How to write a numpy array to a byte memorystream?

Question 1

I'm unit testing code in Python2.7 that writes numpy array via ndarray.tofile(fileHandle,..). Since doing file IO in unit tests is bad for a number of reasons, how do I substitute a byte memorystream in place of the file handle? (io.BytesIO failed to work because ndarray.toFile() asks it for a file name.)

Question 2

Shouldn't tobytes [1] and frombuffer [2] do what you need for testing purposes?

m = np.random.rand(5,3)
b = m.tobytes()
mb = np.frombuffer(b).reshape(m.shape)

Question 3

That will work assuming that tofile() doesn't deviate from tobytes. This is the best answer for the current state of numpy. It's unfortunate that tofile doesn't accept a stream so it would be possible to directly test unit test tofile() API.

Question 4

Would a tempfile.TemporaryFile suit your purposes?

It exposes the same interface as a normal file object, so you can pass it directly to np.ndarray.tofile(), and it will be deleted immediately when it is either explicitly closed or garbage collected:

import numpy as np
from tempfile import TemporaryFile
x = np.random.randn(1000)
with TemporaryFile() as t:
 x.tofile(t)
 # do your testing...
# t is closed and deleted

It will, however, reside temporarily on disk (usually in /tmp/ on a Linux machine), but I don't see an easy way to avoid I/O altogether, since .tofile() will ultimately need a valid OS-level file descriptor.

Question 5

Building automated unit tests with file io leaves race conditions that cause tests to not behave deterministically. If one adds sleeps to ensure asynchronous file io is finished, then you have slow unit tests and slow unit tests aren't scalable to having hundreds of unit tests which run in a few seconds. What you suggest is perfectly fine for a few system tests but that's not what I'm doing.

Question 6

It would be helpful if you could provide a bit more information about your requirements. How much data are you writing? Do you need to be able to read it back? What sort of race conditions are you concerned about? Do you absolutely need to use tofile?

Question 7

I want to test an application that uses numbpy. The application is creating files. I need to write out a bytes to kb to confirm if the correct bytes are being produced. To write automated tests that will work consistently across hardware, I want to check the in memory output stream before it is written to a file. This last part I'm finding difficult because although the api docs for nddarray.toFile() say it takes a filehandler argument or a handle to a stream, it doesn't handle the bytesIo handle I'm passing in. I fear that as of now, it requires a file handler. :-)

Question 8

I'm looking at nddarray.toBytes() and doing some tests to see if I can assume toBytes() will mirror what is written out with toFile(). If that works, I'll figure something out. Maybe use it for mocking.

Question 9

FWIW, here's the relevant method in the C source, which in turn calls npy_PyFile_Dup2 (note the use of os.dup to duplicate a file descriptor). This is all going on at C level, so I don't see an easy way to fake an open file via a Python object.

Phil 6,2843 gold badges35 silver badges65 bronze badges · Accepted Answer · 2015-07-15 03:04:51Z

3

Shouldn't tobytes [1] and frombuffer [2] do what you need for testing purposes?

m = np.random.rand(5,3)
b = m.tobytes()
mb = np.frombuffer(b).reshape(m.shape)

Share

Improve this answer

answered Jul 15, 2015 at 3:04

Phil's user avatar

Phil

6,2843 gold badges35 silver badges65 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

Lance Kind

Lance Kind Over a year ago

That will work assuming that tofile() doesn't deviate from tobytes. This is the best answer for the current state of numpy. It's unfortunate that tofile doesn't accept a stream so it would be possible to directly test unit test tofile() API.

2015年07月19日T03:03:24.767Z+00:00

CollectivesTM on Stack Overflow

How to write a numpy array to a byte memorystream?

2 Answers 2

1 Comment

6 Comments

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

CollectivesTM on Stack Overflow

2 Answers 2

1 Comment

6 Comments

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related