12

I can convert a numpy ndarray to bytes using myndarray.tobytes() Now how can I get it back to an ndarray?

Using the example from the .tobytes() method docs:

>>> x = np.array([[0, 1], [2, 3]])
>>> bytes = x.tobytes()
>>> bytes
b'\x00\x00\x00\x00\x01\x00\x00\x00\x02\x00\x00\x00\x03\x00\x00\x00'
>>> np.some_magic_function_here(bytes)
array([[0, 1], [2, 3]])
asked Dec 4, 2017 at 16:27
2
  • 3
    np.frombuffer()? Commented Dec 4, 2017 at 16:32
  • 1
    Or np.fromstring. In both cases you'll also need to specify the dtype if it's not the default (float). Commented Dec 4, 2017 at 19:58

3 Answers 3

26

To deserialize the bytes you need np.frombuffer().
tobytes() serializes the array into bytes and the np.frombuffer() deserializes them.

Bear in mind that once serialized, the shape info is lost, which means that after deserialization, it is required to reshape it back to its original shape.

Below is a complete example:

import numpy as np
x = np.array([[0, 1], [2, 3]], np.int8)
bytes = x.tobytes()
# bytes is a raw array, which means it contains no info regarding the shape of x
# let's make sure: we have 4 values with datatype=int8 (one byte per array's item), therefore the length of bytes should be 4bytes
assert len(bytes) == 4, "Ha??? Weird machine..."
deserialized_bytes = np.frombuffer(bytes, dtype=np.int8)
deserialized_x = np.reshape(deserialized_bytes, newshape=(2, 2))
assert np.array_equal(x, deserialized_x), "Deserialization failed..."
answered Apr 22, 2019 at 18:52
Sign up to request clarification or add additional context in comments.

1 Comment

Thanks. Just a minor quibble: shouldn't use bytes as a name because it is a built-in type.
10

After your edit it seems you are going into the wrong direction!

You can't use np.tobytes() to store a complete array containing all informations like shapes and types when reconstruction from these bytes only is needed! It will only save the raw data (cell-values) and flatten these in C or Fortran-order.

Now we don't know your task. But you will need something based on serialization. There are tons of approaches, the easiest being the following based on python's pickle (example here: python3!):

import pickle
import numpy as np
x = np.array([[0, 1], [2, 3]])
print(x)
x_as_bytes = pickle.dumps(x)
print(x_as_bytes)
print(type(x_as_bytes))
y = pickle.loads(x_as_bytes)
print(y)

Output:

[[0 1]
 [2 3]]
 b'\x80\x03cnumpy.core.multiarray\n_reconstruct\nq\x00cnumpy\nndarray\nq\x01K\x00\x85q\x02C\x01bq\x03\x87q\x04Rq\x05(K\x01K\x02K\x02\x86q\x06cnumpy\ndtype\nq\x07X\x02\x00\x00\x00i8q\x08K\x00K\x01\x87q\tRq\n(K\x03X\x01\x00\x00\x00<q\x0bNNNJ\xff\xff\xff\xffJ\xff\xff\xff\xffK\x00tq\x0cb\x89C \x00\x00\x00\x00\x00\x00\x00\x00\x01\x00\x00\x00\x00\x00\x00\x00\x02\x00\x00\x00\x00\x00\x00\x00\x03\x00\x00\x00\x00\x00\x00\x00q\rtq\x0eb.'
<class 'bytes'>
[[0 1]
 [2 3]]

The better alternative would be joblib's pickle with specialized pickling for large arrays. joblib's functions are file-object based and can be used in-memory with byte-strings too using python's BytesIO.

answered Dec 4, 2017 at 21:27

Comments

2

If you know the dimensions you are recreating ahead of time, do numpy.ndarray(<dimensions>,<dataType>,<bytes(aka buffer)>)

x = numpy.array([[1.0,1.1,1.2,1.3],[2.0,2.1,2.2,2.3],[3.0,3.1,3.2,3.3]],numpy.float64)
#array([[1. , 1.1, 1.2, 1.3],
# [2. , 2.1, 2.2, 2.3],
# [3. , 3.1, 3.2, 3.3]])
xBytes = x.tobytes()
#b'\x00\x00\x00\x00\x00\x00\xf0?\x9a\x99\x99\x99\x99\x99\xf1?333333\xf3?\xcd\xcc\xcc\xcc\xcc\xcc\xf4?\x00\x00\x00\x00\x00\x00\x00@\xcd\xcc\xcc\xcc\xcc\xcc\x00@\x9a\x99\x99\x99\x99\x99\x01@ffffff\x02@\x00\x00\x00\x00\x00\x00\x08@\xcd\xcc\xcc\xcc\xcc\xcc\x08@\x9a\x99\x99\x99\x99\x99\t@ffffff\n@'
newX = numpy.ndarray((3,4),numpy.float64,xBytes)
#array([[1. , 1.1, 1.2, 1.3],
# [2. , 2.1, 2.2, 2.3],
# [3. , 3.1, 3.2, 3.3]])

Another approach might be, if you have stored your data as records of bytes rather than as an entire ndarray and your selection of data varies from ndarray to ndarray, you can aggregate your pre-array data as bytes in a python bytearray, then when it is the desired size, you already know the required dimensions, and can supply those dimensions/dataType with the bytearray as a buffer.

answered Feb 9, 2021 at 20:02

Comments

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.