ValueError: cannot create an OBJECT array from memory buffer

Asked 7 years, 2 months ago

Viewed 3k times

Since I got memory error while I was concatenating pandas dataframes, I decided to write pandas dataframes into a binary file in append mode and then read this binary file to get the whole dataframe.

However, I got 'ValueError: cannot create an OBJECT array from memory buffer'

If all dataframes have numeric columns, this problem does not occur. However if one of the columns is string (in my case, there are many string columns in my dataframes), then this value error pops up. Here is the code below to exemplify this situation. Uncomment #works1 or #works2 to see that there is no error. But using the dataframe under #does not work gives ValueError

import pandas as pd
import numpy as np
mtot=0
if os.path.exists('df_all.bin'):
 os.remove('df_all.bin')
for i in range(2):
 #works1
 # df = pd.DataFrame(np.random.randint(100, size=(5, 2)))
 #works2
 # df = pd.DataFrame({'A':[1,2,3], 'B':[1,2,3], 'C':[1.0,2.0,3.0]})
 # df = df.astype(dtype={'A': int, 'B': int, 'C': float})
 #does not work
 df = pd.DataFrame({'A':[1,2,3], 'B':['sample1','sample2','sample3'], 'C':[1.0,2.0,3.0]})
 df = df.astype(dtype={'A': int, 'B': str, 'C': float})
 typ = df.values.dtype
 print('dtype:%s' %typ)
 with open('df_all.bin', 'ab') as f:
 m, n = df.shape
 mtot += m
 f.write(df.values.tobytes())
with open('df_all.bin', 'rb') as f:
 buffer = f.read()
 nparray = np.frombuffer(buffer, dtype=typ)
 data = nparray.reshape(mtot, n)
 whole_df = pd.DataFrame(data=data, columns=list(range(n)))
print(whole_df)
print(whole_df.shape)
os.remove('df_all.bin')

How to get rid of this ValueError?

Thanks

Improve this question

asked Oct 24, 2018 at 15:09

burcak's user avatar

burcak

1,15714 silver badges36 bronze badges

Add a comment |

1 Answer 1

Sorted by: Reset to default

My guess is that you're using Python 3, which by default treats all string as unicode. And Unicode is not easily converted to binary, simply because the length of a single character may be multiple bytes.

So, I think you should have a look at this post:

Python: convert string to byte array

to convert your string data into proper binary data.

Improve this answer

answered Oct 25, 2018 at 3:32

MrE's user avatar

MrE

21.2k15 gold badges92 silver badges113 bronze badges

Comments

Your Answer

Draft saved

Draft discarded

Sign up or log in

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.

lang-py

CollectivesTM on Stack Overflow

ValueError: cannot create an OBJECT array from memory buffer

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Linked

Hot Network Questions

CollectivesTM on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Linked

Related