3

I have a csv file with ~50,000 rows and 300 columns. Performing the following operation is causing a memory error in Pandas (python):

merged_df.stack(0).reset_index(1)

The data frame looks like:

GRID_WISE_MW1 Col0 Col1 Col2 .... Col300
7228260 1444 1819 2042
7228261 1444 1819 2042

I am using latest pandas (0.13.1) and the bug does not occur with dataframes with fewer rows (~2,000)

thanks!

asked Apr 21, 2014 at 20:01
1
  • That wouldn't help here because I am using merged_df.stack(0).reset_index(1) in a pandas.merge operation.... Commented Apr 21, 2014 at 20:54

2 Answers 2

5

So it takes on my 64-bit linux (32GB) memory, a little less than 2GB.

In [5]: def f():
 df = DataFrame(np.random.randn(50000,300))
 df.stack().reset_index(1)
In [6]: %memit f()
maximum of 1: 1791.054688 MB per loop

Since you didn't specify. This won't work on 32-bit at all (as you can't usually allocate a 2GB contiguous block), but should work if you have reasonable swap / memory.

answered Apr 21, 2014 at 23:30
Sign up to request clarification or add additional context in comments.

3 Comments

Ahh, I am using Windows 7 64 bit, 8 GB RAM, but my pandas is 32 bit, could that be the issue?
yep; you can install 64-bit python (and all packages), or use conda to do so. 32-bit has a 4GB addressable limit, but python requires contiguous memory, so that's too big to stack reliably. in my experience 32-bit has issues with anything > 1GB; 64-bit scales no problem however.
@Jeff Thanks for the remark! I've been fighting for a good week with pandas to load only ~400MB of data in one dataFrame, when a list of smaller dataFrame instances, for the same total amount, can be loaded without a problem, and your explanation is surely the answer: I'm using Python in 32 bits, as our OSs at work are stuck on a Windows 32 bits. :-/
2

As an alternative approach you can use the library "dask"
e.g:

# Dataframes implement the Pandas API
import dask.dataframe as dd`<br>
df = dd.read_csv('s3://.../2018-*-*.csv')
answered Apr 17, 2018 at 11:17

Comments

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.