Memory error in pandas

Asked 9 years, 6 months ago

Viewed 3k times

I have a csv file which has a size of around 800MB which I'm trying to load into a dataframe via pandas but I keep getting a memory error. I need to load it so I can join it to another smaller dataframe.

Why am I getting a memory error even though I'm using 64bit versions of Windows, and Python 3.4 64bit and have over 8GB of RAM and plenty of harddisk? Is this is a bug in Pandas? How can I solve this memory issue?

Improve this question

asked Jun 15, 2016 at 13:03

Nickpick's user avatar

Nickpick

6,67720 gold badges72 silver badges137 bronze badges

Possible duplicate of Memory error when using pandas read_csv

hashcode55
– hashcode55

2016年06月15日 13:36:51 +00:00
Commented Jun 15, 2016 at 13:36
I knew the answer, but I forgot.

piRSquared
– piRSquared

2016年06月15日 15:44:16 +00:00
Commented Jun 15, 2016 at 15:44
1

You already have two questions about this: here and here stop reposting

Noelkd
– Noelkd

2016年06月16日 16:00:56 +00:00
Commented Jun 16, 2016 at 16:00

Add a comment |

1 Answer 1

Sorted by: Reset to default

reading your CSV in chunks might help:

chunk_size = 10**5
df = pd.concat([chunk for chunk in pd.read_csv(filename, chunksize=chunk_size)],
 ignore_index=False)

Improve this answer

answered Jun 15, 2016 at 20:51

MaxU - stand with Ukraine's user avatar

MaxU - stand with Ukraine

212k37 gold badges403 silver badges437 bronze badges

4 Comments

Nickpick

Nickpick Over a year ago

That might help but won't solve the problem that a merge will kill it. Pandas is incredibly wasteful with memory

2016年06月15日T20:53:41.14Z+00:00

MaxU - stand with Ukraine

MaxU - stand with Ukraine Over a year ago

@nickpick, so what is your problem - reading 800MB CSV file or merging your DF with another smaller one?

2016年06月15日T20:56:38.047Z+00:00

Nickpick

Nickpick Over a year ago

Both is causing problems. The fact that reading in chunks and concatenating it makes a difference at all, points to issues in pandas

2016年06月15日T20:57:52.01Z+00:00

MaxU - stand with Ukraine

MaxU - stand with Ukraine Over a year ago

@nickpick, did you try to read your CSV file in chunks? if yes what shows df.info() after it's done?

2016年06月15日T21:00:43.503Z+00:00

Your Answer

Draft saved

Draft discarded

Sign up or log in

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.

lang-py

CollectivesTM on Stack Overflow

Memory error in pandas

1 Answer 1

4 Comments

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Linked

Hot Network Questions

CollectivesTM on Stack Overflow

1 Answer 1

4 Comments

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Linked

Related