High memory usage in python

Question 1

The following simple python code:

class Node:
 NumberOfNodes = 0
 def __init__(self):
 Node.NumberOfNodes += 1
if __name__ == '__main__':
 nodes = []
 for i in xrange(1, 7 * 1000 * 1000):
 if i % 1000 == 0:
 print i
 nodes.append(Node())

takes gigabytes of memory; Which I think is irrational. Is that normal in python?

How could I fix that?(in my original code, I have about 7 million objects each with 10 fields and that takes 8 gigabytes of RAM)

Question 2

Can we see the original code?

Question 3

So each object approximately takes 1K memory. What kind of fields are they?

Question 4

I Just simplified that. there are 10 other integer fields which I initialize all of them with zero. I think that's not much important. the same code in C should take about (4 * 7000000 bytes < 30MB)

Question 5

If you have fixed number of fields then you can use __slots__ to save quite a lot of memory. Note that __slots__ do have some limitations, so make sure your read the Notes on using __slots__ carefully before choosing to use them in your application:

>>> import sys
>>> class Node(object):
 NumberOfNodes = 0
 def __init__(self):
 Node.NumberOfNodes += 1
... 
>>> n = Node()
>>> sys.getsizeof(n)
64
>>> class Node(object):
 __slots__ = ()
 NumberOfNodes = 0
 def __init__(self):
 Node.NumberOfNodes += 1
... 
>>> n = Node()
>>> sys.getsizeof(n)
16

Question 6

thanks very much, I tried that and memory usage became about 1.5 GBs(originally 8GBs). but as I said in the comments, the equivalent C program should take about 300MBs. it means that python takes 5 times more memory.

Question 7

The size of the normal class should really include the size of its __dict__ as well. eg. sys.getsizeof(n) + sys.getsizeof(vars(n)). getsizeof doesn't count size of the dict as its a separate object.

Question 8

@Farzam Well in Python everything is object, including classes, integers etc. Though that's not the case with C/C++.

Question 9

@AshwiniChaudhary I know, but even if you use class in C++, there is not much noticeable difference(compared to the difference with python). So what does python store in these extra memories?

Question 10

@Dunes Oh! yes, and size of __slots__ in case of the second one.

Question 11

Python is an inherently memory heavy programming language. There are some ways you can get around this. __slots__ is one way. Another, more extreme approach is to use numpy to store your data. You can use numpy to create a structured array or record -- a complex data type that uses minimal memory, but suffers a substantial loss of functionality compared to a normal python class. That is, you are working with the numpy array class, rather than your own class -- you cannot define your own methods on your array.

import numpy as np
# data type for a record with three 32-bit ints called x, y and z
dtype = [(name, np.int32) for name in 'xyz']
arr = np.zeros(1000, dtype=dtype)
# access member of x of a record
arr[0]['x'] = 1 # name based access
# or
assert arr[0][0] == 1 # index based access
# accessing all x members of records in array
assert arr['x'].sum() == 1
# size of array used to store elements in memory
assert arr.nbytes == 12000 # 1000 elements * 3 members * 4 bytes per int

See more here.

Ashwini Chaudhary 252k60 gold badges479 silver badges520 bronze badges · Accepted Answer · 2015-01-07 23:50:49Z

3

If you have fixed number of fields then you can use __slots__ to save quite a lot of memory. Note that __slots__ do have some limitations, so make sure your read the Notes on using __slots__ carefully before choosing to use them in your application:

>>> import sys
>>> class Node(object):
 NumberOfNodes = 0
 def __init__(self):
 Node.NumberOfNodes += 1
... 
>>> n = Node()
>>> sys.getsizeof(n)
64
>>> class Node(object):
 __slots__ = ()
 NumberOfNodes = 0
 def __init__(self):
 Node.NumberOfNodes += 1
... 
>>> n = Node()
>>> sys.getsizeof(n)
16

Share

Improve this answer

edited Jan 7, 2015 at 23:57

answered Jan 7, 2015 at 23:50

Ashwini Chaudhary's user avatar

Ashwini Chaudhary

252k60 gold badges479 silver badges520 bronze badges

Sign up to request clarification or add additional context in comments.

7 Comments

Farzam

Farzam Over a year ago

thanks very much, I tried that and memory usage became about 1.5 GBs(originally 8GBs). but as I said in the comments, the equivalent C program should take about 300MBs. it means that python takes 5 times more memory.

2015年01月08日T00:10:28.643Z+00:00

Dunes

Dunes Over a year ago

The size of the normal class should really include the size of its __dict__ as well. eg. sys.getsizeof(n) + sys.getsizeof(vars(n)). getsizeof doesn't count size of the dict as its a separate object.

2015年01月08日T00:13:35.087Z+00:00

Ashwini Chaudhary

Ashwini Chaudhary Over a year ago

@Farzam Well in Python everything is object, including classes, integers etc. Though that's not the case with C/C++.

2015年01月08日T00:23:07.587Z+00:00

Farzam

Farzam Over a year ago

@AshwiniChaudhary I know, but even if you use class in C++, there is not much noticeable difference(compared to the difference with python). So what does python store in these extra memories?

2015年01月08日T00:26:49.567Z+00:00

Ashwini Chaudhary

Ashwini Chaudhary Over a year ago

@Dunes Oh! yes, and size of __slots__ in case of the second one.

2015年01月08日T00:39:11.247Z+00:00

|

CollectivesTM on Stack Overflow

High memory usage in python

2 Answers 2

7 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

CollectivesTM on Stack Overflow

2 Answers 2

7 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related