5

While trying to fine-tune some memory leaks in the Python bindings for some C/C++ functions I cam across some strange behavior pertaining to the garbage collection of Numpy arrays.

I have created a couple of simplified cases in order to better explain the behavior. The code was run using the memory_profiler, the output from which follows immediately after. It appears that Python's garbage collection is not working as expected when it comes to NumPy arrays:

# File deallocate_ndarray.py
@profile
def ndarray_deletion():
 import numpy as np
 from gc import collect
 buf = 'abcdefghijklmnopqrstuvwxyz' * 10000
 arr = np.frombuffer(buf)
 del arr
 del buf
 collect()
 y = [i**2 for i in xrange(10000)]
 del y
 collect()
if __name__=='__main__':
 ndarray_deletion()

With the following command I invoked the memory_profiler:

python -m memory_profiler deallocate_ndarray.py

This is what I got:

Filename: deallocate_ndarray.py
Line # Mem usage Increment Line Contents
================================================
 5 10.379 MiB 0.000 MiB @profile
 6 def ndarray_deletion():
 7 17.746 MiB 7.367 MiB import numpy as np
 8 17.746 MiB 0.000 MiB from gc import collect
 9 17.996 MiB 0.250 MiB buf = 'abcdefghijklmnopqrstuvwxyz' * 10000
10 18.004 MiB 0.008 MiB arr = np.frombuffer(buf)
11 18.004 MiB 0.000 MiB del arr
12 18.004 MiB 0.000 MiB del buf
13 18.004 MiB 0.000 MiB collect()
14 18.359 MiB 0.355 MiB y = [i**2 for i in xrange(10000)]
15 18.359 MiB 0.000 MiB del y
16 18.359 MiB 0.000 MiB collect()

I don't understand why even the forced calls to collect don't reduce the memory usage of the program by freeing up some memory. Moreover, even if Numpy arrays don't behave normally due to the underlying C constructs, why doesn't the list (which is pure Python) get garbage collected?

I know that del does not directly call the underlying __del__ method, but you will note that all del statements in the code actually end up reducing the reference count of the corresponding objects to zero (thereby making them eligible for garbage collection AFAIK). Typically, I would expect to see a negative entry in the increment column when an object undergoes garbage collection. Can anyone shed some light on what is going on here?

NOTE: This test was run on OS X 10.10.4, Python 2.7.10 (conda), Numpy 1.9.2 (conda), Memory Profiler 0.33 (conda-binstar), psutil 2.2.1 (conda).

asked Jul 15, 2015 at 5:29

1 Answer 1

4

In order to see the memory garbage collected, I had to increase the size of buf several orders of magnitude. Maybe the size is too small for memory_profiler to detect the change (it queries the OS, so measurements are not very precise) or maybe its too small for the Python garbage collector to care, I don't know.

For example, replacing 10000 by 100000000 in the factor buf yields

Line # Mem usage Increment Line Contents
================================================
21 10.289 MiB 0.000 MiB @profile
22 def ndarray_deletion():
23 17.309 MiB 7.020 MiB import numpy as np
24 17.309 MiB 0.000 MiB from gc import collect
25 2496.863 MiB 2479.555 MiB buf = 'abcdefghijklmnopqrstuvwxyz' * 100000000
26 2496.867 MiB 0.004 MiB arr = np.frombuffer(buf)
27 2496.867 MiB 0.000 MiB del arr
28 17.312 MiB -2479.555 MiB del buf
29 17.312 MiB 0.000 MiB collect()
30 17.719 MiB 0.406 MiB y = [i**2 for i in xrange(10000)]
31 17.719 MiB 0.000 MiB del y
32 17.719 MiB 0.000 MiB collect()
answered Aug 4, 2015 at 16:37

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.