This issue tracker has been migrated to GitHub ,
and is currently read-only.
For more information,
see the GitHub FAQs in the Python's Developer Guide.
Created on 2013年10月13日 12:00 by Пётр.Дёмин, last changed 2022年04月11日 14:57 by admin. This issue is now closed.
| Files | ||||
|---|---|---|---|---|
| File name | Uploaded | Description | Edit | |
| uglyhack.c | Esa.Peuha, 2013年10月14日 09:23 | test program in C | ||
| Messages (23) | |||
|---|---|---|---|
| msg199698 - (view) | Author: Пётр Дёмин (Пётр.Дёмин) | Date: 2013年10月13日 12:00 | |
Taken from http://stackoverflow.com/a/19287553/135079 When I consume all memory: Python 2.7 (r27:82525, Jul 4 2010, 09:01:59) [MSC v.1500 32 bit (Intel)] on win32 Type "help", "copyright", "credits" or "license" for more information. >>> a = {} >>> for k in xrange(1000000): a['a' * k] = k ... Traceback (most recent call last): File "<stdin>", line 1, in <module> MemoryError >>> len(a) 64036 If we'll take summary keys length: >>> log(sum(xrange(64036)), 2) 30.93316861532543 we'll get near 32-bit integer overflow. After that done, >>> a = {} will free all 2 Gb of allocated memory (as shown in Task Manager), but executing: >>> for k in xrange(1000000): a[k] = k Will cause: MemoryError And dictionary length something like: >>> len(a) 87382 |
|||
| msg199730 - (view) | Author: R. David Murray (r.david.murray) * (Python committer) | Date: 2013年10月13日 16:47 | |
My guess would be you are dealing with memory fragmentation issues, but I'll let someone more knowledgeable confirm that before closing the issue :) |
|||
| msg199813 - (view) | Author: Tim Peters (tim.peters) * (Python committer) | Date: 2013年10月13日 22:05 | |
Here on 32-bit Windows Vista, with Python 3:
C:\Python33>python.exe
Python 3.3.2 (v3.3.2:d047928ae3f6, May 16 2013, 00:03:43) [MSC v.1600 32 bit (Intel)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> a = {}
>>> for k in range(1000000): a['a' * k] = k
...
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
MemoryError
>>> del a
And here too Task Manager shows that Python has given back close to 2GB of memory.
>>> a = {}
>>> for k in range(100000): a['a' * k] = k
...
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
MemoryError
And here Task Manager shows that there's tons of memory still available. sys._debugmallocstats() shows nothing odd after another "a = {}" - only 7 arenas are allocated, less than 2 MB.
Of course this has nothing to do with running in interactive mode. Same thing happens in a program (catching MemoryError, etc).
So best guess is that Microsoft's allocators have gotten fatally fragmented, but I don't know how to confirm/refute that.
It would be good to get some reports from non-Windows 32-bit boxes. If those are fine, then we can be "almost sure" it's a Microsoft problem.
|
|||
| msg199814 - (view) | Author: Antoine Pitrou (pitrou) * (Python committer) | Date: 2013年10月13日 22:10 | |
Works fine on a 32-bit Linux build (64-bit machine, though): >>> import sys >>> sys.maxsize 2147483647 >>> a = {} >>> for k in range(1000000): a['a' * k] = k ... Traceback (most recent call last): File "<stdin>", line 1, in <module> MemoryError >>> del a >>> a = {} >>> for k in range(1000000): a[k] = k ... >>> Note that Linux says the process eats 4GB RAM. |
|||
| msg199815 - (view) | Author: STINNER Victor (vstinner) * (Python committer) | Date: 2013年10月13日 22:14 | |
int type of Python 2 uses an internal "free list" which has an unlimited size. If once you have 1 million different integers are the same time, the memory will never be released, even if the container storing all these integers is removed, because a reference is kept in the free list. This is a known issue of Python 2, solved "indirectly" in Python 3, because "int" type of Python 3 does not use a free list. The long type of Python 2 does not use a free list neither. |
|||
| msg199817 - (view) | Author: Tim Peters (tim.peters) * (Python committer) | Date: 2013年10月13日 22:22 | |
haypo, there would only be a million ints here even if the loop had completed. That's trivial in context (maybe 14 MB for the free list in Python 2?). And note that I did my example run under Python 3. Besides, the OP and I both reported that Task Manager showed that Python did release "almost all" of the memory back to the OS. While the first MemoryError occurs when available memory has been truly exhausted, the second MemoryError occurs with way over a gigabyte of memory still "free" (according to Task Manager). Best guess is that it is indeed free, but so fragmented that MS C's allocator can't deal with it. That would not be unprecedented on Windows ;-) |
|||
| msg199857 - (view) | Author: Esa Peuha (Esa.Peuha) | Date: 2013年10月14日 09:23 | |
> So best guess is that Microsoft's allocators have gotten fatally fragmented, but I don't know how to confirm/refute that. Let's test this in pure C. Compile and run the attached uglyhack.c on win32; if it reports something significantly less than 100%, it's probably safe to conclude that this has nothing to do with Python. |
|||
| msg199866 - (view) | Author: STINNER Victor (vstinner) * (Python committer) | Date: 2013年10月14日 11:12 | |
Python uses an allocator called "pymalloc". For allocations smaller than 512 bytes, it uses arenas of 256 KB. If you allocate many small objects and later release most of them (but not all!), the memory is fragmented. For allocations larger than 512 bytes, Python falls back to malloc/free. It was discussed to replace pymalloc with Windows Low Fragmented Heap allocator. |
|||
| msg199936 - (view) | Author: Tim Peters (tim.peters) * (Python committer) | Date: 2013年10月14日 17:56 | |
@haypo, this has nothing to do with PyMalloc. As I reported in my first message, only 7 PyMalloc arenas are in use at the end of the program, less than 2 MB total. *All* other arenas ever used were released to the OS. And that's not surprising. The vast bulk of the memory used in the test case isn't in small objects, it's in *strings* of ever-increasing size. Those are gotten by many calls to the system malloc(). |
|||
| msg199940 - (view) | Author: Tim Peters (tim.peters) * (Python committer) | Date: 2013年10月14日 19:07 | |
@Esa.Peuha, fine idea! Alas, on the same box I used before, uglyhack.c displays (it varies a tiny amount from run to run): 65198 65145 99.918709% So it's not emulating enough of Python's malloc()/free() behavior to trigger the same kind of problem :-( |
|||
| msg199941 - (view) | Author: Antoine Pitrou (pitrou) * (Python committer) | Date: 2013年10月14日 19:09 | |
By the way, in Python 3.4 arena allocation is done using VirtualAlloc and VirtualFree, that may make a difference too. |
|||
| msg199943 - (view) | Author: Tim Peters (tim.peters) * (Python committer) | Date: 2013年10月14日 19:22 | |
@pitrou, maybe, but seems very unlikely. As explained countless times already ;-), PyMalloc allocates few arenas in the test program. "Small objects" are relatively rare here. Almost all the memory is consumed by strings of ever-increasing length. PyMalloc passes those large requests on to the system malloc(). |
|||
| msg199944 - (view) | Author: Antoine Pitrou (pitrou) * (Python committer) | Date: 2013年10月14日 19:25 | |
> @pitrou, maybe, but seems very unlikely. As explained countless times > already ;-), Indeed, a 32-bit counter would already have overflowed :-D You're right that's very unlikely. |
|||
| msg199945 - (view) | Author: Tim Peters (tim.peters) * (Python committer) | Date: 2013年10月14日 19:27 | |
Just to be sure, I tried under current default (3.4.0a3+). Same behavior. |
|||
| msg199950 - (view) | Author: Richard Oudkerk (sbt) * (Python committer) | Date: 2013年10月14日 20:59 | |
After running ugly_hack(), trying to malloc a largeish block (1MB) fails:
int main(void)
{
int first;
void *ptr;
ptr = malloc(1024*1024);
assert(ptr != NULL); /* succeeds */
free(ptr);
first = ugly_hack();
ptr = malloc(1024*1024);
assert(ptr != NULL); /* fails */
free(ptr);
return 0;
}
|
|||
| msg199958 - (view) | Author: Tim Peters (tim.peters) * (Python committer) | Date: 2013年10月14日 21:51 | |
@sbt, excellent! Happens for me too: trying to allocate a 1MB block fails after running ugly_hack() once. That fits the symptoms: lots of smaller, varying-sized allocations, followed by free()s, followed by a "largish" allocation. Don't know _exactly_ which largish allocation is failing. Could be the next non-trivial dict resize, or, because I'm running under Python 3, a largish Unicode string allocation. Unfortunately, using the current default-branch Python in a debug build, the original test case doesn't misbehave, so I can't be more specific. That could be because, in a debug build, Python does more of the memory management itself. Or at least it used to - everything got more complicated in my absence ;-) Anyway, since "the problem" has been produced with a simple pure C program, I think we need to close this as "wont fix". |
|||
| msg199960 - (view) | Author: STINNER Victor (vstinner) * (Python committer) | Date: 2013年10月14日 22:19 | |
> Anyway, since "the problem" has been produced with a simple pure C program, I think we need to close this as "wont fix". Can someone try the low fragmentation allocator? |
|||
| msg199961 - (view) | Author: STINNER Victor (vstinner) * (Python committer) | Date: 2013年10月14日 22:22 | |
I tried jemalloc on Linux which behaves better than the (g)libc on the RSS ans VMS memory. I know that Firefox uses it on Windows (and maybe also Mac OS X). It may be interesting to try it and/or provide something to use it easily. |
|||
| msg199967 - (view) | Author: Tim Peters (tim.peters) * (Python committer) | Date: 2013年10月14日 23:38 | |
@haypo, I'm not sure what you mean by "the low fragmentation allocator". If it's referring to this: http://msdn.microsoft.com/en-us/library/windows/desktop/aa366750(v=vs.85).aspx it doesn't sound all that promising for this failing case. But, sure, someone should try it ;-) |
|||
| msg199968 - (view) | Author: Tim Peters (tim.peters) * (Python committer) | Date: 2013年10月14日 23:46 | |
BTW, everything I've read (including the MSDN page I linked to) says that the LFH is enabled _by default_ starting in Windows Vista (which I happen to be using). So unless Python does something to _disable_ it (I don't know), there's nothing to try here. |
|||
| msg199982 - (view) | Author: STINNER Victor (vstinner) * (Python committer) | Date: 2013年10月15日 08:23 | |
Tim> http://msdn.microsoft.com/en-us/library/windows/desktop/aa366750(v=vs.85).aspx Yes, this one. Tim> BTW, everything I've read (including the MSDN page I linked to) says that the LFH is enabled _by default_ starting in Windows Vista (which I happen to be using). So unless Python does something to _disable_ it (I don't know), there's nothing to try here. Extract of the link: "To enable the LFH for a heap, use the GetProcessHeap function to obtain a handle to the default heap of the calling process, or use the handle to a private heap created by the HeapCreate function. Then call the HeapSetInformation function with the handle." It should be enabled explicitly. |
|||
| msg199983 - (view) | Author: Antoine Pitrou (pitrou) * (Python committer) | Date: 2013年10月15日 08:26 | |
> It should be enabled explicitly. Victor, please read your own link before posting: """The information in this topic applies to Windows Server 2003 and Windows XP. Starting with Windows Vista, the system uses the low-fragmentation heap (LFH) as needed to service memory allocation requests. Applications do not need to enable the LFH for their heaps. """ |
|||
| msg199984 - (view) | Author: STINNER Victor (vstinner) * (Python committer) | Date: 2013年10月15日 08:33 | |
> Victor, please read your own link before posting: Oh. I missed this part, that's why I didn't understand Tim's remark. So the issue comes the Windows heap allocator. I don't see any obvious improvment that Python can do to improve the memory usage. I close the issue. You have to modify your application to allocate objects differently, to limit manually the fragmentation of the heap. Another option, maybe more complex, is to create a subprocess to process data, and destroy the process to release the memory. multiprocessing helps to implement that. I will maybe try jemalloc on Windows, but I prefer to open a new issue if I find something interesting. |
|||
| History | |||
|---|---|---|---|
| Date | User | Action | Args |
| 2022年04月11日 14:57:51 | admin | set | github: 63445 |
| 2013年10月15日 08:53:48 | pitrou | set | resolution: fixed -> rejected |
| 2013年10月15日 08:33:28 | vstinner | set | title: freeing then reallocating lots of memory fails under Windows -> high fragmentation of the memory heap on Windows |
| 2013年10月15日 08:33:01 | vstinner | set | status: open -> closed resolution: fixed messages: + msg199984 |
| 2013年10月15日 08:26:18 | pitrou | set | messages:
+ msg199983 title: freeing then reallocating lots of memory fails under Windows -> freeing then reallocating lots of memory fails under Windows |
| 2013年10月15日 08:23:22 | vstinner | set | messages: + msg199982 |
| 2013年10月14日 23:46:05 | tim.peters | set | messages: + msg199968 |
| 2013年10月14日 23:38:08 | tim.peters | set | messages: + msg199967 |
| 2013年10月14日 22:22:52 | vstinner | set | messages: + msg199961 |
| 2013年10月14日 22:19:28 | vstinner | set | messages: + msg199960 |
| 2013年10月14日 21:51:38 | tim.peters | set | messages: + msg199958 |
| 2013年10月14日 20:59:15 | sbt | set | nosy:
+ sbt messages: + msg199950 |
| 2013年10月14日 19:28:33 | brian.curtin | set | nosy:
- brian.curtin |
| 2013年10月14日 19:27:28 | tim.peters | set | messages: + msg199945 |
| 2013年10月14日 19:25:28 | pitrou | set | messages: + msg199944 |
| 2013年10月14日 19:22:34 | tim.peters | set | messages: + msg199943 |
| 2013年10月14日 19:09:01 | pitrou | set | messages: + msg199941 |
| 2013年10月14日 19:07:29 | tim.peters | set | messages: + msg199940 |
| 2013年10月14日 17:56:25 | tim.peters | set | messages: + msg199936 |
| 2013年10月14日 11:53:29 | pitrou | set | title: GC does not really free up memory in console -> freeing then reallocating lots of memory fails under Windows |
| 2013年10月14日 11:12:37 | vstinner | set | messages: + msg199866 |
| 2013年10月14日 09:23:40 | Esa.Peuha | set | files:
+ uglyhack.c nosy: + Esa.Peuha messages: + msg199857 |
| 2013年10月13日 22:22:58 | tim.peters | set | messages: + msg199817 |
| 2013年10月13日 22:14:10 | vstinner | set | nosy:
+ vstinner messages: + msg199815 |
| 2013年10月13日 22:10:25 | pitrou | set | nosy:
+ pitrou messages: + msg199814 |
| 2013年10月13日 22:05:18 | tim.peters | set | messages:
+ msg199813 versions: + Python 3.4 |
| 2013年10月13日 16:47:23 | r.david.murray | set | nosy:
+ r.david.murray messages: + msg199730 |
| 2013年10月13日 15:12:57 | pitrou | set | nosy:
+ tim.peters, tim.golden, brian.curtin |
| 2013年10月13日 12:00:01 | Пётр.Дёмин | create | |