Issue 1092502: Memory leak in socket.py on Mac OS X

➜

This issue tracker has been migrated to GitHub , and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

This issue has been migrated to GitHub: https://github.com/python/cpython/issues/41374

classification

Type:	Stage:
Title:	Memory leak in socket.py on Mac OS X
Components:	Library (Lib)	Versions:

process

Dependencies:	Superseder:
Status:	closed	Resolution:	fixed
Assigned To:	Nosy List:	a_lauer, akuchling, bacchusrx, bob.ippolito, christian.heimes, gregory.p.smith, martey, mhammond, schmir, vila
Priority:	normal	Keywords:

Created on 2004年12月29日 02:09 by bacchusrx, last changed 2022年04月11日 14:56 by admin. This issue is now closed.

Files
File name	Uploaded	Description	Edit
example.py	bacchusrx, 2004年12月29日 02:09	example.py
example2.py	bacchusrx, 2005年01月01日 23:02	example client
server.pl	bacchusrx, 2005年01月01日 23:03	example server

Messages (17)
msg23834 - (view)	Author: bacchusrx (bacchusrx)	Date: 2004年12月29日 02:09
Some part of socket.py leaks memory on Mac OS X 10.3 (both with the python 2.3 that ships with the OS and with python 2.4). I encountered the problem in John Goerzen's offlineimap. Transfers of messages over a certain size would cause the program to bail with malloc errors, eg * malloc: vm_allocate(size=5459968) failed (error code=3) * malloc[13730]: error: Can't allocate region Inspecting the process as it runs shows that python's total memory size grows wildly during such transfers. The bug manifests in _fileobject.read() in socket.py. You can replicate the problem easily using the attached example with "nc -l -p 9330 < /dev/zero" running on some some remote host. The way _fileobject.read() is written, socket.recv is called with the larger of the minimum rbuf size or whatever's left to be read. Whatever is received is then appended to a buffer which is joined and returned at the end of function. It looks like each time through the loop, space for recv_size is allocated but not freed, so if the loop runs for enough iterations, python exhausts the memory available to it. You can sidestep the condition if recv_size is small (like _fileobject.default_bufsize small). I can't replicate this problem with python 2.3 on FreeBSD 4.9 or FreeBSD 5.2, nor on Mac OS X 10.3 if the logic from _fileobject.read() is re-written in Perl (for example).
msg23835 - (view)	Author: Bob Ippolito (bob.ippolito) * (Python committer)	Date: 2005年01月01日 06:18
Logged In: YES user_id=139309 I can't reproduce this on either version of Python a 10.3.7 machine w/ 1gb ram. Python's total memory usage seems stable to me even if the read is in a while loop. I can't see anything in sock_recv or _fileobject.read that will in any way leak memory. With a really large buffer size (always >17mb, but it does vary with each run) it will get a memory error but the Python process doesn't grow beyond 50mb at the samples I looked at. That's pretty much the amount of RAM I'd expect it to use. It is kind of surprising it doesn't want to allocate a buffer of that size, because I have the RAM for it.. but I don't think this is a bug.
msg23836 - (view)	Author: Bob Ippolito (bob.ippolito) * (Python committer)	Date: 2005年01月01日 06:27
Logged In: YES user_id=139309 I just played with a bit more. If I catch the MemoryError and try again, most of the time it will work (sometimes on the second try). These malloc faults seem to be some kind of temporary condition.
msg23837 - (view)	Author: bacchusrx (bacchusrx)	Date: 2005年01月01日 23:01
Logged In: YES user_id=646321 I've been able to replicate the problem reliably on both 10.3.5 and 10.3.7. I've attached two more examples to demonstrate: Try this: Do, "dd if=/dev/zero of=./data bs=1024 count=10240" and save server.pl wherever you put "data". Have three terminals open. In one, run "perl server.pl -s0.25". In another, run "top -ovsize" and in the third run "python example2.py". After about 100 iterations, python's vsize is +1GB (just about the value of cumulative_req in example2.py) and if left running will cause a malloc error at around 360 iterations with a vsize over 3.6GB (again, just about what cumulative_req reports). Mind you, we've only received ~512kbytes. server.pl differs from the netcat method in that it (defaults) to sending only 1492 bytes at a time (configurable with the -b switch) and sleeps for however many seconds specified with the -s switch. This guarantees enough iterations to raise the error each time around. When omittting the -s switch to server.pl, I don't get the error, but throughput is good enough that the loop in readFromSockUntil() only runs a few times.
msg23838 - (view)	Author: Bob Ippolito (bob.ippolito) * (Python committer)	Date: 2005年01月02日 02:22
Logged In: YES user_id=139309 Ok. I've tracked it down. realloc(...) on Darwin doesn't actually resize memory unless it has to. For shrinking an allocation, it does not have to, therefore realloc(...) with a smaller size is a no-op. It seems that this may be a misunderstanding by Python. The man page for realloc(...) does not say that it will EVER free memory, EXCEPT in the case where it has to allocate a larger region. I'll attach an example that demonstrates this outside of Python.
msg23839 - (view)	Author: Bob Ippolito (bob.ippolito) * (Python committer)	Date: 2005年01月02日 02:23
Logged In: YES user_id=139309 #include <unistd.h> #define NUM_ALLOCATIONS 100000 #define ALLOC_SIZE 10485760 #define ALLOC_RESIZE 1492 int main(int argc, char *argv) { / exiting will free all this leaked memory / for (i = 0; i < NUM_ALLOCATIONS; i++) { void orig_ptr, *new_ptr; size_t new_size, orig_size; orig_ptr = malloc(ALLOC_SIZE); orig_size = malloc_size(orig_ptr); if (orig_ptr == NULL) { printf("failure to malloc %d\n", i); abort(); } new_ptr = realloc(orig_ptr, ALLOC_RESIZE); new_size = malloc_size(new_ptr); printf("resized %d[%p] -> %d[%p]\n", orig_size, orig_ptr, new_size, new_ptr); if (new_ptr == NULL) { printf("failure to realloc %d\n", i); abort(); } } return 0; }
msg23840 - (view)	Author: Bob Ippolito (bob.ippolito) * (Python committer)	Date: 2005年01月02日 02:25
Logged In: YES user_id=139309 that code paste is missing an "int i" at the beginning of main..
msg23841 - (view)	Author: Andreas Lauer (a_lauer)	Date: 2005年11月10日 07:42
Logged In: YES user_id=1376343 The problem also occurs in rare cases under Windows XP with Python 2.3.4. I Suspect the code line recv_size = max(self._rbufsize, left) in socket.py to be a part of the problem. In the case that I investigated, this caused >600 allocations of up to 5 MBytes (which came in 8 KB packets). Sure, the memory allocator should be able to handle this in _socket.recv (first it allocates the X MBytes buffer, which is later resized with _PyString_Resize), but it I think the correct line in socket.py is recv_size = min(self._rbufsize, left). At least, after this my problem was gone.
msg59313 - (view)	Author: Christian Heimes (christian.heimes) * (Python committer)	Date: 2008年01月05日 19:39
Probably outdated I haven't heard or seen any such problems in the past two years.
msg60130 - (view)	Author: Martey Dodoo (martey)	Date: 2008年01月19日 01:24
I am not sure that this is outdated. After running into memory errors trying to download messages with attachments using imaplib, I ran the example client and server. After iteration 357, there was a malloc error, just like etrepum suggested. I am using Mac OS X 10.5, with Python 2.5 (not the Apple-supplied Python).
msg61341 - (view)	Author: Martey Dodoo (martey)	Date: 2008年01月20日 19:24
Just wanted to note that the good people of comp.lang.python helped me figure out that the issue is actually http://bugs.python.org/issue1389051, in case anyone in similar straits ended up here.
msg62797 - (view)	Author: A.M. Kuchling (akuchling) * (Python committer)	Date: 2008年02月23日 19:31
Andreas Lauer's suggested fix is correct. Applied to 2.6 trunk in rev. 61008 and to 2.5-maint in rev. 61009.
msg65467 - (view)	Author: Ralf Schmitt (schmir)	Date: 2008年04月14日 16:33
I think this should be fixed somewhere in the c code. people calling sock.recv which a large recv size will also trigger this error. this fix is wrong. the fixed code reads one byte at a time. see: http://mail.python.org/pipermail/python-dev/2008-April/078613.html
msg65468 - (view)	Author: A.M. Kuchling (akuchling) * (Python committer)	Date: 2008年04月14日 17:30
Note that _rbufsize is only set to 1 if the _fileobject's bufsize is set to 0. So perhaps the bug is that some library is turning off buffering when it shouldn't. I don't see how you would fix this in the C code, other than manually doing two separate mallocs and copying the data, which would unfairly penalize platforms with smarter malloc() implementations. What sort of fix would you suggest?
msg65481 - (view)	Author: Ralf Schmitt (schmir)	Date: 2008年04月14日 21:07
Well, I think the right thing to do is limit the maximal size to be read inside the c function (just to make it impossible to pass around large values). This is basically the same fix just at another place in the code. http://twistedmatrix.com/trac/ticket/1079 describes the same problem (but with 64 k read requests: it can even leak with small requests). The fix there was to really copy those strings around into a StringIO object. Note that the code does not read byte by byte when passing in no size argument. Instead it read recv_size bytes: if self._rbufsize <= 1: recv_size = self.default_bufsize else: recv_size = self._rbufsize This seems clearly wrong to me.
msg65482 - (view)	Author: Ralf Schmitt (schmir)	Date: 2008年04月14日 21:10
that is it seems wrong that it uses 1 byte when a size is given, and recv_size when size is not given. By the way I think if you ask for 4096 bytes and the buffering is set to 2048 bytes it should still try to read the full 4096 bytes. The number of bytes it tries to read however. should be limited by whatever the system maximally returns.
msg65991 - (view)	Author: Mark Hammond (mhammond) * (Python committer)	Date: 2008年04月30日 05:59
FYI, #2632 is tracking a regression caused by this change.

History
Date	User	Action	Args
2022年04月11日 14:56:08	admin	set	github: 41374
2008年04月30日 05:59:20	mhammond	set	nosy: + mhammond messages: + msg65991
2008年04月15日 15:24:03	gregory.p.smith	set	nosy: + gregory.p.smith
2008年04月14日 21:10:17	schmir	set	messages: + msg65482
2008年04月14日 21:07:38	schmir	set	messages: + msg65481
2008年04月14日 17:30:42	akuchling	set	messages: + msg65468
2008年04月14日 16:33:46	schmir	set	nosy: + schmir messages: + msg65467
2008年03月07日 10:24:42	vila	set	nosy: + vila
2008年02月23日 19:31:25	akuchling	set	nosy: + akuchling resolution: out of date -> fixed messages: + msg62797
2008年01月20日 19:24:30	martey	set	messages: + msg61341
2008年01月19日 01:24:23	martey	set	nosy: + martey messages: + msg60130 title: Memory leak in socket.py on Mac OS X 10.3 -> Memory leak in socket.py on Mac OS X
2008年01月05日 19:39:04	christian.heimes	set	status: open -> closed nosy: + christian.heimes resolution: out of date messages: + msg59313
2004年12月29日 02:09:35	bacchusrx	create

homepage