homepage

This issue tracker has been migrated to GitHub , and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: Memory leak in socket.py on Mac OS X
Type: Stage:
Components: Library (Lib) Versions:
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: a_lauer, akuchling, bacchusrx, bob.ippolito, christian.heimes, gregory.p.smith, martey, mhammond, schmir, vila
Priority: normal Keywords:

Created on 2004年12月29日 02:09 by bacchusrx, last changed 2022年04月11日 14:56 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
example.py bacchusrx, 2004年12月29日 02:09 example.py
example2.py bacchusrx, 2005年01月01日 23:02 example client
server.pl bacchusrx, 2005年01月01日 23:03 example server
Messages (17)
msg23834 - (view) Author: bacchusrx (bacchusrx) Date: 2004年12月29日 02:09
Some part of socket.py leaks memory on Mac OS X 10.3 (both with 
the python 2.3 that ships with the OS and with python 2.4).
I encountered the problem in John Goerzen's offlineimap. 
Transfers of messages over a certain size would cause the program 
to bail with malloc errors, eg
*** malloc: vm_allocate(size=5459968) failed (error code=3)
*** malloc[13730]: error: Can't allocate region
Inspecting the process as it runs shows that python's total memory
size grows wildly during such transfers.
The bug manifests in _fileobject.read() in socket.py. You can 
replicate the problem easily using the attached example with "nc -l 
-p 9330 < /dev/zero" running on some some remote host.
The way _fileobject.read() is written, socket.recv is called with the 
larger of the minimum rbuf size or whatever's left to be read. 
Whatever is received is then appended to a buffer which is joined 
and returned at the end of function.
It looks like each time through the loop, space for recv_size is 
allocated but not freed, so if the loop runs for enough iterations, 
python exhausts the memory available to it.
You can sidestep the condition if recv_size is small (like 
_fileobject.default_bufsize small).
I can't replicate this problem with python 2.3 on FreeBSD 4.9 or 
FreeBSD 5.2, nor on Mac OS X 10.3 if the logic from 
_fileobject.read() is re-written in Perl (for example).
msg23835 - (view) Author: Bob Ippolito (bob.ippolito) * (Python committer) Date: 2005年01月01日 06:18
Logged In: YES 
user_id=139309
I can't reproduce this on either version of Python a 10.3.7 machine w/ 
1gb ram. Python's total memory usage seems stable to me even if the 
read is in a while loop.
I can't see anything in sock_recv or _fileobject.read that will in any way 
leak memory.
With a really large buffer size (always >17mb, but it does vary with each 
run) it will get a memory error but the Python process doesn't grow 
beyond 50mb at the samples I looked at. That's pretty much the amount 
of RAM I'd expect it to use. 
It is kind of surprising it doesn't want to allocate a buffer of that size, 
because I have the RAM for it.. but I don't think this is a bug.
msg23836 - (view) Author: Bob Ippolito (bob.ippolito) * (Python committer) Date: 2005年01月01日 06:27
Logged In: YES 
user_id=139309
I just played with a bit more. If I catch the MemoryError and try again, 
most of the time it will work (sometimes on the second try). These 
malloc faults seem to be some kind of temporary condition.
msg23837 - (view) Author: bacchusrx (bacchusrx) Date: 2005年01月01日 23:01
Logged In: YES 
user_id=646321
I've been able to replicate the problem reliably on both 10.3.5 and 
10.3.7. I've attached two more examples to demonstrate:
Try this: Do, "dd if=/dev/zero of=./data bs=1024 count=10240" and save 
server.pl wherever you put "data". Have three terminals open. In one, 
run "perl server.pl -s0.25". In another, run "top -ovsize" and in the third 
run "python example2.py". 
After about 100 iterations, python's vsize is +1GB (just about the value 
of cumulative_req in example2.py) and if left running will cause a 
malloc error at around 360 iterations with a vsize over 3.6GB (again, just 
about what cumulative_req reports). Mind you, we've only received 
~512kbytes.
server.pl differs from the netcat method in that it (defaults) to sending 
only 1492 bytes at a time (configurable with the -b switch) and sleeps for 
however many seconds specified with the -s switch. This guarantees 
enough iterations to raise the error each time around. When omittting 
the -s switch to server.pl, I don't get the error, but throughput is good 
enough that the loop in readFromSockUntil() only runs a few times.
msg23838 - (view) Author: Bob Ippolito (bob.ippolito) * (Python committer) Date: 2005年01月02日 02:22
Logged In: YES 
user_id=139309
Ok. I've tracked it down. realloc(...) on Darwin doesn't actually resize 
memory unless it *has* to. For shrinking an allocation, it does not have 
to, therefore realloc(...) with a smaller size is a no-op.
It seems that this may be a misunderstanding by Python. The man page 
for realloc(...) does not say that it will EVER free memory, EXCEPT in the 
case where it has to allocate a larger region.
I'll attach an example that demonstrates this outside of Python.
msg23839 - (view) Author: Bob Ippolito (bob.ippolito) * (Python committer) Date: 2005年01月02日 02:23
Logged In: YES 
user_id=139309
#include <unistd.h>
#define NUM_ALLOCATIONS 100000
#define ALLOC_SIZE 10485760
#define ALLOC_RESIZE 1492
int main(int argc, char **argv) {
 /* exiting will free all this leaked memory */
 for (i = 0; i < NUM_ALLOCATIONS; i++) {
 void *orig_ptr, *new_ptr;
 size_t new_size, orig_size;
 orig_ptr = malloc(ALLOC_SIZE);
 orig_size = malloc_size(orig_ptr);
 if (orig_ptr == NULL) {
 printf("failure to malloc %d\n", i);
 abort();
 }
 new_ptr = realloc(orig_ptr, ALLOC_RESIZE);
 new_size = malloc_size(new_ptr);
 printf("resized %d[%p] -> %d[%p]\n",
 orig_size, orig_ptr, new_size, new_ptr);
 if (new_ptr == NULL) {
 printf("failure to realloc %d\n", i);
 abort();
 }
 }
 return 0;
}
msg23840 - (view) Author: Bob Ippolito (bob.ippolito) * (Python committer) Date: 2005年01月02日 02:25
Logged In: YES 
user_id=139309
that code paste is missing an "int i" at the beginning of main..
msg23841 - (view) Author: Andreas Lauer (a_lauer) Date: 2005年11月10日 07:42
Logged In: YES 
user_id=1376343
The problem also occurs in rare cases under Windows XP with
Python 2.3.4. I Suspect the code line
recv_size = max(self._rbufsize, left)
in socket.py to be a part of the problem.
 
In the case that I investigated, this caused >600 allocations
of up to 5 MBytes (which came in 8 KB packets). 
Sure, the memory allocator should be able to handle this in
_socket.recv (first it allocates the X MBytes buffer, which
is later
resized with _PyString_Resize), but it I think the correct
line in socket.py
is 
recv_size = min(self._rbufsize, left).
At least, after this my problem was gone.
msg59313 - (view) Author: Christian Heimes (christian.heimes) * (Python committer) Date: 2008年01月05日 19:39
Probably outdated
I haven't heard or seen any such problems in the past two years.
msg60130 - (view) Author: Martey Dodoo (martey) Date: 2008年01月19日 01:24
I am not sure that this is outdated. After running into memory errors
trying to download messages with attachments using imaplib, I ran the
example client and server. After iteration 357, there was a malloc
error, just like etrepum suggested.
I am using Mac OS X 10.5, with Python 2.5 (not the Apple-supplied Python).
msg61341 - (view) Author: Martey Dodoo (martey) Date: 2008年01月20日 19:24
Just wanted to note that the good people of comp.lang.python helped me
figure out that the issue is actually
http://bugs.python.org/issue1389051, in case anyone in similar straits
ended up here.
msg62797 - (view) Author: A.M. Kuchling (akuchling) * (Python committer) Date: 2008年02月23日 19:31
Andreas Lauer's suggested fix is correct. Applied to 2.6 trunk in rev.
61008 and to 2.5-maint in rev. 61009.
msg65467 - (view) Author: Ralf Schmitt (schmir) Date: 2008年04月14日 16:33
I think this should be fixed somewhere in the c code. people calling
sock.recv which a large recv size will also trigger this error.
this fix is wrong. the fixed code reads one byte at a time.
see:
http://mail.python.org/pipermail/python-dev/2008-April/078613.html 
msg65468 - (view) Author: A.M. Kuchling (akuchling) * (Python committer) Date: 2008年04月14日 17:30
Note that _rbufsize is only set to 1 if the _fileobject's bufsize is set
to 0. So perhaps the bug is that some library is turning off buffering
when it shouldn't.
I don't see how you would fix this in the C code, other than manually
doing two separate mallocs and copying the data, which would unfairly
penalize platforms with smarter malloc() implementations. What sort of
fix would you suggest?
msg65481 - (view) Author: Ralf Schmitt (schmir) Date: 2008年04月14日 21:07
Well, I think the right thing to do is limit the maximal size to be read
inside the c function (just to make it impossible to pass around large
values). This is basically the same fix just at another place in the code.
http://twistedmatrix.com/trac/ticket/1079 describes the same problem
(but with 64 k read requests: it can even leak with small requests).
The fix there was to really copy those strings around into a StringIO
object.
Note that the code does not read byte by byte when passing in no size
argument. Instead it read recv_size bytes:
 if self._rbufsize <= 1:
 recv_size = self.default_bufsize
 else:
 recv_size = self._rbufsize
This seems clearly wrong to me.
msg65482 - (view) Author: Ralf Schmitt (schmir) Date: 2008年04月14日 21:10
that is it seems wrong that it uses 1 byte when a size is given, and
recv_size when size is not given.
By the way I think if you ask for 4096 bytes and the buffering is set to
2048 bytes it should still try to read the full 4096 bytes.
The number of bytes it tries to read however. should be limited by
whatever the system maximally returns.
msg65991 - (view) Author: Mark Hammond (mhammond) * (Python committer) Date: 2008年04月30日 05:59
FYI, #2632 is tracking a regression caused by this change.
History
Date User Action Args
2022年04月11日 14:56:08adminsetgithub: 41374
2008年04月30日 05:59:20mhammondsetnosy: + mhammond
messages: + msg65991
2008年04月15日 15:24:03gregory.p.smithsetnosy: + gregory.p.smith
2008年04月14日 21:10:17schmirsetmessages: + msg65482
2008年04月14日 21:07:38schmirsetmessages: + msg65481
2008年04月14日 17:30:42akuchlingsetmessages: + msg65468
2008年04月14日 16:33:46schmirsetnosy: + schmir
messages: + msg65467
2008年03月07日 10:24:42vilasetnosy: + vila
2008年02月23日 19:31:25akuchlingsetnosy: + akuchling
resolution: out of date -> fixed
messages: + msg62797
2008年01月20日 19:24:30marteysetmessages: + msg61341
2008年01月19日 01:24:23marteysetnosy: + martey
messages: + msg60130
title: Memory leak in socket.py on Mac OS X 10.3 -> Memory leak in socket.py on Mac OS X
2008年01月05日 19:39:04christian.heimessetstatus: open -> closed
nosy: + christian.heimes
resolution: out of date
messages: + msg59313
2004年12月29日 02:09:35bacchusrxcreate

AltStyle によって変換されたページ (->オリジナル) /