Issue549731
Created on 2002年04月28日 09:17 by mhammond, last changed 2022年04月10日 16:05 by admin. This issue is now closed.
Files |
File name |
Uploaded |
Description |
Edit |
encodeleak.py
|
mhammond,
2002年04月28日 09:17
|
Program demonstrating leak. |
codecs.c.patch
|
mhammond,
2002年04月28日 10:39
|
Better patch |
Messages (12) |
msg10603 - (view) |
Author: Mark Hammond (mhammond) * (Python committer) |
Date: 2002年04月28日 09:17 |
Note the following Debug Python session:
>>> s=u"anything"
[8189 refs]
>>> v=s.encode("utf8")
[10967 refs]
>>> v=s.encode("utf8")
[10968 refs]
>>> v=s.encode("utf8")
[10969 refs]
>>> v=s.encode("utf8")
[10970 refs]
Each call to encode is losing a reference. Attaching a
test program that demonstrates this in more detail.
The output from my test program is:
After 10000 iterations, lost 12850 references
[15227 refs]
and for 100000:
After 100000 iterations, lost 102850 references
[105227 refs]
etc.
As far as I can tell, this appears in all Python 2.x
versions.
|
msg10604 - (view) |
Author: Mark Hammond (mhammond) * (Python committer) |
Date: 2002年04月28日 09:26 |
Logged In: YES
user_id=14198
s/decode/encode/ :) Also meant to mention problem not
restricted to UTF8 - changing the encoding in the text file
to anything other than 'ascii' seems to leak in the same way.
|
msg10605 - (view) |
Author: Mark Hammond (mhammond) * (Python committer) |
Date: 2002年04月28日 10:05 |
Logged In: YES
user_id=14198
Found it :) Attaching patch.
|
msg10606 - (view) |
Author: Mark Hammond (mhammond) * (Python committer) |
Date: 2002年04月28日 10:39 |
Logged In: YES
user_id=14198
Oops - too quick. All calls to _PyCodec_Lookup() leak.
|
msg10607 - (view) |
Author: Neal Norwitz (nnorwitz) * (Python committer) |
Date: 2002年06月04日 01:35 |
Logged In: YES
user_id=33168
Patch makes sense to me.
If you add a test, I may be able to catch the problem w/purify
next time I run it (if purify works).
|
msg10608 - (view) |
Author: Nobody/Anonymous (nobody) |
Date: 2002年06月04日 04:39 |
Logged In: NO
I'm not sure what sort of test you are suggesting I add. I
think the patch is pretty obvious and reasonable, so MAL
should just check it in or assign it back to me <wink>.
Earlier the better really.
|
msg10609 - (view) |
Author: Mark Hammond (mhammond) * (Python committer) |
Date: 2002年06月04日 04:42 |
Logged In: YES
user_id=14198
damn sourceforge - it went to the trouble of asking my email
address when I submitted without being logged in, but it
doesn't seem to have done anything with it - so that was me
just incase you weren't sure :)
|
msg10610 - (view) |
Author: Marc-Andre Lemburg (lemburg) * (Python committer) |
Date: 2002年06月04日 07:20 |
Logged In: YES
user_id=38388
I'll have a look later today.
|
msg10611 - (view) |
Author: Neal Norwitz (nnorwitz) * (Python committer) |
Date: 2002年06月04日 17:25 |
Logged In: YES
user_id=33168
Basically the code in the report would be fine.
Purify *should* catch anything which causes the leak.
So:
s = u'anything'
assert(s.encode('utf-8') == s.encode('utf-8'))
should work. Perhaps, there is already a test for this?
and purify didn't report leaks.
|
msg10612 - (view) |
Author: Mark Hammond (mhammond) * (Python committer) |
Date: 2002年07月17日 23:07 |
Logged In: YES
user_id=14198
A tickle for Marc, assuming his days aren't quite *that*
long <wink>. Just give the OK and I will check it in.
|
msg10613 - (view) |
Author: Marc-Andre Lemburg (lemburg) * (Python committer) |
Date: 2002年07月18日 13:34 |
Logged In: YES
user_id=38388
Perfect. I've marked it as Python 2.2.1 candidate. Please
also mention this in the checkin message.
Thanks. (And sorry for not getting back earlier -- my days
are indeed *very* long ;-)
|
msg10614 - (view) |
Author: Mark Hammond (mhammond) * (Python committer) |
Date: 2002年07月18日 23:07 |
Logged In: YES
user_id=14198
Checking in codecs.c;
/cvsroot/python/python/dist/src/Python/codecs.c,v <-- codecs.c
new revision: 2.14; previous revision: 2.13
done
|
History
|
---|
Date |
User |
Action |
Args |
2022年04月10日 16:05:16 | admin | set | github: 36513 |
2002年04月28日 09:17:59 | mhammond | create |