This issue tracker has been migrated to GitHub ,
and is currently read-only.
For more information,
see the GitHub FAQs in the Python's Developer Guide.
Created on 2015年03月13日 06:47 by nagle, last changed 2022年04月11日 14:58 by admin.
| Messages (8) | |||
|---|---|---|---|
| msg238009 - (view) | Author: John Nagle (nagle) | Date: 2015年03月13日 06:47 | |
I'm porting a large, working system from Python 2 to Python 3, using "six", so the same code works with both. One part of the system works a lot like the multiprocessing module, but predates it. It launches child processes with "Popen" and talks to them using "pickle" over stdin/stdout as pipes. Works fine under Python 2, and has been working in production for years.
Under Python 3, I'm getting errors that indicate memory corruption:
Fatal Python error: GC object already tracked
Current thread 0x00001a14 (most recent call first):
File "C:\python34\lib\site-packages\pymysql\connections.py", line 411
in description
File "C:\python34\lib\site-packages\pymysql\connections.py", line 1248
in _get_descriptions
File "C:\python34\lib\site-packages\pymysql\connections.py", line 1182
in _read_result_packet
File "C:\python34\lib\site-packages\pymysql\connections.py", line 1132
in read
File "C:\python34\lib\site-packages\pymysql\connections.py", line 929
in _read_query_result
File "C:\python34\lib\site-packages\pymysql\connections.py", line 768
in query
File "C:\python34\lib\site-packages\pymysql\cursors.py", line 282 in
_query
File "C:\python34\lib\site-packages\pymysql\cursors.py", line 134 in
execute
File "C:\projects\sitetruth\domaincacheitem.py", line 128 in select
File "C:\projects\sitetruth\domaincache.py", line 30 in search
File "C:\projects\sitetruth\ratesite.py", line 31 in ratedomain
File "C:\projects\sitetruth\RatingProcess.py", line 68 in call
File "C:\projects\sitetruth\subprocesscall.py", line 140 in docall
File "C:\projects\sitetruth\subprocesscall.py", line 158 in run
File "C:\projects\sitetruth\RatingProcess.py", line 89 in main
File "C:\projects\sitetruth\RatingProcess.py", line 95 in <module>
That's clear memory corruption.
Also,
File "C:\projects\sitetruth\InfoSiteRating.py", line 200, in scansite
if len(self.badbusinessinfo) > 0 : # if bad stuff
NameError: name 'len' is not defined
There are others, but those two should be impossible to cause from Python source.
I've done the obvious stuff - deleted all .pyc files and Python cache directories. All my code is in Python. Every library module came in via "pip", into a clean Python 3.4.3 (32 bit) installation on Win7/x86-64.
Currently installed packages (via "pip list")
beautifulsoup4 (4.3.2)
dnspython3 (1.12.0)
html5lib (0.999)
pip (6.0.8)
PyMySQL (0.6.6)
pyparsing (2.0.3)
setuptools (12.0.5)
six (1.9.0)
Nothing exotic there. The project has zero local C code; any C code came
from the Python installation or the above packages, most of which are pure Python.
It all works fine with Python 2.7.9. Everything else in the program seems
to be working fine under both 2.7.9 and 3.4.3, until subprocesses are involved.
What's being pickled is very simple; no custom objects, although Exception types are sometimes pickled if the subprocess raises an exception.
Pickler and Unpickler instances are being reused here. A message is pickled, piped to the subprocess, unpickled, work is done, and a response comes back later via the return pipe. A send looks like:
self.writer.dump(args) # send data
self.dataout.flush() # finish output
self.writer.clear_memo() # no memory from cycle to cycle
and a receive looks like:
result = self.reader.load() # read and return from child
self.reader.memo = {} # no memory from cycle to cycle
Those were the recommended way to reset "pickle" for new traffic years ago.
(You have to clear the receive side as well as the send side, or the dictionary
of saved objects grows forever.) My guess is that there's something about reusing "pickle" instances that botches memory uses in CPython 3's C code
for "cpickle". That should work, though; the "multiprocessing" module works
by sending pickled data over pipes.
The only code difference between Python 2 and 3 is that under Python 3 I have to use "sys.stdin.buffer" and "sys.stdout.buffer" as arguments to Pickler and Unpickler. Otherwise they complain that they're getting type "str".
Unfortunately, I don't have an easy way to reproduce this bug yet.
Is there some way to force the use of the pure Python pickle module under Python 3? That would help isolate the problem.
John Nagle
|
|||
| msg238012 - (view) | Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) | Date: 2015年03月13日 07:27 | |
sys.modules['_pickle'] del sys.modules['pickle'] # if exists import pickle Or just use pickle._Pickler instead of pickle.Pickler and like (implementation detail!). |
|||
| msg238049 - (view) | Author: John Nagle (nagle) | Date: 2015年03月13日 19:48 | |
> Or just use pickle._Pickler instead of pickle.Pickler and like > (implementation detail!). Tried that. Changed my own code as follows: 25a26 > 71,72c72,73 < self.reader = pickle.Unpickler(self.proc.stdout) # set up reader < self.writer = pickle.Pickler(self.proc.stdin,kpickleprotocolversion) --- > self.reader = pickle._Unpickler(self.proc.stdout) # set up reader > self.writer = pickle._Pickler(self.proc.stdin,kpickleprotocolversion 125,126c126,127 < self.reader = pickle.Unpickler(self.datain) # set up reader < self.writer = pickle.Pickler(self.dataout,kpickleprotocolversion) --- > self.reader = pickle._Unpickler(self.datain) # set up reader > self.writer = pickle._Pickler(self.dataout,kpickleprotocolversion) Program runs after those changes. So it looks like CPickle has a serious memory corruption problem. |
|||
| msg238053 - (view) | Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) | Date: 2015年03月13日 20:42 | |
Could you please try to minimize you data and try to reproduce an issue without using third-party modules if this is possible? |
|||
| msg238055 - (view) | Author: John Nagle (nagle) | Date: 2015年03月13日 21:30 | |
"minimize you data" - that's a big job here. Where are the tests for "pickle"? Is there one that talks to a subprocess over a pipe? Maybe I can adapt that. |
|||
| msg238075 - (view) | Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) | Date: 2015年03月14日 08:54 | |
No, there are no subprocess specific tests for pickle. Pickle tests are in Lib/test/pickletester.py and Lib/test/test_pickle.py. First try dump pickled data to a file and then load it in other process. Is it still failed? |
|||
| msg238158 - (view) | Author: John Nagle (nagle) | Date: 2015年03月15日 20:17 | |
More info: the problem is on the "unpickle" side. If I use _Unpickle and Pickle, so the unpickle side is in Python, but the pickle side is in C, no problem. If I use Unpickle and _Pickle, so the unpickle side is C, crashes. |
|||
| msg288160 - (view) | Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) | Date: 2017年02月19日 19:31 | |
Without additional information we can't solve this issue. Is the problem still reproduced? |
|||
| History | |||
|---|---|---|---|
| Date | User | Action | Args |
| 2022年04月11日 14:58:13 | admin | set | status: pending -> open github: 67843 |
| 2017年02月19日 19:31:51 | serhiy.storchaka | set | status: open -> pending |
| 2017年02月19日 19:31:43 | serhiy.storchaka | set | status: pending -> open messages: + msg288160 |
| 2017年02月19日 19:29:47 | serhiy.storchaka | set | status: open -> pending |
| 2015年07月21日 07:22:27 | ethan.furman | set | nosy:
- ethan.furman |
| 2015年03月18日 16:51:06 | ethan.furman | set | nosy:
+ ethan.furman |
| 2015年03月15日 20:17:37 | nagle | set | messages: + msg238158 |
| 2015年03月14日 08:54:53 | serhiy.storchaka | set | messages: + msg238075 |
| 2015年03月13日 21:30:32 | nagle | set | messages: + msg238055 |
| 2015年03月13日 20:42:42 | serhiy.storchaka | set | messages: + msg238053 |
| 2015年03月13日 19:48:31 | nagle | set | messages: + msg238049 |
| 2015年03月13日 07:27:14 | serhiy.storchaka | set | versions:
+ Python 3.5 nosy: + alexandre.vassalotti, serhiy.storchaka, pitrou messages: + msg238012 type: behavior stage: test needed |
| 2015年03月13日 06:47:14 | nagle | create | |