Message56604
| Author |
gvanrossum |
| Recipients |
brett.cannon, christian.heimes, gvanrossum |
| Date |
2007年10月20日.15:36:06 |
| SpamBayes Score |
0.052568886 |
| Marked as misclassified |
No |
| Message-id |
<ca471dc20710200836s6829a4c2l8c589fa69878cb6f@mail.gmail.com> |
| In-reply-to |
<47198279.9070102@cheimes.de> |
| Content |
Thanks for persevering!!!
The dangers of switching between fileno(fp) and fp are actually well
documented in the C and/or POSIX standards. The problem is caused in
PyFile_FromFileEx() -- it creates a Python file object from the file
descriptor. The fix actually only works because we're not using the
FILE struct once PyTokenizer_FindEncoding() is called. I think it
would be better to move the lseek() into call_find_module() so the
FILE abstraction is not broken by PyTokenizer_FindEncoding().
I think there's still a bug or two lurking in this area: first, each
time you call imp.find_module() you leak a FILE object; second, the
encoding allocated in PyTokenizer_FindEncoding() is leaked.
You're right that a lot of this could be avoided if we used file
descriptors consistently. It seems find_module() itself doesn't read
the file; it just needs to know that it's possible to open the file.
Rewriting everywhere that uses PyFile_FromFile[Ex] to use file
descriptors doesn't seem too hard; there are only a few places. |
|