Message234731
| Author |
serhiy.storchaka |
| Recipients |
Arfrever, python-dev, serhiy.storchaka, vstinner |
| Date |
2015年01月26日.10:26:41 |
| SpamBayes Score |
-1.0 |
| Marked as misclassified |
Yes |
| Message-id |
<129300265.9jKHJZWiYC@raxxla> |
| In-reply-to |
<CAMpsgwYgAVXE=H=yELMJAjjgnyZ5-7tXaOkcBQLoUWoyoj9_uQ@mail.gmail.com> |
| Content |
I think the changeset which made decoders to use _PyUnicodeWriter (issue16311)
is responsible of the regression.
For example consider b'\x80abc'.decode('utf-8', 'backslashreplace').
The writer reserves string buffer with size 4 (every byte produces at most 1
character). First byte is incorrect and replaced by 4-character string
'\\x80'. The writer increases min_length but doesn't resize the buffer because
its size is enough to write replacement string. But following writes of ASCII
characters cause buffer overflow. |
|