homepage

This issue tracker has been migrated to GitHub , and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author serhiy.storchaka
Recipients Arfrever, python-dev, serhiy.storchaka, vstinner
Date 2015年01月26日.10:26:41
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <129300265.9jKHJZWiYC@raxxla>
In-reply-to <CAMpsgwYgAVXE=H=yELMJAjjgnyZ5-7tXaOkcBQLoUWoyoj9_uQ@mail.gmail.com>
Content
I think the changeset which made decoders to use _PyUnicodeWriter (issue16311) 
is responsible of the regression.
For example consider b'\x80abc'.decode('utf-8', 'backslashreplace').
The writer reserves string buffer with size 4 (every byte produces at most 1 
character). First byte is incorrect and replaced by 4-character string 
'\\x80'. The writer increases min_length but doesn't resize the buffer because 
its size is enough to write replacement string. But following writes of ASCII 
characters cause buffer overflow.
History
Date User Action Args
2015年01月26日 10:26:42serhiy.storchakasetrecipients: + serhiy.storchaka, vstinner, Arfrever, python-dev
2015年01月26日 10:26:42serhiy.storchakalinkissue23321 messages
2015年01月26日 10:26:41serhiy.storchakacreate

AltStyle によって変換されたページ (->オリジナル) /