This issue tracker has been migrated to GitHub ,
and is currently read-only.
For more information,
see the GitHub FAQs in the Python's Developer Guide.
Created on 2011年09月29日 01:38 by ezio.melotti, last changed 2022年04月11日 14:57 by admin. This issue is now closed.
| Messages (6) | |||
|---|---|---|---|
| msg144583 - (view) | Author: Ezio Melotti (ezio.melotti) * (Python committer) | Date: 2011年09月29日 01:38 | |
The test at Lib/test/test_multibytecodec.py:178 checks for len('\U00012345') == 2, and with PEP393 this is always False. I tried to run the tests with a few changes and they seem to work, but the code doesn't raise any exception on c.reset(): ---->8-------->8-------->8-------->8---- import io, codecs s = io.BytesIO() c = codecs.getwriter('gb18030')(s) c.write('123'); s.getvalue() c.write('\U00012345'); s.getvalue() c.write('\U00012345' + '\uac00\u00ac'); s.getvalue() c.write('\uac00'); s.getvalue() c.reset() s.getvalue() ---->8-------->8-------->8-------->8---- Result: >>> import io, codecs >>> s = io.BytesIO() >>> c = codecs.getwriter('gb18030')(s) >>> c.write('123'); s.getvalue() b'123' >>> c.write('\U00012345'); s.getvalue() b'123\x907\x959' >>> # '\U00012345'[0] is the same of '\U00012345' now >>> c.write('\U00012345' + '\uac00\u00ac'); s.getvalue() b'123\x907\x959\x907\x959\x827\xcf5\x810\x851' >>> c.write('\uac00'); s.getvalue() b'123\x907\x959\x907\x959\x827\xcf5\x810\x851\x827\xcf5' >>> c.reset() # is this supposed to raise an error? >>> s.getvalue() b'123\x907\x959\x907\x959\x827\xcf5\x810\x851\x827\xcf5' Victor suggested to wait until multibytecodec gets ported to the new API before fixing this. |
|||
| msg171346 - (view) | Author: Ezio Melotti (ezio.melotti) * (Python committer) | Date: 2012年09月26日 16:47 | |
Victor, do you know if multibytecodec has been ported to the new API yet? If I removed the "if", I still get a failure. test test_multibytecodec failed -- Traceback (most recent call last): File "/home/wolf/dev/py/py3k/Lib/test/test_multibytecodec.py", line 187, in test_gb18030 self.assertEqual(s.getvalue(), b'123\x907\x959') AssertionError: b'123\x907\x959\x907\x959' != b'123\x907\x959' |
|||
| msg171347 - (view) | Author: STINNER Victor (vstinner) * (Python committer) | Date: 2012年09月26日 16:57 | |
> Victor, do you know if multibytecodec has been ported to the new API yet? No, it has no. CJK codecs still use the legacy API (Py_UNICODE). |
|||
| msg184186 - (view) | Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) | Date: 2013年03月14日 20:14 | |
I think these tests have no sense after PEP393. They tests that StreamWriter works with non-BMP characters broken inside surrogate pair. I.e. c.write(s[:i]); c.write(s[i:]) always is same as c.write(s), even if i breaks s inside a surrogate pair. This case is impossible after PEP393. |
|||
| msg186591 - (view) | Author: Roundup Robot (python-dev) (Python triager) | Date: 2013年04月11日 20:41 | |
New changeset 78cd09d2f908 by Victor Stinner in branch 'default': Issue #13056: Reenable test_multibytecodec.Test_StreamWriter tests http://hg.python.org/cpython/rev/78cd09d2f908 |
|||
| msg186592 - (view) | Author: STINNER Victor (vstinner) * (Python committer) | Date: 2013年04月11日 20:45 | |
CJK decoders use the new Unicode API since the changeset bcecf3910162. "I think these tests have no sense after PEP393. They tests that StreamWriter works with non-BMP characters broken inside surrogate pair. I.e. c.write(s[:i]); c.write(s[i:]) always is same as c.write(s), even if i breaks s inside a surrogate pair. This case is impossible after PEP393." I reenabled tests, but I simplified them to remove parts related to surrogate pairs. Tests are shorter than before, but it's better than no test at all. Can I close the issue or someone wants to improve these tests? |
|||
| History | |||
|---|---|---|---|
| Date | User | Action | Args |
| 2022年04月11日 14:57:22 | admin | set | github: 57265 |
| 2013年04月11日 20:58:25 | ezio.melotti | set | status: open -> closed stage: needs patch -> resolved resolution: fixed versions: - Python 3.3 |
| 2013年04月11日 20:45:48 | vstinner | set | messages: + msg186592 |
| 2013年04月11日 20:41:22 | python-dev | set | nosy:
+ python-dev messages: + msg186591 |
| 2013年03月14日 20:14:47 | serhiy.storchaka | set | messages: + msg184186 |
| 2013年03月14日 03:49:04 | ezio.melotti | set | nosy:
+ serhiy.storchaka versions: + Python 3.4 |
| 2012年09月26日 16:57:10 | vstinner | set | messages: + msg171347 |
| 2012年09月26日 16:47:21 | ezio.melotti | set | keywords:
+ 3.3regression messages: + msg171346 |
| 2011年09月29日 01:47:23 | vstinner | set | components: + Unicode |
| 2011年09月29日 01:38:20 | ezio.melotti | create | |