This issue tracker has been migrated to GitHub ,
and is currently read-only.
For more information,
see the GitHub FAQs in the Python's Developer Guide.
Created on 2008年11月22日 00:41 by terry.reedy, last changed 2022年04月11日 14:56 by admin. This issue is now closed.
| Files | ||||
|---|---|---|---|---|
| File name | Uploaded | Description | Edit | |
| reqbytes.diff | loewis, 2008年11月30日 13:48 | |||
| Messages (7) | |||
|---|---|---|---|
| msg76226 - (view) | Author: Terry J. Reedy (terry.reedy) * (Python committer) | Date: 2008年11月22日 00:41 | |
Binascii b2a_xxx functions accept 'binary data' and return ascii-encoded bytes. The corresponding a2b_xxx functions turn the ascii-encoded bytes back to 'binary data' (bytes). If the binary data is bytes, these should be inverses of each other. Somewhat surprisingly to me (because the corresponding base64 module functions raise "TypeError: expected bytes, not str") 3.0 strings (unicode) are accepted as 'binary data', though they will not 'round-trip'. Ascii chars almost do >>> a='aaaa' >>> c=b.b2a_base64(a) >>> c b'YWFhYQ==\n' >>> d=b.a2b_base64(c) >>> d b'aaaa' But general unicode chars generate nonsense. >>> a='\u1000' >>> c=b.b2a_base64(a) >>> c b'4YCA\n' >>> d=b.a2b_base64(c) >>> d b'\xe1\x80\x80' I also tried b2a_uu. Is this a bug? |
|||
| msg76233 - (view) | Author: Georg Brandl (georg.brandl) * (Python committer) | Date: 2008年11月22日 08:16 | |
I vote yes. |
|||
| msg76628 - (view) | Author: Antoine Pitrou (pitrou) * (Python committer) | Date: 2008年11月29日 22:48 | |
It's not /exactly/ nonsense, it seems to assume an utf8 encoding pass is
necessary:
>>> b'\xe1\x80\x80'.decode('utf8') == '\u1000'
True
IMO, while accepting unicode strings instead of bytes for the a2b_xx
functions is understandable (because in practice only ASCII characters
are allowed), it is not acceptable for b2a_xx functions to accept
unicode strings instead of bytes.
In other words, it might/should be ok for
`binascii.a2b_base64('YWFh\n')` to return the same as
`binascii.a2b_base64('YWFh\n')` (that is, b'aaa'), but
`binascii.b2a_base64('aaa')` should raise a TypeError rather than
applying an utf8 encoding pass before doing the actual b2a encoding.
I think this must be fixed before 3.0 final, and is therefore a release
blocker.
|
|||
| msg76629 - (view) | Author: Antoine Pitrou (pitrou) * (Python committer) | Date: 2008年11月29日 22:49 | |
Hmm, I obviously meant:
[...] In other words, it might/should be ok for
`binascii.a2b_base64('YWFh\n')` to return the same as
`binascii.a2b_base64(b'YWFh\n')` (that is, b'aaa') [...]
|
|||
| msg76639 - (view) | Author: Martin v. Löwis (loewis) * (Python committer) | Date: 2008年11月30日 13:48 | |
Here is a patches that fixes the problem. Notice that this affects the email API; base64mime.body_encode now also requires bytes (whereas quoprimime remains unchanged). There are probably more functions that still incorrectly accept strings, e.g. zlib.crc32. |
|||
| msg76662 - (view) | Author: Barry A. Warsaw (barry) * (Python committer) | Date: 2008年11月30日 20:14 | |
Martin, the patch looks okay to me. I vote for applying it. |
|||
| msg76724 - (view) | Author: Martin v. Löwis (loewis) * (Python committer) | Date: 2008年12月02日 06:00 | |
Committed as r67472. |
|||
| History | |||
|---|---|---|---|
| Date | User | Action | Args |
| 2022年04月11日 14:56:41 | admin | set | github: 48637 |
| 2008年12月02日 06:00:34 | loewis | set | status: open -> closed messages: + msg76724 |
| 2008年11月30日 20:14:38 | barry | set | nosy:
+ barry resolution: accepted messages: + msg76662 |
| 2008年11月30日 13:58:56 | loewis | set | keywords: + needs review |
| 2008年11月30日 13:48:30 | loewis | set | files:
+ reqbytes.diff nosy: + loewis messages: + msg76639 keywords: + patch |
| 2008年11月29日 22:49:42 | pitrou | set | messages: + msg76629 |
| 2008年11月29日 22:48:03 | pitrou | set | priority: release blocker nosy: + pitrou messages: + msg76628 |
| 2008年11月22日 08:16:40 | georg.brandl | set | nosy:
+ georg.brandl messages: + msg76233 |
| 2008年11月22日 00:41:18 | terry.reedy | create | |