This issue tracker has been migrated to GitHub ,
and is currently read-only.
For more information,
see the GitHub FAQs in the Python's Developer Guide.
Created on 2012年10月28日 12:43 by takluyver, last changed 2022年04月11日 14:57 by admin.
| Files | ||||
|---|---|---|---|---|
| File name | Uploaded | Description | Edit | |
| format-bytes.patch | martin.panter, 2014年12月18日 05:55 | review | ||
| Messages (13) | |||
|---|---|---|---|
| msg174042 - (view) | Author: Thomas Kluyver (takluyver) * | Date: 2012年10月28日 12:43 | |
At least in CPython, format strings can be given as bytes, as an alternative to str. E.g. >>> struct.unpack(b'>hhl', b'\x00\x01\x00\x02\x00\x00\x00\x03') (1, 2, 3) Looking at the source code [1], this appears to be consciously accounted for. But it doesn't seem to be mentioned in the documentation. I think the docs should either say it's a possibility, or warn that it's an implementation detail. [1] http://hg.python.org/cpython/file/cde4b66699fe/Modules/_struct.c#l1340 |
|||
| msg174083 - (view) | Author: Martin Panter (martin.panter) * (Python committer) | Date: 2012年10月28日 22:36 | |
Also it would be nice to clarify if struct.Struct.format is meant to be a byte string. Reading the documentation and examples I expected a character string. It was an issue for me when embedding one structure within another:
HSF_VOL_DESC = Struct("< B 5s B")
# Python 3.2.3's "Struct.format" is actually a byte string
NSR_DESC = Struct(HSF_VOL_DESC.format.decode() + "B")
|
|||
| msg174584 - (view) | Author: Terry J. Reedy (terry.reedy) * (Python committer) | Date: 2012年11月02日 21:28 | |
For 3.3, I verified that adding b prefix to first three doc examples gives same output as without, but also discovered that example outputs are wrong, at least on windows, because of byte ordering issues.
>>> pack('hhl', 1, 2, 3)
b'\x01\x00\x02\x00\x03\x00\x00\x00'
>>> pack(b'hhl', 1, 2, 3)
b'\x01\x00\x02\x00\x03\x00\x00\x00'
>>> unpack(b'hhl', b'\x00\x01\x00\x02\x00\x00\x00\x03')
(256, 512, 50331648)
>>> unpack('hhl', b'\x00\x01\x00\x02\x00\x00\x00\x03')
(256, 512, 50331648)
|
|||
| msg174680 - (view) | Author: Mark Dickinson (mark.dickinson) * (Python committer) | Date: 2012年11月03日 19:35 | |
> but also discovered that example outputs are wrong That's documented to some extent: there's a line in the docs that says: "All examples assume a native byte order, size, and alignment with a big-endian machine". Given that little-endian machines are much more common that big-endian these days, it may be worth rewriting the examples for little-endian machines. |
|||
| msg174682 - (view) | Author: Mark Dickinson (mark.dickinson) * (Python committer) | Date: 2012年11月03日 19:40 | |
> Also it would be nice to clarify if struct.Struct.format is meant to be > a byte string. Hmm. That seems wrong to me. After all, the format string is supposed to be a piece of human-readable text rather than a collection of bytes. I think it's borderline acceptable to allow a bytes instance to be passed in for the format (practicality beats purity and all that), but I'd say that the output format should definitely be unicode. |
|||
| msg174711 - (view) | Author: Terry J. Reedy (terry.reedy) * (Python committer) | Date: 2012年11月03日 22:13 | |
I think the example should be switched *and* the formats should specify the endianess so the examples work on all systems. |
|||
| msg176681 - (view) | Author: Thomas Kluyver (takluyver) * | Date: 2012年11月30日 11:04 | |
I'm happy to put together a docs patch, but I don't have any indication of the right answer (is it a safe feature to use, or an implementation detail?) Is there another venue where I should raise the question? |
|||
| msg176701 - (view) | Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) | Date: 2012年11月30日 18:55 | |
Python 2 supports only str. Support for unicode objects has been added in r59687 (merged with other unrelated changes in changeset 13aabc23cf2e). Maybe Raymond can explain why the type for the Struct.format was chosen bytes, not unicode. |
|||
| msg176702 - (view) | Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) | Date: 2012年11月30日 19:05 | |
No, this is not r59687. I can't found from which revision in 59680-59695 it came. |
|||
| msg216656 - (view) | Author: Martin Panter (martin.panter) * (Python committer) | Date: 2014年04月17日 05:11 | |
The issue of Struct.format being a byte string has been raised separately in Issue 21071. |
|||
| msg232767 - (view) | Author: Martin Panter (martin.panter) * (Python committer) | Date: 2014年12月16日 22:34 | |
Actually the "struct" module doc string seems to already hint that format strings can be byte strings: "Python bytes objects are used to hold the data representing the C struct and also as format strings . . ." |
|||
| msg232858 - (view) | Author: Martin Panter (martin.panter) * (Python committer) | Date: 2014年12月18日 05:55 | |
Assuming it is intended to support byte strings, here is a patch that documents them being allowed, and adds a test case |
|||
| msg292554 - (view) | Author: Martin Panter (martin.panter) * (Python committer) | Date: 2017年04月29日 02:46 | |
I think the direction to take for this depends on the outcome of Issue 21071. First we have to decide if the "format" attribute is blessed as a byte string (and not deprecated), or whether it is deprecated or changed to a text string. Serhiy pointed out that it is not entirely "safe" because mixing equivalent byte and text formats can generate ByteWarning. |
|||
| History | |||
|---|---|---|---|
| Date | User | Action | Args |
| 2022年04月11日 14:57:37 | admin | set | github: 60553 |
| 2017年09月14日 03:25:33 | xiang.zhang | unlink | issue19985 dependencies |
| 2017年04月29日 02:46:51 | martin.panter | set | dependencies: + struct.Struct.format is bytes, but should be str, - Document whether it's safe to use bytes for struct format string |
| 2017年04月29日 02:46:51 | martin.panter | unlink | issue16349 dependencies |
| 2017年04月29日 02:46:14 | martin.panter | set | dependencies:
+ Document whether it's safe to use bytes for struct format string messages: + msg292554 |
| 2017年04月29日 02:46:14 | martin.panter | link | issue16349 dependencies |
| 2016年04月15日 04:00:33 | martin.panter | link | issue19985 dependencies |
| 2014年12月19日 00:17:39 | Arfrever | set | nosy:
+ Arfrever |
| 2014年12月18日 05:55:28 | martin.panter | set | files:
+ format-bytes.patch keywords: + patch messages: + msg232858 |
| 2014年12月16日 22:34:55 | martin.panter | set | messages: + msg232767 |
| 2014年04月17日 05:11:19 | martin.panter | set | messages: + msg216656 |
| 2012年11月30日 19:05:18 | serhiy.storchaka | set | nosy:
+ christian.heimes messages: + msg176702 |
| 2012年11月30日 18:55:08 | serhiy.storchaka | set | versions:
+ Python 3.2, Python 3.3, Python 3.4 nosy: + rhettinger, serhiy.storchaka messages: + msg176701 components: + Extension Modules, - Library (Lib) |
| 2012年11月30日 11:04:26 | takluyver | set | messages: + msg176681 |
| 2012年11月03日 22:13:58 | terry.reedy | set | messages: + msg174711 |
| 2012年11月03日 19:40:04 | mark.dickinson | set | messages: + msg174682 |
| 2012年11月03日 19:35:49 | mark.dickinson | set | messages: + msg174680 |
| 2012年11月02日 21:28:35 | terry.reedy | set | nosy:
+ mark.dickinson, meador.inge, terry.reedy messages: + msg174584 |
| 2012年10月28日 22:36:28 | martin.panter | set | nosy:
+ martin.panter messages: + msg174083 |
| 2012年10月28日 12:43:42 | takluyver | create | |