This issue tracker has been migrated to GitHub ,
and is currently read-only.
For more information,
see the GitHub FAQs in the Python's Developer Guide.
Created on 2012年08月11日 07:48 by loewis, last changed 2022年04月11日 14:57 by admin. This issue is now closed.
| Messages (15) | |||
|---|---|---|---|
| msg167937 - (view) | Author: Martin v. Löwis (loewis) * (Python committer) | Date: 2012年08月11日 07:48 | |
PEP 3118 specifies that the 'c'format denotes UCS-1 characters, yet .tolist() converts the memoryview into a list of bytes objects. This is incorrect; it ought to be a list of string objects (as it should for 'u' and 'w' codes). The same holds for item access. |
|||
| msg167938 - (view) | Author: Martin v. Löwis (loewis) * (Python committer) | Date: 2012年08月11日 07:54 | |
To reproduce:
>>> memoryview(array.array('B',b'foo')).cast('c').tolist()
[b'f', b'o', b'o']
|
|||
| msg167940 - (view) | Author: Stefan Krah (skrah) * (Python committer) | Date: 2012年08月11日 08:04 | |
You have rejected the PEP-3118 'u' and 'w' specifiers here: http://mail.python.org/pipermail/python-dev/2012-March/117390.html Otherwise, memoryview follows the existing struct module syntax: http://docs.python.org/dev/library/struct.html#format-characters I hope it did not escape you that _testbuffer.c *uses* the struct module to verify the correctness of memoryview. |
|||
| msg167943 - (view) | Author: Martin v. Löwis (loewis) * (Python committer) | Date: 2012年08月11日 09:18 | |
No, I haven't rejected the format codes. What I did ask to revert is that 'u' in the array module denotes Py_UCS4, I requested that it should continue to be compatible with 3.2. I didn't have an opinion on memoryview at all then. It's unfortunate that PEP 3118 deviates from the struct module, however, memoryview is based onthe buffer interface,and its formatcodes ought to conform to the PEP, not to the struct module (IMO). It's easy to see that it *doesn't* follow the struct syntax, as it is possjible to create memoryview objects with other format codes in 3.3. |
|||
| msg167944 - (view) | Author: Martin v. Löwis (loewis) * (Python committer) | Date: 2012年08月11日 09:32 | |
That the struct module hasn't been updated to support the PEP 3118 is already reported as issue 3132, please don't confuse the issues. This issue is about memoryview. One solution would be to revert the PEPs decision that 'c' is UCS-1. |
|||
| msg167945 - (view) | Author: Antoine Pitrou (pitrou) * (Python committer) | Date: 2012年08月11日 09:39 | |
I don't know which behaviour is more desirable, but I would consider PEP 3118 a historical document more than a normative spec. Especially when it comes to struct format codes. |
|||
| msg167948 - (view) | Author: Stefan Krah (skrah) * (Python committer) | Date: 2012年08月11日 09:44 | |
Martin v. L??wis <report@bugs.python.org> wrote: > It's unfortunate that PEP 3118 deviates from the struct module, however, > memoryview is based onthe buffer interface,and its formatcodes ought to > conform to the PEP, not to the struct module (IMO). The struct module itself should conform to PEP-3118, see #3132. I think the struct module should be updated first. The proliferation of subtly different format codes is not manageable. For example, if you use NumPy, there are already differences between NumPy syntax and struct syntax. Also, one should always be able to unpack the tobytes() representation using the struct module and get the same result as from flatten(tolist()). > It's easy to see that it *doesn't* follow the struct syntax, as it is > possjible to create memoryview objects with other format codes in 3.3. memoryview has *always* allowed arbitrary format strings during construction. In 3.3, it keeps this property for backwards compatibility. It does follow struct syntax whenever it *uses* one of the format codes, like in tolist(). |
|||
| msg167949 - (view) | Author: Stefan Krah (skrah) * (Python committer) | Date: 2012年08月11日 10:07 | |
Martin v. L??wis <report@bugs.python.org> wrote: > That the struct module hasn't been updated to support the PEP 3118 is > already reported as issue 3132, please don't confuse the issues. > This issue is about memoryview. No, it isn't. It was always planned to use struct to do the unpacking for memoryview, see msg71338. On a meta note, I'd appreciate if you were less liberal with words like "confusing", especially if you are just beginning to work on an issue that other people have already spent a lot of time on. |
|||
| msg167951 - (view) | Author: Martin v. Löwis (loewis) * (Python committer) | Date: 2012年08月11日 10:35 | |
Do you agree or not agree that memoryview.tolist should return a list of str objects for the c code? If you agree, can you please change the title back? If you disagree, please explain why, change the title back, and close the issue as rejected. If you agree, but think that struct should be changed first, create a new issue for the struct change, make that a dependency of this issue, and change the title back. |
|||
| msg167957 - (view) | Author: Alyssa Coghlan (ncoghlan) * (Python committer) | Date: 2012年08月11日 14:20 | |
Whatever the struct module produces for a format code is the same thing that memoryview.to_list() should produce. PEP 3118 contains way too many errors (as has been found out the hard way) to be considered a normative document. |
|||
| msg167961 - (view) | Author: Alyssa Coghlan (ncoghlan) * (Python committer) | Date: 2012年08月11日 14:40 | |
<Closing with rationale, as Martin requested> The struct module documentation takes precedence over PEP 3118 when it comes to pre-existing format codes, as changing struct is not feasible due to backwards compatibility concerns, and we don't want two conflicting notations for binary format descriptions. PEP 3118 was intended only to define *additional* format characters, which may or may not yet be understood by the struct module. As 'c' is defined by the struct module as returning a bytes object of length one, this is the same interpretation used by memoryview. Thus the current behaviour of both memoryview and struct are considered correct, while it is PEP 3118 that is incorrect in this case: the 'c' entry should not have been in the table, as 'c' was already defined at least as long ago as 1.5.2 (returning an 8-bit string, which then became a bytes object in 3.x). The PEP was also written in a 2.x context (note the mention of "2.5" above the table of new format codes), where the idea of providing a separate code that implicitly performed x.decode("latin-1") to produce a unicode object instead of an 8-bit string object wouldn't necessarily come up. |
|||
| msg167969 - (view) | Author: Alyssa Coghlan (ncoghlan) * (Python committer) | Date: 2012年08月11日 15:03 | |
However, based on this issue, I have added some comments to #3132 (I think PEP 3118's simplistic approach to embedded text data is broken and a bad idea) |
|||
| msg167970 - (view) | Author: Chris Jerdonek (chris.jerdonek) * (Python committer) | Date: 2012年08月11日 15:06 | |
> <Closing with rationale, as Martin requested> Status was still open. Was that a tracker bug? |
|||
| msg167973 - (view) | Author: Alyssa Coghlan (ncoghlan) * (Python committer) | Date: 2012年08月11日 15:21 | |
Pretty sure it was just an error on my part. |
|||
| msg167980 - (view) | Author: Martin v. Löwis (loewis) * (Python committer) | Date: 2012年08月11日 17:09 | |
Nick: that's a reasonable view, thanks - in particular the point that PEP 3118 should not be considered normative. I still think that the c code in struct is fairly redundant (with B) as it stands, so I think it should get deprecated and removed - but that's a different issue. |
|||
| History | |||
|---|---|---|---|
| Date | User | Action | Args |
| 2022年04月11日 14:57:34 | admin | set | github: 59827 |
| 2012年08月11日 17:09:51 | loewis | set | messages: + msg167980 |
| 2012年08月11日 15:21:46 | ncoghlan | set | messages: + msg167973 |
| 2012年08月11日 15:06:19 | chris.jerdonek | set | nosy:
+ chris.jerdonek messages: + msg167970 |
| 2012年08月11日 15:03:45 | ncoghlan | set | status: open -> closed |
| 2012年08月11日 15:03:26 | ncoghlan | set | messages: + msg167969 |
| 2012年08月11日 14:40:31 | ncoghlan | set | resolution: not a bug dependencies: - implement PEP 3118 struct changes messages: + msg167961 stage: resolved |
| 2012年08月11日 14:20:31 | ncoghlan | set | messages: + msg167957 |
| 2012年08月11日 10:35:20 | loewis | set | messages: + msg167951 |
| 2012年08月11日 10:08:27 | skrah | set | dependencies:
+ implement PEP 3118 struct changes title: memoryview.to_list() incorrect for 'c' format -> struct module 'c' specifier does not follow PEP-3118 |
| 2012年08月11日 10:07:10 | skrah | set | messages:
+ msg167949 title: struct module 'c' specifier does not follow PEP-3118 -> memoryview.to_list() incorrect for 'c' format |
| 2012年08月11日 10:00:19 | Arfrever | set | nosy:
+ Arfrever |
| 2012年08月11日 09:44:06 | skrah | set | messages:
+ msg167948 title: memoryview.to_list() incorrect for 'c' format -> struct module 'c' specifier does not follow PEP-3118 |
| 2012年08月11日 09:39:30 | pitrou | set | nosy:
+ pitrou messages: + msg167945 |
| 2012年08月11日 09:32:02 | loewis | set | messages:
+ msg167944 title: struct module 'c' specifier does not follow PEP-3118 -> memoryview.to_list() incorrect for 'c' format |
| 2012年08月11日 09:18:15 | loewis | set | messages: + msg167943 |
| 2012年08月11日 08:06:01 | skrah | set | title: memoryview.to_list() incorrect for 'c' format -> struct module 'c' specifier does not follow PEP-3118 |
| 2012年08月11日 08:04:03 | skrah | set | nosy:
+ ncoghlan messages: + msg167940 |
| 2012年08月11日 07:54:10 | loewis | set | messages: + msg167938 |
| 2012年08月11日 07:48:53 | loewis | create | |