This issue tracker has been migrated to GitHub ,
and is currently read-only.
For more information,
see the GitHub FAQs in the Python's Developer Guide.
Created on 2014年07月19日 18:30 by wayedt, last changed 2022年04月11日 14:58 by admin.
| Files | ||||
|---|---|---|---|---|
| File name | Uploaded | Description | Edit | |
| issue-22012.txt | ronaldoussoren, 2014年07月21日 08:09 | review | ||
| Messages (13) | |||
|---|---|---|---|
| msg223470 - (view) | Author: Tyler Wade (wayedt) | Date: 2014年07月19日 18:30 | |
On Mac OSX, struct.unpack incorrectly handles bools.
Python 3.4.1 (default, May 19 2014, 13:10:29)
[GCC 4.2.1 Compatible Apple LLVM 5.1 (clang-503.0.40)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import struct
>>> struct.unpack('????', b'\x00\x01\x02\x03')
(False, True, False, True)
|
|||
| msg223472 - (view) | Author: Ned Deily (ned.deily) * (Python committer) | Date: 2014年07月19日 20:11 | |
Doing a quick look, the results vary. Using current python.org 2.7.8 and 3.4.1 installers, the results are correct. These interpreters are built with Apple gcc-4.2 (non-LLVM) from Xcode 3.2.6. Other 2.7 and 3.4.x builds I have available at the moment, including the Apple-supplied Python 2.7, were built with clang LLVM and they all give the incorrect result.
$ /usr/bin/python2.7 -c "import sys,struct;print(struct.unpack('????', b'\x00\x01\x02\x03'), sys.version)"
((False, True, False, True), '2.7.5 (default, Mar 9 2014, 22:15:05) \n[GCC 4.2.1 Compatible Apple LLVM 5.0 (clang-500.0.68)]')
$ arch -i386 /usr/bin/python2.7 -c "import sys,struct;print(struct.unpack('????', b'\x00\x01\x02\x03'), sys.version)"
((False, True, False, True), '2.7.5 (default, Mar 9 2014, 22:15:05) \n[GCC 4.2.1 Compatible Apple LLVM 5.0 (clang-500.0.68)]')
$ /usr/local/bin/python2.7 -c "import sys,struct;print(struct.unpack('????', b'\x00\x01\x02\x03'), sys.version)"
((False, True, True, True), '2.7.8 (v2.7.8:ee879c0ffa11, Jun 29 2014, 21:07:35) \n[GCC 4.2.1 (Apple Inc. build 5666) (dot 3)]')
$ /usr/local/bin/python2.7-32 -c "import sys,struct;print(struct.unpack('????', b'\x00\x01\x02\x03'), sys.version)"
((False, True, True, True), '2.7.8 (v2.7.8:ee879c0ffa11, Jun 29 2014, 21:07:35) \n[GCC 4.2.1 (Apple Inc. build 5666) (dot 3)]')
$ /usr/local/bin/python3.4 -c "import sys,struct;print(struct.unpack('????', b'\x00\x01\x02\x03'), sys.version)"
(False, True, True, True) 3.4.1 (v3.4.1:c0e311e010fc, May 18 2014, 00:54:21)
[GCC 4.2.1 (Apple Inc. build 5666) (dot 3)]
$ /usr/local/bin/python3.4-32 -c "import sys,struct;print(struct.unpack('????', b'\x00\x01\x02\x03'), sys.version)"
(False, True, True, True) 3.4.1 (v3.4.1:c0e311e010fc, May 18 2014, 00:54:21)
[GCC 4.2.1 (Apple Inc. build 5666) (dot 3)]
$ ./bin/python3.4 -c "import sys,struct;print(struct.unpack('????', b'\x00\x01\x02\x03'), sys.version)"
(False, True, False, True) 3.4.1+ (3.4:d6b71971b228, Jul 16 2014, 16:30:56)
[GCC 4.2.1 Compatible Apple LLVM 5.1 (clang-503.0.40)]
|
|||
| msg223480 - (view) | Author: Mark Dickinson (mark.dickinson) * (Python committer) | Date: 2014年07月19日 21:22 | |
The relevant piece of code in the struct module looks like this:
static PyObject *
nu_bool(const char *p, const formatdef *f)
{
BOOL_TYPE x;
memcpy((char *)&x, p, sizeof x);
return PyBool_FromLong(x != 0);
}
Is it possible that BOOL_TYPE is a bitfield of length 1, and that clang is somehow making use of that fact?
One thing I don't understand is that this shouldn't affect *standard* packing and unpacking (as opposed to native), but I still see the problem for a format of "<????". However, it's fine for ">????". Some debugging shows that we're calling 'nu_bool' even for "<????", which is a bit odd.
|
|||
| msg223481 - (view) | Author: Mark Dickinson (mark.dickinson) * (Python committer) | Date: 2014年07月19日 21:31 | |
> Some debugging shows that we're calling 'nu_bool' even for "<????", which is a bit odd. Ah, I see. There's an optimisation that uses the native table functions instead of the big/little-endian ones if the size and byte order match. |
|||
| msg223487 - (view) | Author: Ned Deily (ned.deily) * (Python committer) | Date: 2014年07月19日 23:25 | |
FTR, the problem also shows up with the current clang-3.4 (3.4.2) and clang3.5 (svn213451-1) packages in Debian testing (on i386), building --with-pydebug and without, so the issue is not confined to OS X. |
|||
| msg223501 - (view) | Author: Ronald Oussoren (ronaldoussoren) * (Python committer) | Date: 2014年07月20日 07:06 | |
> On 19 jul. 2014, at 23:22, Mark Dickinson <report@bugs.python.org> wrote: > > > Mark Dickinson added the comment: > > The relevant piece of code in the struct module looks like this: > > static PyObject * > nu_bool(const char *p, const formatdef *f) > { > BOOL_TYPE x; > memcpy((char *)&x, p, sizeof x); > return PyBool_FromLong(x != 0); > } > > Is it possible that BOOL_TYPE is a bitfield of length 1, and that clang is somehow making use of that fact? I haven't found a definitive source yet, but it seems that the only valid values for _Bool, which BOOL_TYPE expands to, are 0 and 1. Clang might make use of that restriction. Ronald |
|||
| msg223502 - (view) | Author: Mark Dickinson (mark.dickinson) * (Python committer) | Date: 2014年07月20日 08:25 | |
For native struct packing / unpacking, it's not even clear to me that this should be considered a bug (though for standard unpacking it definitely is): the primary use-case for native unpacking is unpacking a collection of bytes that was written (from Python, or C, or ...) on the same machine, and in that case we're only going to be unpacking 0s and 1s. FWIW, the docs say: "Either 0 or 1 in the native or standard bool representation will be packed, and any non-zero value will be True when unpacking.", so for native packing either the docs or the code need to be fixed. The simplest solution would be to replace `nu_bool` with something identical to `bu_bool`. In theory that would break on any platform where "sizeof(_Bool)" is greater than 1. In practice, I doubt that Python's going to meet such platforms in a hurry. |
|||
| msg223503 - (view) | Author: Mark Dickinson (mark.dickinson) * (Python committer) | Date: 2014年07月20日 08:32 | |
> In practice, I doubt that Python's going to meet such platforms in a hurry. Hrm; looks like that's not a particularly safe assumption, especially with older compilers that might do a "typedef int _Bool". From http://msdn.microsoft.com/en-us/library/tf4dy80a.aspx: "That means that for Visual C++ 4.2, a call of sizeof(bool) yields 4, while in Visual C++ 5.0 and later, the same call yields 1." |
|||
| msg223511 - (view) | Author: Ronald Oussoren (ronaldoussoren) * (Python committer) | Date: 2014年07月20日 14:46 | |
> On 20 jul. 2014, at 10:32, Mark Dickinson <report@bugs.python.org> wrote: > > > Mark Dickinson added the comment: > >> In practice, I doubt that Python's going to meet such platforms in a hurry. > > Hrm; looks like that's not a particularly safe assumption, especially with older compilers that might do a "typedef int _Bool" I'm not sure about the actual definition, but at least for OSX on PPC sizeof(bool) is not 1. I ran into this with pyobjc when I tried to treat BOOL and bool the same. All of this is with Objective-C, but the same should be true for plain C. |
|||
| msg223565 - (view) | Author: Ronald Oussoren (ronaldoussoren) * (Python committer) | Date: 2014年07月21日 08:09 | |
I just confirmed that clang only uses the LSB for _Bool values by looking at the assembly generated for the following code:
<quote>
#include <stdbool.h>
#include <stdio.h>
int main(void)
{
_Bool x;
*(unsigned char*)&x = 42;
printf("%d\n", (int)x);
return 0;
}
</quote>
The attached patch fixes the issue for me. The new testcase fails without the changes to _struct.c and passes with those changes.
|
|||
| msg223566 - (view) | Author: Ronald Oussoren (ronaldoussoren) * (Python committer) | Date: 2014年07月21日 09:13 | |
The last draf of ISO C '11: <http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1570.pdf>. This says that _Bool is large enough to store 0 and 1, and that conversion of any integer data to _Bool results in 0 or 1. If I interpret the document correctly there is no conforming way to create a value of _Bool that has a value other than 0 or 1, and that should mean clang's behavior is not a bug. BTW. I haven't tested my patch on a PPC system yet (where sizeof(bool) != 1), and won't be able to do so until I'm back home (I'm currently at EuroPython) |
|||
| msg224030 - (view) | Author: Ronald Oussoren (ronaldoussoren) * (Python committer) | Date: 2014年07月26日 08:56 | |
Does anyone have feedback for my proposed patch (other the bug in test code when sizeof(bool) != 1, the test values for big and little endian are in the wrong order)? |
|||
| msg224710 - (view) | Author: Ronald Oussoren (ronaldoussoren) * (Python committer) | Date: 2014年08月04日 11:55 | |
BTW. There is also an argument to be made against my patch and for a documentation update: it is unclear to me if clang ever creates _Bool false values where the bit pattern isn't exactly the same as the zero value of an integer of the same size (for example due to generated code only updating the least significant bit and starting with a uninitialised value that has some of the other bits set). If it does so depends on what's more efficient to do in machine code, and my knowledge of assembly is too rusty to have anything useful to say about that (although I'd suspect that overwriting the complete _Bool value would be more efficient than loading the current value, updating the LSB and storing it again). If it can do so the current behavior of struct.unpack would be more correct than the version you get after applying my patch. |
|||
| History | |||
|---|---|---|---|
| Date | User | Action | Args |
| 2022年04月11日 14:58:06 | admin | set | github: 66211 |
| 2016年10月07日 17:12:39 | berker.peksag | link | issue28377 superseder |
| 2015年12月12日 00:36:30 | gregory.p.smith | set | versions: + Python 3.6 |
| 2015年12月12日 00:36:04 | gregory.p.smith | set | components: + Library (Lib) |
| 2015年12月12日 00:35:39 | gregory.p.smith | set | nosy:
+ gregory.p.smith |
| 2014年08月04日 11:55:02 | ronaldoussoren | set | messages: + msg224710 |
| 2014年07月26日 08:56:07 | ronaldoussoren | set | messages: + msg224030 |
| 2014年07月21日 09:13:17 | ronaldoussoren | set | keywords:
+ patch, needs review messages: + msg223566 stage: patch review |
| 2014年07月21日 08:09:15 | ronaldoussoren | set | files:
+ issue-22012.txt messages: + msg223565 |
| 2014年07月20日 14:46:00 | ronaldoussoren | set | messages:
+ msg223511 title: struct.unpack('?', '\x02') returns (False,) when Python is built with clang -> struct.unpack('?', '\x02') returns (False,) on Mac OSX |
| 2014年07月20日 08:57:06 | ned.deily | set | title: struct.unpack('?', '\x02') returns (False,) on Mac OSX -> struct.unpack('?', '\x02') returns (False,) when Python is built with clang |
| 2014年07月20日 08:32:06 | mark.dickinson | set | messages: + msg223503 |
| 2014年07月20日 08:25:18 | mark.dickinson | set | messages: + msg223502 |
| 2014年07月20日 07:06:21 | ronaldoussoren | set | messages:
+ msg223501 title: struct.unpack('?', '\x02') returns (False,) when Python is built with clang -> struct.unpack('?', '\x02') returns (False,) on Mac OSX |
| 2014年07月19日 23:26:57 | ned.deily | set | title: struct.unpack('?', '\x02') returns (False,) on Mac OSX -> struct.unpack('?', '\x02') returns (False,) when Python is built with clang |
| 2014年07月19日 23:25:58 | ned.deily | set | components: + Build, - macOS |
| 2014年07月19日 23:25:06 | ned.deily | set | assignee: ronaldoussoren -> messages: + msg223487 |
| 2014年07月19日 21:31:03 | mark.dickinson | set | messages: + msg223481 |
| 2014年07月19日 21:22:45 | mark.dickinson | set | nosy:
+ mark.dickinson messages: + msg223480 |
| 2014年07月19日 20:11:29 | ned.deily | set | nosy:
+ ned.deily messages: + msg223472 versions: - Python 3.1, Python 3.2, Python 3.3 |
| 2014年07月19日 18:30:27 | wayedt | create | |