homepage

This issue tracker has been migrated to GitHub , and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: non-deterministic behavior of int subclass
Type: behavior Stage: commit review
Components: Interpreter Core Versions: Python 3.2, Python 3.3
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: mark.dickinson Nosy List: brechtm, mark.dickinson, pitrou, python-dev, skrah
Priority: high Keywords: patch

Created on 2012年04月20日 07:51 by brechtm, last changed 2022年04月11日 14:57 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
pdf.py brechtm, 2012年04月20日 07:51
issue14630.patch mark.dickinson, 2012年04月20日 17:24 review
Messages (15)
msg158803 - (view) Author: Brecht Machiels (brechtm) Date: 2012年04月20日 07:51
I have subclassed int to add an extra attribute:
class Integer(int):
 def __new__(cls, value, base=10, indirect=False):
 try:
 obj = int.__new__(cls, value, base)
 except TypeError:
 obj = int.__new__(cls, value)
 return obj
 def __init__(self, value, base=10, indirect=False):
 self.indirect = indirect
Using this class in my application, int(Integer(b'0')) sometimes returns a value of 48 (= ord('0')!) or 192, instead of the correct value 0. str(Integer(b'0')) always returns '0'. This seems to only occur for the value 0. First decoding b'0' to a string, or passing int(b'0') to Integer makes no difference. The problem lies with converting an Integer(0) to an int with int().
Furthermore, this occurs in a random way. Subsequent runs will produce 48 or 192 at different points in the application (a parser). Both Python 3.2.2 and 3.2.3 behave the same (32-bit, Windows XP). Apparently, the 64-bit Windows Python 3.2.3 does not show this behavior [2]. I haven't tested on other operating systems.
I cannot seem to reproduce this in a simple test program. The following produces no output:
for i in range(100000):
 integer = int(Integer(b'0'))
 if integer > 0:
 print(integer)
Checking for the condition int(Integer()) > 0 in my application (when I know the argument to Integer is b'0') and conditionally printing int(Integer(b'0')) a number of times, the results 48 and 192 do show up now and then.
As I can't reproduce the problem in a short test program, I have attached the relevant code. It is basically a PDF parser. The output for this [2] PDF file is, for example:
b'0' 0 Integer(0) 192 0 b'0' 16853712
b'0' 0 Integer(0) 48 0 b'0' 16938088
b'0' 0 Integer(0) 192 0 b'0' 17421696
b'0' 0 Integer(0) 48 0 b'0' 23144888
b'0' 0 Integer(0) 48 0 b'0' 23185408
b'0' 0 Integer(0) 48 0 b'0' 23323272
Search for print function calls in the code to see what this represents.
[1] http://stackoverflow.com/questions/10230604/non-deterministic-behavior-of-int-subclass#comment13156508_10230604
[2] http://www.gust.org.pl/projects/e-foundry/math-support/vieth2008.pdf 
msg158812 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2012年04月20日 10:01
I can reproduce this on a 32-bit OS X build of the default branch, so it doesn't seem to be Windows specific (though it may be 32-bit specific).
Brecht, if you can find a way to reduce the size of your example at all that would be really helpful.
msg158814 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2012年04月20日 10:53
Reproduced under 32-bit Linux.
The problem seems to be that Py_SIZE(x) == 0 when x is Integer(0), but ob_digit[0] is still supposed to be significant. There's probably some overwriting with the trailing attributes.
By forcing Py_SIZE(x) == 1, the bug disappears, but it probably breaks lots of other stuff in longobject.c.
msg158815 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2012年04月20日 10:56
If we're accessing ob_digit[0] when Py_SIZE(x) == 0, that sounds like a bug to me.
msg158816 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2012年04月20日 11:07
> If we're accessing ob_digit[0] when Py_SIZE(x) == 0, that sounds like a 
> bug to me.
_PyLong_Copy does.
It's ok as long as the object is int(0), because it's part of the small ints and its allocated size is one digit.
The following hack seems to fix the issue here. Perhaps we can simply fix _PyLong_Copy, but I wonder how many other parts of longobject.c rely on accessing ob_digit[0].
diff --git a/Objects/longobject.c b/Objects/longobject.c
--- a/Objects/longobject.c
+++ b/Objects/longobject.c
@@ -4194,6 +4194,8 @@ long_subtype_new(PyTypeObject *type, PyO
 n = Py_SIZE(tmp);
 if (n < 0)
 n = -n;
+ if (n == 0)
+ n = 1;
 newobj = (PyLongObject *)type->tp_alloc(type, n);
 if (newobj == NULL) {
 Py_DECREF(tmp);
diff --git a/Objects/object.c b/Objects/object.c
--- a/Objects/object.c
+++ b/Objects/object.c
@@ -1010,6 +1010,8 @@ PyObject **
 tsize = ((PyVarObject *)obj)->ob_size;
 if (tsize < 0)
 tsize = -tsize;
+ if (tsize == 0 && PyLong_Check(obj))
+ tsize = 1;
 size = _PyObject_VAR_SIZE(tp, tsize);
 
 dictoffset += (long)size;
@@ -1090,6 +1092,8 @@ PyObject *
 tsize = ((PyVarObject *)obj)->ob_size;
 if (tsize < 0)
 tsize = -tsize;
+ if (tsize == 0 && PyLong_Check(obj))
+ tsize = 1;
 size = _PyObject_VAR_SIZE(tp, tsize);
 
 dictoffset += (long)size;
msg158817 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2012年04月20日 11:35
> _PyLong_Copy does.
Grr. So it does. That at least should be fixed, but I agree that it would be good to have the added protection of ensuring that we always allocate space for at least one limb.
We should also check whether 2.7 is susceptible.
msg158819 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2012年04月20日 11:53
Self-contained example that fails for me on 32-bit OS X.
class Integer(int):
 def __new__(cls, value, base=10, indirect=False):
 try:
 obj = int.__new__(cls, value, base)
 except TypeError:
 obj = int.__new__(cls, value)
 return obj
 def __init__(self, value, base=10, indirect=False):
 self.indirect = indirect
integers = []
for i in range(1000):
 integer = Integer(b'0')
 integers.append(integer)
for integer in integers:
 assert int(integer) == 0
msg158822 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2012年04月20日 12:06
The fix for _PyLong_Copy is the following:
diff --git a/Objects/longobject.c b/Objects/longobject.c
--- a/Objects/longobject.c
+++ b/Objects/longobject.c
@@ -156,7 +156,7 @@ PyObject *
 if (i < 0)
 i = -(i);
 if (i < 2) {
- sdigit ival = src->ob_digit[0];
+ sdigit ival = (i == 0) ? 0 : src->ob_digit[0];
 if (Py_SIZE(src) < 0)
 ival = -ival;
 CHECK_SMALL_INT(ival);
msg158823 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2012年04月20日 12:18
Using MEDIUM_VALUE also works.
I'll cook up a patch tonight, after work.
diff -r 6762b943ee59 Objects/longobject.c
--- a/Objects/longobject.c	Tue Apr 17 21:42:07 2012 -0400
+++ b/Objects/longobject.c	Fri Apr 20 13:18:01 2012 +0100
@@ -156,9 +156,7 @@
 if (i < 0)
 i = -(i);
 if (i < 2) {
- sdigit ival = src->ob_digit[0];
- if (Py_SIZE(src) < 0)
- ival = -ival;
+ sdigit ival = MEDIUM_VALUE(src);
 CHECK_SMALL_INT(ival);
 }
 result = _PyLong_New(i);
msg158854 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2012年04月20日 17:23
Here's the patch. I searched through the rest of Objects/longobject.c for other occurrences of [0], and found nothing else that looked suspicious, so I'm reasonably confident that this was an isolated case.
msg158861 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2012年04月20日 17:52
Also, Python 2.7 looks safe here.
msg158863 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2012年04月20日 17:57
The patch works fine here, and the test exercises the issue correctly.
msg158877 - (view) Author: Stefan Krah (skrah) * (Python committer) Date: 2012年04月20日 19:49
The patch looks good to me.
msg158886 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2012年04月20日 20:44
New changeset cdcc6b489862 by Mark Dickinson in branch '3.2':
Issue #14630: Fix an incorrect access of ob_digit[0] for a zero instance of an int subclass.
http://hg.python.org/cpython/rev/cdcc6b489862
New changeset c7b0f711dc15 by Mark Dickinson in branch 'default':
Issue #14630: Merge fix from 3.2.
http://hg.python.org/cpython/rev/c7b0f711dc15 
msg158888 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2012年04月20日 20:45
Fixed. Thanks Brecht for the report (and Antoine for diagnosing the problem).
History
Date User Action Args
2022年04月11日 14:57:29adminsetgithub: 58835
2012年04月20日 21:02:16mark.dickinsonsetstatus: open -> closed
2012年04月20日 20:45:28mark.dickinsonsetresolution: fixed
messages: + msg158888
2012年04月20日 20:44:30python-devsetnosy: + python-dev
messages: + msg158886
2012年04月20日 19:49:51skrahsetmessages: + msg158877
2012年04月20日 17:57:51pitrousetmessages: + msg158863
2012年04月20日 17:52:15mark.dickinsonsetstage: needs patch -> commit review
2012年04月20日 17:52:08mark.dickinsonsetmessages: + msg158861
versions: - Python 2.7
2012年04月20日 17:24:06mark.dickinsonsetfiles: + issue14630.patch
keywords: + patch
2012年04月20日 17:23:55mark.dickinsonsetmessages: + msg158854
2012年04月20日 12:18:37mark.dickinsonsetassignee: mark.dickinson
messages: + msg158823
2012年04月20日 12:06:37pitrousetcomponents: + Interpreter Core, - None
stage: needs patch
2012年04月20日 12:06:28pitrousetassignee: mark.dickinson -> (no value)
messages: + msg158822
2012年04月20日 11:53:11mark.dickinsonsetmessages: + msg158819
2012年04月20日 11:45:03mark.dickinsonsetassignee: mark.dickinson
2012年04月20日 11:35:59mark.dickinsonsetmessages: + msg158817
versions: + Python 2.7
2012年04月20日 11:07:07pitrousetmessages: + msg158816
2012年04月20日 10:56:45mark.dickinsonsetmessages: + msg158815
2012年04月20日 10:53:32pitrousetnosy: + skrah, pitrou
messages: + msg158814
2012年04月20日 10:01:56mark.dickinsonsetpriority: normal -> high

messages: + msg158812
versions: + Python 3.3
2012年04月20日 08:07:05mark.dickinsonsetnosy: + mark.dickinson
2012年04月20日 07:51:22brechtmcreate

AltStyle によって変換されたページ (->オリジナル) /