homepage

This issue tracker has been migrated to GitHub , and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: zlib.crc32() and adler32() return value
Type: behavior Stage:
Components: Extension Modules Versions: Python 3.0, Python 3.1, Python 2.6
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: gregory.p.smith Nosy List: arigo, facundobatista, gdementen, gregory.p.smith, gvanrossum, jcea, jdavid.ibp@gmail.com, mbecker, tlesher
Priority: normal Keywords: 64bit

Created on 2007年09月25日 11:31 by arigo, last changed 2022年04月11日 14:56 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
Document.odt jdavid.ibp@gmail.com, 2009年04月06日 09:42
Messages (23)
msg56130 - (view) Author: Armin Rigo (arigo) * (Python committer) Date: 2007年09月25日 11:31
The functions zlib.crc32() and zlib.adler32() return a signed value
in the range(-2147483648, 2147483648) on 32-bit platforms, but an
unsigned value in the range(0, 4294967296) on 64-bit platforms. This
means that half the possible answers are numerically different on these
two kinds of platforms.
Ideally, this should be fixed by having them always return unsigned
numbers (their C return type is unsigned too). It's unclear if we can
do this without breaking existing code, though :-(
msg56146 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2007年09月26日 16:51
Since it's basically a magic cookie, not a meaningful numeric value, I'd
propose sticking with backwards compatibility and fixing the 64-bit
version to return a signed version.
return x - ((x & 0x80000000) <<1)
anyone?
msg58336 - (view) Author: Tim Lesher (tlesher) * Date: 2007年12月10日 01:45
Both CRC-32 and ADLER32 are standards (described in ISO 3309 and RFC
1950 respectively); whatever fix implemented should make sure that the
output complies. 
ISO 3309 isn't available online as far as I can see, but CRC-32
reference code is published in RFC 1952; RFC 1950 contains reference
code for ADLER32.
msg58344 - (view) Author: Armin Rigo (arigo) * (Python committer) Date: 2007年12月10日 12:29
The C reference code in rfc1950 for Adler-32 and in rfc1952 for CRC-32
compute with and return "unsigned long" values. From this point of
view, returning negative values on 32-bit machines from CPython's zlib
module can be considered a bug. That only leaves open the question of
backward compatibility.
msg58362 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2007年12月10日 19:40
How about, in Python 2.6 we make the 64-bit version return a signed
value for better compatibility with the 32-bit version, and in Python
3.0 we make both versions return the signed value for better compliance
with the standard?
msg58375 - (view) Author: Armin Rigo (arigo) * (Python committer) Date: 2007年12月10日 22:23
Obscure but reasonable. (I suspect you meant to say that py3k should
return the *unsigned* value for better compliance with the standard.)
msg58376 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2007年12月10日 22:25
> Obscure but reasonable. (I suspect you meant to say that py3k should
> return the *unsigned* value for better compliance with the standard.)
Yes. :)
msg63664 - (view) Author: Gregory P. Smith (gregory.p.smith) * (Python committer) Date: 2008年03月17日 16:27
working on this now. foo = 'abcdefghijklmnop'
2.x 32-bit long: zlib.crc32(foo) returns -1808088941
2.x 64-bit long: zlib.crc32(foo) returns 2486878355
This is because PyInt_FromLong() happily fits the value into a signed
long internally to the integer object on 64-bit platforms. They are
both the same number if considered with & 0xffffffff.
I'm doing as guido suggests and leaving this slightly odd behavior for
2.x so that crc32 and adler32 always return an integer object. in 3.0
they'll always return an unsigned value.
msg63666 - (view) Author: Gregory P. Smith (gregory.p.smith) * (Python committer) Date: 2008年03月17日 16:33
question: should I also make 64-bit 2.x return a signed value as well to
be consistent with 32bit python 2.x?
Consistency in case someone ever pickles the value and sends it to
another python instance of a different word length would be good...
msg63668 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2008年03月17日 16:35
Sure. (Though sending pickles to 3.0 would still cause problems. 
Consumers of pickled checksums would do wise to *always* take the CRC
mod 2**32 before doing comparisons.)
msg63676 - (view) Author: Jesús Cea Avión (jcea) * (Python committer) Date: 2008年03月17日 17:05
I store CRC in reed-solomon schema of mine. I compare with equality, so,
I think we should enforce "CRC(python 32 bits) == CRC(python 64 bits)".
I will need to touch my code in python 3.0, but that will be inevitable
anyway...
msg63756 - (view) Author: Gregory P. Smith (gregory.p.smith) * (Python committer) Date: 2008年03月17日 20:25
fixed.
3.0 always returns unsigned.
2.6 always returns signed, 2**31...2**31-1 come back as negative integers.
trunk r61449
branches/py3k r61459 
msg64416 - (view) Author: Michael Becker (mbecker) Date: 2008年03月24日 14:58
In case it isn't obvious the work around for pre 3.0 to get the right
sum is something like:
x=zlib.adler32(str)
if x < 0:
 x=(long(x) + 4294967296L) # 2^32, long may or may not be needed here
msg64420 - (view) Author: Gregory P. Smith (gregory.p.smith) * (Python committer) Date: 2008年03月24日 16:55
The workaround I prefer to use is:
x = zlib.adler32(mystr) & 0xffffffffL
msg74380 - (view) Author: Facundo Batista (facundobatista) * (Python committer) Date: 2008年10月06日 16:10
Let me reopen this, I think we have an issue with this fix.
The conclusion of this discussion so far is that in 3.0 the crc32 will
behave like the standard, which is a good thing (tm), but in 2.6 it will
not: it should return a signed integer. I agree with this outcome!
The documentation for 2.6, the commit message for the fix and what it's
said here states that: "2.6 always returns signed, 2**31...2**31-1 come
back as negative integers."
This is *not* actually happening:
>>> s = "*"*100000
>>> print zlib.crc32(s) # 2.6, 32 bits
-872059092
>>> print zlib.crc32(s) # 2.6, 64 bits
3422908204
The problem in the code is, IMHO, that the "32b rounding" is being
forced by assigning the result to an int (Modules/zlibmodule.c, line
929), but this "rounding" does not actually work for 64b (because the
int has 64 bit, and even as it's signed, the number stays big and positive).
Thank you!
msg74384 - (view) Author: Gregory P. Smith (gregory.p.smith) * (Python committer) Date: 2008年10月06日 17:53
An int is 32-bits on all popular platforms. Anyways i'll double check
things. What platforms did you run your test on?
msg79408 - (view) Author: J. David Ibáñez (jdavid.ibp@gmail.com) Date: 2009年01月08日 12:13
I believe I have hit this bug. With Python 2.6.1 in a Gentoo Linux
64 bits.
This code:
 from zipfile import ZipFile
 inzip = ZipFile('Document.odt')
 outzip = ZipFile('out.zip', 'w')
 for f in inzip.infolist():
 if f.filename != 'content.xml':
 print f.filename, '(CRC: %s)' % f.CRC
 outzip.writestr(f, inzip.read(f.filename))
 outzip.close()
Produces this output:
 ...
 styles.xml (CRC: 3930581528)
 test_odt.py:10: DeprecationWarning: 'l' format requires -2147483648 <=
number <= 2147483647
 outzip.writestr(f, inzip.read(f.filename))
 ...
The CRC is not a 32bits signed, and then the module struct complains,
here:
 zipfile.py:1098
 self.fp.write(struct.pack("<lLL", zinfo.CRC, zinfo.compress_size,
Apparently 'struct.pack' expects 'zinfo.CRC' to be a 32 signed it,
which is not.
I can attach the 'Document.odt' file if you want.
msg79527 - (view) Author: Gregory P. Smith (gregory.p.smith) * (Python committer) Date: 2009年01月10日 03:09
seems there are bugs with it not staying signed as it should on some 
64bit platforms. i'll be looking into this shortly. its a good 
candidate bug for 2.6.x and 3.0.x releases.
msg79528 - (view) Author: Gregory P. Smith (gregory.p.smith) * (Python committer) Date: 2009年01月10日 03:10
err not 3.0.x, 3.0 is always unsigned like anyone sane would want. :)
msg85557 - (view) Author: Gregory P. Smith (gregory.p.smith) * (Python committer) Date: 2009年04月05日 19:13
J. David Ibáñez - do you still happen to have that Document.odt laying
around?
msg85626 - (view) Author: J. David Ibáñez (jdavid.ibp@gmail.com) Date: 2009年04月06日 09:42
There it is.
msg86192 - (view) Author: Gaëtan de Menten (gdementen) Date: 2009年04月20日 09:13
Regarding the issue J. David Ibáñez has, I have a few comments to add:
- It's also present on 32bit.
- AFAICT:
 * it's present in both 2.6 branch & trunk (as of 68886),
 * it's a problem with line 1110 (in 2.6 branch), or line 1122 in
trunk, which should read "<LLL" instead of "<lLL",
 * it is an omission from changeset 61591: 
http://svn.python.org/view/python/trunk/Lib/zipfile.py?view=diff&r1=61590&r2=61591
which changed line 964 (at the time)
http://svn.python.org/view/python/trunk/Lib/zipfile.py?annotate=61591#l964
but not the corresponding one in writestr.
I'm not sure whether or not a separate bug report should be opened for
this issue.
msg89723 - (view) Author: Gregory P. Smith (gregory.p.smith) * (Python committer) Date: 2009年06月26日 08:19
fix for J. David's issue submitted to trunk r73565 and py3k r73566 just in 
time for the 3.1 release.
release30-maint r73567
release26-maint r73568 
History
Date User Action Args
2022年04月11日 14:56:27adminsetgithub: 45543
2009年06月26日 08:20:10gregory.p.smithsetresolution: fixed
2009年06月26日 08:19:28gregory.p.smithsetstatus: open -> closed

messages: + msg89723
versions: + Python 3.0, Python 3.1
2009年04月20日 09:13:46gdementensetnosy: + gdementen
messages: + msg86192
2009年04月06日 09:42:23jdavid.ibp@gmail.comsetfiles: + Document.odt

messages: + msg85626
2009年04月05日 19:13:36gregory.p.smithsetmessages: + msg85557
2009年01月10日 03:10:01gregory.p.smithsetmessages: + msg79528
versions: - Python 3.0
2009年01月10日 03:09:17gregory.p.smithsetkeywords: - easy
messages: + msg79527
2009年01月10日 03:06:59gregory.p.smithlinkissue4903 dependencies
2009年01月08日 12:13:47jdavid.ibp@gmail.comsetnosy: + jdavid.ibp@gmail.com
messages: + msg79408
2008年10月06日 17:53:06gregory.p.smithsetmessages: + msg74384
2008年10月06日 16:10:43facundobatistasetstatus: closed -> open
nosy: + facundobatista
resolution: fixed -> (no value)
messages: + msg74380
2008年03月24日 16:55:18gregory.p.smithsetmessages: + msg64420
2008年03月24日 14:58:55mbeckersetnosy: + mbecker
messages: + msg64416
2008年03月17日 20:25:17gregory.p.smithsetstatus: open -> closed
resolution: fixed
messages: + msg63756
2008年03月17日 17:05:03jceasetmessages: + msg63676
2008年03月17日 16:35:36gvanrossumsetmessages: + msg63668
2008年03月17日 16:33:13gregory.p.smithsetmessages: + msg63666
2008年03月17日 16:27:11gregory.p.smithsetmessages: + msg63664
2008年01月23日 08:34:23gregory.p.smithsetpriority: normal
assignee: gregory.p.smith
versions: + Python 2.6, Python 3.0
keywords: + 64bit, easy
nosy: + gregory.p.smith
2008年01月16日 02:28:45jceasetnosy: + jcea
2007年12月10日 22:25:48gvanrossumsetmessages: + msg58376
2007年12月10日 22:23:52arigosetmessages: + msg58375
2007年12月10日 19:40:40gvanrossumsetmessages: + msg58362
2007年12月10日 12:29:55arigosetmessages: + msg58344
2007年12月10日 01:45:23tleshersetnosy: + tlesher
messages: + msg58336
2007年09月26日 16:51:11gvanrossumsetnosy: + gvanrossum
messages: + msg56146
2007年09月25日 11:31:11arigocreate

AltStyle によって変換されたページ (->オリジナル) /