homepage

This issue tracker has been migrated to GitHub , and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: bz2.BZ2DEcompressor.decompress fail on large files
Type: crash Stage: resolved
Components: Extension Modules Versions: Python 3.2, Python 3.3, Python 3.4, Python 2.7
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: nadeem.vawda Nosy List: Laurent.Gautier, benjamin.peterson, georg.brandl, loewis, nadeem.vawda, python-dev, serhiy.storchaka
Priority: normal Keywords:

Created on 2012年03月24日 16:15 by Laurent.Gautier, last changed 2022年04月11日 14:57 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
testbz2.py nadeem.vawda, 2012年03月24日 17:33
Messages (18)
msg156698 - (view) Author: Laurent Gautier (Laurent.Gautier) Date: 2012年03月24日 16:15
The call ends with:
Objects/stringobject.c:3884: bad argument to internal function
sys.version:
'2.7.2 (default, Jun 13 2011, 15:14:50) \n[GCC 4.4.5]'
(on 64bit Linux)
msg156701 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2012年03月24日 16:36
I can't reproduce this. Can you please provide a test script along with input data that allows us to reproduce this error?
msg156705 - (view) Author: Laurent Gautier (Laurent.Gautier) Date: 2012年03月24日 16:45
Wow! Quick follow-up.
 
The data file is about 1.6Gb. Is there a preferred way to pass it on (I suspect that the bug tracker is not the preferred way).
The code goes like:
import bz2
f = file("foobar.bz2", mode="rb")
src_buf = f.read()
decomp = bz2.BZ2Decompressor()
tmp = decomp.decompress(src_buf)
msg156709 - (view) Author: Nadeem Vawda (nadeem.vawda) * (Python committer) Date: 2012年03月24日 17:33
I have been able to reproduce it; see attached script. It happens for
inputs of 2GB (decompressed), but not for ones of 1GB.
It seems that bz2module.c doesn't guard against 32-bit overflows when
handling the size of the decompressed data. This affects both the
BZ2Decompressor object's decompress() method, and the module-level
decompress() function. All python versions prior to 3.3 are affected.
msg156710 - (view) Author: Nadeem Vawda (nadeem.vawda) * (Python committer) Date: 2012年03月24日 17:35
(the contents of the input file don't matter; I just pulled out a
bunch of zeros from /dev/zero and compressed them with bzip2.)
msg156711 - (view) Author: Nadeem Vawda (nadeem.vawda) * (Python committer) Date: 2012年03月24日 17:52
This should be fixed for 2.7.3. I'll have a patch ready in the next day
or two.
msg156713 - (view) Author: Benjamin Peterson (benjamin.peterson) * (Python committer) Date: 2012年03月24日 19:31
This isn't a regression, is it? If it's not, I don't think it's essential to get into 2.7.3.
msg156714 - (view) Author: Nadeem Vawda (nadeem.vawda) * (Python committer) Date: 2012年03月24日 19:35
No, it's been around since at least 2.6. I wasn't really sure what the
protocol was for bugs found during the RC process. It'd be nice to get
a fix for this into 2.7.3 (and 3.2.3), but it's not urgent.
msg156715 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2012年03月24日 19:37
Nadeem: the final release candidate of 2.7.3 was already made. Any further change would require another release candidate, which in turn would delay the release further. This has to wait for 2.7.4.
msg156717 - (view) Author: Nadeem Vawda (nadeem.vawda) * (Python committer) Date: 2012年03月24日 19:38
That's fine by me, then. Sorry for the confusion.
msg173471 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2012年10月21日 19:22
New changeset ebb8c7d79f52 by Nadeem Vawda in branch '3.2':
Issue #14398: Fix size truncation and overflow bugs in bz2 module.
http://hg.python.org/cpython/rev/ebb8c7d79f52
New changeset 25fdf297c077 by Nadeem Vawda in branch '3.3':
Merge #14398: Fix size truncation and overflow bugs in bz2 module.
http://hg.python.org/cpython/rev/25fdf297c077
New changeset d6bf506ea13f by Nadeem Vawda in branch 'default':
Merge #14398: Fix size truncation and overflow bugs in bz2 module.
http://hg.python.org/cpython/rev/d6bf506ea13f 
msg173479 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2012年10月21日 20:20
What about 2.7?
msg173481 - (view) Author: Nadeem Vawda (nadeem.vawda) * (Python committer) Date: 2012年10月21日 20:30
I'm working on it now. Will push in the next 15 minutes or so.
msg173483 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2012年10月21日 21:09
New changeset f03a335621ce by Nadeem Vawda in branch '2.7':
Issue #14398: Fix size truncation and overflow bugs in bz2 module.
http://hg.python.org/cpython/rev/f03a335621ce 
msg173484 - (view) Author: Nadeem Vawda (nadeem.vawda) * (Python committer) Date: 2012年10月21日 21:12
All fixed, along with some other similar but harder-to-trigger bugs.
Thanks for the bug report, Laurent!
msg187083 - (view) Author: Benjamin Peterson (benjamin.peterson) * (Python committer) Date: 2013年04月16日 14:16
Why does only 2.7 have tests?
msg187298 - (view) Author: Nadeem Vawda (nadeem.vawda) * (Python committer) Date: 2013年04月18日 21:40
An oversight on my part, I think. I'll add tests for 3.x this weekend.
msg187533 - (view) Author: Nadeem Vawda (nadeem.vawda) * (Python committer) Date: 2013年04月21日 22:30
Hmm, so actually most of the bugs fixed in 2.7 and 3.2 weren't present
in 3.3 and 3.4, and those versions already had tests equivalent to the
tests I added for 2.7/3.2.
As for the changes that I did make to 3.3/3.4:
- two of the three cover cases that only occur if the output data is
 larger than ~32GiB. Even if we have a buildbot with enough memory for
 it (which I don't think we do), actually running such tests would take
 forever and then some.
- the third is for a condition that's actually pretty much impossible to
 trigger - grow_buffer() has to be called on a buffer that is already at
 least 8*((size_t)-1)/9 bytes long. On a 64-bit system this is
 astronomically large, while on a 32-bit system the OS will probably
 have reserved more than 1/9th of the virtual address space for itself,
 so it won't be possible to allocate a large enough buffer.
History
Date User Action Args
2022年04月11日 14:57:28adminsetgithub: 58606
2013年04月21日 22:30:05nadeem.vawdasetstatus: open -> closed

messages: + msg187533
2013年04月18日 21:40:44nadeem.vawdasetstatus: closed -> open

messages: + msg187298
2013年04月16日 14:16:35benjamin.petersonsetmessages: + msg187083
2012年10月21日 21:12:54nadeem.vawdasetstatus: open -> closed
resolution: fixed
messages: + msg173484

stage: needs patch -> resolved
2012年10月21日 21:09:42python-devsetmessages: + msg173483
2012年10月21日 20:30:19nadeem.vawdasetmessages: + msg173481
2012年10月21日 20:20:08serhiy.storchakasetnosy: + serhiy.storchaka

messages: + msg173479
versions: + Python 3.3, Python 3.4
2012年10月21日 19:22:09python-devsetnosy: + python-dev
messages: + msg173471
2012年03月24日 19:38:53nadeem.vawdasetmessages: + msg156717
2012年03月24日 19:37:11loewissetmessages: + msg156715
2012年03月24日 19:35:51nadeem.vawdasetpriority: release blocker -> normal

messages: + msg156714
2012年03月24日 19:31:42benjamin.petersonsetmessages: + msg156713
2012年03月24日 17:52:25nadeem.vawdasetpriority: normal -> release blocker
nosy: + georg.brandl, benjamin.peterson
messages: + msg156711

2012年03月24日 17:39:16nadeem.vawdasetversions: + Python 3.2
2012年03月24日 17:35:02nadeem.vawdasetmessages: + msg156710
2012年03月24日 17:33:46nadeem.vawdasetfiles: + testbz2.py

assignee: nadeem.vawda
components: + Extension Modules

nosy: + nadeem.vawda
messages: + msg156709
stage: needs patch
2012年03月24日 16:45:34Laurent.Gautiersetmessages: + msg156705
2012年03月24日 16:36:13loewissetnosy: + loewis
messages: + msg156701
2012年03月24日 16:15:19Laurent.Gautiercreate

AltStyle によって変換されたページ (->オリジナル) /