This issue tracker has been migrated to GitHub ,
and is currently read-only.
For more information,
see the GitHub FAQs in the Python's Developer Guide.
Created on 2009年04月28日 05:57 by MizardX, last changed 2022年04月11日 14:56 by admin. This issue is now closed.
| Files | ||||
|---|---|---|---|---|
| File name | Uploaded | Description | Edit | |
| bz2module-v1.diff | nadeem.vawda, 2011年01月25日 03:15 | BZ2File rewrite + tests | ||
| bz2module-v2.diff | nadeem.vawda, 2011年01月26日 00:07 | Python implementation of BZ2File rewrite + tests | ||
| bz2-v3.diff | nadeem.vawda, 2011年01月30日 21:12 | Cleaned-up Python implementation with more tests | ||
| bz2-doc.diff | nadeem.vawda, 2011年02月06日 03:31 | Documentation update | ||
| bz2-v3b.diff | nadeem.vawda, 2011年02月09日 04:57 | bz2-v3.diff + tidied-up docstrings | ||
| bz2-v4.diff | nadeem.vawda, 2011年03月21日 01:15 | |||
| bz2-v4-doc.diff | nadeem.vawda, 2011年03月21日 01:18 | |||
| bz2-v5.diff | nadeem.vawda, 2011年04月02日 13:34 | Add read1() and clean up _read_block() | ||
| bz2-v5-doc.diff | nadeem.vawda, 2011年04月02日 13:38 | |||
| bz2-v6.diff | nadeem.vawda, 2011年04月03日 00:14 | |||
| bz2-v6-doc.diff | nadeem.vawda, 2011年04月03日 00:14 | |||
| Messages (54) | |||
|---|---|---|---|
| msg86716 - (view) | Author: MizardX (MizardX) | Date: 2009年04月28日 05:57 | |
bz2.BZ2File should, like gzip.GzipFile, accept a fileobj argument. If implemented, you could much more easily pipe BZ2-data from other sources, such as stdin or a socket. |
|||
| msg122630 - (view) | Author: Éric Araujo (eric.araujo) * (Python committer) | Date: 2010年11月28日 05:01 | |
Do you want to work on patch? |
|||
| msg122790 - (view) | Author: MizardX (MizardX) | Date: 2010年11月29日 11:43 | |
Would if I could. But, No. |
|||
| msg122797 - (view) | Author: Éric Araujo (eric.araujo) * (Python committer) | Date: 2010年11月29日 13:44 | |
That’s a perfectly fine reply. Someone will see this feature request and propose a patch eventually. Another way to help is to write tests, since those are in Python. |
|||
| msg122845 - (view) | Author: Antoine Pitrou (pitrou) * (Python committer) | Date: 2010年11月29日 18:42 | |
For the record, this will need a comprehensive rewrite of bz2module, since it uses FILE pointers right now. |
|||
| msg122849 - (view) | Author: Raymond Hettinger (rhettinger) * (Python committer) | Date: 2010年11月29日 18:51 | |
Without a patch and compelling use cases, this has no chance. Recommend closing. |
|||
| msg123362 - (view) | Author: Xuanji Li (xuanji) * | Date: 2010年12月04日 15:29 | |
I'll try working on a patch. |
|||
| msg123929 - (view) | Author: Xuanji Li (xuanji) * | Date: 2010年12月14日 11:28 | |
Sorry, I'm giving up. The copyright notice for bz2module.c lists "Gustavo Niemeyer" as one of the holders, is he the maintainer? Maybe he should be notified of this bug. |
|||
| msg123937 - (view) | Author: Antoine Pitrou (pitrou) * (Python committer) | Date: 2010年12月14日 13:38 | |
> Sorry, I'm giving up. Indeed, I think only an extensive rewrite could fulfill the feature request here. > The copyright notice for bz2module.c lists "Gustavo Niemeyer" as one > of the holders, is he the maintainer? Maybe he should be notified of > this bug. He hasn't been active for years, so I don't think he can still be considered the maintainer. |
|||
| msg126194 - (view) | Author: wrobell (wrobell) | Date: 2011年01月13日 19:50 | |
A use case wget -O http://planet.openstreetmap.org/planet-110112.osm.bz2 | tee planet.bz2 | osm2sql | psql osm planet-*osm.bz2 files are 14GB at the moment. it would be great to read them from stdin while downloading from a server and uploading to a database at the same time. Of course, you can insert "bzip2 -d" into the pipe... but then why to bother with bz2 module in Python? ;) |
|||
| msg126195 - (view) | Author: Éric Araujo (eric.araujo) * (Python committer) | Date: 2011年01月13日 19:53 | |
We’ve already agreed the feature is desirable; what’s missing is a patch, not user stories :) |
|||
| msg126196 - (view) | Author: wrobell (wrobell) | Date: 2011年01月13日 20:02 | |
OK! :) |
|||
| msg126807 - (view) | Author: Nadeem Vawda (nadeem.vawda) * (Python committer) | Date: 2011年01月22日 00:49 | |
I have been working on a patch for this issue. I've implemented everything except for readline(), readlines() and the iterator protocol. In the existing implementation, the reading methods seem to interact weirdly - iternext() uses a readahead buffer, while none of the other methods do. Does anyone know if there's a reason for this? I was planning on having all the reading methods use a common buffer, which should allow free mixing of read methods and iteration. Looking at issue8397, I'm guessing it would be fine, but I wanted to double-check in case there's a quirk of the iteration protocol that I've overlooked, or something like that. |
|||
| msg126983 - (view) | Author: Nadeem Vawda (nadeem.vawda) * (Python committer) | Date: 2011年01月25日 03:15 | |
Here is a patch that rewrites BZ2File to implement the requested feature, and adds some tests using BytesIO objects. Some notes: * iteration and the read*() method now use the same buffering machinery, so they can be mixed freely. The test for issue8397 has been updated accordingly. * readlines() now respects its size argument. The existing implementation appears to effectively ignore it. * writelines() no longer uses the (deprecated) old buffer protocol, and is now much simpler. * Currently, calling next() on a writable BZ2File results in a rather unhelpful error message; the patched version checks that the file is readable before trying to actually read. * The docstrings have been rewritten to clarify that all of the methods deal with bytes and not text strings. One thing I was unsure of is how to handle exceptions that occur in BZ2File_dealloc(). Does the error status need to be cleared before it returns? The documentation for the bz2 module appears to be quite out of date; I will upload a patch in the next day or so. On a related note, the 'buffering' argument to __init__() is ignored, and I was wondering whether this should be documented explicitly? The current documentation claims that it allows the caller to specify a buffer size, or request unbuffered I/O. |
|||
| msg126984 - (view) | Author: Anthony Long (antlong) | Date: 2011年01月25日 03:27 | |
Are there tests for this? |
|||
| msg126985 - (view) | Author: Nadeem Vawda (nadeem.vawda) * (Python committer) | Date: 2011年01月25日 03:33 | |
Yes, see bz2module-v1.diff. |
|||
| msg127006 - (view) | Author: Antoine Pitrou (pitrou) * (Python committer) | Date: 2011年01月25日 13:07 | |
Interesting! If you are motivated, a further approach would be to expose the compressor and decompressor objects from the C extension, and write the file object in Python (as in Lib/gzip.py). > One thing I was unsure of is how to handle exceptions that occur in > BZ2File_dealloc(). Does the error status need to be cleared before it > returns? Yes, it should. Actually, it would be better to write out the exception using PyErr_WriteUnraisable(). > On a related note, the 'buffering' argument to __init__() is ignored, > and I was wondering whether this should be documented explicitly? Yes, it should probably be deprecated if it's not useful anymore. By the way, the current patch produces reference leaks: $ ./python -m test -R 3:2 test_bz2 [1/1] test_bz2 beginning 5 repetitions 12345 ..... test_bz2 leaked [44, 44] references, sum=88 |
|||
| msg127008 - (view) | Author: Nadeem Vawda (nadeem.vawda) * (Python committer) | Date: 2011年01月25日 13:51 | |
> Interesting! If you are motivated, a further approach would be to expose > the compressor and decompressor objects from the C extension, and write > the file object in Python (as in Lib/gzip.py). I had initially considered doing something that, but I decided not to for reasons that I can't quite remember. However, in hindsight it seems like it would have been a better approach than doing everything in C. I'll start on it ASAP. >> On a related note, the 'buffering' argument to __init__() is ignored, >> and I was wondering whether this should be documented explicitly? > Yes, it should probably be deprecated if it's not useful anymore. How would I go about doing this? Would it be sufficient to raise a DeprecationWarning if the argument is provided by the caller, and add a note to the docstring and documentation? |
|||
| msg127009 - (view) | Author: Nadeem Vawda (nadeem.vawda) * (Python committer) | Date: 2011年01月25日 13:54 | |
* I had initially considered doing something *like* that |
|||
| msg127010 - (view) | Author: Antoine Pitrou (pitrou) * (Python committer) | Date: 2011年01月25日 14:13 | |
> How would I go about doing this? Would it be sufficient to raise a > DeprecationWarning if the argument is provided by the caller, and add > a note to the docstring and documentation? Yes, totally. |
|||
| msg127074 - (view) | Author: Nadeem Vawda (nadeem.vawda) * (Python committer) | Date: 2011年01月26日 00:07 | |
Here is a quick-and-dirty reimplementation of BZ2File in Python, on top of the existing C implementation of BZ2Compressor and BZ2Decompressor. There are a couple of issues with this code that need to be fixed: * BZ2Decompressor doesn't signal when it reaches the EOS marker, so doesn't seem possible to detect a premature end-of-file. This was easy in the C implementation, when using bzDecompress() directly. * The read*() methods are implemented very inefficiently. Since they have to deal with the bytes objects returned by BZ2Decompressor.decompress(), a large read results in lots of allocations that weren't necessary in the C implementation. I hope to resolve both of these issues (and do a general code cleanup), by writing a C extension module that provides a thin wrapper around bzCompress()/bzDecompress(), and reimplementing the module's public interface in Python on top of it. This should reduce the size of the code by close to half, and make it easier to read and maintain. I'm not sure when I'll be able to get around to it, though, so I thought I should post what I've done so far. Other changes in the patch: * write(), writelines() and seek() now return meaningful values instead of None, in line with the behaviour of other file-like objects. * Fixed a typo in test_bz2's testReadChunk10() that caused the test to pass regardless of whether the data read was correct (self.assertEqual(text, text) -> self.assertEqual(text, self.TEXT)). This one might be worth committing now, since it isn't dependent on the rewrite. |
|||
| msg127115 - (view) | Author: Antoine Pitrou (pitrou) * (Python committer) | Date: 2011年01月26日 13:33 | |
> * The read*() methods are implemented very inefficiently. Since they > have to deal with the bytes objects returned by > BZ2Decompressor.decompress(), a large read results in lots of > allocations that weren't necessary in the C implementation. It probably depends on the buffer size. Trying to fix this /might/ be premature optimization. Also, as with GzipFile one goal should be for BZFile to be wrappable in a io.BufferedReader, which has its own very fast buffering layer (and also a fast readline() if you implement peek() in BZFile). > * Fixed a typo in test_bz2's testReadChunk10() that caused the test to > pass regardless of whether the data read was correct > (self.assertEqual(text, text) -> self.assertEqual(text, self.TEXT)). > This one might be worth committing now, since it isn't dependent on > the rewrite. Ah, thank you. Will take a look. |
|||
| msg127120 - (view) | Author: Nadeem Vawda (nadeem.vawda) * (Python committer) | Date: 2011年01月26日 15:04 | |
>> * The read*() methods are implemented very inefficiently. Since they >> have to deal with the bytes objects returned by >> BZ2Decompressor.decompress(), a large read results in lots of >> allocations that weren't necessary in the C implementation. > > It probably depends on the buffer size. Trying to fix this /might/ be > premature optimization. Actually, looking at the code again (and not being half-asleep this time), I think readline() and readlines() are fine. My worry is about read(), where the problem isn't the size of the buffer but rather the fact that every byte that is read gets copied around more than necessary: * Read into the readahead buffer in _fill_readahead(). * Copy into 'data' in _read_block() * Copy into newly-allocated bytes object for read()'s return value But you're right; this is probably premature optimization. I'll do some proper performance measurements before I jump into rewriting. In the meanwhile, FWIW, I noticed that with the Python implementation, test_bz2 took 20% longer than with my C implementation (~1.5s up from ~1.25s). I don't think this is a very reliable indicator of real-world performance, though. > Also, as with GzipFile one goal should be for BZFile to be wrappable in > a io.BufferedReader, which has its own very fast buffering layer (and > also a fast readline() if you implement peek() in BZFile). Ah, OK. I suppose that is a sensible way of using it. peek() will be quite easy to implement. How should it interpret its argument, though? PEP3116 (New I/O) makes no mention of the function. BufferedReader appears to ignore it and return however much data is convenient. |
|||
| msg127551 - (view) | Author: Nadeem Vawda (nadeem.vawda) * (Python committer) | Date: 2011年01月30日 21:12 | |
OK, I've rewritten the whole bz2 module (patch attached), and I think it is now ready for review. The BZ2File implementation is a cleaned-up version of the one from my previous patch, with some further additions. I've factored out the common compressor/decompressor stuff into classes Compressor and Decompressor in the _bz2 extension module; with these, BZ2Compressor, BZ2Decompressor, compress() and decompress() are trivial to implement in Python. My earlier efficiency concerns seem to have been unfounded; I ran some quick tests with a 4MB bz2 file, and there wasn't any measurable performance difference from the existing all-C implementation. I have added a peek() method to BZ2File, in accordance with Antoine's suggestion, but it's not clear how it should interpret its argument. I followed the lead of io.BufferedReader, and simply ignored the arg, returning whatever data as is already buffered. The patch also includes tests for peek() in test_bz2, based on test_io's BufferedRWPairTest. Also, while looking at io.BufferedReader's implementation, I noticed that it doesn't actually seem to use raw.peek() at all. If this is correct, then perhaps peek() is unnecessary, and shouldn't be added. The patch also adds a property 'eof' to BZ2Decompressor, so that the user can test whether EOF has been reached on the compressed stream. For the new files (Modules/_bz2module.c and Lib/bz2.py), I'm guessing there should be some license boilerplate stuff added at the top of each. I wasn't sure exactly what this should look like, though - some advice would be helpful here. |
|||
| msg128038 - (view) | Author: Nadeem Vawda (nadeem.vawda) * (Python committer) | Date: 2011年02月06日 03:31 | |
Here's an update to the documentation for the bz2 module. |
|||
| msg128203 - (view) | Author: Nadeem Vawda (nadeem.vawda) * (Python committer) | Date: 2011年02月09日 04:55 | |
Here's a revised version of bz2-v3.diff, with docstrings that are more consistent with the updated documentation. |
|||
| msg128204 - (view) | Author: Nadeem Vawda (nadeem.vawda) * (Python committer) | Date: 2011年02月09日 04:57 | |
Weird, the patch didn't upload... |
|||
| msg130626 - (view) | Author: Antoine Pitrou (pitrou) * (Python committer) | Date: 2011年03月11日 21:55 | |
Patch posted for review at http://codereview.appspot.com/4274045/. Still have to do a review though :) |
|||
| msg130750 - (view) | Author: Antoine Pitrou (pitrou) * (Python committer) | Date: 2011年03月13日 17:32 | |
Review posted at http://codereview.appspot.com/4274045/ |
|||
| msg130751 - (view) | Author: Antoine Pitrou (pitrou) * (Python committer) | Date: 2011年03月13日 17:39 | |
Reviewers: nadeem vawda <nadeem.vawda_gmail.com>, http://codereview.appspot.com/4274045/diff/1/Lib/bz2.py File Lib/bz2.py (right): http://codereview.appspot.com/4274045/diff/1/Lib/bz2.py#newcode25 Lib/bz2.py:25: class BZ2File: Is there any reason it doesn't inherit io.BufferedIOBase? (it should also bring you a couple of methods implemented for free: readlines, writelines, __iter__, __next__, __enter__, __exit__) You should probably also implement fileno() (simply return self.fp.fileno()) and the `closed` property. http://codereview.appspot.com/4274045/diff/1/Lib/bz2.py#newcode386 Lib/bz2.py:386: class BZ2Compressor: I don't think there's a point in a Python wrapper, since the wrapper is so trivial. Just do the lock operations in C. Same for BZ2Decompressor. http://codereview.appspot.com/4274045/diff/1/Modules/_bz2module.c File Modules/_bz2module.c (left): http://codereview.appspot.com/4274045/diff/1/Modules/_bz2module.c#oldcode123 Modules/_bz2module.c:123: #ifdef WITH_THREAD As mentioned in Lib/bz2.py, I would keep the lock on the C side since it isn't significantly more complicated, and it avoids having to write a Python wrapper around the compressor and decompressor types. http://codereview.appspot.com/4274045/diff/1/Modules/_bz2module.c File Modules/_bz2module.c (right): http://codereview.appspot.com/4274045/diff/1/Modules/_bz2module.c#newcode3 Modules/_bz2module.c:3: #include "Python.h" Since this is a new start, perhaps we should add #define PY_SSIZE_T_CLEAN before including "Python.h"? This will ensure no code in the module will rely on the old behaviour. http://codereview.appspot.com/4274045/diff/1/Modules/_bz2module.c#newcode48 Modules/_bz2module.c:48: "libbzip2 was not compiled correctly"); Just a nit, but I'm not sure there's any point in renaming "the bz2 library" to "libbzip2"? (also, under Windows I'm not sure the library naming convention is the same as under Unix) http://codereview.appspot.com/4274045/diff/1/Modules/_bz2module.c#newcode78 Modules/_bz2module.c:78: "Unrecognized error from libbzip2: %d", bzerror); Out of curiousity, did you encounter this condition? http://codereview.appspot.com/4274045/diff/1/Modules/_bz2module.c#newcode122 Modules/_bz2module.c:122: c->bzs.avail_out = PyBytes_GET_SIZE(result); Do note that avail_in and avail_out are 32-bit ints, and therefore this is not 64-bit clean. I guess you're just copying the old code here, but that would deserve a separate patch later. Perhaps add a comment in the meantime. http://codereview.appspot.com/4274045/diff/1/Modules/_bz2module.c#newcode209 Modules/_bz2module.c:209: "Provide a block of data to the compressor."}, You could instead re-use the old, more precise docstrings. Also, using PyDoc_STRVAR is preferred so as to make it easier to modify multi-line docstrings. http://codereview.appspot.com/4274045/diff/1/setup.py File setup.py (right): http://codereview.appspot.com/4274045/diff/1/setup.py#newcode1236 setup.py:1236: exts.append( Extension('_bz2', ['_bz2module.c'], The Windows build files probably need updating as well. Can you do it? Otherwise I'll have a try. Please review this at http://codereview.appspot.com/4274045/ Affected files: A Lib/bz2.py M Lib/test/test_bz2.py M Modules/_bz2module.c M setup.py |
|||
| msg130759 - (view) | Author: Nadeem Vawda (nadeem.vawda) * (Python committer) | Date: 2011年03月13日 20:17 | |
Thanks for the review. I'll try and have an updated patch ready by next weekend. Regarding your comments: > Is there any reason it doesn't inherit io.BufferedIOBase? No, there isn't; I'll fix that in my revised patch. > Since this is a new start, perhaps we should add > #define PY_SSIZE_T_CLEAN > before including "Python.h"? Sounds like a good idea. > Just a nit, but I'm not sure there's any point in renaming "the bz2 library" > to "libbzip2"? > (also, under Windows I'm not sure the library naming convention is the same > as under Unix) Well, the official name for the library is "libbzip2" <bzip.org>. I thought that the "lib" prefix would make it clearer that the error is referring to the that library and not _bz2module.c. But if you think it would be better not to make this change, I'll leave it out. >> Modules/_bz2module.c:78: "Unrecognized error from libbzip2: %d", bzerror); > Out of curiousity, did you encounter this condition? No, I was just programming defensively (in case the underlying library adds more error codes in future). Unlikely, but I would think it's better than taking the risk of silently ignoring an error. > Do note that avail_in and avail_out are 32-bit ints, and therefore this is > not 64-bit clean. I guess you're just copying the old code here, but that > would deserve a separate patch later. Perhaps add a comment in the meantime. Good catch. I'll make a note of it. This would only be a problem for avail_in, though. The output buffer never grows by more than BIGCHUNK (512KiB) at a time (see grow_buffer()) so there is no risk of overflowing in avail_out. > The Windows build files probably need updating as well. Can you do it? > Otherwise I'll have a try. I'll give it a try, and let you know if I can't get it to work. |
|||
| msg130811 - (view) | Author: Michiel de Hoon (mdehoon) * | Date: 2011年03月14日 13:22 | |
Would it be possible to add an open() function to the bz2 module? Currently gzip has such a function, but bz2 does not: >>> import gzip >>> gzip.open <function open at 0x781f0> >>> import bz2 >>> bz2.open Traceback (most recent call last): File "<stdin>", line 1, in ? AttributeError: 'module' object has no attribute 'open' >>> |
|||
| msg130824 - (view) | Author: Antoine Pitrou (pitrou) * (Python committer) | Date: 2011年03月14日 14:57 | |
> Would it be possible to add an open() function to the bz2 module? > Currently gzip has such a function, but bz2 does not: Well, it could be a topic for a separate issue. |
|||
| msg130828 - (view) | Author: Nadeem Vawda (nadeem.vawda) * (Python committer) | Date: 2011年03月14日 15:32 | |
> Would it be possible to add an open() function to the bz2 module? Yes, it would be quite trivial, though I don't think it would be worthwhile - all it would do is provide a direct alias for the BZ2File constructor. But as Antoine said, that is a topic for a separate issue. @Antoine: Regarding the use of PY_SSIZE_T_CLEAN, I assume that Py_ssize_t is to be preferred over plain ssize_t. Is this correct? Also, I was wondering whether I need to add some sort of license boilerplate to the beginning of bz2.py? With _bz2module.c, I presume I should retain the copyright information from the old bz2module.c. Would something like this be ok? /* _bz2 - Low-level Python interface to libbzip2. * * Copyright (c) 2011 Nadeem Vawda <nadeem.vawda@gmail.com> * * Based on bz2module.c: * * Copyright (c) 2002 Gustavo Niemeyer <niemeyer@conectiva.com> * Copyright (c) 2002 Python Software Foundation; All Rights Reserved */ (Browsing through the source files in Lib/ and Modules/, there doesn't seem to be a clear convention for this sort of thing...) |
|||
| msg130837 - (view) | Author: Antoine Pitrou (pitrou) * (Python committer) | Date: 2011年03月14日 15:58 | |
> Regarding the use of PY_SSIZE_T_CLEAN, I assume that Py_ssize_t is to be > preferred over plain ssize_t. Is this correct? Yes, ssize_t doesn't exist everywhere AFAIK. (size_t does, or at least we assume it does) > Also, I was wondering whether I need to add some sort of license boilerplate to > the beginning of bz2.py? With _bz2module.c, I presume I should retain the > copyright information from the old bz2module.c. Would something like this be ok? Well, I would personally advocate not re-adding a license boilerplate, since it doesn't serve a purpose (nearly all of Python is freely usable under the PSF License, and the authors are documented by version control). You could ask on python-dev to get other opinions, though. |
|||
| msg130908 - (view) | Author: Nadeem Vawda (nadeem.vawda) * (Python committer) | Date: 2011年03月14日 21:32 | |
> Well, I would personally advocate not re-adding a license boilerplate, > since it doesn't serve a purpose (nearly all of Python is freely usable > under the PSF License, and the authors are documented by version control). That sounds sensible to me. I'll see what the rest of python-dev thinks, though. |
|||
| msg131440 - (view) | Author: Antoine Pitrou (pitrou) * (Python committer) | Date: 2011年03月19日 19:43 | |
Given the absence of response on python-dev, I'd say simply remove the obsolete copyright notice. |
|||
| msg131593 - (view) | Author: Nadeem Vawda (nadeem.vawda) * (Python committer) | Date: 2011年03月21日 01:15 | |
Here is an updated patch, incorporating the feedback from your review. The new patch no longer checks for errors in bz2CompressEnd()/bz2DecompressEnd() in the dealloc functions for BZ2Compressor/BZ2Decompressor. I found that calling PyErr_WriteUnraisable() results in spurious error messages if an exception is raised by the init function, and in any case, the output would not be of much use if a genuine error were to occur. The patch adds implementations of most of the io.BufferedIOBase methods (everything except detach(), read1() and truncate()), and includes unit tests for fileno() and readinto(). |
|||
| msg131596 - (view) | Author: Nadeem Vawda (nadeem.vawda) * (Python committer) | Date: 2011年03月21日 01:18 | |
Corresponding patch for the module docs. |
|||
| msg131666 - (view) | Author: Nadeem Vawda (nadeem.vawda) * (Python committer) | Date: 2011年03月21日 15:27 | |
From the discussion on python-dev, it seems that I will need to submit a Contributor Agreement to the PSF. Can I ask that you not commit this patch until the CA has been submitted? I will need to clear it with my employer, and it might complicate things if the code in question has already been committed. |
|||
| msg131667 - (view) | Author: Antoine Pitrou (pitrou) * (Python committer) | Date: 2011年03月21日 15:29 | |
> >From the discussion on python-dev, it seems that I will need to submit a > Contributor Agreement to the PSF. Can I ask that you not commit this > patch until the CA has been submitted? I will need to clear it with my > employer, and it might complicate things if the code in question has > already been committed. Ok, I was planning to do another review anyway. |
|||
| msg132606 - (view) | Author: Antoine Pitrou (pitrou) * (Python committer) | Date: 2011年03月30日 19:09 | |
Nadeem, > Can I ask that you not commit this > patch until the CA has been submitted? I will need to clear it with my > employer, and it might complicate things if the code in question has > already been committed. Apparently the PSF has received your contributor agreement. Does it mean the situation is cleared? I plan to do a review of your latest patch. |
|||
| msg132609 - (view) | Author: Nadeem Vawda (nadeem.vawda) * (Python committer) | Date: 2011年03月30日 19:42 | |
> Apparently the PSF has received your contributor agreement. Great; I was just about to send them an email to check. > Does it mean the situation is cleared? I plan to do a review of your latest patch. Yes, everything's sorted out. Go ahead :) |
|||
| msg132795 - (view) | Author: Nadeem Vawda (nadeem.vawda) * (Python committer) | Date: 2011年04月02日 13:34 | |
Here is an updated patch that adds read1() to BZ2File. This should fix things for issue10791 from the bz2 side. I also took the opportunity to clean up _read_block() to be more readable. As per Martin's suggestion on python-dev, I put the copyright notice in the patch header, rather than in the code itself. |
|||
| msg132796 - (view) | Author: Nadeem Vawda (nadeem.vawda) * (Python committer) | Date: 2011年04月02日 13:38 | |
Updated documentation. |
|||
| msg132807 - (view) | Author: Antoine Pitrou (pitrou) * (Python committer) | Date: 2011年04月02日 18:28 | |
> Here is an updated patch that adds read1() to BZ2File. This should fix things > for issue10791 from the bz2 side. I also took the opportunity to clean up > _read_block() to be more readable. As per Martin's suggestion on python-dev, I > put the copyright notice in the patch header, rather than in the code itself. Thank you! A couple of comments: - please avoid C++-style comments ("// ..."), some compilers don't like them - BZ2Decompressor.eof would be better as a T_BOOL than a T_INT, IMO - BZ2Decompressor.eof should be documented as new in 3.3 - instead of using PyObject_GetBuffer(), I think it's better to call PyArg_ParseTuple with the "y*" typecode: it makes sure it does the right thing - instead of "int(size)", use "size = size.__index__()" so as to forbid floats |
|||
| msg132819 - (view) | Author: Nadeem Vawda (nadeem.vawda) * (Python committer) | Date: 2011年04月02日 23:34 | |
Thanks for the review. I've made most of the changes you suggested, but there's
one thing I wanted to check about:
> - instead of "int(size)", use "size = size.__index__()" so as to forbid floats
The tests for readline() and readlines() expect a TypeError if size is None.
Calling size.__index__() in this case raises an AttributeError instead. Should I
change the tests to expect an AttributeError? Alternatively, something like this
would more closely match the behaviour of the old code:
try:
size = size.__index__()
except AttributeError:
raise TypeError("Integer argument expected")
|
|||
| msg132820 - (view) | Author: Antoine Pitrou (pitrou) * (Python committer) | Date: 2011年04月02日 23:45 | |
> The tests for readline() and readlines() expect a TypeError if size is None.
> Calling size.__index__() in this case raises an AttributeError instead. Should I
> change the tests to expect an AttributeError? Alternatively, something like this
> would more closely match the behaviour of the old code:
>
> try:
> size = size.__index__()
> except AttributeError:
> raise TypeError("Integer argument expected")
Ah, you're right, TypeError should be raised.
|
|||
| msg132821 - (view) | Author: Nadeem Vawda (nadeem.vawda) * (Python committer) | Date: 2011年04月03日 00:14 | |
Here's the updated patch. |
|||
| msg132822 - (view) | Author: Nadeem Vawda (nadeem.vawda) * (Python committer) | Date: 2011年04月03日 00:14 | |
... and the corresponding updated documentation patch. |
|||
| msg132840 - (view) | Author: Roundup Robot (python-dev) (Python triager) | Date: 2011年04月03日 15:09 | |
New changeset 2cb07a46f4b5 by Antoine Pitrou in branch 'default': Issue #5863: Rewrite BZ2File in pure Python, and allow it to accept http://hg.python.org/cpython/rev/2cb07a46f4b5 |
|||
| msg132841 - (view) | Author: Antoine Pitrou (pitrou) * (Python committer) | Date: 2011年04月03日 15:11 | |
Thank you very much, Nadeem. The patch is now in. |
|||
| msg132865 - (view) | Author: Oliver Deppert (Kontr-Olli) | Date: 2011年04月03日 18:13 | |
Hi, thanks for the patch. Could you also publish a version for older python 2.x ? regards, Olli |
|||
| msg132866 - (view) | Author: Éric Araujo (eric.araujo) * (Python committer) | Date: 2011年04月03日 18:27 | |
As a new feature, this can’t go into older versions. |
|||
| History | |||
|---|---|---|---|
| Date | User | Action | Args |
| 2022年04月11日 14:56:48 | admin | set | github: 50113 |
| 2011年04月07日 13:20:27 | jcea | set | nosy:
+ jcea |
| 2011年04月03日 18:27:06 | eric.araujo | set | messages: + msg132866 |
| 2011年04月03日 18:13:06 | Kontr-Olli | set | nosy:
+ Kontr-Olli messages: + msg132865 |
| 2011年04月03日 15:11:33 | pitrou | set | status: open -> closed resolution: fixed messages: + msg132841 stage: patch review -> resolved |
| 2011年04月03日 15:09:11 | python-dev | set | nosy:
+ python-dev messages: + msg132840 |
| 2011年04月03日 00:14:52 | nadeem.vawda | set | files:
+ bz2-v6-doc.diff messages: + msg132822 |
| 2011年04月03日 00:14:11 | nadeem.vawda | set | files:
+ bz2-v6.diff messages: + msg132821 |
| 2011年04月02日 23:45:48 | pitrou | set | messages: + msg132820 |
| 2011年04月02日 23:34:21 | nadeem.vawda | set | messages: + msg132819 |
| 2011年04月02日 18:28:51 | pitrou | set | messages: + msg132807 |
| 2011年04月02日 13:38:16 | nadeem.vawda | set | files:
+ bz2-v5-doc.diff messages: + msg132796 |
| 2011年04月02日 13:35:02 | nadeem.vawda | set | files:
+ bz2-v5.diff messages: + msg132795 |
| 2011年03月30日 19:42:38 | nadeem.vawda | set | messages: + msg132609 |
| 2011年03月30日 19:09:59 | pitrou | set | messages: + msg132606 |
| 2011年03月24日 06:32:49 | asvetlov | set | nosy:
+ asvetlov |
| 2011年03月21日 15:29:39 | pitrou | set | nosy:
rhettinger, niemeyer, mdehoon, pitrou, wrobell, nadeem.vawda, eric.araujo, MizardX, antlong, xuanji messages: + msg131667 |
| 2011年03月21日 15:27:29 | nadeem.vawda | set | nosy:
rhettinger, niemeyer, mdehoon, pitrou, wrobell, nadeem.vawda, eric.araujo, MizardX, antlong, xuanji messages: + msg131666 |
| 2011年03月21日 01:18:44 | nadeem.vawda | set | files:
+ bz2-v4-doc.diff nosy: rhettinger, niemeyer, mdehoon, pitrou, wrobell, nadeem.vawda, eric.araujo, MizardX, antlong, xuanji messages: + msg131596 |
| 2011年03月21日 01:15:47 | nadeem.vawda | set | files:
+ bz2-v4.diff nosy: rhettinger, niemeyer, mdehoon, pitrou, wrobell, nadeem.vawda, eric.araujo, MizardX, antlong, xuanji messages: + msg131593 |
| 2011年03月19日 19:43:54 | pitrou | set | nosy:
rhettinger, niemeyer, mdehoon, pitrou, wrobell, nadeem.vawda, eric.araujo, MizardX, antlong, xuanji messages: + msg131440 |
| 2011年03月14日 21:32:32 | nadeem.vawda | set | nosy:
rhettinger, niemeyer, mdehoon, pitrou, wrobell, nadeem.vawda, eric.araujo, MizardX, antlong, xuanji messages: + msg130908 |
| 2011年03月14日 15:58:04 | pitrou | set | nosy:
rhettinger, niemeyer, mdehoon, pitrou, wrobell, nadeem.vawda, eric.araujo, MizardX, antlong, xuanji messages: + msg130837 |
| 2011年03月14日 15:32:55 | nadeem.vawda | set | nosy:
rhettinger, niemeyer, mdehoon, pitrou, wrobell, nadeem.vawda, eric.araujo, MizardX, antlong, xuanji messages: + msg130828 |
| 2011年03月14日 14:57:30 | pitrou | set | nosy:
rhettinger, niemeyer, mdehoon, pitrou, wrobell, nadeem.vawda, eric.araujo, MizardX, antlong, xuanji messages: + msg130824 |
| 2011年03月14日 13:22:16 | mdehoon | set | nosy:
+ mdehoon messages: + msg130811 |
| 2011年03月13日 20:17:10 | nadeem.vawda | set | nosy:
rhettinger, niemeyer, pitrou, wrobell, nadeem.vawda, eric.araujo, MizardX, antlong, xuanji messages: + msg130759 |
| 2011年03月13日 17:39:54 | pitrou | set | nosy:
rhettinger, niemeyer, pitrou, wrobell, nadeem.vawda, eric.araujo, MizardX, antlong, xuanji messages: + msg130751 title: bz2.BZ2File should accept other file-like objects. -> bz2.BZ2File should accept other file-like objects. (issue4274045) |
| 2011年03月13日 17:32:49 | pitrou | set | nosy:
rhettinger, niemeyer, pitrou, wrobell, nadeem.vawda, eric.araujo, MizardX, antlong, xuanji messages: + msg130750 |
| 2011年03月11日 21:55:38 | pitrou | set | nosy:
rhettinger, niemeyer, pitrou, wrobell, nadeem.vawda, eric.araujo, MizardX, antlong, xuanji messages: + msg130626 |
| 2011年02月09日 04:57:19 | nadeem.vawda | set | files:
+ bz2-v3b.diff nosy: rhettinger, niemeyer, pitrou, wrobell, nadeem.vawda, eric.araujo, MizardX, antlong, xuanji messages: + msg128204 |
| 2011年02月09日 04:55:39 | nadeem.vawda | set | nosy:
rhettinger, niemeyer, pitrou, wrobell, nadeem.vawda, eric.araujo, MizardX, antlong, xuanji messages: + msg128203 |
| 2011年02月06日 03:31:23 | nadeem.vawda | set | files:
+ bz2-doc.diff nosy: rhettinger, niemeyer, pitrou, wrobell, nadeem.vawda, eric.araujo, MizardX, antlong, xuanji messages: + msg128038 |
| 2011年01月30日 21:12:52 | nadeem.vawda | set | files:
+ bz2-v3.diff nosy: rhettinger, niemeyer, pitrou, wrobell, nadeem.vawda, eric.araujo, MizardX, antlong, xuanji messages: + msg127551 |
| 2011年01月26日 15:04:01 | nadeem.vawda | set | nosy:
rhettinger, niemeyer, pitrou, wrobell, nadeem.vawda, eric.araujo, MizardX, antlong, xuanji messages: + msg127120 |
| 2011年01月26日 13:33:06 | pitrou | set | nosy:
rhettinger, niemeyer, pitrou, wrobell, nadeem.vawda, eric.araujo, MizardX, antlong, xuanji messages: + msg127115 |
| 2011年01月26日 00:07:18 | nadeem.vawda | set | files:
+ bz2module-v2.diff nosy: rhettinger, niemeyer, pitrou, wrobell, nadeem.vawda, eric.araujo, MizardX, antlong, xuanji messages: + msg127074 |
| 2011年01月25日 14:13:12 | pitrou | set | nosy:
rhettinger, niemeyer, pitrou, wrobell, nadeem.vawda, eric.araujo, MizardX, antlong, xuanji messages: + msg127010 |
| 2011年01月25日 13:54:00 | nadeem.vawda | set | nosy:
rhettinger, niemeyer, pitrou, wrobell, nadeem.vawda, eric.araujo, MizardX, antlong, xuanji messages: + msg127009 |
| 2011年01月25日 13:51:00 | nadeem.vawda | set | nosy:
rhettinger, niemeyer, pitrou, wrobell, nadeem.vawda, eric.araujo, MizardX, antlong, xuanji messages: + msg127008 |
| 2011年01月25日 13:07:35 | pitrou | set | nosy:
rhettinger, niemeyer, pitrou, wrobell, nadeem.vawda, eric.araujo, MizardX, antlong, xuanji messages: + msg127006 stage: needs patch -> patch review |
| 2011年01月25日 03:33:45 | nadeem.vawda | set | nosy:
rhettinger, niemeyer, pitrou, wrobell, nadeem.vawda, eric.araujo, MizardX, antlong, xuanji messages: + msg126985 |
| 2011年01月25日 03:27:18 | antlong | set | nosy:
+ antlong messages: + msg126984 |
| 2011年01月25日 03:15:40 | nadeem.vawda | set | files:
+ bz2module-v1.diff messages: + msg126983 keywords: + patch nosy: rhettinger, niemeyer, pitrou, wrobell, nadeem.vawda, eric.araujo, MizardX, xuanji |
| 2011年01月22日 00:49:16 | nadeem.vawda | set | nosy:
rhettinger, niemeyer, pitrou, wrobell, nadeem.vawda, eric.araujo, MizardX, xuanji messages: + msg126807 |
| 2011年01月13日 20:02:11 | wrobell | set | nosy:
rhettinger, niemeyer, pitrou, wrobell, nadeem.vawda, eric.araujo, MizardX, xuanji messages: + msg126196 |
| 2011年01月13日 19:53:26 | eric.araujo | set | nosy:
+ niemeyer messages: + msg126195 versions: + Python 3.3, - Python 3.2 |
| 2011年01月13日 19:50:48 | wrobell | set | nosy:
+ wrobell messages: + msg126194 |
| 2010年12月14日 13:38:34 | pitrou | set | messages: + msg123937 |
| 2010年12月14日 11:28:40 | xuanji | set | messages: + msg123929 |
| 2010年12月04日 15:29:36 | xuanji | set | nosy:
+ xuanji messages: + msg123362 |
| 2010年11月29日 18:52:13 | nadeem.vawda | set | nosy:
+ nadeem.vawda |
| 2010年11月29日 18:51:57 | rhettinger | set | nosy:
+ rhettinger messages: + msg122849 |
| 2010年11月29日 18:42:59 | pitrou | set | nosy:
+ pitrou messages: + msg122845 |
| 2010年11月29日 13:44:32 | eric.araujo | set | messages: + msg122797 |
| 2010年11月29日 11:43:47 | MizardX | set | messages: + msg122790 |
| 2010年11月28日 05:01:43 | eric.araujo | set | nosy:
+ eric.araujo messages: + msg122630 components: + Extension Modules, - Library (Lib), IO stage: test needed -> needs patch |
| 2010年07月10日 11:47:01 | BreamoreBoy | set | stage: test needed versions: + Python 3.2 |
| 2009年04月28日 05:57:42 | MizardX | create | |