This issue tracker has been migrated to GitHub ,
and is currently read-only.
For more information,
see the GitHub FAQs in the Python's Developer Guide.
Created on 2008年12月24日 16:29 by ebfe, last changed 2022年04月11日 14:56 by admin. This issue is now closed.
| Files | ||||
|---|---|---|---|---|
| File name | Uploaded | Description | Edit | |
| zlib_threads-3.diff | ebfe, 2009年01月02日 10:38 | |||
| zlibtest2.py | ebfe, 2009年01月02日 10:39 | |||
| Messages (15) | |||
|---|---|---|---|
| msg78266 - (view) | Author: Lukas Lueg (ebfe) | Date: 2008年12月24日 16:29 | |
My application needs to pack and unpack workunits all day long and does this using multiple threading.Threads. I've noticed that the zlib module seems to use only one thread at a time when using [de]compressobj(). As the comment in the sourcefile zlibmodule.c already says the module uses a global lock to protect different threads from accessing the object. While the c-functions release the GIL while waiting for the global lock, only one thread at a time can use zlib. My app ends up using only one CPU to compress/decompress it's workunits... The patch (svn diff to ) attached here fixes this problem by extending the compressobj-structure by an additional member to create object-specific locks and removes the global lock. The lock protects each compressobj individually and allows multiple python threads to use zlib in parallel, utilizing all available CPUs. |
|||
| msg78282 - (view) | Author: STINNER Victor (vstinner) * (Python committer) | Date: 2008年12月25日 23:14 | |
A call to PyThread_free_lock(this->lock) in Comp_dealloc() and Decomp_dealloc(). Comp_dealloc() and Decomp_dealloc() code may also be factorized (write a common function to free unused_data, unconsumed_tail and self). |
|||
| msg78287 - (view) | Author: Lukas Lueg (ebfe) | Date: 2008年12月26日 00:42 | |
new svn diff |
|||
| msg78289 - (view) | Author: STINNER Victor (vstinner) * (Python committer) | Date: 2008年12月26日 01:30 | |
Ok, the patch looks fine and I like finer locks ;-) |
|||
| msg78318 - (view) | Author: STINNER Victor (vstinner) * (Python committer) | Date: 2008年12月26日 23:34 | |
New comments about the last patch: - GIL is not released for adler() or crc32() whereas these functions may be slow for long strings: just add Py_BEGIN_ALLOW_THREADS / Py_END_ALLOW_THREADS before / after adler(...) and crc32(...) - (As ENTER_HASHLIB, issue #4751) I think that Py_BEGIN_ALLOW_THREADS / Py_END_ALLOW_THREADS are useless in ENTER_ZLIB - You might add explicit self to ENTER/LEAVE_ZLIB because the macros are now dependent of self (and not the whole module) => ENTER_ZLIB(self) and LEAVE_ZLIB(self) Are deflateCopy() and inflateCopy() slow enough to release the GIL? |
|||
| msg78326 - (view) | Author: Lukas Lueg (ebfe) | Date: 2008年12月27日 01:15 | |
new svn diff attached - GIL is now released for adler32 and crc32 if the buffer is larger than 5kb (we don't want to risk burning cpu cycles by GIL-stuff) - adler32 got it's param by s# but now does s* - why s# anyway? - ENTER_ZLIB no longer gives away the GIL. It's dangerous and useless as there is no pressure on the object's lock. - deflateCopy() and inflateCopy() are not worth the trouble.u |
|||
| msg78329 - (view) | Author: STINNER Victor (vstinner) * (Python committer) | Date: 2008年12月27日 01:38 | |
Comments on zlib_threads-2.diff: - the indentation is strange: don't mix spaces and tabs! - I prefer ";" after a call to a macro: "ENTER_ZLIB(self);" instead of "ENTER_ZLIB(self)". It makes vim happy (auto indent code correctly) and it works for ENTER_ZLIB and LEAVER_ZLIB. - ENTER_ZLIB and LEAVER_ZLIB prototype is wrong if WITH_THREAD is not defined - oh yeah, s* is needed to protect the buffer with a lock - why 5kb? is it a random value? I prefer power of two, like 4096 bytes :-) |
|||
| msg78331 - (view) | Author: Lukas Lueg (ebfe) | Date: 2008年12月27日 02:02 | |
new svn diff attached
the indentation in this file is not my fault, it has tabs all over it...
The 5kb limits protects from the overhead of releasing the GIL. With
very small buffers the overall runtime in my benchmark tends to double.
I set it based on my testing and it remains being arbitrary to a certain
degree. Set the limit to 1 and try 1.000.000 times b'abc'...
May I also suggest to change the zlib module not to accept s* but y*:
- Internally zlib operates on bytes, characters don't mean a thing in
zlib-land.
- We rely on s* performing the encoding into default for us. This
behaviour is hidden from the programmer and somewhat violates the rule
of least surprise.
- type(zlib.decompress(zlib.compress('abc'))) == bytes
- Changing from s* to y* forces the programmer to use .encode() on his
strings (e.g. zlib.compress('abc'.encode()) which very clearly shows
what's happening.
|
|||
| msg78350 - (view) | Author: Antoine Pitrou (pitrou) * (Python committer) | Date: 2008年12月27日 10:38 | |
> May I also suggest to change the zlib module not to accept s* but y* You are probably right, but this would also break the API and can't be done lightheartedly. You can open a new bug entry about this though. |
|||
| msg78359 - (view) | Author: STINNER Victor (vstinner) * (Python committer) | Date: 2008年12月27日 13:01 | |
I opened a separate issue for the unicode problem: #4757. |
|||
| msg78669 - (view) | Author: Antoine Pitrou (pitrou) * (Python committer) | Date: 2009年01月01日 00:07 | |
Same comment about potential deadlocks as in #4751. |
|||
| msg78771 - (view) | Author: Lukas Lueg (ebfe) | Date: 2009年01月02日 10:38 | |
Here is a small test-script with concurrent access to a single compressosbj. The original patch will immediately deadlock. The patch attached releases the GIL before trying to get the zlib-lock. This allows the other thread to release the zlib-lock but comes at the cost of one additional GIL lock/unlock. |
|||
| msg78772 - (view) | Author: Lukas Lueg (ebfe) | Date: 2009年01月02日 10:39 | |
test-script |
|||
| msg78849 - (view) | Author: Antoine Pitrou (pitrou) * (Python committer) | Date: 2009年01月02日 17:35 | |
Checked in r68165. Thanks! |
|||
| msg78850 - (view) | Author: STINNER Victor (vstinner) * (Python committer) | Date: 2009年01月02日 17:49 | |
@pitrou: You added "Also, the GIL is now released when computing the CRC of a large buffer." in the NEWS. The GIL released for crc32 but also for adler32. |
|||
| History | |||
|---|---|---|---|
| Date | User | Action | Args |
| 2022年04月11日 14:56:43 | admin | set | github: 48988 |
| 2009年01月02日 17:49:52 | vstinner | set | messages: + msg78850 |
| 2009年01月02日 17:35:18 | pitrou | set | status: open -> closed resolution: fixed messages: + msg78849 |
| 2009年01月02日 10:39:36 | ebfe | set | files:
+ zlibtest2.py messages: + msg78772 |
| 2009年01月02日 10:38:17 | ebfe | set | files:
+ zlib_threads-3.diff messages: + msg78771 |
| 2009年01月02日 10:33:46 | ebfe | set | files: - zlib_threads-2.diff |
| 2009年01月01日 00:07:07 | pitrou | set | messages: + msg78669 |
| 2008年12月27日 13:01:01 | vstinner | set | messages: + msg78359 |
| 2008年12月27日 10:38:39 | pitrou | set | messages: + msg78350 |
| 2008年12月27日 02:03:02 | ebfe | set | files: - zlib_threads-2.diff |
| 2008年12月27日 02:02:45 | ebfe | set | files:
+ zlib_threads-2.diff messages: + msg78331 |
| 2008年12月27日 01:38:06 | vstinner | set | messages: + msg78329 |
| 2008年12月27日 01:15:30 | ebfe | set | files: - zlib_threads.diff |
| 2008年12月27日 01:15:22 | ebfe | set | files:
+ zlib_threads-2.diff messages: + msg78326 |
| 2008年12月26日 23:34:43 | vstinner | set | messages: + msg78318 |
| 2008年12月26日 22:28:57 | pitrou | set | priority: normal nosy: + pitrou stage: patch review versions: - Python 2.6, Python 2.5, Python 3.0 |
| 2008年12月26日 01:30:24 | vstinner | set | messages: + msg78289 |
| 2008年12月26日 01:29:19 | vstinner | set | files: - zlib_threads.diff |
| 2008年12月26日 00:42:35 | ebfe | set | files:
+ zlib_threads.diff messages: + msg78287 |
| 2008年12月25日 23:14:30 | vstinner | set | nosy:
+ vstinner messages: + msg78282 |
| 2008年12月24日 16:29:30 | ebfe | create | |