This issue tracker has been migrated to GitHub ,
and is currently read-only.
For more information,
see the GitHub FAQs in the Python's Developer Guide.
Created on 2012年06月22日 14:59 by gvanrossum, last changed 2022年04月11日 14:57 by admin. This issue is now closed.
| Files | ||||
|---|---|---|---|---|
| File name | Uploaded | Description | Edit | |
| base64.patch | gvanrossum, 2012年06月22日 16:55 | review | ||
| base64.patch | gvanrossum, 2012年06月22日 17:19 | review | ||
| base64_27.diff | Devin Jeanpierre, 2015年05月30日 18:56 | |||
| Messages (18) | |||
|---|---|---|---|
| msg163420 - (view) | Author: Guido van Rossum (gvanrossum) * (Python committer) | Date: 2012年06月22日 14:59 | |
This bug is still present in Python 3.3. Some people use urlsafe strings *a lot*. And they find that the (obvious) implementation in base64.py is way too slow. In fact, I found myself writing the following a few times in App Engine libraries:
# This is 3-4x faster than urlsafe_b64decode()
urlsafe = base64.b64encode(<arg>)
return urlsafe.rstrip('=').replace('+', '-').replace('/', '_')
Someone from YouTube told me they patch the stdlib for the same reason (they claim their solution is 10x faster).
IIUC the cause of the slowness is the (mostly irrelevant) generality in b64encode and the _translate() function.
|
|||
| msg163422 - (view) | Author: Jesús Cea Avión (jcea) * (Python committer) | Date: 2012年06月22日 15:32 | |
This is a feature request, not a bug. Marking for 3.3. |
|||
| msg163424 - (view) | Author: Jesús Cea Avión (jcea) * (Python committer) | Date: 2012年06月22日 15:42 | |
Guido, would your patch be enough?. Is a 3x improvement enough?. What improvement could be acceptable?. Implemented in C in "binascii" module? |
|||
| msg163430 - (view) | Author: Guido van Rossum (gvanrossum) * (Python committer) | Date: 2012年06月22日 16:28 | |
Why would it be a feature? There is no change in semantics, only in speed. Do we have a policy that forbids performance enhancements in bugfix releases? (It doesn't really matter, I'm just curious. Also, if it really was a feature request, it couldn't be introduced once the 3.3 beta has started, which would strike me as silly.) Anyway, I don't think my hack should be applied literally; I was more thinking of memoizing the translation table and inlining the _translate call. |
|||
| msg163432 - (view) | Author: Antoine Pitrou (pitrou) * (Python committer) | Date: 2012年06月22日 16:38 | |
> Do we have a policy that forbids performance enhancements in bugfix > releases? Yes, except when it fixes a regression. I don't know where that policy is codified, but it seems quite ingrained. It also means the window for getting it in 3.3 is getting narrow, the beta being so soon :) |
|||
| msg163435 - (view) | Author: Guido van Rossum (gvanrossum) * (Python committer) | Date: 2012年06月22日 16:55 | |
Here's a patch. It makes urlsafe_b64{en,de}code about 4x faster on my machine. The tests in base64_test.py still pass.
|
|||
| msg163439 - (view) | Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) | Date: 2012年06月22日 17:11 | |
$ ./python -m timeit -s "from base64 import urlsafe_b64encode; url=bytes(range(256))" "urlsafe_b64encode(url)" 10000 loops, best of 3: 73.5 usec per loop $ ./python -m timeit -s "from base64 import b64encode; url=bytes(range(256))" "b64encode(url).replace(b'+', b'-').replace(b'/', b'_')" 100000 loops, best of 3: 18.9 usec per loop $ ./python -m timeit -s "from base64 import b64encode; url=bytes(range(256)); tr=bytes.maketrans(b'+/', b'-_')" "b64encode(url).translate(tr)" 100000 loops, best of 3: 15.6 usec per loop |
|||
| msg163440 - (view) | Author: Guido van Rossum (gvanrossum) * (Python committer) | Date: 2012年06月22日 17:19 | |
Heh. I'd forgotten about maketrans. Here's a patch that kills the _translate() helper function completely. |
|||
| msg163441 - (view) | Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) | Date: 2012年06月22日 17:26 | |
You're faster than me. I think that asserts should be replaced by if/raise. |
|||
| msg163443 - (view) | Author: Guido van Rossum (gvanrossum) * (Python committer) | Date: 2012年06月22日 17:28 | |
I didn't touch that assert. |
|||
| msg163475 - (view) | Author: Jesús Cea Avión (jcea) * (Python committer) | Date: 2012年06月22日 20:07 | |
Antoine thinks so: http://bugs.python.org/issue15139 :-) I think we should have an official policy, though. |
|||
| msg163494 - (view) | Author: Guido van Rossum (gvanrossum) * (Python committer) | Date: 2012年06月22日 21:38 | |
So, can I check this into 3.3? |
|||
| msg163496 - (view) | Author: Antoine Pitrou (pitrou) * (Python committer) | Date: 2012年06月22日 21:52 | |
I haven't tested the patch, but it looks fine to me. |
|||
| msg163499 - (view) | Author: Guido van Rossum (gvanrossum) * (Python committer) | Date: 2012年06月22日 22:19 | |
Submitted. |
|||
| msg244485 - (view) | Author: Devin Jeanpierre (Devin Jeanpierre) * | Date: 2015年05月30日 18:56 | |
Here's a backport of the patch to 2.7. It's pretty rad, and basically identical to how YouTube monkeypatches base64. Not sure what will happen to this patch. According to recent discussion on the list (e.g. https://mail.python.org/pipermail/python-dev/2015-May/140380.html ), performance improvements are open for inclusion in 2.7 if anyone wants to bother with merging this in and taking on the review / maintenance burden. I'm OK with just publishing it for others to merge in with their own private versions of Python. It is only relevant if you use base64 a lot. :) |
|||
| msg247672 - (view) | Author: Guido van Rossum (gvanrossum) * (Python committer) | Date: 2015年07月30日 14:26 | |
I'm going to apply the 2.7 patch, except I'm going to leave the (now unused) _translate() function in place, in case it's used anywhere (minimizing the risk of breaking anything). |
|||
| msg247675 - (view) | Author: Roundup Robot (python-dev) (Python triager) | Date: 2015年07月30日 14:50 | |
New changeset 309cbbc32491 by Guido van Rossum in branch '2.7': Issue #15138: Speed up base64.urlsafe_b64* considerably (2.7 backport). https://hg.python.org/cpython/rev/309cbbc32491 |
|||
| msg247677 - (view) | Author: Guido van Rossum (gvanrossum) * (Python committer) | Date: 2015年07月30日 14:58 | |
And just in case someone objects to my breaking the policy stated by Antoine in msg163432: see http://bugs.python.org/issue4753 -- it's not actually a real policy. |
|||
| History | |||
|---|---|---|---|
| Date | User | Action | Args |
| 2022年04月11日 14:57:31 | admin | set | github: 59343 |
| 2015年07月30日 14:58:04 | gvanrossum | set | messages:
+ msg247677 versions: + Python 2.7 |
| 2015年07月30日 14:50:45 | python-dev | set | nosy:
+ python-dev messages: + msg247675 |
| 2015年07月30日 14:26:56 | gvanrossum | set | messages: + msg247672 |
| 2015年05月30日 18:56:12 | Devin Jeanpierre | set | files:
+ base64_27.diff nosy: + Devin Jeanpierre messages: + msg244485 |
| 2012年06月22日 22:19:47 | gvanrossum | set | status: open -> closed resolution: fixed messages: + msg163499 |
| 2012年06月22日 21:52:23 | pitrou | set | messages:
+ msg163496 versions: - Python 3.4 |
| 2012年06月22日 21:39:43 | gvanrossum | set | assignee: gvanrossum stage: needs patch -> patch review |
| 2012年06月22日 21:38:38 | gvanrossum | set | messages: + msg163494 |
| 2012年06月22日 20:07:26 | jcea | set | messages: + msg163475 |
| 2012年06月22日 17:28:38 | gvanrossum | set | messages: + msg163443 |
| 2012年06月22日 17:26:01 | serhiy.storchaka | set | messages: + msg163441 |
| 2012年06月22日 17:19:43 | gvanrossum | set | files:
+ base64.patch messages: + msg163440 |
| 2012年06月22日 17:11:05 | serhiy.storchaka | set | nosy:
+ serhiy.storchaka messages: + msg163439 |
| 2012年06月22日 16:55:23 | gvanrossum | set | files:
+ base64.patch keywords: + patch messages: + msg163435 |
| 2012年06月22日 16:38:01 | pitrou | set | nosy:
+ pitrou messages: + msg163432 |
| 2012年06月22日 16:28:25 | gvanrossum | set | messages: + msg163430 |
| 2012年06月22日 15:42:01 | jcea | set | messages: + msg163424 |
| 2012年06月22日 15:32:06 | jcea | set | nosy:
+ jcea messages: + msg163422 versions: - Python 2.6, Python 3.1, Python 2.7, Python 3.2 |
| 2012年06月22日 14:59:59 | gvanrossum | create | |