This issue tracker has been migrated to GitHub ,
and is currently read-only.
For more information,
see the GitHub FAQs in the Python's Developer Guide.
Created on 2012年12月23日 16:50 by serhiy.storchaka, last changed 2022年04月11日 14:57 by admin. This issue is now closed.
| Files | ||||
|---|---|---|---|---|
| File name | Uploaded | Description | Edit | |
| unicode_findmaxchar.patch | serhiy.storchaka, 2012年12月23日 16:50 | review | ||
| format_bench.sh | serhiy.storchaka, 2012年12月23日 16:51 | Benchmark script | ||
| unicode_findmaxchar_2.patch | serhiy.storchaka, 2012年12月24日 20:53 | review | ||
| Messages (9) | |||
|---|---|---|---|
| msg177997 - (view) | Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) | Date: 2012年12月23日 16:50 | |
The proposed patch optimizes _PyUnicode_FindMaxChar(). This affects string formatting of long patterns (speedup to 15-25% for classic formatting and 5-8% for new style formatting). |
|||
| msg178066 - (view) | Author: STINNER Victor (vstinner) * (Python committer) | Date: 2012年12月24日 15:59 | |
See issues #14687 and #14716 for benchmarks and information about the changes introducing and using _PyUnicodeWriter for str%args and str.format(args). |
|||
| msg178067 - (view) | Author: STINNER Victor (vstinner) * (Python committer) | Date: 2012年12月24日 16:03 | |
Avoid scanning a substring to compute the maximum character is a good thing, so +1 for the idea. Instead of modifying an existing function, it might be safer to rename it. Even if it is private, a third party module *can* use it outside Python. It is also surprising that you have to specifiy a "maxchar" argument whereas the purpose of the function is to compute the maximum character. Ex: rename _PyUnicode_FindMaxChar() to _PyUnicode_FindMaxChar2(), and add _PyUnicode_FindMaxChar() as the macro: #define _PyUnicode_FindMaxChar(str, start, length) _PyUnicode_FindMaxChar2(str, start, length, 127) |
|||
| msg178076 - (view) | Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) | Date: 2012年12月24日 16:48 | |
It is not more surprising that specify a "start" argument for sum(). I don't think we should worry about third party module which violate the API. It will crash in any case when the exported function will disappear and will be replaced by macros. |
|||
| msg178081 - (view) | Author: STINNER Victor (vstinner) * (Python committer) | Date: 2012年12月24日 17:45 | |
> It is not more surprising that specify a "start" argument for sum(). Un pythin, the start attribute of sum is optional. > I don't think we should worry about third party module which violate the API. It will crash in any case when the exported function will disappear and will be replaced by macros. The compilation will fail, whereas it will crash with your patch. |
|||
| msg178099 - (view) | Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) | Date: 2012年12月24日 20:53 | |
I think it is redundant and useless, but do as you want. |
|||
| msg185866 - (view) | Author: STINNER Victor (vstinner) * (Python committer) | Date: 2013年04月03日 00:02 | |
Related commit: changeset: 83066:b5d5f422299f tag: tip user: Victor Stinner <victor.stinner@gmail.com> date: Wed Apr 03 01:48:39 2013 +0200 files: Include/unicodeobject.h Lib/test/test_format.py Objects/stringlib/unicode_format.h Objects/unicodeobject.c description: Add _PyUnicodeWriter_WriteSubstring() function Write a function to enable more optimizations: * If the substring is the whole string and overallocation is disabled, just keep a reference to the string, don't copy characters * Avoid a call to the expensive _PyUnicode_FindMaxChar() function when possible |
|||
| msg185867 - (view) | Author: Roundup Robot (python-dev) (Python triager) | Date: 2013年04月03日 00:03 | |
New changeset 7132bca093ad by Victor Stinner in branch 'default': Close #16757: Avoid calling the expensive _PyUnicode_FindMaxChar() function http://hg.python.org/cpython/rev/7132bca093ad |
|||
| msg185868 - (view) | Author: STINNER Victor (vstinner) * (Python committer) | Date: 2013年04月03日 00:06 | |
> New changeset 7132bca093ad by Victor Stinner in branch 'default': > Close #16757: Avoid calling the expensive _PyUnicode_FindMaxChar() function This changeset is almost the same than unicode_findmaxchar_2.patch. I prefered to keep the API unchanged and not call _PyUnicode_FindMaxChar() instead of adding a test in the function to exit it. There is just a minor difference in Python/formatter_unicode.c: the test for _PyUnicode_FindMaxChar() is done after the test on format->fill_char (which should avoid a call to for _PyUnicode_FindMaxChar() if fill_char is wider than the actual buffer). Thanks Serhiy for your great idea! |
|||
| History | |||
|---|---|---|---|
| Date | User | Action | Args |
| 2022年04月11日 14:57:39 | admin | set | github: 60961 |
| 2013年04月03日 00:06:36 | vstinner | set | messages: + msg185868 |
| 2013年04月03日 00:03:49 | python-dev | set | status: open -> closed nosy: + python-dev messages: + msg185867 resolution: fixed stage: patch review -> resolved |
| 2013年04月03日 00:02:52 | vstinner | set | messages: + msg185866 |
| 2012年12月24日 20:53:38 | serhiy.storchaka | set | files:
+ unicode_findmaxchar_2.patch messages: + msg178099 |
| 2012年12月24日 17:45:10 | vstinner | set | messages: + msg178081 |
| 2012年12月24日 16:48:09 | serhiy.storchaka | set | messages: + msg178076 |
| 2012年12月24日 16:03:58 | vstinner | set | messages: + msg178067 |
| 2012年12月24日 15:59:04 | vstinner | set | messages: + msg178066 |
| 2012年12月23日 16:51:11 | serhiy.storchaka | set | files: + format_bench.sh |
| 2012年12月23日 16:50:35 | serhiy.storchaka | create | |