homepage

This issue tracker has been migrated to GitHub , and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: find() slower than rfind()
Type: performance Stage: resolved
Components: Interpreter Core Versions: Python 3.4
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: flox, pitrou, python-dev, serhiy.storchaka, vstinner
Priority: normal Keywords: patch

Created on 2011年10月07日 19:46 by pitrou, last changed 2022年04月11日 14:57 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
findperf.patch pitrou, 2011年10月07日 19:46 review
Messages (7)
msg145136 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2011年10月07日 19:46
With some gcc versions, str.find() is slower than str.rfind():
 - 	11.22	0.0	s="ABC"*33; ((s+"D")*500+s+"E").find(s+"E") (*100)
 - 	4.56	0.0	s="ABC"*33; ((s+"D")*500+"E"+s).find("E"+s) (*100)
 - 	6.71	0.0	s="ABC"*33; (s+"E") in ((s+"D")*300+s+"E") (*100)
 - 	11.22	0.0	s="ABC"*33; ((s+"D")*500+s+"E").index(s+"E") (*100)
 - 	11.52	0.0	s="ABC"*33; ((s+"D")*500+s+"E").partition(s+"E") (*100)
 - 	8.79	0.0	s="ABC"*33; ("E"+s+("D"+s)*500).rfind("E"+s) (*100)
 - 	3.86	0.0	s="ABC"*33; (s+"E"+("D"+s)*500).rfind(s+"E") (*100)
 - 	8.80	0.0	s="ABC"*33; ("E"+s+("D"+s)*500).rindex("E"+s) (*100)
 - 	9.80	0.0	s="ABC"*33; ("E"+s+("D"+s)*500).rpartition("E"+s) (*100)
 - 	9.83	0.0	s="ABC"*33; ("E"+s+("D"+s)*500).rsplit("E"+s, 1) (*100)
 - 	11.56	0.0	s="ABC"*33; ((s+"D")*500+s+"E").split(s+"E", 1) (*100)
Attached patch seems to fix it.
msg157846 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2012年04月09日 14:29
I checked one example on a 32-bit system (you have a 64-bit?)), because I was afraid pessimization because of a lack of registers. str.find() is faster than str.rfind(), but the patch makes it even faster.
But I would like to see the script and the results of benchmarking of the 1/2/3/20-character ascii/ucs1/ucs2/ucs4-substring in ascii/ucs1/ucs2/ucs4-string, in all possible combinations. May be, such benchmark scripts already exist?
msg157849 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2012年04月09日 15:16
> But I would like to see the script and the results of benchmarking of
> the 1/2/3/20-character ascii/ucs1/ucs2/ucs4-substring in ascii/ucs1
> /ucs2/ucs4-string, in all possible combinations. May be, such benchmark 
> scripts already exist?
stringbench (the tool which produced those results) now exists in Tools/stringbench/stringbench.py.
msg157853 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2012年04月09日 15:44
> stringbench (the tool which produced those results) now exists in Tools/stringbench/stringbench.py.
Thank you, yesterday they were not.
msg157872 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2012年04月09日 17:57
I used stringbench and self-writen script (see issue13165) for comparison and saw no convincing difference. The difference to str.find does not exceed accidental deviations for other functions which are not affected by the patch. Apparently, the accuracy of stringbench is not enough for a reliable measurement.
========== late match, 100 characters
 +8.6	 +8.8	s="ABC"*33; ((s+"D")*500+s+"E").find(s+"E") (*100)
 +0.1	 +0.2	s="ABC"*33; ((s+"D")*500+"E"+s).find("E"+s) (*100)
 +7.9	 +7.4	s="ABC"*33; (s+"E") in ((s+"D")*300+s+"E") (*100)
 +7.2	 +7.3	s="ABC"*33; ((s+"D")*500+s+"E").index(s+"E") (*100)
 +8.0	 +7.9	s="ABC"*33; ((s+"D")*500+s+"E").partition(s+"E") (*100)
 -4.3	 -4.3	s="ABC"*33; ("E"+s+("D"+s)*500).rfind("E"+s) (*100)
 -4.9	 -6.9	s="ABC"*33; (s+"E"+("D"+s)*500).rfind(s+"E") (*100)
 -3.0	 -3.0	s="ABC"*33; ("E"+s+("D"+s)*500).rindex("E"+s) (*100)
 -3.7	 -4.2	s="ABC"*33; ("E"+s+("D"+s)*500).rpartition("E"+s) (*100)
 -4.0	 -2.6	s="ABC"*33; ("E"+s+("D"+s)*500).rsplit("E"+s, 1) (*100)
 +28.0	 +6.5	s="ABC"*33; ((s+"D")*500+s+"E").split(s+"E", 1) (*100)
msg186251 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2013年04月07日 22:05
I still see a difference between find and rfind, even if the different is low (11%).
$ ./python -m timeit -s 's="ABC"*33; a=((s+"D")*500+s+"E"); b=s+"E"' 'a.find(b)'
10000 loops, best of 3: 93.6 usec per loop
$ ./python -m timeit -s 's="ABC"*33; a=("E"+s+("D"+s)*500); b="E"+s' 'a.rfind(b)'
10000 loops, best of 3: 84.3 usec per loop
Patched Python:
$ ./python -m timeit -s 's="ABC"*33; a=((s+"D")*500+s+"E"); b=s+"E"' 'a.find(b)'
10000 loops, best of 3: 84.7 usec per loop
$ ./python -m timeit -s 's="ABC"*33; a=("E"+s+("D"+s)*500); b="E"+s' 'a.rfind(b)'
10000 loops, best of 3: 84.5 usec per loop
I'm unable to explain why GCC (4.7 in my case) produces faster code with the patch, but the patch is simple and does not make the code (much) harder to understand.
So Antoine, please go ahead and apply it.
msg186253 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2013年04月07日 22:27
New changeset c5e2ea9e3aa7 by Victor Stinner in branch 'default':
Close #13126: "Simplify" FASTSEARCH() code to help the compiler to emit more
http://hg.python.org/cpython/rev/c5e2ea9e3aa7 
History
Date User Action Args
2022年04月11日 14:57:22adminsetgithub: 57335
2013年04月07日 22:27:13python-devsetstatus: open -> closed

nosy: + python-dev
messages: + msg186253

resolution: fixed
stage: patch review -> resolved
2013年04月07日 22:05:29vstinnersetmessages: + msg186251
versions: + Python 3.4, - Python 3.3
2012年04月09日 17:57:15serhiy.storchakasetmessages: + msg157872
2012年04月09日 15:44:49serhiy.storchakasetmessages: + msg157853
2012年04月09日 15:16:21pitrousetmessages: + msg157849
2012年04月09日 14:29:25serhiy.storchakasetmessages: + msg157846
2012年04月07日 17:25:48serhiy.storchakasetnosy: + serhiy.storchaka
2011年11月16日 16:41:28floxsetpriority: low -> normal
stage: patch review
2011年10月09日 21:09:13floxsetnosy: + flox
2011年10月07日 19:46:35pitroucreate

AltStyle によって変換されたページ (->オリジナル) /