Package: grep;
Reported by: Martin Raszyk <m.raszyk <at> gmail.com>
Date: 2019年12月20日 15:33:02 UTC
Severity: normal
To reply to this bug, email your comments to 38690 AT debbugs.gnu.org.
the display of automated, internal messages from the tracker.
View this report as an mbox folder, status mbox, maintainer mbox
bug-grep <at> gnu.org:bug#38690; Package grep.
(2019年12月20日 15:33:02 GMT) Full text and rfc822 format available.Martin Raszyk <m.raszyk <at> gmail.com>:bug-grep <at> gnu.org.
(2019年12月20日 15:33:02 GMT) Full text and rfc822 format available.Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):
From: Martin Raszyk <m.raszyk <at> gmail.com> To: bug-grep <at> gnu.org Subject: grep -P quadratic on long lines Date: 2019年12月20日 10:54:14 +0100
Dear grep maintainers, I've realized that grep -P takes quadratic time on long lines with many short matches. For instance, executing './src/grep -P -o "a" a.txt > a.out' on a file 'a.txt' consisting of N characters 'a' takes time quadratic in N. I've used grep-3.3 and pcre-8.43 for the benchmarks. The root causes for this behavior are as follows: 1. in src/pcresearch.c on l. 222 (at commit cf09252295c554dd3eba4cdb8eb53911fb495f40), the end of the line is searched each time a new match is searched; this already results in quadratic runtime in the above mentioned case 2. the function 'pcre_exec' from pcre-8.43 called in src/pcresearch.c on l. 71 for each match checks if the provided string is valid UTF-8 (code implemented in pcre_valid_utf8.c); this also results in quadratic runtime On your side, it is possible to fix the first root cause. I'll post an e-mail to PCRE mailing list about the second root cause. Best regards, Martin
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.