Package: grep;
Reported by: starlight.2014q3 <at> binnacle.cx
Date: Thu, 4 Sep 2014 21:39:01 UTC
Severity: wishlist
Tags: patch
To reply to this bug, email your comments to 18406 AT debbugs.gnu.org.
the display of automated, internal messages from the tracker.
View this report as an mbox folder, status mbox, maintainer mbox
bug-grep <at> gnu.org:bug#18406; Package grep.
(2014年9月04日 21:39:01 GMT) Full text and rfc822 format available.starlight.2014q3 <at> binnacle.cx:bug-grep <at> gnu.org.
(2014年9月04日 21:39:02 GMT) Full text and rfc822 format available.Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):
From: starlight.2014q3 <at> binnacle.cx To: bug-grep <at> gnu.org Subject: O_NOATIME patch Date: 2014年9月04日 16:46:27 -0400
[Message part 1 (text/plain, inline)]
Wrote a quick (but clean) patch to have O_NOATIME applied on file opens. I find this handy for bulk find/grep when I'd prefer not to update atime. If the patch is of interest I'm willing to improve it by having the feature present conditionally on the appearance of HAVE_WORKING_O_NOATIME in 'config.h'. Perhaps my choice of -N or --no-atime are not agreeable to all. Possibly the option does not belong under "miscellaneous." Easy to change.
[grep-noatime.patch (application/octet-stream, attachment)]
Paul Eggert <eggert <at> cs.ucla.edu>
to control <at> debbugs.gnu.org.
(2014年9月11日 20:01:02 GMT) Full text and rfc822 format available.Paul Eggert <eggert <at> cs.ucla.edu>
to control <at> debbugs.gnu.org.
(2014年9月11日 20:01:02 GMT) Full text and rfc822 format available.bug-grep <at> gnu.org:bug#18406; Package grep.
(2014年9月11日 20:15:02 GMT) Full text and rfc822 format available.Message #12 received at 18406 <at> debbugs.gnu.org (full text, mbox):
From: Paul Eggert <eggert <at> cs.ucla.edu> To: starlight.2014q3 <at> binnacle.cx, 18406 <at> debbugs.gnu.org Subject: Re: bug#18406: O_NOATIME patch Date: 2014年9月11日 13:13:54 -0700
> If the patch is of interest I'm willing > to improve it by having the feature > present conditionally on the appearance of > HAVE_WORKING_O_NOATIME > in 'config.h'. Thanks, but there's no need for that; just have 'grep' complain if the option is used and O_NOATIME == 0. I'm of two minds about this suggestion. On the one hand we don't want to add an option like this to every utility that reads files. On the other hand grep is used soooo often that it may be justifiable. What do other people think? If we add it, it should not have a single-letter option, though, and the long option should be called "--atime-preserve" for compatibility with tar, and the patch should also use FTS_NOATIME to avoid updating atime on directories with grep -r, and it should be documented properly in grep.texi and in 'grep --help' output and in NEWS (plus maybe write a test case or two....).
bug-grep <at> gnu.org:bug#18406; Package grep.
(2014年9月11日 20:51:01 GMT) Full text and rfc822 format available.Message #15 received at 18406 <at> debbugs.gnu.org (full text, mbox):
From: Eric Blake <eblake <at> redhat.com> To: Paul Eggert <eggert <at> cs.ucla.edu>, starlight.2014q3 <at> binnacle.cx, 18406 <at> debbugs.gnu.org Subject: Re: bug#18406: O_NOATIME patch Date: 2014年9月11日 14:50:10 -0600
[Message part 1 (text/plain, inline)]
On 09/11/2014 02:13 PM, Paul Eggert wrote: >> If the patch is of interest I'm willing >> to improve it by having the feature >> present conditionally on the appearance of >> HAVE_WORKING_O_NOATIME >> in 'config.h'. > > Thanks, but there's no need for that; just have 'grep' complain if the > option is used and O_NOATIME == 0. > > I'm of two minds about this suggestion. On the one hand we don't want > to add an option like this to every utility that reads files. On the > other hand grep is used soooo often that it may be justifiable. What do > other people think? > > If we add it, it should not have a single-letter option, though, and the > long option should be called "--atime-preserve" for compatibility with > tar, and the patch should also use FTS_NOATIME to avoid updating atime > on directories with grep -r, and it should be documented properly in > grep.texi and in 'grep --help' output and in NEWS (plus maybe write a > test case or two....). Lots of work, but I like the idea. In fact, I proposed a similar idea for coreutils' du several years ago, and the only reason I haven't actually submitted a patch is _because_ it is a lot of this detail work. -- Eric Blake eblake redhat com +1-919-301-3266 Libvirt virtualization library http://libvirt.org
[signature.asc (application/pgp-signature, attachment)]
bug-grep <at> gnu.org:bug#18406; Package grep.
(2014年9月11日 21:53:01 GMT) Full text and rfc822 format available.Message #18 received at 18406 <at> debbugs.gnu.org (full text, mbox):
From: starlight.2014q3 <at> binnacle.cx To: Paul Eggert <eggert <at> cs.ucla.edu>, Blake <eblake <at> redhat.com>, 18406 <at> debbugs.gnu.org Subject: Re: bug#18406: O_NOATIME patch Date: 2014年9月11日 17:43:52 -0400
At 13:13 9/11/2014 -0700, Paul Eggert wrote: >> If the patch is of interest I'm willing >> to improve it by having the feature >> present conditionally on the appearance of >> HAVE_WORKING_O_NOATIME >> in 'config.h'. > >Thanks, but there's no need for that; just >have 'grep' complain if the option is used >and O_NOATIME == 0. Sure, that is a better approach. >I'm of two minds about this suggestion. On >the one hand we don't want to add an option >like this to every utility that reads >files. On the other hand grep is used soooo >often that it may be justifiable. What do >other people think? I don't feel a compulsion to utilize O_NOATIME all over the place--it really seems like a specific use case to me which is where one cares about and refers to ATIME with something like 'ls -otru' and one frequently runs find * -type f -print | xargs egrep somestring while trying to find code fragments. For years it has annoyed me that the 'find/grep' nukes ATIME values for the entire tree one is working on. >If we add it, it should not have a >single-letter option, though, and the long >option should be called "--atime-preserve" >for compatibility with tar, I specifically chose to not use the 'tar' option because it has two variants: a) =replace and b) =system in order to avoid semantic confusion. The 'tar' default is --atime-preserve=restore which nukes CTIME values and is evil in my opinion. But I don't feel strongly about the naming choice here. However I do feel strongly that a short option should be defined. If one embraces the feature it would be typed often. >and the patch >should also use FTS_NOATIME to avoid updating >atime on directories with grep -r, and it >should be documented properly in grep.texi >and in 'grep --help' output and in NEWS (plus >maybe write a test case or two....). I can do the above if a decision is taken to adopt the feature.
bug-grep <at> gnu.org:bug#18406; Package grep.
(2014年9月11日 22:16:01 GMT) Full text and rfc822 format available.Message #21 received at 18406 <at> debbugs.gnu.org (full text, mbox):
From: starlight.2014q3 <at> binnacle.cx To: Paul Eggert <eggert <at> cs.ucla.edu>, Blake <eblake <at> redhat.com>, 18406 <at> debbugs.gnu.org Subject: Re: bug#18406: O_NOATIME patch Date: 2014年9月11日 18:08:38 -0400
Another argument in favor of adding O_NOATIME support to a limited set of utilities (just 'grep' IMO) is that recent 'ext4' file system behavior defaults to a mode where ATIME is updated only once relative to a given MTIME. For O_NOATIME to be of use, one must care enough about ATIME to add 'strictatime' to the mount options in /etc/fstab. I find this matters to me only in the context of code development where seeing what has been recently viewed or compiled (mainly by me) is of interest. It's ok if ATIME is nuked wholesale on occasion since the usefulness of the value is short.
bug-grep <at> gnu.org:bug#18406; Package grep.
(2020年9月01日 02:22:01 GMT) Full text and rfc822 format available.Message #24 received at 18406 <at> debbugs.gnu.org (full text, mbox):
From: Paul Eggert <eggert <at> cs.ucla.edu> To: starlight.2014q3 <at> binnacle.cx Cc: Eric Blake <eblake <at> redhat.com>, 18406 <at> debbugs.gnu.org Subject: Re: bug#18406: O_NOATIME patch Date: 2020年8月31日 19:21:06 -0700
[Message part 1 (text/plain, inline)]
On 9/11/14 1:13 PM, Paul Eggert wrote: > Thanks, but there's no need for that; just have 'grep' complain if the option is > used and O_NOATIME == 0. On looking into this more today, O_NOATIME seems to be just a best-effort thing as some GNU/Linux filesystems ignore it, so grep should just join the throng and not worry whether O_NOATIME actually works. Also, the O_NOATIME support was withdrawn from fts a couple of years ago, so 'grep -r' can't easily avoid updating atime on directories. A patch is attached. I'm still of two minds about this. The efficiency argument for the new option is not as strong as it used to be, now that relatime has taken over on ext4 style filesystems. So the main argument is "I want to search through this directory but don't want it to count as an access"; although that's indeed a use case I'm not quite sure it's worth modifying 'grep' over. It doesn't seem to be worth using up a scarce option letter over, anyway, so the attached patch uses just a long option.
[0001-grep-new-atime-preserve-option.patch (text/x-patch, attachment)]
bug-grep <at> gnu.org:bug#18406; Package grep.
(2020年9月04日 16:13:02 GMT) Full text and rfc822 format available.Message #27 received at 18406 <at> debbugs.gnu.org (full text, mbox):
From: Jim Meyering <jim <at> meyering.net> To: Paul Eggert <eggert <at> cs.ucla.edu> Cc: 18406 <at> debbugs.gnu.org, starlight.2014q3 <at> binnacle.cx Subject: Re: bug#18406: O_NOATIME patch Date: Fri, 4 Sep 2020 18:12:07 +0200
On Tue, Sep 1, 2020 at 4:22 AM Paul Eggert <eggert <at> cs.ucla.edu> wrote: > On 9/11/14 1:13 PM, Paul Eggert wrote: > > Thanks, but there's no need for that; just have 'grep' complain if the option is > > used and O_NOATIME == 0. > > On looking into this more today, O_NOATIME seems to be just a best-effort thing > as some GNU/Linux filesystems ignore it, so grep should just join the throng and > not worry whether O_NOATIME actually works. > > Also, the O_NOATIME support was withdrawn from fts a couple of years ago, so > 'grep -r' can't easily avoid updating atime on directories. > > A patch is attached. I'm still of two minds about this. The efficiency argument > for the new option is not as strong as it used to be, now that relatime has > taken over on ext4 style filesystems. So the main argument is "I want to search > through this directory but don't want it to count as an access"; although that's > indeed a use case I'm not quite sure it's worth modifying 'grep' over. It > doesn't seem to be worth using up a scarce option letter over, anyway, so the > attached patch uses just a long option. I confess to similar ambivalence, but do like the idea. Has anyone run tests to compare performance on file systems like ext4, btrfs (the default with Fedora 33) and xfs?
bug-grep <at> gnu.org:bug#18406; Package grep.
(2020年9月04日 17:44:02 GMT) Full text and rfc822 format available.Message #30 received at 18406 <at> debbugs.gnu.org (full text, mbox):
From: Zev Weiss <zev <at> bewilderbeest.net> To: Jim Meyering <jim <at> meyering.net> Cc: Paul Eggert <eggert <at> cs.ucla.edu>, 18406 <at> debbugs.gnu.org, starlight.2014q3 <at> binnacle.cx Subject: Re: bug#18406: O_NOATIME patch Date: Fri, 4 Sep 2020 12:43:12 -0500
On Fri, Sep 04, 2020 at 11:12:07AM CDT, Jim Meyering wrote: >On Tue, Sep 1, 2020 at 4:22 AM Paul Eggert <eggert <at> cs.ucla.edu> wrote: >> On 9/11/14 1:13 PM, Paul Eggert wrote: >> > Thanks, but there's no need for that; just have 'grep' complain if the option is >> > used and O_NOATIME == 0. >> >> On looking into this more today, O_NOATIME seems to be just a best-effort thing >> as some GNU/Linux filesystems ignore it, so grep should just join the throng and >> not worry whether O_NOATIME actually works. >> >> Also, the O_NOATIME support was withdrawn from fts a couple of years ago, so >> 'grep -r' can't easily avoid updating atime on directories. >> >> A patch is attached. I'm still of two minds about this. The efficiency argument >> for the new option is not as strong as it used to be, now that relatime has >> taken over on ext4 style filesystems. So the main argument is "I want to search >> through this directory but don't want it to count as an access"; although that's >> indeed a use case I'm not quite sure it's worth modifying 'grep' over. It >> doesn't seem to be worth using up a scarce option letter over, anyway, so the >> attached patch uses just a long option. > >I confess to similar ambivalence, but do like the idea. Has anyone run >tests to compare performance on file systems like ext4, btrfs (the >default with Fedora 33) and xfs? > > For what my two cents are worth: while yes, the performance angle is I'd guess probably not real relevant these days in the face of widespread noatime/relatime mount options (though I haven't done any measurements), I can see the semantic angle -- but adding flags to individual tools seems like an awkward way to go about solving the problem. Something like an LD_PRELOAD hack to shove O_NOATIME into the flags argument of every open(2) call comes to mind... Zev
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.