Dealing with a whole bunch of two-line config files, I'd like a way to exclude any files that have a different number of lines.
So, something like:
mv * destdir only if file contains exactly two lines
Or:
wc -l * | grep '^ *2' | xargs mv {} destdir
Except that neither of those is actual working code.
While writing this I realized I do have a way to do this, which is ugly as heck, and I've included it below as an answer.
Is there an easy/clean way to do this?
5 Answers 5
Your kludgy solution isn't too bad for starters... you are just missing the fact that not only can awk
give you the number of lines, you can also instruct it to exit with the right status code so that you can then chain it with the cp
command:
for file in * ; do awk 'NR==3{exit}END{exit NR!=2}' "$file" && cp "$file" /tmp; done
NR
is the number of records, and as suggested in @don_crissti's answer, we can use the NR==3
check to stop further processing once we encounter a third line.
NR!=2
looks funny, because awk
's true/false
values are 1/0
, but in the shell, we need 0
to represent a success status for &&
to work correctly. The inverse of that works too (depending on how strongly do you react to seeing !=
):
for file in * ; do awk 'NR==3{exit}END{exit NR==2}' "$file" || cp "$file" /tmp; done
-
I like this solution because (a) it doesn't assume the filenames are sane (b) it is POSIX compatible. (Also it doesn't depend on zsh, which I know nothing about.)Wildcard– Wildcard2015年12月04日 06:33:16 +00:00Commented Dec 4, 2015 at 6:33
-
You could use awk
, exit
on line 3
(the END
rule is still executed) and exit 1
in the END
block if no. of lines is not 2
e.g. with zsh
:
print -rl -- *(.e_'awk "NR==3{exit}END{if(NR!=2){exit 1}}" $REPLY'_)
will list two-line files in the current directory; replace print -rl
with mv
and add the destination if you want to move them.
With other shells:
for file in ./*; do [[ -f $file ]] && \
awk 'NR==3{exit};END{if(NR!=2){exit 1}}' "$file" && mv "$file" "$dest"
done
Other ways, e.g. with z
shell
and gnu awk
:
awk 'ENDFILE{if(FNR==2){print FILENAME}}' ./*(.)
or gnu sed
(v. 4.2.2
or later):
sed -ns '2{$F}' ./*(.)
to list the two-line files1 and e.g.:
for f (./*(.))
sed -n '2{$Q 1};3q' $f || mv $f $dest
to move them.
1: those would both go through the whole input so not really suited if you're working with huge files; in that case, you may want to sed -n '2{$F};3q'
for each file or use the first awk solution
-
What does
F
do insed
? I can't find documentation on it anywhere. Or is that a typo?Wildcard– Wildcard2015年11月26日 19:44:37 +00:00Commented Nov 26, 2015 at 19:44 -
Using GNU sed 4.2.1,
sed F filename
returnsunknown command: 'F'
. And looking through the entirety ofinfo sed
I don't see it mentioned anywhere. Did you test it? What version ofsed
are you using? EDIT: Ah, I see the link in your comment now. I suspected that's what it was supposed to do...but it doesn't work on my CentOS 6 vagrant box.Wildcard– Wildcard2015年11月26日 20:19:22 +00:00Commented Nov 26, 2015 at 20:19 -
1@Wildcard - yes, my bad for not specifying that
F
was added insed 4.2.2
don_crissti– don_crissti2015年11月26日 20:22:27 +00:00Commented Nov 26, 2015 at 20:22
if your filenames are fairly sane, and you can delimit on both :
and a newline, then:
grep -m3 '' ./* ./*/* |
cut -d: -f1 | uniq -c |
grep -v '^ *[13] '
^that command will list all not-dot files in the current directory and in all immediate child directories which contain only two lines.
You don't really need to worry about sorting for uniq
, because globs are sorted. I use the GNU -m
ax match option because it is much faster if grep
quits at the third input line than it is if it continues through to the end, but it will work without it as well. The idea is to get grep
to print the filenames for each line they contain, then to count the occurences of each filename in its output, and then to filter out anything more or less than 2.
I ran it against some random source code dirs, and, of all of them, I had two files which contained only the two lines:
2 ./dex/coll.sh
2 ./jimtcl/jim-config.h.in
it would be neater to replace the last line with:
... |
sed -ne's/^ *2 *//p'
...though.
-
Assuming sane filenames in any context is something I dislike. But yes, if you're comfortable assuming that (and you're doing it interactively, not scripting it), then this is good.Wildcard– Wildcard2015年12月04日 06:29:06 +00:00Commented Dec 4, 2015 at 6:29
-
@Wildcard - if you're not comfortable with it, you can do
find . ! -type d ! -path "*[:$IFS]*" -exec ... {} +
and then invert the selection with a more conservative approach for a second run. Just make sure$IFS
is set to a default value first, or drop the space and tab if you like.mikeserv– mikeserv2015年12月04日 06:51:15 +00:00Commented Dec 4, 2015 at 6:51
I worked out the following kludgy solution:
for file in * ; do if [ "$(wc -l "$file" | awk '{print 1ドル}')" == "2" ] ; then cp "$file" /tmp/ ; fi; done
There must be a better way which doesn't start two processes for every single file in the current directory.
-
2If process count is a concern, you can save a process by doing
if [ -r "$file" ] && [ $(wc -l < "$file") -eq 2 ]
Mark Plotnick– Mark Plotnick2015年11月26日 07:08:09 +00:00Commented Nov 26, 2015 at 7:08
Using the -t
target directory option on mv
with xargs
:
wc -l * | sed -n 's/^[[:space:]]*2[[:space:]]\+//p' | xargs mv -t "$DESTDIR"
-
1fwiw, this won't work if the file names contain spaces or other funky chars.don_crissti– don_crissti2015年11月26日 02:06:19 +00:00Commented Nov 26, 2015 at 2:06