I have several grep filters that I usually use to parse specific information.
1st grep: grep "pattern1\|pattern2\|pattern3\|" file.txt
2nd grep: grep "patternA\|patternB\|patternC\|" file.txt
etc.
I apply each grep usually to the same file.txt
to get an independent output.
I would like to know how I can group this bunch of greps into a single bash script in order to get independent outputs based on each type of grep.
For example: input file.txt
looks as follows:
This line1 is the first line in here1
This line2 is the second line in here2
This line3 is the third line in here3
This line4 is the fourth line in here4
I usually apply separate greps here to get specific patterns.
grep -h -r --color=always "line1\|here1" file.txt >>pattern1.txt
or
grep -h -r --color=always "line2\|here2" file.txt >>pattern2.txt
This will highlight only the information required and will give me separate pattern*.txt
files to work on. The objective here is to run all these greps in a single time to evaluate the same file and print in shell as follows:
Pattern1
Pattern2
Pattern3
etc.
Each grep should evaluate the complete file independently.
1 Answer 1
If I understand the question correctly,
it’s about how to do the same command (grep
),
with the same options and the same input,
but multiple times,
with different regex arguments and different outputs.
And I guess you want to avoid unnecessary repetition/duplication.
It seems like you want an array of regular expressions (search strings):
declare -A regex
regex[1]="line1\|here1"
regex[2]="line2\|here2"
regex[3]="line3\|here3"
regex[4]="line4\|here4"
for i in "${!regex[@]}"
do
grep -h -r --color=always "${regex["$i"]}" file.txt >> "pattern$i.txt"
done
The first line (declare -A regex
) declares an associative array
called regex
; that means it creates the array as a placeholder,
but doesn't enter any information (elements) into it.
The next four lines populate the array with four elements,
which are the regular expressions,
indexed by the numbers 1
, 2
, 3
, and 4
.
(I’m using those indices because that’s what you seem to want,
but you can use any distinct strings as indices:
for example, uno
, dos
, tres
and cuatro
,
or ant
, bat
, cat
and dog
.†)
The for
statement, for i in "${!regex[@]}"
,
causes the variable i
to iterate through
the index values 1
, 2
, 3
, and 4
.
(If I had left out the !
and said for i in "${regex[@]}"
,
it would have iterated through the element values,
line1\|here1
, line2\|here2
, line3\|here3
and line4\|here4
.)
When $i
is 1
, ${regex["$i"]}
reduces to ${regex[1]}
,
which expands to line1\|here1
.
So, the loop iterates (executes) four times,
executing the four grep
commands that you want.
If you want to run the grep
processes in parallel, just do:
for i in "${!regex[@]}" do grep -h -r --color=always "${regex["$i"]}" file.txt>> "pattern$i.txt" & done wait
______________
† If the indices
are numerically distinct non-negative integers,
you can leave out the declare
statement.
-
Thanks!, it is working as expected, however, if I use strings as indices for the
regex
, it doesn't work, for example:regex[test_one]="line1\|here1"
, is there something I'm missing here?. When using strings the for loop exits after evaluating the first grep.raver83– raver832017年04月08日 22:55:13 +00:00Commented Apr 8, 2017 at 22:55 -
Sorry; it's my fault — I left out a line. I've fixed the answer.G-Man Says 'Reinstate Monica'– G-Man Says 'Reinstate Monica'2017年04月09日 19:46:54 +00:00Commented Apr 9, 2017 at 19:46
-
Thanks G-man, I believe it should work but doing some research,
declare -A
is available from bash v4.0, unfortunately I'm a mac OS X user and bash is available up to v3.2.57, I'm thinking to use python instead to use dictionaries as associative arrays, currently this is what I get as an output:declare: usage: declare [-afFirtx] [-p] [name[=value] ...]
raver83– raver832017年04月09日 22:18:05 +00:00Commented Apr 9, 2017 at 22:18 -
Well, as I said, if you use numbers as indices, you don’t need to
declare
the array at all.G-Man Says 'Reinstate Monica'– G-Man Says 'Reinstate Monica'2017年04月10日 06:06:58 +00:00Commented Apr 10, 2017 at 6:06
grep
criteria to the name "Pattern1" as the output?