grep the last occurrence of a pattern before another pattern

Question 1

I have a huge file contains two types of patterns say pattern1 and pattern2, pattern1 may appear many times before pattern2 appears. I want to grep the last occurrence of each pattern1 before each pattern2.

Input file:

some text
pattern1=1
some lines
pattern1=2
some lines
pattern1=3
some lines
pattern2
some lines
pattern1=4
some lines
pattern1=5
some lines
pattern1=6
some lines
pattern1=7
some lines
pattern2

Desired output:

pattern1=3
pattern1=7

I tried with grep when I know the numbers of lines between pattern2 and the previous pattern1:

grep -B400 "pattern2" | grep "pattern1"

but I need a unique command that can be run over any file regardless of the number of lines between the two patterns.

Question 2

To be clear, you want to return the last instance of pattern 1 that appears before any instance of pattern 2 regardless of what other strings are between the two of them, yes?

Question 3

Please read how-do-i-find-the-text-that-matches-a-pattern to understand the issue and then replace pattern everywhere in your question with whatever you really want to match - full or partial + word or line + string or regexp + anchored or not. Right now the answers you have assume you want unanchored partial line regexp matches which seems unlikely to be the most robust solution for you but that's based on what your code does since you haven't yet stated/shown what you actually need.

Question 4

@NasirRiley yes exactly, anyway there are a lot of accepted answers for my quetion here

Question 5

$ awk '/pattern1/{x=0ドル} /pattern2/{print x}' input
pattern1=3
pattern1=7

Saves the pattern1 matches (the whole line) to the variable x and prints that when pattern2 happens. Will print a blank line if there's a pattern2 before any pattern1, which would take more logic to detect if that's not desirable. Will drop any trailing pattern1 that are not followed by a pattern2 before the end of the input.

Question 6

@thrig's answer is good, but I made some modifications to handle some extra test cases. The following script:

Will not print an empty line if pattern2 appears before the first appearance of pattern1.
Will not print duplicate lines if pattern2 appears multiple times after pattern1.

With the modified input file:

pattern2
some text
pattern1=1
some lines
pattern1=2
some lines
pattern1=3
some lines
pattern2
pattern2
some lines
pattern1=4
some lines
pattern1=5
pattern2
some lines
pattern1=6
some lines
pattern1=7
some lines
pattern2

The following script seems to do what you describe in the text:

$ awk '/pattern1/{x=0ドル} length(x) && /pattern2/{print x;x=""}' file
pattern1=3
pattern1=5
pattern1=7

Question 7

Three grep calls:

Extract only lines that match ^pattern1= or ^pattern2$ from the original input file
```
grep -e '^pattern1=' -e '^pattern2$' file
```
Get the lines that match ^pattern2$, and the lines immediately before these (using the non-standard -B option):
```
grep -B1 '^pattern2$'
```
Get all lines matching ^pattern1= from these:
```
grep '^pattern1='
```

All together:

grep -e '^pattern1=' -e '^pattern2$' file |
grep -B1 '^pattern2$' |
grep '^pattern1='

This handles the same edge cases as user000001's answer, namely that it does not output duplicate lines if there are many pattern2 lins with no pattern1 line between them, and it would not produce empty lines for pattern2 lines at the start of the file.

Using sed:

sed -e '/^pattern1=/ { h; d; }' \
 -e '/^pattern2$/ x' \
 -e '/^pattern1=/ !d' file

If the current line is a pattern1 line, it saves it into the hold space and discards it.
If the current line a pattern2 line, it swaps in the hold space.
If the current line now is not a pattern1 line, it is discarded.
(implicitly) print the current line. The current line, by the preceding commands, must be a pattern1 line swapped in from the hold space due to finding a pattern2 line. The hold space would, therefore, by necessity, hold a pattern2 line, ensuring that the pattern1 line would not be outputted multiple times.

Question 8

❯ printf 'g/pattern2/?pattern1?p\n' | ed -s in.txt
pattern1=3
pattern1=7

Explanation:

g/pattern2/ → For every match of the pattern2, do

?pattern1? → Search backwards for pattern1

p → print

Limitations:

Search backwards wraps back to end of file and selects the last match of pattern1
Prints the same line multiple times if there are multiple pattern2

E.g.

pattern1=1
pattern2
pattern2

This will print pattern1 twice.

Question 9

Could you use /pattern1/,$ (or something like it) as the address range for the g command, possibly, to fix the wrapping issue? I'm on a phone and can't test...

Question 10

egrep "^pattern1|^pattern2" <file> | grep -B 1 "^pattern2" | grep "^pattern1"

The first egrep will get only the lines that contain either pattern (stripping all other unknown lines from the output). The second grep will get the pattern2 and whatever line is before it. This will be used to remove lines where there is no pattern1 before the pattern2. The third grep will just return the remaining pattern1 lines.

Question 11

 awk '{a[++i]=0ドル}/pattern2/{for(x=NR-2;x<=NR;x++)print a[x]}' file1.txt| awk '/pattern1/'

output

pattern1=3
pattern1=7

As per Input pattern1 is separated by one line so used above awk command to extract the required output

Question 12

You should probably mention the assumptions you're making, such as there not being more than two lines between the pattern. It's unclear why you have that second awk call, as you could do the same thing with an if statement in front of the print.

Question 13

Think about the values your variable i will have. Now think about the values the builtin variable NR will have and figure out why introduce i.

thrig thrig 35.8k4 gold badges70 silver badges88 bronze badges · Accepted Answer · 2022-08-23 03:33:39Z

$ awk '/pattern1/{x=0ドル} /pattern2/{print x}' input
pattern1=3
pattern1=7

Saves the pattern1 matches (the whole line) to the variable x and prints that when pattern2 happens. Will print a blank line if there's a pattern2 before any pattern1, which would take more logic to detect if that's not desirable. Will drop any trailing pattern1 that are not followed by a pattern2 before the end of the input.

Stack Exchange Network

grep the last occurrence of a pattern before another pattern

6 Answers 6

You must log in to answer this question.

Hot Network Questions

grep the last occurrence of a pattern before another pattern

6 Answers 6

You must log in to answer this question.

Related

Hot Network Questions