Remove duplicate lines from a file that contains a timestamp

Question 1

This question/answer has some good solutions for deleting identical lines in a file, but won't work in my case since the otherwise duplicate lines have a timestamp.

Is it possible to tell awk to ignore the first 26 characters of a line in determining duplicates?

Example:

[Fri Oct 31 20:27:05 2014] The Brown Cow Jumped Over The Moon
[Fri Oct 31 20:27:10 2014] The Brown Cow Jumped Over The Moon
[Fri Oct 31 20:27:13 2014] The Brown Cow Jumped Over The Moon
[Fri Oct 31 20:27:16 2014] The Brown Cow Jumped Over The Moon
[Fri Oct 31 20:27:21 2014] The Brown Cow Jumped Over The Moon
[Fri Oct 31 20:27:22 2014] The Brown Cow Jumped Over The Moon
[Fri Oct 31 20:27:23 2014] The Brown Cow Jumped Over The Moon
[Fri Oct 31 20:27:24 2014] The Brown Cow Jumped Over The Moon

Would become

[Fri Oct 31 20:27:24 2014] The Brown Cow Jumped Over The Moon

(keeping the most recent timestamp)

Question 2

Yes. If you were to post some example input and output, then this might amount to a question.

Question 3

When asking this type of question, you need to include your input and your desired output. We can't help if we have to guess.

Question 4

"yes" or "no" seems to be an acceptable answer, what are you going to do with that knowledge? In case of no, extend awk?

Question 5

Wow. 80,000 rep claim this was an unusable question (I would not call it a good one) but not a single close vote?

Question 6

@HaukeLaging it seems reasonable to give the OP the chance to react to our comments. They have now done so and the question is greatly improved.

Question 7

You can just use uniq with its -f option:

uniq -f 4 input.txt

From man uniq:

 -f, --skip-fields=N
 avoid comparing the first N fields

Actually this will display the first line:

[Fri Oct 31 20:27:05 2014] The Brown Cow Jumped Over The Moon

If that is a problem you can do:

tac input.txt | uniq -f 4

or if you don't have tac but your tail supports -r:

tail -r input.txt | uniq -f 4

Question 8

That's wickedly awesome :)

Question 9

@Ramesh Some of these tools have some nasty useful options that, when you know them, beat any awk/perl/python stuff you can come up with.

Question 10

awk '!seen[substr(0,27ドル)]++' file

Question 11

This solution does not cover the timestamp part as that was not part of the question when this answer was written.

Question 12

This is exactly why many of us work to close these until the Q's have been fully fleshed out. Otherwise these Q's are wasting your time and the OP's.

Question 13

Try this one:

awk -F ']' '{a[2ドル]=1ドル}END{for(i in a){print a[i]"]"i}}'

Question 14

A perl solution:

perl -F']' -anle '$h{$F[1]} = $_; END{print $h{$_} for keys %h}' file

Question 15

One can use power of vim:

:g/part of duplicate string/d

Very easy. If you have couple more files (such as gzipped rotated logs), vim will open them without any preliminary uncompression on your side and you can repeat the last command by pressing : and ↑. Just like repeating last command in terminal.

Anthon Anthon 81.3k42 gold badges174 silver badges228 bronze badges · Accepted Answer · 2014-11-03 19:16:09Z

15

You can just use uniq with its -f option:

uniq -f 4 input.txt

From man uniq:

 -f, --skip-fields=N
 avoid comparing the first N fields

Actually this will display the first line:

[Fri Oct 31 20:27:05 2014] The Brown Cow Jumped Over The Moon

If that is a problem you can do:

tac input.txt | uniq -f 4

or if you don't have tac but your tail supports -r:

tail -r input.txt | uniq -f 4

Share

Improve this answer

edited Nov 9, 2014 at 1:18

Whymarrh's user avatar

Whymarrh

1751 gold badge2 silver badges11 bronze badges

answered Nov 3, 2014 at 19:16

Anthon's user avatar

Anthon Anthon

81.3k42 gold badges174 silver badges228 bronze badges

2

1

That's wickedly awesome :)

Ramesh
– Ramesh

2014年11月03日 19:18:14 +00:00
Commented Nov 3, 2014 at 19:18
3

@Ramesh Some of these tools have some nasty useful options that, when you know them, beat any awk/perl/python stuff you can come up with.

Anthon
– Anthon

2014年11月03日 19:20:39 +00:00
Commented Nov 3, 2014 at 19:20

Add a comment |

Stack Exchange Network

Remove duplicate lines from a file that contains a timestamp

5 Answers 5

You must log in to answer this question.

Linked

Hot Network Questions

Remove duplicate lines from a file that contains a timestamp

5 Answers 5

You must log in to answer this question.

Linked

Related

Hot Network Questions