This question/answer has some good solutions for deleting identical lines in a file, but won't work in my case since the otherwise duplicate lines have a timestamp.
Is it possible to tell awk to ignore the first 26 characters of a line in determining duplicates?
Example:
[Fri Oct 31 20:27:05 2014] The Brown Cow Jumped Over The Moon
[Fri Oct 31 20:27:10 2014] The Brown Cow Jumped Over The Moon
[Fri Oct 31 20:27:13 2014] The Brown Cow Jumped Over The Moon
[Fri Oct 31 20:27:16 2014] The Brown Cow Jumped Over The Moon
[Fri Oct 31 20:27:21 2014] The Brown Cow Jumped Over The Moon
[Fri Oct 31 20:27:22 2014] The Brown Cow Jumped Over The Moon
[Fri Oct 31 20:27:23 2014] The Brown Cow Jumped Over The Moon
[Fri Oct 31 20:27:24 2014] The Brown Cow Jumped Over The Moon
Would become
[Fri Oct 31 20:27:24 2014] The Brown Cow Jumped Over The Moon
(keeping the most recent timestamp)
-
4Yes. If you were to post some example input and output, then this might amount to a question.jasonwryan– jasonwryan2014年11月03日 16:21:13 +00:00Commented Nov 3, 2014 at 16:21
-
3When asking this type of question, you need to include your input and your desired output. We can't help if we have to guess.terdon– terdon ♦2014年11月03日 16:24:38 +00:00Commented Nov 3, 2014 at 16:24
-
1"yes" or "no" seems to be an acceptable answer, what are you going to do with that knowledge? In case of no, extend awk?Anthon– Anthon2014年11月03日 16:32:32 +00:00Commented Nov 3, 2014 at 16:32
-
1Wow. 80,000 rep claim this was an unusable question (I would not call it a good one) but not a single close vote?Hauke Laging– Hauke Laging2014年11月03日 16:45:18 +00:00Commented Nov 3, 2014 at 16:45
-
6@HaukeLaging it seems reasonable to give the OP the chance to react to our comments. They have now done so and the question is greatly improved.terdon– terdon ♦2014年11月03日 17:39:31 +00:00Commented Nov 3, 2014 at 17:39
5 Answers 5
You can just use uniq
with its -f
option:
uniq -f 4 input.txt
From man uniq
:
-f, --skip-fields=N
avoid comparing the first N fields
Actually this will display the first line:
[Fri Oct 31 20:27:05 2014] The Brown Cow Jumped Over The Moon
If that is a problem you can do:
tac input.txt | uniq -f 4
or if you don't have tac
but your tail
supports -r
:
tail -r input.txt | uniq -f 4
-
1That's wickedly awesome :)Ramesh– Ramesh2014年11月03日 19:18:14 +00:00Commented Nov 3, 2014 at 19:18
-
3@Ramesh Some of these tools have some nasty useful options that, when you know them, beat any awk/perl/python stuff you can come up with.Anthon– Anthon2014年11月03日 19:20:39 +00:00Commented Nov 3, 2014 at 19:20
awk '!seen[substr(0,27ドル)]++' file
-
This solution does not cover the timestamp part as that was not part of the question when this answer was written.Hauke Laging– Hauke Laging2014年11月03日 17:18:36 +00:00Commented Nov 3, 2014 at 17:18
-
2This is exactly why many of us work to close these until the Q's have been fully fleshed out. Otherwise these Q's are wasting your time and the OP's.2014年11月03日 18:30:17 +00:00Commented Nov 3, 2014 at 18:30
Try this one:
awk -F ']' '{a[2ドル]=1ドル}END{for(i in a){print a[i]"]"i}}'
A perl
solution:
perl -F']' -anle '$h{$F[1]} = $_; END{print $h{$_} for keys %h}' file
One can use power of vim
:
:g/part of duplicate string/d
Very easy. If you have couple more files (such as gzipped rotated logs), vim
will open them without any preliminary uncompression on your side and you can repeat the last command by pressing : and ↑. Just like repeating last command in terminal.
You must log in to answer this question.
Explore related questions
See similar questions with these tags.