0

I created a txt file using two requests, one LDAP and one SQL. Results of the two requests are stored in the same txt file.

The txt file looks like this :

[email protected]
[email protected]
[email protected]
[email protected]
[email protected]

Because a user can be in the two databases, I need to delete duplicate entries, using bash.
How can I do it?

taliezin
9,4551 gold badge37 silver badges39 bronze badges
asked Jun 11, 2015 at 8:23
0

1 Answer 1

5

If you don't mind your file ending up sorted, sort it and filter it; either

sort -u file

if your sort supports it, or

sort file | uniq

if not, and you'll get on standard output the sorted list of unique email addresses.

If you want to keep the addresses in the original order, use awk:

awk '!(count[0ドル]++)' file
answered Jun 11, 2015 at 8:26
7
  • sort -u doesn't report the unique line but the first in lines sort the same in current locale. Commented Jun 11, 2015 at 8:43
  • @cuonglm Indeed, but is there a case where two different email addresses would have the same collation? Commented Jun 11, 2015 at 8:51
  • @StephenKitt: 1@example.com and 2@example.com in en_US.utf8 locale. Commented Jun 11, 2015 at 9:18
  • @cuonglm: LC_ALL=en_US.UTF-8; (echo 1@example.com; echo 2@example.com) | sort | uniq also merges both lines, so only the awk solution is viable in that case. Commented Jun 11, 2015 at 18:23
  • @StephenKitt: It seems that you are using GNU uniq, it's not POSIX compliant in this case, you must use uniq -i. Commented Jun 12, 2015 at 1:09

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.