I created a txt file using two requests, one LDAP and one SQL. Results of the two requests are stored in the same txt file.
The txt file looks like this :
[email protected]
[email protected]
[email protected]
[email protected]
[email protected]
Because a user can be in the two databases, I need to delete duplicate entries, using bash.
How can I do it?
1 Answer 1
If you don't mind your file ending up sorted, sort it and filter it; either
sort -u file
if your sort
supports it, or
sort file | uniq
if not, and you'll get on standard output the sorted list of unique email addresses.
If you want to keep the addresses in the original order, use awk
:
awk '!(count[0ドル]++)' file
-
sort -u
doesn't report the unique line but the first in lines sort the same in current locale.cuonglm– cuonglm2015年06月11日 08:43:36 +00:00Commented Jun 11, 2015 at 8:43 -
@cuonglm Indeed, but is there a case where two different email addresses would have the same collation?Stephen Kitt– Stephen Kitt2015年06月11日 08:51:20 +00:00Commented Jun 11, 2015 at 8:51
-
@StephenKitt:
1@example.com
and2@example.com
inen_US.utf8
locale.cuonglm– cuonglm2015年06月11日 09:18:58 +00:00Commented Jun 11, 2015 at 9:18 -
@cuonglm:
LC_ALL=en_US.UTF-8; (echo 1@example.com; echo 2@example.com) | sort | uniq
also merges both lines, so only theawk
solution is viable in that case.Stephen Kitt– Stephen Kitt2015年06月11日 18:23:31 +00:00Commented Jun 11, 2015 at 18:23 -
@StephenKitt: It seems that you are using GNU uniq, it's not POSIX compliant in this case, you must use
uniq -i
.cuonglm– cuonglm2015年06月12日 01:09:38 +00:00Commented Jun 12, 2015 at 1:09