3

I'm having a problem finding only unique named files over directories and subdirectories. Files must be unique by their name, not md5 sum or content.

I've managed to get list of unique files, but only names, not their location (directory name) with this code. Can't sort properly or use uniq if dir name is in the string...

find . -type f -name "*" | xargs -I% basename % | sort -u

Example of result I got:

same_name
some_file
test_file

Result expected:

./dir1/same_name
./dir1/some_file
./dir3/test_file

This would be an example of directory tree, but it can be a lot larger and deeper

.
├── dir1
│ ├── same_name
│ └── some_file
├── dir2
│ └── same_name
├── dir3
│ └── test_file
└── same_name
asked Mar 29, 2016 at 21:00
2
  • 1
    There are 3 files same_name: ./same_name ./dir1/same_name ./dir2/same_name, but only one is listed in the example result, so how do you choose the location ? Commented Mar 29, 2016 at 21:15
  • Location is not crucial, as long as I get only 1 file of the same name Commented Mar 30, 2016 at 7:35

1 Answer 1

6

Something like

find . -type f -printf "%f:%p\n" | awk -F: '!seen[1ドル]++ {print 2ドル}'

Let find print out the basename for you, and then use awk to print out the pathname only the first time the basename is seen.

I used colon as a field separator and newline as the (default) record separator. Both are valid filename characters. This one uses the null character as the record separator (not legal for filenames) and is more robust

find . -type f -printf "%f0円%p0円" |
 awk -v RS='0円' '{basename=0ドル; getline} !seen[basename]++'
answered Mar 29, 2016 at 21:24
4
  • I'm really glad to see someone using awk with such elegance. Commented Mar 29, 2016 at 22:41
  • Great, this works wonderful. Thanks for the answer! Commented Mar 30, 2016 at 7:38
  • voted up! I love the robust version: it should succeed even if there is this directory in the directory tree : mkdir "$(printf "1円2円3円4円5円6円7円10円11円12円13円14円15円16円17円20円21円22円23円24円25円26円27円30円31円32円33円34円3円‌​536円37円40円41円42円43円44円45円46円47円testdir" "")" Commented Mar 31, 2016 at 19:01
  • BTW the title mentions recursively, but all commands here are going through the directory tree iteratively. Which is efficient for such topic. Commented Mar 31, 2016 at 19:07

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.