Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Enhanced version of finddupe, a duplicate file detector for Windows

License

Notifications You must be signed in to change notification settings

thomas694/finddupe

Repository files navigation

finddupe

License: GPL v3 Build status

Enhanced version of finddupe, a duplicate file detector and eliminator for Windows, originally by Matthias Wandel.

Reasons

I really like finddupe when I look for duplicate files. It is fast and clever. The match candidates are clustered according to the signature of the first 32k, then checked byte for byte. It can also create and find NTFS hard links. Creating hard links saves you disk space. Listing all existing hard links is very difficult otherwise.

Please refer to Matthias' site for full description. My favourites are finddupe -bat d:\ImageLibray\Hardlinks_to_be_created.bat -ref d:\ImageLibray\originals1\** -ref d:\ImageLibray\originals2\** d:\ImageLibray\**\*.jpg to remove duplicates in an image collection and finddupe -listlink d:\ImageLibray to list them.

However, Matthias' current version 1.23 is not supporting my requirements. And it is ASCII-only and fails on non-ASCII filenames, as is often the case nowadays.

Enhancements

I added the following features to finddupe:

  • multiple reference directories that shall not be touched (v1.24)
  • unicode support (v1.25)
  • alert message if order of options is wrong (v1.26)
  • support for ignoring files by patterns (v1.26)
  • checking for NTFS file system in batch and hardlink mode (v1.27)
  • performance optimizations (especially for very large amounts of files) (v1.28)
  • new option to skip linked duplicates in output list (v1.30)
  • 64-bit version for addressing more memory (for large amounts of files) (v1.33)

It works for me, but some more testing is desirable.

I've udated the project to use Visual Studio 2019.

Usage

finddupe v1.32 compiled Jan 27 2024
an enhanced version by thomas694 (@GH), originally by Matthias Wandel
This program comes with ABSOLUTELY NO WARRANTY. This is free software, and you
are welcome to redistribute it under certain conditions; view GNU GPLv3 for more.
Usage: finddupe [options] [-ign <substr> ...] [-ref <filepat> ...] <filepat>...
Options:
 -bat <file.bat> Create batch file with commands to do the hard
 linking. run batch file afterwards to do it
 -hardlink Create hardlinks. Works on NTFS file systems only.
 Use with caution!
 -del Delete duplicate files
 -v Verbose
 -sigs Show signatures calculated based on first 32k for each file
 -rdonly Apply to readonly files also (as opposed to skipping them)
 -z Do not skip zero length files (zero length files are ignored
 by default)
 -u Do not print a warning for files that cannot be read
 -sl Skip linked duplicates and show only unlinked ones
 -p Hide progress indicator (useful when redirecting to a file)
 -j Follow NTFS junctions and reparse points (off by default)
 -listlink hardlink list mode. Not valid with -del, -bat, -hardlink,
 or -rdonly, options
 -ign <substr> Ignore file pattern, eg. .bak or .tmp (repeatable)
 -ref <filepat> Following file pattern are files that are for reference, NOT to
 be eliminated, only used to check duplicates against (repeatable)
 filepat Pattern for files. Examples:
 c:\** Match everything on drive C
 c:\**\*.jpg Match only .jpg files on drive C
 **\foo\** Match any path with component foo
 from current directory down

Download:

Latest release can be found here.

Authors

finddupe by thomas694 is licensed under GNU GPLv3.
Based on a work at https://www.sentex.ca/~mwandel/finddupe/.

AltStyle によって変換されたページ (->オリジナル) /