3

I have a directory with several thousand files. Many of them are images where the filename begins with the image's resolution. The height and width of the images can be two, three, four or five digits long. For example:

 - 986x1088_lalbslj.jpg
 - 2043x924_fjnndkk.jpg
 - 9560x12643_fjknfd.jpg
 - 24x24_dnjkan.jpg

I'd like to collect all these images in a new directory (say ./images/). Sounds simple enough.
The simple regex [0-9]+x[0-9]+.* matches these filenames, but as far as I've understood, you can't use regex with mv.
Ideally, I'd like something like this to work: mv [0-9]+x[0-9]+.* images, but of course, it doesn't.

It seems this questions has been asked a lot before, and I've already looked at several dozen (literally!) similar threads on this and other StackExchanges, but sadly I haven't yet seen an answer that helps me. Most of the time, the accepted answers are simply explaining that mv uses globs and not regex, or suggesting a corresponding glob that helps the author, but which doesn't work in my case. So I'm trying my luck by asking my own question – I hope that's OK.

I can't seem to condense my pattern into any reasonable glob, and regex doesn't work with mv, so what do I do?
Surely this must be possible somehow?!

Thank you very much for all your kind replies!

asked Jun 10, 2020 at 2:51
1
  • Pipe the file list into sed (good regex) to filter in your files and pipe that into xargs to move them. Something like ls | sed -n '/your_regex/p' | xargs -I{} mv {} ~/new_directory Commented Sep 1 at 14:23

6 Answers 6

4

You can use extended globs:

If the extglob shell option is enabled using the shopt builtin, several extended pattern matching operators are recognized. In the following description, a pattern-list is a list of one or more patterns separated by a ‘|’. Composite patterns may be formed using one or more of the following sub-patterns: [...]

+(pattern-list)
Matches one or more occurrences of the given patterns.

So:

shopt -s extglob
mv +([0-9])x+([0-9])* ./images
answered Jun 10, 2020 at 3:15
0
2

You can use the following loop to achieve your objective:

for i in *
do
 echo "$i" | grep -qE "^[[:digit:]]+x[[:digit:]]+.*" && mv "$i" images
done

You can change the regular expression to specify between 2 and 5 digits as follows:

for i in *
do
 echo "$i" | grep -qE "^[[:digit:]]{2,5}x[[:digit:]]{2,5}.*" && mv "$i" images
done
answered Jun 10, 2020 at 3:11
3
  • Note that it would move a $'foo\n10x10\nbar' file, and if there was a $'-t\n10x10' file and 10x10 directory, with GNU mv, it would move images into it and then rename each other matching file to images in turn, losing all but the last. Commented Sep 1 at 12:39
  • With UNIX compliant echos, it would also move a file called 120170234円 Commented Sep 1 at 12:41
  • Note that "^[[:digit:]]{2,5}x[[:digit:]]{2,5}.*" also matches on 99x123456789, the upper bound in the second {2,5} is redundant as it's followed by .* which will match anything anyway. Same for + which is short for {1,} which can be removed. Commented Sep 1 at 12:43
1

mv doesn't take glob or regexps as argument, it only takes file paths.

Most shells however have a special feature called filename generation or globbing (or pathname expansion in POSIX terminology) whereby a pattern can be expanded into a list of matching file paths.

The term globbing actually comes from the /etc/glob helper program that the original Unix shell from the early 70s was calling to perform that feature.

The syntax of those patterns vary from shell to shell but are not regular expressions. However several shells have extended the basic * (match any number of characters), ? (match a single character) and [...] (match any character in a set) original wildcards to make them feature-compatible with regexps.

ksh in the late 80s added *(x), @(x|y), +(x) as equivalents of ERE x*, (x|y) and x+ and even !(x) which has no equivalent in EREs. Those were copied by bash in the late 90s (2.02) under an extglob option. EREs in the 80s didn't have x{2,3} (those were first added in BREs), {2,3}(x) was only added to ksh in ksh93, but not copied by bash.

zsh in the early 90s added x#, (x|y), x## as equivalents of x*, (x|y) and x+ (# and ## requiring the extendedglob option to be enabled) and many more, some of which later such as x(#c2,3) as equivalent to ERE x{2,3}.

In particular and of interest here, it has a <x-y> to match on string representations of numbers from x to y, so <-> can be used in place of [0-9]## (equivalent to ERE [0-9]+).

Besides {2,3}(x) ksh93 added many more matching operators including the ability to actually use regular expressions (basic, extended, perl-like or augmented).

For instance, here, you can do:

mv ~(E:[0-9]+x[0-9]+.*) images/

To move the files whose name matches the [0-9]+x[0-9]+.* ERE (which btw is equivalent to [0-9]+x[0-9].*). You can even replace [0-9] with perl-style \d there as an extension over the standard ERE operators.

In ksh88 or bash -O extglob (or zsh --emulate ksh):

mv +([0-9])x[0-9]* images/

Though in zsh, you'd just write:

mv <->x<->* images/

Now beware that in bash or ksh (not zsh unless in sh/ksh emulation), if the pattern does not match any file, the pattern is passed as-is to mv (a misfeature inherited from the Bourne shell from the late 70s, earlier Unix shells were not affected) so mv could end up moving a file called literally +([0-9])x[0-9]*. In bash, that can be avoided by enabling the failglob option.

Also beware that in bash and ksh (not zsh), [0-9] matches hundreds more characters than just 0123456789 (that's also the case in some regexp engines). So in those shells, you'd likely want to use [0123456789] instead. An alternative would be to use [[:digit:]] (of which ksh93's \d is an alias) which matches on decimal digits only (though depending on the shell, locale and system, including other representations of those in other languages).

Zsh globs can also do regexp matching by way of its e[code] glob qualifier with something like:

mv *(e['[[ $REPLY =~ ^[0-9]+x[0-9] ]]']) images

After set -o rematchpcre, you can use perl-compatible regexps, so:

mv *(e['[[ $REPLY =~ "^\d+x\d" ]]']) images

Beside shells with their globs, some tools can also generate lists of files based on patterns and pass them as separate arguments to commands.

find is one of them with its -name predicate which takes a basic glob pattern. Some find implementations also take a -regex predicate to match with regexps but some important notes:

  • it matches on the file path (whole name in GNU terminology) like -path, not name.
  • the default regex flavour varies with the implementation (BRE in BSD find, ancient form of emacs regexp in GNU find).
  • the syntax to switch regex flavour varies with the implementation (-regextype predicate in GNU find, -E option in BSD find to switch to ERE).

With GNU find (and GNU mv for its -t):

find . -regextype posix-extended ! -name . -prune \
 -regex '\./[0-9]+x[0-9].*' -exec mv -t images {} +

With BSD find (and BSD xargs for its -J):

find -E . ! -name . -prune -regex '\./[0-9]+x[0-9].*' -print0 |
 xargs -r0J@ mv @ images/

The canonical tool to do regexp matching is grep, however grep works on lines of text and of course file names can be made of any number of lines since the newline character is as valid as any in a file name (file names don't even have to be made of text).

The GNU implementation of grep however has a -z option to work on NUL-delimited records (NUL being the only character, or more to the point 0 being the only byte that cannot occur in a file path) instead of lines and recent versions of the GNU implementation of ls have a --zero to list files in that format, so on recent GNU systems, you can do something like:

ls --zero | grep -zxE '[0-9]+x[0-9].*' | xargs -r0 mv -t images
answered Sep 1 at 6:52
-1

Safely mving many (thousands, millions) files, which may have funny characters in their filenames, calls for find, xargs, and the --target-directory, -t option to mv.

find . -maxdepth 0 -type f -name '[0-9]+x[0-9]+' -print0 | 
 xargs -0 -r | 
 mv -t ./images
terdon
252k69 gold badges480 silver badges716 bronze badges
answered Aug 31 at 17:27
5
  • -name takes a basic glob pattern as argument, not a regexp. Some find implementations have a -regex predicate, but it matches on the full path, not just the name and the default regexp flavour and how to switch to another and which varies with the implementation. Commented Aug 31 at 17:41
  • -maxdepth 0 means it will only work on . itself which obviously does match the pattern Commented Aug 31 at 17:41
  • xargs -0 -r without argument will call echo. Commented Aug 31 at 17:42
  • -t / --target-directory is a GNU extension. Commented Aug 31 at 17:42
  • No point in using -print0 | xargs -r0 when you can use -exec. Commented Aug 31 at 17:43
-2

mv $(ls | grep YOUR_REGEX) destination_directory/

should do the trick

Cheers

answered Mar 9, 2024 at 18:26
4
  • 2
    Not if any of the files in the current directory contain whitespace characters in their name. Commented Mar 9, 2024 at 19:21
  • Names beginning with - could also be a problem. Commented Mar 9, 2024 at 19:56
  • I think your regex should handle that and you can verify if the ls | grep REGEX prints the contents you wish to perform the operation (in this case mv) on. Commented Mar 10, 2024 at 20:50
  • Not if there are MANY long filenames. DATA LOSS WARNING! Use mv -t . Commented Aug 31 at 17:09
-3

You can use find with xargs:

find {FROM_DIRECTORY} -regextype sed -regex '{YOUR_REGEX}' | xargs -I {} mv {} ./images/
answered Aug 30 at 21:01

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.