Globbing Regex And Bash

Question 1

I just can't ever work out whether I am supposed to be using globbing or regex with bash. My book on bash shell scripting is so confusing specifically because it doesn't clear this topic up and I never end up getting my understanding right. Let me give an example, it states the following:... The . (dot) character means "any single character." Thus, a.c matches all of abc, aac, aqc, and so on.

Ok great, I'm thinking he's wrong because this is regex, but the first thing I do, is test it anyway:

$ touch abc aac aqc
$ ls
aac abc aqc
$ ls a.c
ls: cannot access 'a.c': No such file or directory

I then go and google globbing, and come across this post called "globbing tutorial", and I'm thinking, right this is the one.

https://linuxhint.com/bash_globbing_tutorial/

I'm almost immediately thinking it's all wrong because half his "globbing" is done via grep, which uses BRE which isn't globbing. For example he states:

"$ is used to define the ending character"

This is wrong, because that's the regex meaning, and it's not globbing. So I test it:

$ ls
aac abc aqc
$ ls c$
ls: cannot access 'c$': No such file or directory

So his number 1 hit link on google is wrong as well. It's like there's no post that clarifies this topic either in books or online, so I need some help to define the difference between regex and globbing, with some absolute certainty.

Question 2

Yeah that article appears to go completely off the rails partway through the Caret – (^) section. IMHO you'd be better off reading the bash manual's Pattern Matching

Question 3

As a general rule you never want the top hit, you want a respectable site. Preferably the manual, as steeldriver suggested, but certainly not whatever random website google decides to show you. That said, does this answer you? What is the definition of a regular expression? It might be a duplicate, if it answers you to your satisfaction.

Question 4

The only place where bash uses regexps is with the =~ operator of its [[ ... ]] construct, and it's POSIX extended regexps in that case:

if [[ abc =~ ^a.b$ ]]; then
 echo 'abc matches the ^a.b$ ERE'
fi

Everywhere else:

case abc in (a?b) echo 'abc matches the a?b glob pattern'; esac
[[ abc = a?b ]] && echo 'abc matches the a?b glob pattern'
printf '%s\n' a?b: actual globbing aka filename generation aka pathname expansion
printf '%s\n' "${var#a?b}" "${var%a?b}" "${var##a?b}" "${var%%a?b}" "${var/a?b/x}
compgen -G 'a?b' (same with complete).
help 'r??d'

That's shell wildcards aka glob patterns aka filename / fnmatch patterns.

Run info bash pattern to learn about those in bash specifically. info -n conditional bash will get you to the Conditional construct section inside which you'll find the description of [[ ... ]] and its =~ operator.

Other tools like grep, find, vim, perl, firefox can use either or both in varying contexts. Their documentation will tell you. Also beware, there are many flavours of both types of patterns. As a rule of thumb, glob patterns are typically used for matching filenames (like in shell globs or find's -name/-path) and regexp for arbitrary text matching.

ksh93 is a shell that can use regexps (basic, extended, perl-like or augmented) in its globs:

$ printf '%s\n' ~(E:^a.b$)
a=b
axb

In zsh, you can use regexps (extended or pcre) in its globs via the e glob qualifier:

$ printf '%s\n' *(e['[[ $REPLY =~ "^a.b$" ]]'])
a=b
axb

$ zmodload zsh/pcre
$ printf '%s\n' *(e['[[ $REPLY -pcre-match "^a.b\z" ]]'])
a=b
axb

(where \z being the PCRE equivalent of ERE $ as in PCRE $ matches at the end of the subject but also before a newline at the end of subject).

If you set the rematchpcre option (set -o rematchpcre), [[ =~ ]] uses PCRE instead of ERE there.

score 4 · Accepted Answer · 2022-12-15 19:01:59Z

The only place where bash uses regexps is with the =~ operator of its [[ ... ]] construct, and it's POSIX extended regexps in that case:

if [[ abc =~ ^a.b$ ]]; then
 echo 'abc matches the ^a.b$ ERE'
fi

Everywhere else:

case abc in (a?b) echo 'abc matches the a?b glob pattern'; esac
[[ abc = a?b ]] && echo 'abc matches the a?b glob pattern'
printf '%s\n' a?b: actual globbing aka filename generation aka pathname expansion
printf '%s\n' "${var#a?b}" "${var%a?b}" "${var##a?b}" "${var%%a?b}" "${var/a?b/x}
compgen -G 'a?b' (same with complete).
help 'r??d'

That's shell wildcards aka glob patterns aka filename / fnmatch patterns.

Run info bash pattern to learn about those in bash specifically. info -n conditional bash will get you to the Conditional construct section inside which you'll find the description of [[ ... ]] and its =~ operator.

Other tools like grep, find, vim, perl, firefox can use either or both in varying contexts. Their documentation will tell you. Also beware, there are many flavours of both types of patterns. As a rule of thumb, glob patterns are typically used for matching filenames (like in shell globs or find's -name/-path) and regexp for arbitrary text matching.

ksh93 is a shell that can use regexps (basic, extended, perl-like or augmented) in its globs:

$ printf '%s\n' ~(E:^a.b$)
a=b
axb

In zsh, you can use regexps (extended or pcre) in its globs via the e glob qualifier:

$ printf '%s\n' *(e['[[ $REPLY =~ "^a.b$" ]]'])
a=b
axb

$ zmodload zsh/pcre
$ printf '%s\n' *(e['[[ $REPLY -pcre-match "^a.b\z" ]]'])
a=b
axb

(where \z being the PCRE equivalent of ERE $ as in PCRE $ matches at the end of the subject but also before a newline at the end of subject).

If you set the rematchpcre option (set -o rematchpcre), [[ =~ ]] uses PCRE instead of ERE there.

Stack Exchange Network

Globbing Regex And Bash

1 Answer 1

You must log in to answer this question.

Linked

Hot Network Questions

Globbing Regex And Bash

1 Answer 1

You must log in to answer this question.

Linked

Related

Hot Network Questions