Bash 5 bug in Cygwin only: empty pattern inside a regular expression
Corinna Vinschen
corinna-cygwin@cygwin.com
Mon Feb 24 17:05:54 GMT 2025
On Feb 24 17:01, LLoyd via Cygwin wrote:
> Hello.
>> I'll try to keep this short:
> In Cygwin only, using bash 5.2.21-1 or 5.2.15-3 (the only 5.* versions
> available), "empty" in a regular expression is not properly matched
> and breaks the regular expression.
> It's not a quoting issue, I also tested with:
> reg='foo|'; [[ foo =~ $regex ]]
>> GNU bash, version 5.2.15(3)-release (x86_64-pc-cygwin)
> GNU bash, version 5.2.21(1)-release (x86_64-pc-cygwin)
> [[ foo =~ foo| ]] (is false, should be true)
> [[ foo =~ foo|a ]] (is true)
> [[ '' =~ foo| ]] (is false, should be true)
>> GNU bash, version 5.1.16(1)-release (x86_64-pc-linux-gnu)
> GNU bash, version 5.2.21(1)-release (x86_64-pc-linux-gnu)
> [[ foo =~ foo| ]] (is true)
> [[ foo =~ foo|a ]] (is true)
> [[ '' =~ foo| ]] (is true)
This isn't actually a bug in Cygwin's bash, but a characteristic of the
underlying FreeBSD-based regex library. Per POSIX, empty branches in a
regular expression are undefined. Quoting from the Open Group:
The <vertical-line> is special except when used in a bracket
expression (see 9.3.5 RE Bracket Expression). A <vertical-line>
appearing first or last in an ERE, or immediately following a
<vertical-line> or a <left-parenthesis>, or immediately preceding a
<right-parenthesis>, produces undefined results.
(https://pubs.opengroup.org/onlinepubs/9799919799/basedefs/V1_chap09.html#tag_09_04_03)
Funny enough, even `man 7 regex' contains a description along the lines
of POSIX:
A (modern) RE is one(!) or more nonempty(!) branches, separated by
'|'. It matches anything that matches one of the branches.
(https://man7.org/linux/man-pages/man7/regex.7.html)
Given that, empty branches are an extension of GLibC which are not
necessarily supported by other libs.
Corinna
More information about the Cygwin
mailing list