top
up

9.5QuantifiersπŸ”— i

The quantifiers *, +, and ? match respectively: zero or more, one or more, and zero or one instances of the preceding subpattern.

> (regexp-match-positions #rx"c[ad]*r""cadaddadddr")

'((0 . 11))

> (regexp-match-positions #rx"c[ad]*r""cr")

'((0 . 2))

> (regexp-match-positions #rx"c[ad]+r""cadaddadddr")

'((0 . 11))

> (regexp-match-positions #rx"c[ad]+r""cr")

#f

> (regexp-match-positions #rx"c[ad]?r""cadaddadddr")

#f

> (regexp-match-positions #rx"c[ad]?r""cr")

'((0 . 2))

> (regexp-match-positions #rx"c[ad]?r""car")

'((0 . 3))

In #px syntax, you can use braces to specify much finer-tuned quantification than is possible with *, +, ?:

  • The quantifier {m} matches exactly m instances of the preceding subpattern; m must be a nonnegative integer.

  • The quantifier {m,n} matches at least m and at most n instances. m and n are nonnegative integers with m less or equal to n. You may omit either or both numbers, in which case m defaults to 0 and n to infinity.

It is evident that + and ? are abbreviations for {1,} and {0,1} respectively, and * abbreviates {,}, which is the same as {0,}.

> (regexp-match #px"[aeiou]{3}""vacuous")

'("uou")

> (regexp-match #px"[aeiou]{3}""evolve")

#f

> (regexp-match #px"[aeiou]{2,3}""evolve")

#f

> (regexp-match #px"[aeiou]{2,3}""zeugma")

'("eu")

The quantifiers described so far are all greedy: they match the maximal number of instances that would still lead to an overall match for the full pattern.

> (regexp-match #rx"<.*>""<tag1> <tag2> <tag3>")

'("<tag1> <tag2> <tag3>")

To make these quantifiers non-greedy, append a ? to them. Non-greedy quantifiers match the minimal number of instances needed to ensure an overall match.

> (regexp-match #rx"<.*?>""<tag1> <tag2> <tag3>")

'("<tag1>")

The non-greedy quantifiers are *?, +?, ??, {m}?, and {m,n}?, although {m}? is always the same as {m}. Note that the metacharacter ? has two different uses, and both uses are represented in ??.

top
up

AltStyle γ«γ‚ˆγ£γ¦ε€‰ζ›γ•γ‚ŒγŸγƒšγƒΌγ‚Έ (->γ‚ͺγƒͺγ‚ΈγƒŠγƒ«) /