lua-users home
lua-l archive

Re: LPEG 're' module self-test fails

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


Further (again) to my message about the re module self-test failure, I think I worked it out (this took a few days).
It fails to parse any lines with "<-" on them, leading me to query why the test for "name S !arrow" failed.
The relevant part of the grammar is here:
 pattern <- exp !.
 exp <- S (alternative / grammar)
 alternative <- seq ('/' S seq)*
 seq <- prefix*
 ...
 grammar <- definition+
An "exp" is either an "alternative" or a "grammar".
Assuming the alternative doesn't use the "/" symbol we effectively have this:
 pattern <- S (prefix* / definition+) !.
-----
We can make up a similar test case:
 require "re"
 local target = "foo"
 local grammar = " ('foo'* / 'bar'+) !."
 print (re.match (target, grammar))
That will match at target of "foo" but not "bar". Why? Because even zero instances of "foo" are acceptable as a match. Therefore the "'bar'+" alternative is not considered. Thus in the real grammar "alternative" can consist of an empty string. A line like this will still match "alternative" (without consuming any characters):
 pattern <- exp !.
Now the final test fails (the test that we are at end-of-subject).
However by putting "grammar" first (ie. "(grammar / alternative)" rather than "(alternative / grammar)" ) this works, because "grammar" matches ONE or more (not zero or more) and will fail on a non-grammar line, thus letting the PEG try the "alternative" route.
You could work around it as well by insisting that "seq" matches at least something:
 seq <- prefix+
However that fails to pass a totally empty grammar.
Reference: http://www.inf.puc-rio.br/~roberto/lpeg/re.html

AltStyle によって変換されたページ (->オリジナル) /