lua-users home
lua-l archive

There's a bug in my LPeg code, but I can't find it

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


 Usually, I'm the one to answer LPeg questions, but tonight I need some
help with LPeg, and I'm hoping someone might see something I'm missing. It
has to do with my URL parsing module [1]. The following code presents the
bug:
url = require "org.conman.parsers.url.url"
lpeg = require "lpeg"
x = url * lpeg.Cp()
a,b = x:match "/status" print(b) -- prints 8, okay
a,b = x:match "/status/" print(b) -- prints 9, okay
a,b = x:match "/status " print(b) -- prints 8, okay
a,b = x:match "/status/ " print(b) -- prints 8, WAT?
The code in url that matters [2]:
path_absolute <- {| {:root: %istrue :} '/' (segment_nz ('/' segment)* )? |}
segment_nz <- {~ pchar+ ~}
segment <- ! . / {~ pchar+ ~} -- NOTE
pchar <- unreserved / pct_encoded / sub_delims / ':' / '@'
pct_encoded <- %pct_encoded
sub_delims <- '!' / '$' / '&' / "'" / '(' / ')'
 / '*' / '+' / ',' / ';' / '='
unreserved <- %ALPHA / %DIGIT / '-' / '.' / '_' / '~'
The 'segment' rule *should* be 
segment <- ! . / {~ pchar* ~}
 But fixing that issue doesn't resolve my current issue. Why is the
trailing slash, when followed by a space, not parsed as part of the URL? I
can work around the bug (for some usecases; see below for a possibly related
issue) but it's annoying me that I can't seem to locate the issue.
 Possibly related:
a,b = x:match "/status#a" print(b) -- prints 10 okay
a,b = x:match "/status/#a" print(b) -- prints 8 WAT?
 -spc (Puzzled by this ... )
[1]	Installable as
		luarocks install org.conman.parsers.url.url
	Also as part of
		https://github.com/spc476/LPeg-Parsers
	viewable at:
		https://github.com/spc476/LPeg-Parsers/blob/9fe3db4c0a52264f9e0e78200cc0f7dda0008f04/url/url.lua
[2]	The code is literally transcribed from RFC-3986.

AltStyle によって変換されたページ (->オリジナル) /