lua-users home
lua-l archive

Questions about Lpeg (semantics of captures)

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


A few questions about lpeg...
I still don't understand lpeg very well, and I have the (naive?)
impression that patterns-with-captures are implemented on top of
patterns-without-captured in a way that even allows "projecting" a
pattern-with-captures into the lowel level, by discarding all the
information about captures... also, matching a pattern-with-captures
involves some backtracking, and some operations on the captures - like
"patt / function" - should only be performed after the (super)pattern
succeeds; so, in a first moment lpeg.match keeps backtracking
information and instructions for performing captures; at some point
the pattern is "closed", the backtracking information is dropped, and
the instructions for performing captures are executed...
Is that mental model correct? Is there a way to force a subpattern to
be closed, and its captures performed?
Now let me show why I stumbled on that question, and why I was
somewhat surprised when I discovered that the execution of the
function in "patt / function" is delayed.
I am trying to htmlize some files that have lots of "Elisp hyperlinks"
embedded in comments. For example, in
 # (info "(bash)Shell Parameter Expansion")
the "(info ...)" can be used as a hyperlink inside Emacs - executing
it as Lisp opens a page of the Bash manual. Not all sexps are
hyperlinks, and only a few of the sexps that work as hyperlinks inside
Emacs can be htmlized in meaningful ways. I have a table whose keys
are the symbols that can be heads of htmlizable hyperlink sexps, and I
was trying to build a pattern that would fail immediately when it
noticed that it was processing a sexp that is not htmlizable.
My first attempts to build patterns that would match only the "head
symbols" were more or less like this (I'm reconstructing that from
memory - it didn't work...):
 SSymbol = lpeg.R("AZ", "az", "09") + lpeg.S("-+_")
 headsymbols = { ["info"]=true, ["man"]=true }
 setsymbol = function (str) symbol = str end
 isheadsymbol = function (subj, pos)
 return headsymbols[symbol] and pos
 end
 SHeadSymbol = (SSymbol / setsymbol) * lpeg.P(isheadsymbol)
but then I discovered that the the "/ setsymbol" part was being
executed after the "lpeg.P(isheadsymbol)", not before...
My current solution (which works!) is like this - again, I'm
reconstructing this from from memory; the real implementation is more
complex:
 SSymbol = lpeg.R("AZ", "az", "09") + lpeg.S("-+_")
 headsymbols = { ["info"]=true, ["man"]=true }
 setmark = function (subj, pos)
 mark = pos
 return pos
 end
 isheadsymbol = function (subj, pos)
 local symbol = string.sub(subj, mark, pos - 1)
 return headsymbols[symbol] and pos
 end
 SHeadSymbol = lpeg.P(setmark) * SSymbol * lpeg.P(isheadsymbol)
Cheers, more later, thanks in advance, etc,
 Eduardo Ochs
 eduardoochs@gmail.com
 http://angg.twu.net/

AltStyle によって変換されたページ (->オリジナル) /