Re: Matching sequences of identical characters with lpeg.
[
Date Prev][
Date Next][
Thread Prev][
Thread Next]
[
Date Index]
[
Thread Index]
- Subject: Re: Matching sequences of identical characters with lpeg.
- From: Sean Conner <sean@...>
- Date: 2018年11月20日 20:04:15 -0500
It was thus said that the Great Gabriel Bertilson once stated:
> Here's a solution using Cmt. As Andrew pointed out, Cb just inserts a
> capture into the list of captures returned by the current pattern, it
> doesn't match anything.
>
> Cg(1, "char") * Cmt(C(1) * Cb'char', function (_, _, char1, char2)
> return char1 == char2 end)^0
>
> It matches one character, labels it as "char", then matches further
> characters if they are equal to "char". To use it on "aaabbbcccd" in
> the Lua interpreter (with LPeg functions available as variables):
>
> > patt = Cg(1, "char") * Cmt(C(1) * Cb'char', function (_, _, cur, prev) return cur == prev end)^0
> > (C(patt)^1):match "aaabbbcccd"
> aa bbb ccc d
And it can be further extended to UTF-8:
local char = R"1円127円"
+ R"194円244円" * R"128円191円"^1
local seq = Cg(char,'char')
* Cmt(
C(char) * Cb'char',
function(_,_,cur,prev)
return cur == prev
end
)^0
local patt = C(seq)^1
print(patt:match "aaabbbcccd")
print(patt:match "###aaaa©©©©####bbbbb####")
-spc