lua-users home
lua-l archive

Re: LPEG: captures

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


On Thu, May 28, 2015 at 12:52 AM, Alexander Mashin
<alex.mashin@gmail.com> wrote:
> This is the input:
> Perhaps, [[Peter|Simon]], or [[Paul]], so they say (see [[:Apocrypha]])
>
> This is the desired output:
>
> table {
> full = Perhaps, [[Name::Peter|Simon]], or [[Name::Paul]], so they say (see
> [[:Apocrypha]])
> items = table {
> 1 = Peter,
> 2 = Paul
> }
> separator = ,
> }
The above would be very tricky. "Peter" and "Paul" are captured
inside two different named captures (full and items). Additionally,
Peter and Paul are appended to the same named capture (items) even
though Peter and Paul occur at different locations in the input.
Will the following work for you?
grammar = [==[
wikitext <- {| ( link / separator / text )* |}
link <- {|
 {:t:''->'link':}
 {'[['}
 !':'
 ''->'Name::'
 { ( !']]' !'|' . )+ }
 { '|' ( !']]' . )* / }
 {']]'}
 |}
separator <- {|
 {:t:''->'separator':}
 { [,;*#] } { %s* } |}
text <- {|
 {:t:''->'text':}
 { ( !link !separator . )+ } |}
]==]
s = 'Perhaps, [[Peter|Simon]], or [[Paul]], so they say (see [[:Apocrypha]])'
parser = require ( 're' ).compile ( grammar )
t = parser : match ( s )
print ( s )
print ()
for k,v in pairs ( t ) do
 print ( string.format ( '%d %-20s %s', k, v.t, table.concat ( v ) ) )
 end
print ()
for k,v in pairs ( t ) do
 if v.t == 'link' then
 print ( string.format ( '%d %s %d %-5s %s',
 k, 'link', #v, v[3], v[4] ) )
 end end
---
The above will output:
Perhaps, [[Peter|Simon]], or [[Paul]], so they say (see [[:Apocrypha]])
1 text Perhaps
2 separator ,
3 link [[Name::Peter|Simon]]
4 separator ,
5 text or
6 link [[Name::Paul]]
7 separator ,
8 text so they say (see [[:Apocrypha]])
3 link 5 Peter |Simon
6 link 5 Paul

AltStyle によって変換されたページ (->オリジナル) /