Re: Any LPEG tutorial for laymen ?

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]

Subject: Re: Any LPEG tutorial for laymen ?
From: Sean Conner <sean@...>
Date: 2013年9月25日 01:28:58 -0400

It was thus said that the Great David Crayford once stated:
> On 24/09/2013 8:51 PM, Luiz Henrique de Figueiredo wrote:
> >>Take the following output from a netstat command.
> >>
> >>Client Name: SMTP Client Id: 000000B7
> >[...]
> >>I would love to learn how to write LPeg parser to yank the key->values
> >>from that multi-line report easily.
> >You don't need LPeg for this task. Try
> >	for k,v in T:gmatch("(%u[%w ]-):%s*(.-)%s") do print(k,v) end
> >where T contains the netstat output.
> 
> Thanks. This is how dumbstruck I am WRT pattern matching. I want to 
> parse the following piece of netstat output
> 
> SKRBKDC 00000099 UDP
> Local Socket: 172.17.69.30..464
> Foreign Socket: *..*
> 
> The top line is the user, connection id and state. All I want to do is 
> capture three whitespace seperated words.
> 
> In REXX I would do this:
> 
> parse var line userid connid state
> 
> What is the most succinct way of doing something similar in Lua?
 Using LPeg:
lpeg = require "lpeg"	-- load up the module
-- this defines whitespace. It's just a space (ASCII 32).
-- alternatively, you can define it as:
--
-- SP = lpeg.S" \t"
--
-- Which defines whitespace as a set of characters (ASCII 32
-- and ASCII 9).
SP = lpeg.P" "
-- This defines a word. It's basically, at least one character (lpeg.P(1))
-- that is NOT a space (- SP). The "^1" is a loop operator of LPeg and here
-- it means "one or more". "lpeg.C()" is the capture function, and this is
-- what "captures" (or returns) what we are interested in.
word = lpeg.C( (lpeg.P(1) - SP)^1 )
-- And our line, which is three space separated words. In order to account
-- for multiple spaces, we use the loop operator on the whitespace. The
-- first bit, "SP^0" means "0 or more whitespace characters at the start of
-- the line." The "*" here can be read as "and", so translated, "optional
-- white space and a word and some space and a word and some space and a 
-- word."
line = SP^0 * word * SP^1 * word * SP^1 * word
-- That's it for the parsing. This function just takes a line of text, and
-- splits it into three separate words. Right now, we just print them one
-- to a line, but the code could return all three or do whatever.
function parse(text)
 local w1,w2,w3 = line:match(text)
 print(w1)
 print(w2)
 print(w3)
 print()
end
-- And some tests ... 
parse "SKRBKDC 00000099 UDP" 
parse " Local Socket: 172.17.69.30..464" 
parse " Foreign Socket: *..*"
 -spc

Follow-Ups:
- Re: Any LPEG tutorial for laymen ?, Andrew Starks

References:
- Any LPEG tutorial for laymen ?, Jayanth Acharya
- Re: Any LPEG tutorial for laymen ?, David Crayford
- Re: Any LPEG tutorial for laymen ?, Luiz Henrique de Figueiredo
- Re: Any LPEG tutorial for laymen ?, David Crayford

Prev by Date: Re: Any LPEG tutorial for laymen ?
Next by Date: Re: Any LPEG tutorial for laymen ?
Previous by thread: Re: Any LPEG tutorial for laymen ?
Next by thread: Re: Any LPEG tutorial for laymen ?
Index(es):
- Date
- Thread