lua-users home
lua-l archive

Re: Yieldable/streaming LPEG

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


It was thus said that the Great Paul K once stated:
> 
> Are there any other options for parsing incomplete strings using LPEG?
 I don't think LPeg contains much state itself (seeing how it's composable)
so there isn't much to save---it either matches a pattern, or it doesn't.
 But I think that if you can break your parsing up into natural "units",
you should be able to parse a bit and if you fail to parse, add more data
until you can. As a proof-of-concept:
==[ proof-of-concept.lua ]=======================
BUF = 2 -- we'll be readin in this many characters at a time
local lpeg = require "lpeg"
-- ************************************************************************
-- parse a line of text. If a line doesn't end with a '\n' then it's an
-- error. Return the line of text and position at the end of the line, such
-- that we can resume parsing there if need be.
-- ************************************************************************
local line = lpeg.C((lpeg.P(1) - lpeg.P"\n")^0)
 * lpeg.P"\n"
 * lpeg.Cp() 
-- ************************************************************************
-- Interator to get lines from a file.
-- ************************************************************************
function getlines(file)
 local function getnext(state,var)
 -- -------------------------------
 -- attempt to get a line of data
 -- -------------------------------
 local l,pos = line:match(state.text,state.pos)
 
 -- -------------------------------------------------------------------
 -- if we get nil, or the new position is the same as the old position,
 -- then we've exhausted our buffer of data. Attempt to read more from
 -- our stream (in this case, a file)
 -- -------------------------------------------------------------------
 
 if l == nil or pos == state.pos then
 local data = state.file:read(BUF) 
 
 -- ----------------------------------------------------------------
 -- check for end of our stream. If so, and the current position is at
 -- the start of the string, then we've encountered a partial, non-'\n'
 -- terminated line, so signal an error. Otherwise, we've successfully
 -- reached the end of the input.
 -- -----------------------------------------------------------------
 
 if not data or data == "" then 
 if state.pos == 1 then
 error("bad token on line " .. tostring(state.line))
 end
 return nil
 end
 
 -- ------------------------------------------------------------------
 -- We've read more data from the stream. Discard what we've parsed so
 -- far from our buffered data, keep what we haven't parsed and append
 -- the new data, and try our parse again.
 -- ------------------------------------------------------------------
 
 state.text = state.text:sub(state.pos,-1) .. data
 state.pos = 1
 return getnext(state,var)
 end
 
 -- --------------------------------------------------------------
 -- okay, update our state, and return the data we've just parsed.
 -- --------------------------------------------------------------
 
 state.line = state.line + 1
 state.pos = pos
 return l
 end
 
 return getnext,{		-- our interator function
 	file = file, 		-- stream we're reading from
 	text = file:read(BUF),	-- buffered data
 	pos = 1,		-- starting position in buffer
 	line = 1		-- line (for error reporting)
 }
end
f = io.open(arg[1],"r")
for line in getlines(f) do
 print(line)
end
 
==[ END OF LINE ]========================================
 So that's one way of doing it. It isn't that pretty, but it works. 
 
 -spc

AltStyle によって変換されたページ (->オリジナル) /