Re: tables as parsers

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]

Subject: Re: tables as parsers
From: "Soni \"They/Them\" L." <fakedme@...>
Date: 2019年3月27日 23:06:05 -0300

On 2019年03月27日 10:10 p.m., Coda Highland wrote:

On Wed, Mar 27, 2019 at 6:47 PM Soni "They/Them" L. <fakedme@gmail.com<mailto:fakedme@gmail.com>> wrote:
 I'm rolling with this weird idea. is it possible to make a simple
 parser
 out of tables? I tried some stuff but ran into a few issues. I don't
 wanna use LPeg or anything more complicated than string.find(foo,
 bar,
 1, true).
 e.g.:
 local ltk = {} -- lua_tokens
 ltk.string = {}
 ltk['"'] = "string"
 ltk["'"] = "string" -- how to handle end of string correctly?
 ltk.string['\\'] = "escape"
 ltk.string.escape = {}
 ltk.string.escape['z'] = {}
 ltk.string.escape['z'].next = ltk.string.escape['z']
 for i, v in ipairs({" ", "\t", "\n", etc}) do
    ltk.string.escape['z'][v] = "next"
 end
 ltk.string.escape['z'][''] = ltk.string -- not sure if there'd be a
 better way to do this
 -- etc
Not a bad start for someone who's never done this formally before.
There are a zillion different ways to define a parser. The Dragon Bookrecommendation is well-given.I think the insight you're missing is that the table really representsthe structure of a state machine. Your top-level table shouldn'tcontain things like ["'"] = "string" directly; rather, you'd wantsomething like ltk.base["'"]. Then you wouldn't haveltk.string.escape, you'd just have ltk.string and ltk.escape. Thenesting doesn't belong in the table itself; instead, your parser willmaintain a stack of states, and you push a state when you start into aparsing rule and you pop it off when you've finished that rule.

Yeah but then I wouldn't have self-referential tables. Besides I wasn'tplanning on having a stack. FSMs (same class as regex) don't have a stack.

The place where this is suddenly going to get complicated is if yourgrammar contains any prefix ambiguities (e.g. "fork" vs "forward"),because then you have to implement backtracking. For most simplergrammars it's best to just break it into two passes -- one that makessimple lexical tokens out of a string of characters, and one thatevaluates the syntax out of a list of lexical tokens.

"fork" vs "forward" is unambiguous and doesn't require backtracking.

I don't mind offering assistance with this. I've actually built acomplete standards-compliant parser for C++ from first principlesbefore. I think it's pretty fun.
/s/ Adam

Follow-Ups:
- Re: tables as parsers, Coda Highland

References:
- tables as parsers, Soni "They/Them" L.
- Re: tables as parsers, Coda Highland

Prev by Date: Re: tables as parsers
Next by Date: Re: tables as parsers
Previous by thread: Re: tables as parsers
Next by thread: Re: tables as parsers
Index(es):
- Date
- Thread