pyPEG – a PEG Parser-Interpreter in Python
Requires Python 3.x or 2.7
Older versions: pyPEG 1.x
Offers parsing and composing capabilities. Implements an intrinsic Packrat parser.
pyPEG uses memoization as speed enhancement. Create a Parser instance to have a reset cache memory. Usually this is recommended if you're parsing another text – the cache memory will not provide wrong results but a reset will save memory consumption. If you're altering the grammar then clearing the cache memory for the respective things is required for having correct parsing results. Please use the clear_memory() method in that case.
The instance variables are representing the parser's state.
whitespace
Regular expression to scan whitespace; default: re.compile(r"(?m)\s+"). Set to None to disable automatic whitespace removing.
comment
grammar to parse comments; default: None. If a grammar is set here, comments will be removed from the source text automatically.
last_error
after parsing, SyntaxError which ended parsing
indent
string to use to indent while composing; default: four spaces
indention_level
level to indent to; default: 0
text
original text to parse; set for decorated syntax errors
filename
filename where text is origin from
autoblank
add blanks while composing if grammar would possibly be violated otherwise; default: True
keep_feeble_things
keep otherwise cropped things like comments and whitespace; these things are being put into the feeble_things attribute
__init__(self)
Initialize instance variables to their defaults.
clear_memory(self, thing=None)
Clear cache memory for packrat parsing.
This method clears the cache memory for thing. If None is given as thing, it clears the cache completely.
thing
thing for which cache memory is cleared; default: None
parse(self, text, thing, filename=None)
(Partially) parse text following thing as grammar and return the resulting things.
This method parses as far as possible. It does not raise a SyntaxError if the source text does not parse completely. It returns a SyntaxError object as result part of the return value if the beginning of the source text does not comply with grammar thing.
text
text to parse
thing
grammar for things to parse
filename
filename where text is origin from
Returns (text, result) with:
text
unparsed text
result
generated objects
ValueError
if input does not match types
TypeError
if output classes have wrong syntax for their respective __init__(self, ...)
GrammarTypeError
if grammar contains an object of unkown type
GrammarValueError
if grammar contains an illegal cardinality value
Example:
rd
)◊
)◊
])
||
compose(self, thing, grammar=None)
Compose text using thing with grammar. If thing.compose() exists, execute it, otherwise use grammar to compose.
thing
thing containing other things with grammar
grammar
grammar to use for composing thing; default: type(thing).grammar
Composed text
ValueError
if thing does not match grammar
GrammarTypeError
if grammar contains an object of unkown type
GrammarValueError
if grammar contains an illegal cardinality value
Example:
>>> from pypeg2 import Parser, csl, word
>>> p = Parser()
>>> p.compose(['hello', 'world'], csl(word))
'hello, world'
generate_syntax_error(self, msg, pos)
Generate a syntax error construct.
msg
string with error message
pos
(lineNo, charInText) with positioning information
Instance of SyntaxError with error text
parse(text, thing, filename=None, whitespace=whitespace, comment=None, keep_feeble_things=False)
Parse text following thing as grammar and return the resulting things or raise an error.
text
text to parse
thing
grammar for things to parse
filename
filename where text is origin from
whitespace
regular expression to skip whitespace; default: re.compile(r"(?m)\s+")
comment
grammar to parse comments; default: None
keep_feeble_things
keep otherwise cropped things like comments and whitespace; these things are being put into the feeble_things attribute; default: False
generated things
SyntaxError
if text does not match the grammar in thing
ValueError
if input does not match types
TypeError
if output classes have wrong syntax for __init__()
GrammarTypeError
if grammar contains an object of unkown type
GrammarValueError
if grammar contains an illegal cardinality value
Example:
>>> from pypeg2 import parse, csl, word
>>> parse("hello, world", csl(word))
['hello', 'world']
compose(thing, grammar=None, indent=" ", autoblank=True)
Compose text using thing with grammar.
thing
thing containing other things with grammar
grammar
grammar to use to compose thing; default: thing.grammar
indent
string to use to indent while composing; default: four spaces
autoblank
add blanks if grammar would possibly be violated otherwise; default: True
composed text
ValueError
if input does not match grammar
GrammarTypeError
if grammar contains an object of unkown type
GrammarValueError
if grammar contains an illegal cardinality value
Example:
>>> from pypeg2 import compose, csl, word
>>> compose(['hello', 'world'], csl(word))
'hello, world'
attributes(grammar, invisible=False)
Iterates all attributes of a grammar.
This function can be used to iterate through all attributes which will be generated for the top level object of the grammar. If invisible is False omit attributes whose names are starting with an underscore _.
Example:
>>> from pypeg2 import attr, name, attributes, word, restline
>>> class Me:
... grammar = name(), attr("typing", word), restline
...
>>> for a in attributes(Me.grammar): print(a.name)
...
name
typing
>>>
how_many(grammar)
Determines the possibly parsed objects of grammar.
This function is meant to check if the results of a grammar can be stored in a single object or a collection will be needed.
0
if there will be no objects
1
if there will be a maximum of one object
2
if there can be more than one object
GrammarTypeError
if grammar contains an object of unkown type
GrammarValueError
if grammar contains an illegal cardinality value
Example:
>>> from pypeg2 import how_many, word, csl
>>> how_many("some")
0
>>> how_many(word)
1
>>> how_many(csl(word))
2
Base class for all errors pyPEG delivers.
A grammar contains an object of a type which cannot be parsed, for example an instance of an unknown class or of a basic type like float. It can be caused by an int at the wrong place, too.
A grammar contains an object with an illegal value, for example an undefined cardinality.