Operator-precedence grammar

An operator precedence grammar is a kind of grammar for formal languages.

Technically, an operator precedence grammar is a context-free grammar that has the property (among others)^[1] that no production has either an empty right-hand side or two adjacent nonterminals in its right-hand side. These properties allow precedence relations to be defined between the terminals of the grammar. A parser that exploits these relations is considerably simpler than more general-purpose parsers, such as LALR parsers. Operator-precedence parsers can be constructed for a large class of context-free grammars.

Precedence relations

[edit ]

Operator precedence grammars rely on the following three precedence relations between the terminals:^[2]

Relation	Meaning
$a\lessdot b$ {\displaystyle a\lessdot b}	a yields precedence to b
$a\doteq b$ {\displaystyle a\doteq b}	a has the same precedence as b
$a\gtrdot b$ {\displaystyle a\gtrdot b}	a takes precedence over b

These operator precedence relations allow to delimit the handles in the right sentential forms: $\lessdot$ {\displaystyle \lessdot } marks the left end, $\doteq$ {\displaystyle \doteq } appears in the interior of the handle, and $\gtrdot$ {\displaystyle \gtrdot } marks the right end. Contrary to other shift-reduce parsers, all nonterminals are considered equal for the purpose of identifying handles.^[3] The relations do not have the same properties as their un-dotted counterparts; e. g. $a\doteq b$ {\displaystyle a\doteq b} does not generally imply $b\doteq a$ {\displaystyle b\doteq a}, and $b\gtrdot a$ {\displaystyle b\gtrdot a} does not follow from $a\lessdot b$ {\displaystyle a\lessdot b}. Furthermore, $a\doteq a$ {\displaystyle a\doteq a} does not generally hold, and $a\gtrdot a$ {\displaystyle a\gtrdot a} is possible.

Let us assume that between the terminals a_i and a_i+1 there is always exactly one precedence relation. Suppose that $ is the end of the string. Then for all terminals b we define: $\$\lessdot b$ {\displaystyle \$\lessdot b} and $b\gtrdot \$$ {\displaystyle b\gtrdot \$}. If we remove all nonterminals and place the correct precedence relation: $\lessdot$ {\displaystyle \lessdot }, $\doteq$ {\displaystyle \doteq }, $\gtrdot$ {\displaystyle \gtrdot } between the remaining terminals, there remain strings that can be analyzed by an easily developed bottom-up parser.

Example

[edit ]

For example, the following operator precedence relations can be introduced for simple expressions:^[4]

{\begin{array}{c|cccc}&\mathrm {id} &+&*&\$\\\hline \mathrm {id} &&\gtrdot &\gtrdot &\gtrdot \\+&\lessdot &\gtrdot &\lessdot &\gtrdot \\*&\lessdot &\gtrdot &\gtrdot &\gtrdot \\\$&\lessdot &\lessdot &\lessdot &\end{array}}

{\displaystyle {\begin{array}{c|cccc}&\mathrm {id} &+&*&\$\\\hline \mathrm {id} &&\gtrdot &\gtrdot &\gtrdot \\+&\lessdot &\gtrdot &\lessdot &\gtrdot \\*&\lessdot &\gtrdot &\gtrdot &\gtrdot \\\$&\lessdot &\lessdot &\lessdot &\end{array}}}

They follow from the following facts:^[5]

+ has lower precedence than * (hence $+\lessdot *$ {\displaystyle +\lessdot *} and $*\gtrdot +$ {\displaystyle *\gtrdot +}).
Both + and * are left-associative (hence $+\gtrdot +$ {\displaystyle +\gtrdot +} and $*\gtrdot *$ {\displaystyle *\gtrdot *}).

The input string^[4]

\mathrm {id} _{1}+\mathrm {id} _{2}*\mathrm {id} _{3}

{\displaystyle \mathrm {id} _{1}+\mathrm {id} _{2}*\mathrm {id} _{3}}

after adding end markers and inserting precedence relations becomes

\$\lessdot \mathrm {id} _{1}\gtrdot +\lessdot \mathrm {id} _{2}\gtrdot *\lessdot \mathrm {id} _{3}\gtrdot \$

{\displaystyle \$\lessdot \mathrm {id} _{1}\gtrdot +\lessdot \mathrm {id} _{2}\gtrdot *\lessdot \mathrm {id} _{3}\gtrdot \$}

Operator precedence parsing

[edit ]

Having precedence relations allows to identify handles as follows:^[4]

scan the string from left until seeing $\gtrdot$ {\displaystyle \gtrdot }
scan backwards (from right to left) over any $\doteq$ {\displaystyle \doteq } until seeing $\lessdot$ {\displaystyle \lessdot }
everything between the two relations $\lessdot$ {\displaystyle \lessdot } and $\gtrdot$ {\displaystyle \gtrdot }, including any intervening or surrounding nonterminals, forms the handle

It is generally not necessary to scan the entire sentential form to find the handle.

Operator precedence parsing algorithm

[edit ]

The algorithm below is from Aho et al.:^[6]

If $ is on the top of the stack and ip points to $ then return
else
 Let a be the top terminal on the stack, and b the symbol pointed to by ip
 if a  $\lessdot$ {\displaystyle \lessdot } b or a  $\doteq$ {\displaystyle \doteq } b then
 push b onto the stack
 advance ip to the next input symbol
 else if a  $\gtrdot$ {\displaystyle \gtrdot } b then
 repeat
 pop the stack
 until the top stack terminal is related by  $\lessdot$ {\displaystyle \lessdot } to the terminal most recently popped
 else error()
end

Precedence functions

[edit ]

An operator precedence parser usually does not store the precedence table with the relations, which can get rather large. Instead, precedence functions f and g are defined.^[7] They map terminal symbols to integers, and so the precedence relations between the symbols are implemented by numerical comparison: ⁠ $f(a)<g(b)$ {\displaystyle f(a)<g(b)}⁠ must hold if $a\lessdot b$ {\displaystyle a\lessdot b} holds, etc.

Not every table of precedence relations has precedence functions, but in practice for most grammars such functions can be designed.^[8]

Algorithm for constructing precedence functions

[edit ]

The below algorithm is from Aho et al.:^[9]

Create symbols f_a and g_a for each grammar terminal a and for the end of string symbol;
Partition the created symbols in groups so that f_a and g_b are in the same group if $a\doteq b$ {\displaystyle a\doteq b} (there can be symbols in the same group even if their terminals are not connected by this relation);
Create a directed graph whose nodes are the groups. For each pair ⁠ $(a,b)$ {\displaystyle (a,b)}⁠ of terminals do: place an edge from the group of g_b to the group of f_a if $a\lessdot b$ {\displaystyle a\lessdot b}, otherwise if $a\gtrdot b$ {\displaystyle a\gtrdot b} place an edge from the group of f_a to that of g_b;
If the constructed graph has a cycle then no precedence functions exist. When there are no cycles, let ⁠ $f(a)$ {\displaystyle f(a)}⁠ be the length of the longest path from the group of f_a and let ⁠ $g(a)$ {\displaystyle g(a)}⁠ be the length of the longest path from the group of g_a.

Example

[edit ]

Consider the following table (repeated from above):^[10]

{\begin{array}{c|cccc}&\mathrm {id} &+&*&\$\\\hline \mathrm {id} &&\gtrdot &\gtrdot &\gtrdot \\+&\lessdot &\gtrdot &\lessdot &\gtrdot \\*&\lessdot &\gtrdot &\gtrdot &\gtrdot \\\$&\lessdot &\lessdot &\lessdot &\end{array}}

{\displaystyle {\begin{array}{c|cccc}&\mathrm {id} &+&*&\$\\\hline \mathrm {id} &&\gtrdot &\gtrdot &\gtrdot \\+&\lessdot &\gtrdot &\lessdot &\gtrdot \\*&\lessdot &\gtrdot &\gtrdot &\gtrdot \\\$&\lessdot &\lessdot &\lessdot &\end{array}}}

Using the algorithm leads to the following graph:

 gid
 \
 fid f*
 \ /
 g*
 /
 f+ 
 | \
 | g+
 | |
 g$ f$

from which we extract the following precedence functions from the maximum heights in the directed acyclic graph:

id	+	*	$
f	4	2	4	0
g	5	1	3	0

Operator-precedence languages

[edit ]

The class of languages described by operator-precedence grammars, i.e., operator-precedence languages, is strictly contained in the class of deterministic context-free languages, and strictly contains visibly pushdown languages.^[11]

Operator-precedence languages enjoy many closure properties: union, intersection, complementation,^[12] concatenation,^[11] and they are the largest known class closed under all these operations and for which the emptiness problem is decidable. Another peculiar feature of operator-precedence languages is their local parsability,^[13] that enables efficient parallel parsing.

There are also characterizations based on an equivalent form of automata and monadic second-order logic.^[14]

Notes

[edit ]

^ Aho, Sethi & Ullman 1988, p. 203
^ Aho, Sethi & Ullman 1988, pp. 203–204
^ Aho, Sethi & Ullman 1988, pp. 205–206
^ ^a ^b ^c Aho, Sethi & Ullman 1988, p. 205
^ Aho, Sethi & Ullman 1988, p. 204
^ Aho, Sethi & Ullman 1988, p. 206
^ Aho, Sethi & Ullman 1988, pp. 208–209
^ Aho, Sethi & Ullman 1988, p. 209
^ Aho, Sethi & Ullman 1988, pp. 209–210
^ Aho, Sethi & Ullman 1988, p. 210
^ ^a ^b Crespi Reghizzi & Mandrioli 2012
^ Crespi Reghizzi, Mandrioli & Martin 1978
^ Barenghi et al. 2015
^ Lonati et al. 2015

References

[edit ]

Aho, Alfred V.; Sethi, Ravi; Ullman, Jeffrey D. (1988). Compilers — Principles, Techniques, and Tools. Addison-Wesley.
Crespi Reghizzi, Stefano; Mandrioli, Dino (2012). "Operator precedence and the visibly pushdown property". Journal of Computer and System Sciences. 78 (6): 1837–1867. doi:10.1016/j.jcss.201112006 .
Crespi Reghizzi, Stefano; Mandrioli, Dino; Martin, David F. (1978). "Algebraic Properties of Operator Precedence Languages". Information and Control. 37 (2): 115–133. doi:10.1016/S0019-9958(78)90474-6 .
Barenghi, Alessandro; Crespi Reghizzi, Stefano; Mandrioli, Dino; Panella, Federica; Pradella, Matteo (2015). "Parallel parsing made practical". Science of Computer Programming. 112 (3): 245–249. doi:10.1016/j.scico.201509002 . hdl:11311/971391 .
Lonati, Violetta; Mandrioli, Dino; Panella, Federica; Pradella, Matteo (2015). "Operator Precedence Languages: Their Automata-Theoretic and Logic Characterization". SIAM Journal on Computing. 44 (4): 1026–1088. doi:10.1137/140978818. hdl:2434/352809 .

External links

[edit ]

Nikolay Nikolaev: IS53011A Language Design and Implementation, Course notes for CIS 324, 2010.

v t e Parsing algorithms
Top-down	Earley LL Recursive descent Tail recursive
Bottom-up	Precedence Simple Operator Shunting-yard LR Simple Look-ahead Canonical Generalized CYK Recursive ascent Shift-reduce
Mixed, other	Combinator Chart Left corner Statistical
Related topics	PEG Definite clause grammar Deterministic parsing Dynamic programming Memoization Parser generator LALR Parse tree AST Scannerless parsing History of compiler construction Comparison of parser generators Operator-precedence grammar

Retrieved from "https://en.wikipedia.org/w/index.php?title=Operator-precedence_grammar&oldid=1184119302"

Precedence relations

Example

Operator precedence parsing

Operator precedence parsing algorithm

Precedence functions

Algorithm for constructing precedence functions

Example

Operator-precedence languages

Notes

References

Further reading

External links