In Polish postfix notation, the operators follow their operands. For example, to add 3 and 4 together, the expression is 3 4 + rather than 3 + 4. The conventional notation expression 3 − 4 + 5 becomes 3 4 − 5 + in Polish postfix notation: 4 is first subtracted from 3, then 5 is added to it.
Polish prefix notation, on the other hand, require that its operators precede the operands they work on. 3 + 4 would then be as + 3 4.
The algorithm used for parsing the postfix expression is simple:
while not end of file
read token into var
if (var is num)
push var
else if (var is operator)
if (operands >= 2)
pop into rhs
pop into lhs
push lhs operator rhs
else
throw exception
else
return fail
pop into var
return var
The code uses a stack to implement it. Prefix notation is parsed in the same manner, except that the tokens are reversed and the operands are swapped for each operation.
Parentheses are not supported.
Code:
#!/usr/bin/env python3
from typing import Union # For older Python versions.
OPCODES = {
"+": lambda lhs, rhs: lhs + rhs,
"-": lambda lhs, rhs: lhs - rhs,
"*": lambda lhs, rhs: lhs * rhs,
"/": lambda lhs, rhs: lhs / rhs,
"//": lambda lhs, rhs: lhs // rhs,
"%": lambda lhs, rhs: lhs % rhs,
"^": lambda lhs, rhs: lhs ** rhs,
}
def _eval_op(
lhs: Union[int, float], rhs: Union[int, float], opcode: str
) -> Union[int, float]:
return OPCODES[opcode](lhs, rhs)
def eval_postfix_expr(expr: str) -> Union[int, float]:
"""
Evaluate a postfix expression.
Supported operators:
- Addition (+)
- Subtraction (-)
- Multiplication (*)
- Division (/)
- Floor Division (//)
- Modulo (%)
- Exponentiation (^)
Raises:
ValueError: If 'expr' is an invalid postfix expression.
"""
tokens = expr.split()
if tokens[-1] not in OPCODES.keys() or not tokens[0].isdigit():
raise ValueError("Invalid expression.")
stack = []
for tok in tokens:
if tok.isdigit():
stack.append(int(tok))
elif tok in OPCODES.keys():
if len(stack) >= 2:
rhs = stack.pop()
lhs = stack.pop()
stack.append(_eval_op(lhs, rhs, tok))
else:
raise ValueError("Invalid expression.")
return stack.pop()
def eval_prefix_expr(expr: str) -> Union[int, float]:
"""
Evaluate a prefix expression.
Supported operators:
- Addition (+)
- Subtraction (-)
- Multiplication (*)
- Division (/)
- Floor Division (//)
- Modulo (%)
- Exponentiation (^)
Raises:
ValueError: If 'expr' is an invalid prefix expression.
"""
tokens = expr.split()
if tokens[0] not in OPCODES.keys() or not tokens[-1].isdigit():
raise ValueError("Invalid expression.")
tokens = reversed(tokens)
stack = []
for tok in tokens:
if tok.isdigit():
stack.append(int(tok))
elif tok in OPCODES.keys():
if len(stack) >= 2:
lhs = stack.pop()
rhs = stack.pop()
stack.append(_eval_op(lhs, rhs, tok))
else:
raise ValueError("Invalid expression.")
return stack.pop()
def run_tests(func, test_data) -> None:
func_len = len(func.__name__)
max_expr_len = max(len(repr(expr)) for expr, _ in test_data)
for expr, res in test_data:
func_name = func.__name__
expr_repr = repr(expr)
print(
f"func: {func_name:<{func_len}}, expr: {expr_repr:<{max_expr_len}}",
end="",
)
try:
rv = func(expr)
assert rv == res, f"Expected: {res}, Received: {rv}"
except Exception as e:
rv = e
print(f", result: {repr(rv)}")
def main() -> None:
post_test_data = [
("3 4 +", 7),
("3 4 + 5 4 ^ +", 632),
("+ 10a 29", 0),
("10 5 * 6 2 + /", 6.25),
("10 7 2 3 * + -", -3),
("icaoscasjcs", 0),
("* 10 5 * 7 3 + // 2 -", 3),
]
pre_test_data = [
("+ 3 4", 7),
("+ ^ 5 4 + 3 4", 632),
("+ 10a 29", 0),
("/ * 10 5 + 6 2", 6.25),
("- 10 + 7 * 2 3", -3),
("icaoscasjcs", 0),
("1038 - // * 10 5 + 7 3 2", 3),
]
run_tests(eval_postfix_expr, post_test_data)
print()
run_tests(eval_prefix_expr, pre_test_data)
if __name__ == "__main__":
main()
Prints:
func: eval_postfix_expr, expr: '3 4 +' , result: 7
func: eval_postfix_expr, expr: '3 4 + 5 4 ^ +' , result: 632
func: eval_postfix_expr, expr: '+ 10a 29' , result: ValueError('Invalid expression.')
func: eval_postfix_expr, expr: '10 5 * 6 2 + /' , result: 6.25
func: eval_postfix_expr, expr: '10 7 2 3 * + -' , result: -3
func: eval_postfix_expr, expr: 'icaoscasjcs' , result: ValueError('Invalid expression.')
func: eval_postfix_expr, expr: '* 10 5 * 7 3 + // 2 -', result: ValueError('Invalid expression.')
func: eval_prefix_expr, expr: '+ 3 4' , result: 7
func: eval_prefix_expr, expr: '+ ^ 5 4 + 3 4' , result: 632
func: eval_prefix_expr, expr: '+ 10a 29' , result: ValueError('Invalid expression.')
func: eval_prefix_expr, expr: '/ * 10 5 + 6 2' , result: 6.25
func: eval_prefix_expr, expr: '- 10 + 7 * 2 3' , result: -3
func: eval_prefix_expr, expr: 'icaoscasjcs' , result: ValueError('Invalid expression.')
func: eval_prefix_expr, expr: '1038 - // * 10 5 + 7 3 2', result: ValueError('Invalid expression.')
Review Request:
There's some duplication in the docstrings. How can that be helped?
Are there any bugs in my code? Did I miss something?
I should have liked to implement eval_prefix_expr()
with eval_postfix_expr()
, but I did not see a way to swap the operands for each operator. Do you?
General coding comments, style, naming, et cetera.
-
1\$\begingroup\$ Thank you for your answers. I've added the reinventing-the-wheel given the response to (3) to indicate you don't want answer which are anywhere near BNF. \$\endgroup\$Peilonrayz– Peilonrayz ♦2024年02月20日 15:57:05 +00:00Commented Feb 20, 2024 at 15:57
-
\$\begingroup\$ I wouldn't be opposed to learning something new albeit. So feel free to talk of BNF, or something else. \$\endgroup\$Madagascar– Madagascar2024年02月20日 16:21:26 +00:00Commented Feb 20, 2024 at 16:21
2 Answers 2
operator module
OPCODES = {
"+": lambda lhs, rhs: lhs + rhs, ...
The lambdas are nice enough, they get the job done. We could have just mentioned operator.add, floordiv, and so on.
ints are floats
def _eval_op(
lhs: Union[int, float], ...
Well, ok, clearly an int
is not a float
, nor does it inherit.
But rather than writing e.g. lhs: int | float,
Pep-484
explains that
when an argument is annotated as having type
float
, an argument of typeint
is acceptable
So prefer to shorten such annotations, as int
is implicit.
short docstring
The docstring for eval_postfix_expr()
is lovely.
You lamented the repetition of longish docstrings.
Maybe include the operators by reference?
That is, refer the interested reader
over to OPCODES
, which could have its own docstring.
And then a one-liner of "+ - * / // % ^" suffices.
It goes on to comment on ValueError, suggesting that's the only Bad Thing that could happen. Maybe any error that gets raised is self explanatory, and doesn't need a mention? Or maybe enumerate the others here, such as ZeroDivisionError.
Too bad there's no i
operator.
This turns out to be a very poor substitute:
print((-1) ** .5)
(6.123233995736766e-17+1j)
unary minus
if tok.isdigit():
It makes me sad that "3 -1 +"
won't parse,
since "-".isdigit() == False
I agree with @vnp's EAFP suggestion.
Directly accepting an input operand of "-3.14"
seems simple enough,
without making me divide 314 by 100.
silent error
elif tok in OPCODES.keys():
if len(stack) >= 2:
rhs = stack.pop()
lhs = stack.pop()
stack.append(_eval_op(lhs, rhs, tok))
else:
When presented with a short stack (bad input expression!) we just silently move on without comment?!?
Worse, this could happen in the middle of an expression, leading to a debugging nightmare. (A dozen properly nested items, then short stack, followed by another dozen properly nested items. Where's the bad entry, we wonder?)
long stack
Evaluating "1 2 3 +"
will return 5
but will leave 1
on the stack.
I feel this should be a fatal error reported to the user.
DRY
eval_prefix_expr()
is Too Long.
It should reverse, and then call eval_postfix_expr()
.
Or both should call a common _helper()
.
I did not see a way to swap the operands for each operator.
Yeah, I see what you mean.
Define a postfix_swap(a, b)
helper which does nothing
(identity function), and a prefix_swap(a, b)
helper
which swaps its args, and pass in the swapper function
as part of the call.
Or maybe pass in (0, 1)
and (1, 0)
arg_selector
tuples,
the indexes of a
and b
.
simple tests
The custom run_tests()
routine is nice.
We could tighten up a few things, like {func_name:<{func_len}}
is just {func_name}
, and we could assign func_name
just once
if we need it at all.
Consider phrasing this test in terms of from unittest import TestCase
.
assert rv == res
This equality test works for now.
But soon you'll want self.assertAlmostEqual()
, I think.
commutativity
I really do appreciate the emphasis on "simple" here.
post_test_data = [ ...
("10 5 * 6 2 + /", 6.25),
...
pre_test_data = [ ...
("/ * 10 5 + 6 2", 6.25),
Those say nearly the same thing. But the one is not a reversal of the other.
In mathematics, addition over the reals commutes. Similarly for multiplication.
In IEEE-754, addition is not commutative, nor is multiplication. Order of evaluation matters. We worry about catastrophic cancellation. One way to avoid cumulative rounding errors in a long sum is to order FP operands by magnitude and sum the little ones before the big ones.
So consider ditching pre_test_data
,
in favor of just reversing post_test_data
.
hypothesis
Consider writing a postfix_to_infix()
converter,
whose parenthesized output you can just hand to eval()
.
Now you have an "oracle" that knows the right answer.
With that in hand you can invite hypothesis to dream up random expressions of limited length, and then you verify that the evaluations match. I predict you will learn new things about your code and about FP.
This code achieves most of its design goals.
I would be willing to delegate or accept maintenance tasks on it.
Testing the first and last tokens is very dubious. It rejects a perfectly valid expression
42
, and accepts definitely invalid expressions like2 *
and 1 2 3 +`.I strongly recommend to ditch this test, and handle problems as the evaluation goes on.
isdigit
test is redundant (and it also reject negative operants). The Grace Hopper's principle suggeststry: stack.append(int(tok)) except ValueError: # tok is not an integer, must be an operator
Similarly, consider
try: rhs = stack.pop() lhs = stack.pop() stack.append(eval_op(lhs, rhs, tok)) except IndexError: # handle stack underrun except KeyError: # handle unknown operator except ZeroDivisionError: # which BTW is not handled in your code
stack
is too generic to my taste. I'd rather name itoperands
.
-
\$\begingroup\$
# handle unknown operator
==> Doeselif tok in OPCODES.keys()
not take care of that? How do you suggest handlingIndexError
andZeroDivisionError
? \$\endgroup\$Madagascar– Madagascar2024年02月21日 04:08:26 +00:00Commented Feb 21, 2024 at 4:08 -
\$\begingroup\$ @Harith Of course it does. The point is that EAFP is more Pythonic. \$\endgroup\$vnp– vnp2024年02月21日 04:10:51 +00:00Commented Feb 21, 2024 at 4:10
Explore related questions
See similar questions with these tags.