Here's what I've come up with:
def bf_interpreter(code: str) -> str:
"""
Interprets BrainF**k code
Examples:
++++++++[>++++[>++>+++>+++>+<<<<-]>+>+>->>+[<]<-]>>.>---.+++++++..+++.>>.<-.<.+++.------.--------.>>+.>++.
Hello World!
+++++>++++[<+>-]++++++++[<++++++>-]<.
9
"""
outputs = []
ptr = 0
values = [0]
length = 1
brackets = []
index = 0
code_length = len(code)
while index < code_length:
char = code[index]
while length <= ptr:
length += 1
values.append(0)
if char == '>': ptr += 1
if char == '<': ptr -= 1
if char == '+': values[ptr] = (values[ptr] + 1) % 256
if char == '-': values[ptr] = (values[ptr] - 1) % 256
if char == '[':
brackets.append(index)
if char == ']':
if values[ptr] == 0:
brackets.pop()
else:
index = brackets[-1]
if char == '.':
outputs.append(chr(values[ptr]))
if char == ',':
values[ptr] = ord(input())
index += 1
return ''.join(outputs)
How do I make this look better? (More consistent, as well as strictly following PEP 8 conventions)
2 Answers 2
All of your if char == '>': ptr += 1
and similar checks should use elif
after the first check. By using if
for all of the checks, you're forcing them all to run, even once you've found a match. This is wasteful because the checks are necessarily exclusive of each other. Once one check is true, none of the other checks can be. For example, you should have:
if char == '>': ptr += 1
elif char == '<': ptr -= 1
elif char == '+': values[ptr] = (values[ptr] + 1) % 256
elif char == '-': values[ptr] = (values[ptr] - 1) % 256
Now the checks stop once a match is found which prevents unnecessary equality checks.
I'd also try to break this down into a few functions to help testing. Right now, you can only test bf_interpreter
as one whole. You could have a function that takes the current character, and the state of the program (the brackets
, ptr
, outputs
...), and returns a new state. That way you can easily test for a given state if a certain command produces a correct new state.
I'm assuming this loop is just to add padding so you don't go off the end of the slots?:
while length <= ptr:
length += 1
values.append(0)
You could make that a little neater by just using math and some concatenation. You could also just get rid of length
and use len(values)
:
needed = ptr - len(values)
values += [0] * needed
ptr - len(values)
calculates how many slots are needed, then [0] * needed
produces that many 0
s, and +=
adds them to values
. If needed
is negative, [0] * needed
will produce []
, and essentially cause no change.
If you want to avoid the temporary list that [0] * needed
creates, you could replace that with:
values += (0 for _ in range(needed))
Now +=
just pulls from a generator that produces values as needed.
And then, just like how you don't need length
, you don't need code_length
either. len(code)
is fine; len
runs in constant time. You don't need to cache it for performance reasons.
Here's some timings to show the difference in runtime this can cause:
import timeit
TEST_CODE = "++++++++[>++++[>++>+++>+++>+<<<<-]>+>+>->>+[<]<-]>>.>---.+++++++..+++.>>.<-.<.+++.------.--------.>>+.>++."
>>> timeit.timeit(lambda: bf_interpreter_orig(TEST_CODE), number=int(2e5)) # Two hundred thousand tests
77.3481031 # Takes 77 seconds
>>> timeit.timeit(lambda: bf_interpreter_len(TEST_CODE), number=int(2e5))
88.93794809999997
Where bf_interpreter_orig
is your original code, and bf_interpreter_len
is your original code but using len
.
Yes, there's a difference. Note though, that's a ~11 second difference across 200,000 calls. That works out to roughly 58 microseconds per call to the interpreting function.
Unless you're calling bf_interpreter
hundreds of thousands of times in a tight loop, the difference is unlikely to matter. This also likely has nothing to do with the fact that you're requesting a length, and more to do with one extra function call. Function calls aren't super fast in Python. Likely any extra call to any function would have similar effects.
-
\$\begingroup\$ Yes, it would be possible to use
len(values)
, but wouldn't it be more time consuming? If having a variable to calculate the length takes4.78
seconds, usinglen(values)
takes7.549
seconds. \$\endgroup\$Sriv– Sriv2019年12月02日 17:23:55 +00:00Commented Dec 2, 2019 at 17:23 -
\$\begingroup\$ @Srivaths See my edit at the bottom. The time difference isn't as big as your findings show. You're likely using too small of a sample, or a poor benchmarking method. \$\endgroup\$Carcigenicate– Carcigenicate2019年12月02日 18:06:57 +00:00Commented Dec 2, 2019 at 18:06
-
\$\begingroup\$ I did not exactly use the
bf_interpreter
function to time the difference. See pastebin.com/uZNR3ArW \$\endgroup\$Sriv– Sriv2019年12月02日 18:16:28 +00:00Commented Dec 2, 2019 at 18:16 -
\$\begingroup\$ @Srivaths When I run that exact test, I get 13.8 seconds for
code1
, and 11.2 seconds forcode2
. That's in the neighborhood of the results that I got with my full test (81% as fast for my test vs 86% as fast with your test). \$\endgroup\$Carcigenicate– Carcigenicate2019年12月02日 18:30:16 +00:00Commented Dec 2, 2019 at 18:30 -
\$\begingroup\$ Your tests seem abnormally slow. Do you have something running in the background that could be messing with tests? \$\endgroup\$Carcigenicate– Carcigenicate2019年12月02日 18:35:10 +00:00Commented Dec 2, 2019 at 18:35
+1 on specifying argument code: str
and return type -> str
.
I implemented a class seen below which has the advantage of seeing chars and their corresponding functions at a glance. Adding new chars if need be is incredibly easy.
self.actions = {
'>': self.greater_than,
'<': self.less_than,
'+': self.plus,
'-': self.minus,
'[': self.left_bracket,
']': self.right_bracket,
'.': self.dot,
',': self.comma
}
You can use a
try:
...
except KeyError:
...
to detect unrecognised chars.
Complete Class
class BFInterpreter:
def __init__(self):
self.outputs = []
self.ptr = 0
self.values = [0]
self.length = 1
self.brackets = []
self.index = 0
def greater_than(self):
self.ptr += 1
def less_than(self):
self.ptr -= 1
def plus(self):
self.values[self.ptr] = (self.values[self.ptr] + 1) % 256
def minus(self):
self.values[self.ptr] = (self.values[self.ptr] - 1) % 256
def left_bracket(self):
self.brackets.append(self.index)
def right_bracket(self):
if self.values[self.ptr] == 0:
self.brackets.pop()
else:
self.index = self.brackets[-1]
def dot(self):
self.outputs.append(chr(self.values[self.ptr]))
def comma(self):
self.values[self.ptr] = ord(input())
def evaluate(self, code):
self.code = code
self.code_length = len(self.code)
self.actions = {
'>': self.greater_than,
'<': self.less_than,
'+': self.plus,
'-': self.minus,
'[': self.left_bracket,
']': self.right_bracket,
'.': self.dot,
',': self.comma
}
while self.index < self.code_length:
char = self.code[self.index]
while self.length <= self.ptr:
self.length += 1
self.values.append(0)
self.actions[char]()
self.index += 1
return ''.join(self.outputs)
Usage
bf = BFInterpreter()
print(bf.evaluate('+++++>++++[<+>-]++++++++[<++++++>-]<.'))
code has been specified as an argument in the evaluate method rather than in the constructor to be able to evaluate without creating new objects.
-
5\$\begingroup\$ Now change
BFInterpreter
toBFCode(str)
and changeevaluate(self, code)
to__call__(self)
. Boom. \$\endgroup\$Richard Neumann– Richard Neumann2019年12月05日 13:12:05 +00:00Commented Dec 5, 2019 at 13:12 -
1\$\begingroup\$ "...to be able to evaluate without creating new objects." And what exactly is the point of that? You cannot reuse the object after calling
evaluate
anyways. Your concept of__init__
+evaluate
is bad for readability and usability of your code. I would either make a wrapper method that creates aBFInterpreter
, then callsevaluate
on it and returns the result; or create what I call a "function-like class": a class with__new__
method in which you create an instance, call some method(s) on it (such asevaluate
), and return the result (instead of returning an instance of the class). \$\endgroup\$kyrill– kyrill2019年12月09日 22:44:01 +00:00Commented Dec 9, 2019 at 22:44 -
\$\begingroup\$ In the spirit of you once forge a spoon and use it may times, not creating a spoon each time you want to use it. You create an object then use a method many times versus creating a new instance each time you want to evaluate \$\endgroup\$Abdur-Rahmaan Janhangeer– Abdur-Rahmaan Janhangeer2019年12月10日 05:36:50 +00:00Commented Dec 10, 2019 at 5:36
-
\$\begingroup\$ Have you read and understood my comment? You cannot use your instance multiple times, since you don't reset the variables
self.outputs
,self.values
, etc. in theevaluate
method. You could use the object multiple times if you did reset those variables in theevaluate
method, but then there would be absolutely no point in creating and reusing the object, since there would be no initialization of the object done in the constructor. You might as well changeevaluate
to aclassmethod
and create a new instance in that method. \$\endgroup\$kyrill– kyrill2019年12月10日 12:02:05 +00:00Commented Dec 10, 2019 at 12:02 -
\$\begingroup\$ Oh and by the way, your implementation is neither thread- nor exception-safe. It would be, however, if you did what I told you. PS. if you still believe that your object is reusable, just try to reuse it on these inputs:
'+++++>++++[<+>-]++++++++[<++++++>-]<.'
and''
(an empty string). The first should (and does) print9
; however the second should print nothing but prints9
instead. And one more thing, your program fails for inputs which contain other characters than the 8 control ones, which makes it neither a valid BF interpreter, nor equivalent with OP's original program. \$\endgroup\$kyrill– kyrill2019年12月10日 12:14:23 +00:00Commented Dec 10, 2019 at 12:14
Explore related questions
See similar questions with these tags.
parsing rules
. Could you please elaborate? \$\endgroup\$char == '['
, you're supposed to check whether the value at the pointer is zero; if it is, you need to jump to after the corresponding]
. \$\endgroup\$