CodeWars mathematical expresion evaluator

Question 1

I recently wrote the following code for a codewars.com kata for evaluating mathematical expressions. I've been writing ruby for years, but almost everything I do is for personal use and isn't shared, so I've gotten essentially zero feedback on my ruby style. The following block of code (with more comments than usual) is fairly typical of my style, but it never feels as ruby-ish as the code snippets I see online.

What can be changed to make this code closer to ideal idiomatic ruby?

#parse the initial expression, break it up into an array 
# of tokens (numbers, parens, operations, etc)
# this also detects negation and cheges the symbol from '-' to 'n'
# and finally, this converts numbers to floats
def tokenize_expression(expression)
 #remove spaces
 s = expression.gsub(/\s/,'')
 #convert negations to 'n' character
 while md = s.match(%r{(?<![\d\)])-}) do
 s[md.begin(0)] = 'n'
 end
 #iterate through string
 #if number, get full number and add to array
 #otherwise grab just the first character which will be an operation or parenthesis
 tokens = []
 while not s.empty?
 if not s.match(%r{^\d}).nil? #first char is digit
 md = s.match %r{[\d\.]+}
 s = md.post_match
 tokens << md[0].to_f
 else
 tokens << s[0] #first char is parenthesis or operation
 s = s[1..-1] #everything but first char
 end
 end
 tokens
end
#take the array and make sub arrays based on the parentheses
# e.g. ['(', '1', '+', '(', '2', '+', '3', ')', ')'] -> ['1', '+', ['2', '+', '3']]
def nest_parens(tokens)
 result = []
 stack = []
 first, *rest = *tokens
 while not first.nil? do
 case first #look at first token
 when '(' 
 stack.push result #store current partial result on stack if open parens
 result = [] #start new result
 when ')'
 child = result #store result in temp var
 result = stack.pop #get previous partial result
 result << child #add temp result to current result
 else 
 result << first #add this token to the current result
 end 
 first, *rest = *rest
 end
 throw "Unclosed parenthesis" if not stack.empty?
 result
end
#find all the neagtions and convert them to nested postfix
# e.g. '5-n6' becomes '5-[n 6]'
def postfix_negation(tokens)
 return tokens if not tokens.is_a? Array
 tokens = tokens.map{ |t| postfix_negation(t) } #recursively process everything below the current level
 result = []
 first, *rest = *tokens
 while not first.nil? 
 case first
 when 'n'
 second, *rest = *rest
 result << [first, second] #e.g. [n 6]
 else
 result << first
 end
 first, *rest = *rest
 end
 result
end
#find all operations (mult/div or plus/minus) and convert to nested postfix
# e.g. '1+2*3' becomes '[+ 1 [* 2 3]]'
def postfix_ops(tokens, ops=['/','*'])
 return tokens if not tokens.is_a? Array
 tokens = tokens.map{ |t| postfix_ops(t, ops) } #recursively process everything below the current level
 result = []
 first, *rest = *tokens
 while not first.nil?
 second = rest.first #if there is an operator, second will contain it
 if ops.include? second 
 second, third, *rest = *rest
 first = [second, first, third] #[op, arg1, arg2]. This now becomes first and is compared again to the following tokens,
 next #which will handle cases like 1+2+3 --> [+ [+ 1 2] 3]
 else
 result << first
 end
 first, *rest = *rest
 end
 result
end
#take a fully processed, postfix tree and recursively evaluate the expressions
def eval(tree)
 return tree if not tree.is_a? Array #if this isn't an array, return it
 tree = tree.map {|n| eval(n) } #recursively process everything below the current level
 return tree.first if tree.length == 1 #sometimes we end up with a single value as an array, e.g. [5], so just return the inner value
 first, second, third = tree #process arguments
 case first
 when 'n' then return -second
 when '+' then return second + third
 when '-' then return second - third
 when '*' then return second * third
 when '/' then return second / third
 else raise "Unkown Op: #{first}"
 end 
end
#wrapper to call all the needed steps for processing
def calc(expression)
 tokens = tokenize_expression(expression)
 tokens = nest_parens(tokens)
 tokens = postfix_negation(tokens)
 tokens = postfix_ops(tokens, ['/','*'])
 tokens = postfix_ops(tokens, ['+','-'])
 eval(tokens)
end
#test input
[
"2/(2+3)*4", 
"2-(-(3*-2))",
"3*((4*5)*6*(7*8))",
"-(55--(-(1+2))--12)"
].each do |s|
 puts "\nString: #{s}"
 puts "Eval: #{calc(s)}"
end

Question 2

I don't fully understand the code, but I feel like you could use .each to iterate over lists instead of your whiles...

Question 3

@Caridorc I'm not advancing over the lists in an even fashion, sometimes I iterate over the same item multiple times, sometimes items are skipped

Question 4

I understand, it is not so straightforward as it seems

Question 5

If you are curious, you can add print statements between every line in calc() to see how the string is slowly converted to postifx (this was how I debugged it)

Question 6

I've rolled your question back. Please read What should I do when someone answers my question?

Question 7

Single responsibility principle

#parse the initial expression, break it up into an array 
# of tokens (numbers, parens, operations, etc)
# this also detects negation and cheges the symbol from '-' to 'n'
# and finally, this converts numbers to floats
def tokenize_expression(expression)

Your function tokenize_expression does:

break [the initial expression] up into an array
detects negation and changes the symbol from '-' to 'n'
converts numbers to floats

This should be three functions:

def negation_to_n(expression)
 s = expression.gsub(/\s/,'')
 while md = s.match(%r{(?<![\d\)])-}) do
 s[md.begin(0)] = 'n'
 end
 s
end
def split_tokens(expression)
 ### Implement
end
def numbers_to_floats(tokens)
 tokens.map{|t| t.match(%r{^\d}) ? t.to_f : t}
end

Use functional programming when practical

You should not feel forced to stick to imperative programming even when it becomes so hard to follow, for example negation_to_n can be written just like (please note that the string should be passed in already without spaces):

def matching_indexes(s, regex)
 s
 .enum_for(:scan, regex)
 .map { Regexp.last_match.begin(0) }
end
def negation_to_n(s)
 (0...s.length)
 .map{|i| matching_indexes(s, %r{(?<![\d\)])-}).include?(i) ? 'n' : s[i]}
end

Another example is split_tokens:

def split_tokens(s)
 s
 .chars
 .chunk {|d| is_digit?(d)}
 .map{|_, xs| xs}
 .map {|g| g.all? {|ch| is_digit?(ch)} ? g.join : g}
 .flatten
end

Surely other parts can be similarly simplified.

Question 8

I like what you have suggested with matching_indexes and negation_to_n. I'm not really familiar with ".enum_for", so I'll have to look into incorporating that in the future. I see what you are trying to suggest with break_into_array and numbers_to_floats but I question the usefulness. Detecting the full number and breaking it out is the hard part and requires the bulk of the work. Making a whole new function just to avoid appending ".to_f" on the end seems questionable. I suppose the "to_f" could be added to the operands in eval, but it would look messy

Question 9

I see where you are coming from: Making a whole new function you feel like writing a function more is a chore or a difficulty, but I think that more functions make it easier to write code because you can test them one by one.

Question 10

For what it's worth, tokenize_expression was originally a full class with a next_token() and to_a() methods, but it seemed like overkill and needlessly complicated, especially for purposes of soliciting feedback, so I merged it into a single function for posting.

Question 11

Instead of everything in your split_token method, might I suggest s.scan /\d+|./?

Question 12

@Zack please don't simplify your code when posting here. We want your real code, precisely to avoid the "but my real code looks like this" comments. Imagine how you'd feel if you took the time to offer assistance to someone and they replied with that.

Question 13

Updated Code After Feedback

I've remodeled my code after Caridorc's example. It took a lot of time to refactor the code, possibly longer than I spent writing it in the first place. I'd guess it is a little more readable and idiomatic than it was before, although I'm not sure it was worth the time investment. Regardless, I've at least learned several new ruby tricks out of all of this. Any thought on new code?

def apply_negation_symbol(s)
 copy = s.clone
 s
 .enum_for(:scan, %r{(?<![\d\)])-})
 .map { Regexp.last_match.begin(0) }
 .each { |ix| copy[ix] = 'n' }
 copy
end
def nest_parenthesis(s)
 stack = []
 s
 .chars
 .inject Array.new do |arr, char|
 case(char)
 when '(' then stack.push arr; Array.new
 when ')' then stack.pop << arr
 else arr << char
 end
 end
end
def is_numerical?(c)
 not (c =~ %r{[\d\.]}).nil?
end
def digits_to_floats(tree)
 return tree if not tree.is_a? Array
 tree = tree.map {|t| digits_to_floats(t) }
 tree
 .chunk {|d| is_numerical?(d)}
 .map{|_, xs| xs}
 .flat_map {|g| g.all? {|ch| is_numerical?(ch)} ? g.join.to_f : g}
end
def negation_to_postfix(tree)
 return tree if not tree.is_a? Array
 tree = tree.map{ |t| negation_to_postfix(t) }
 tree
 .slice_before{|item| item == 'n'}
 .flat_map { |arr| arr.first == 'n' ? [arr.first(2), arr[2..-1]] : arr}
 .delete_if { |item| item.is_a? Array and item.empty?}
end
def operations_to_postfix(tree, ops=['/','*'])
 return tree if not tree.is_a? Array
 tree = tree.map{ |t| operations_to_postfix(t, ops) }
 tree
 .map.with_index{ |item, ix| ops.include?(item) ? ix : nil}
 .compact
 .reverse
 .each do |ix|
 arg1, op, arg2 = tree.slice!(ix-1..ix+1)
 tree[ix-1] = [op, arg1, arg2]
 end
 tree
end
def calc(expression)
 s = apply_negation_symbol(expression)
 tree = nest_parenthesis(s)
 tree = digits_to_floats(tree)
 tree = negation_to_postfix(tree)
 tree = operations_to_postfix(tree, ops=['/','*'])
 tree = operations_to_postfix(tree, ops=['+','-'])
 eval(tree)
end

Question 14

I don't want to be negative for the sake of being negative, but while this is more functional, which I like, I find it way less readable than your original code.

Question 15

Those were more or less my feelings. Some of it is better, some of it is worse, and it took a whole lot of time. Changing negation_to_postfix() to use the same array replacement logic I used in operations_to_postfix() would help a little for readability, but altering arrays in place (even though it is a local variable) feels like it violates the spirit of functional programming.

Question 16

Overall I think your code is great. Almost zero WTFs. Of course, I wouldn't be posting if I didn't have suggestions. I've broken it up into methods below, with a "Miscellaneous" section at the end for general advice.

tokenize_expression

I'll take this in two sections:

Convert negations to n

I think you overthought this part a bit. This:

#convert negations to 'n' character
while md = s.match(%r{(?<![\d\)])-}) do
 s[md.begin(0)] = 'n'
end

...is better expressed like this:

s.gsub!(/(?<![\d\)])-/, 'n')

Consume tokens

Your code:

while not s.empty?
 if not s.match(%r{^\d}).nil? #first char is digit
 md = s.match %r{[\d\.]+}
 s = md.post_match
 tokens << md[0].to_f
 else
 tokens << s[0] #first char is parenthesis or operation
 s = s[1..-1] #everything but first char
 end
end

Firstly, this line:

if not s.match(%r{^\d}).nil?

...would be better expressed as:

if s.match(/^\d/) # `nil` is falsey in Ruby

...but since you're not using the MatchData object returned by match, you should use the =~ operator (it returns a Fixnum, which is always truthy, or nil):

if s =~ /^\d/

Secondly, this is a waste:

if not s.match(%r{^\d}).nil? #first char is digit
 md = s.match %r{[\d\.]+}
 s = md.post_match

You're throwing away the MatchData object from the first line and then performing a very similar match on the second line. You can do this with one match operation:

if md = s.match(/^\d+(\.\d+)?/)
 s = md.post_match

This Regexp matches one or more digits optionally followed by a decimal point and one or more digits. (This also solves a bug in your code that would have matched 1.2.3 as a valid number.)

We can improve this further, though:

until s.empty?
 tokens <<
 if num = s.slice!(/^\d+(\.\d+)?/)
 num.to_f
 else
 s.slice!(0)
 end
end

Instead of matching the expression, checking the result, adding the matched part to tokens and setting s to everything after it, we do all of those things at once by passing the expression to String#slice!. If nothing is matched slice! returns nil, so we do s.slice!(0) to chop off just the first character instead.

Or further, although we're getting somewhat occult now:

while s.slice!(/^\d+(?:\.\d+)?|(.)/)
 tokens << (1ドル || $&.to_f)
end

This is basically the same except I've added an alternation that, if a number wasn't matched, captures the first character in capture group 1 (1ドル). If a number was matched, 1ドル will be nil and so we call to_f on the whole match ($&.to_f).

nest_parens

This method is really solid. I only have one suggestion, which is to use Enumerable#reduce or Enumerable#each_with_object instead of a while loop:

result = tokens.reduce([]) do |res, token|
 case token
 when '(' then stack.push(res); []
 when ')' then stack.pop << res
 else res << token
 end
end

postfix_negation

This one is also really solid. Unlike the above I don't think reduce works well here (since you're sometimes, but not always, consuming a second element).

One thing that's slightly smelly to me is having first, *rest = tokens before your loop and first, *rest = rest at the bottom:

first, *rest = *tokens
while not first.nil?
 case first
 when 'n'
 second, *rest = *rest
 result << [first, second] #e.g. [n 6]
 else
 result << first
 end
 first, *rest = *rest
end

Also, I think a case expression is overkill when you only have one condition and an else. I'd probably rewrite your loop like this:

while first = tokens.shift
 if first == 'n'
 result << [ first, tokens.shift ]
 else
 result << first
 end
end

postfix_ops

Similar to the above:

while first = tokens.shift
 if ops.include?(tokens.first)
 second, third = tokens.shift(2)
 tokens.unshift([ second, first, third ])
 else
 result << first
 end
end

As you can see, I prefer a, b = rest.shift(2) to a, b, *rest = rest.

eval

Don't shadow Kernel#eval. I'd rename this method to evaluate.

Otherwise, this method is very good. One thing I'd do is remove all of the returns from your case expression. The case is the last expression in the method, so whichever then is reached will be the return value of the method:

case first
 when 'n' then -second
 when '+' then second + third
 # ...
 else raise "Unknown Op: #{first}"
end

If you want to get fancy, you could replace this with the following:

if first == 'n'
 first, second, third = 0, '-', second
end
first.send(second, third)

However, this way you lose error handling for unknown operators. You could add it back in simply enough, but by then it might not be worth it.

Miscellaneous

Use consistent indentation. Two spaces is standard for Ruby.
Don't use do after a while (or for). It's implied.
Use unless instead of if not.
while not first.nil? do should either be while first or until first.nil?. Similarly, I recommend until s.empty? instead of while not s.empty?.
You don't need the splat before tokens in first, *rest = *tokens, because tokens is already an array.
Regexp "percent notation" (%r{...}) is good for reducing visual noise when your expression contains slashes, because you don't have to escape them (%r{a/b/c} vs. /a\/b\/c/); however, when you don't have any slashes, I find it actually adds visual noise and prefer normal slash notation.

Question 17

Thanks for your feedback, Jordan. A few of your suggestions I've already done on the improved code. Your suggestion about using gsub() for swapping out the negation symbol made me smack my forehead; I really overthought that one. With regards to Misc#5, the splat before tokens is required for what I want (a Lisp style first() and rest() destructuring); it doesn't work without it. As for Misc #6, I left it %r{} to keep the regexes consistent, rather than mixing escape syntax. Thank you again for your feedback

Question 18

You're mistaken about the splat. Take a look: ideone.com/PtwYaQ A splat on the right-hand side of an assignment just invokes to_ary, on it; since tokens is an array this is a no-op.

Question 19

Hmm. According to this article (devblog.avdi.org/2010/01/31/first-and-rest-in-ruby), it was required for proper handling of empty lists, but maybe it's changed in more recent versions of ruby (which testing seems to confirm)

Caridorc Caridorc 28k7 gold badges54 silver badges137 bronze badges · Accepted Answer · 2016-02-10 20:50:43Z

Single responsibility principle

#parse the initial expression, break it up into an array 
# of tokens (numbers, parens, operations, etc)
# this also detects negation and cheges the symbol from '-' to 'n'
# and finally, this converts numbers to floats
def tokenize_expression(expression)

Your function tokenize_expression does:

break [the initial expression] up into an array
detects negation and changes the symbol from '-' to 'n'
converts numbers to floats

This should be three functions:

def negation_to_n(expression)
 s = expression.gsub(/\s/,'')
 while md = s.match(%r{(?<![\d\)])-}) do
 s[md.begin(0)] = 'n'
 end
 s
end
def split_tokens(expression)
 ### Implement
end
def numbers_to_floats(tokens)
 tokens.map{|t| t.match(%r{^\d}) ? t.to_f : t}
end

Use functional programming when practical

You should not feel forced to stick to imperative programming even when it becomes so hard to follow, for example negation_to_n can be written just like (please note that the string should be passed in already without spaces):

def matching_indexes(s, regex)
 s
 .enum_for(:scan, regex)
 .map { Regexp.last_match.begin(0) }
end
def negation_to_n(s)
 (0...s.length)
 .map{|i| matching_indexes(s, %r{(?<![\d\)])-}).include?(i) ? 'n' : s[i]}
end

Another example is split_tokens:

def split_tokens(s)
 s
 .chars
 .chunk {|d| is_digit?(d)}
 .map{|_, xs| xs}
 .map {|g| g.all? {|ch| is_digit?(ch)} ? g.join : g}
 .flatten
end

Surely other parts can be similarly simplified.

I like what you have suggested with matching_indexes and negation_to_n. I'm not really familiar with ".enum_for", so I'll have to look into incorporating that in the future. I see what you are trying to suggest with break_into_array and numbers_to_floats but I question the usefulness. Detecting the full number and breaking it out is the hard part and requires the bulk of the work. Making a whole new function just to avoid appending ".to_f" on the end seems questionable. I suppose the "to_f" could be added to the operands in eval, but it would look messy
I see where you are coming from: Making a whole new function you feel like writing a function more is a chore or a difficulty, but I think that more functions make it easier to write code because you can test them one by one.
For what it's worth, tokenize_expression was originally a full class with a next_token() and to_a() methods, but it seemed like overkill and needlessly complicated, especially for purposes of soliciting feedback, so I merged it into a single function for posting.
Instead of everything in your split_token method, might I suggest s.scan /\d+|./?
@Zack please don't simplify your code when posting here. We want your real code, precisely to avoid the "but my real code looks like this" comments. Imagine how you'd feel if you took the time to offer assistance to someone and they replied with that.

Stack Exchange Network

CodeWars mathematical expresion evaluator

3 Answers 3

Single responsibility principle

Use functional programming when practical

Updated Code After Feedback

tokenize_expression

nest_parens

postfix_negation

postfix_ops

eval

Miscellaneous

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

CodeWars mathematical expresion evaluator

3 Answers 3

Single responsibility principle

Use functional programming when practical

Updated Code After Feedback

tokenize_expression

nest_parens

postfix_negation

postfix_ops

eval

Miscellaneous

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions