Balanced smileys check algorithm (part 2)

Question 1

This is a follow-up for here.

Problem

Your friend John uses a lot of emoticons when you talk to him on Messenger. In addition to being a person who likes to express himself through emoticons, he hates unbalanced parenthesis so much that it makes him go :(

Sometimes he puts emoticons within parentheses, and you find it hard to tell if a parenthesis really is a parenthesis or part of an emoticon.

A message has balanced parentheses if it consists of one of the following:

An empty string ""

One or more of the following characters: 'a' to 'z', ' ' (a space) or ':' (a colon)

An open parenthesis '(', followed by a message with balanced parentheses, followed by a close parenthesis ')'.

A message with balanced parentheses followed by another message with balanced parentheses.

A smiley face ":)" or a frowny face ":("

Write a program that determines if there is a way to interpret his message while leaving the parentheses balanced.

I'm working on this balanced smileys checking algorithm, and my current solution is very naive, with just two rules:

At any point, the number of ) (close) should be less than the number of ( (open) + number of :( (frown)
At the end, the number of ( (open) should be less than the number of ) (close) and :) (smile)

I'm wondering if any bugs in my checking logic. Any advice on algorithm time complexity improvement or code style advice is highly appreciated as well.

def check_balance(source):
 left = 0
 right = 0
 frown = 0
 smile = 0
 for i,c in enumerate(source):
 if c == '(':
 left += 1
 if i > 0 and source[i-1] == ':': # :(
 left -= 1
 frown += 1
 elif c == ')':
 right += 1
 if i > 0 and source[i-1] == ':': # :)
 right -= 1
 smile += 1
 if left + frown < right:
 return False
 if left > right + smile:
 return False
 return True
if __name__ == "__main__":
 raw_smile_string = 'abc(b:)cdef(:()'
 print check_balance(raw_smile_string)

Question 2

You've asked a lot of questions on Code Review. Please do a better job picking tags for your questions. In this case, since you're asking a follow-up question, it should have been obvious which tags to use.

Question 3

@200_success, would love to follow your advice, for follow-up question, what is the suggested tag?

Question 4

Well, I added balanced-delimiters to the previous question, so obviously it should apply to this question too.

Question 5

@200_success, for sure, thanks. Is it possible I can add some tags? So that if I post different version of code, I can refer old post by tag (from the new version of code post)?

Question 6

In the verbal description of the algorithm, "should be less than" implies "not equal", which would not be correct. A wording such as "must be at most" would be totally clear.
Your code has a bug because you are not quite following rule 1. The rule says "at any point...", but you only check the rule after a smiley. You miss the case when a lone ) breaks the rule.

Instead of canceling += 1 with -= 1 here...

left += 1
if i > 0 and source[i-1] == ':': # :(
 left -= 1
 frown += 1

... it would be clearer to use else:

if i > 0 and source[i-1] == ':': # :(
 frown += 1
else:
 left += 1

Instead of if i > 0 and source[i-1] == ':' it would be simpler to remember the previous character in a variable:

 previous_char = None 
 for char in source:
 if char == 'c':
 if previous_char == ':':
 ...
 previous_char = char

Question 7

instead of manually setting the previous character in a variable, it would be clearer to use the itertools' pairwise recipe

Question 8

@MathiasEttinger Good idea, but then checking if the first character of the string is ( or ) would be a special case.

Question 9

Not necessarily, you just need to adapt pairwise: a, b = tee(iterable); yield next(a), None; yield from zip(a, b). And you iterate over with for char, previous in ....

Question 10

Thanks Janne, I post a new thread (codereview.stackexchange.com/questions/154783/…) to address your comments, if you have advice on that, it will be great. Mark your reply as answer.

Question 11

Some notes:

Variable naming
- Can be greatly improved by the use of an IDE like IDEA which supports trivial variable renaming.
- Don't use single character variables, index and character (or even char) is much more readable than i and c.
- It's not immediately obvious what any of the variables are.
You only have one test case. There should be test cases including and excluding each of the types of content of a message. If you get started with the simplest one you can even use TDD to get to a working but simple design.
Knowing the index of the unbalanced parenthesis would be nice.
Checking the number of parentheses is not enough, you have to whether they match up. For example, your algorithm fails (reports True when it shouldn't) on strings like )(.
The assignment calls for a recursive solution:

A message has balanced parentheses if it consists of one of the following: [...] An open parenthesis '(', followed by a message with balanced parentheses, followed by a close parenthesis ')'.

This means that within each pair of parentheses there must be a valid message, meaning all the rules apply within them.

Question 12

Hi l0b0, I have the same question from @JanneKarila, if you could help to clarify, it will be great.

Question 13

Hi l0b0, are there any code logical bug (for logical, I mean my code returns wrong conclusion about True/False for balance or not) in my posted code?

Question 14

I did post an example of input leading to the wrong conclusion. And which "same question" do you refer to?

Question 15

Thanks l0b0, I think you mean the example of )(?

Question 16

BTW l0b0, I post a new thread (codereview.stackexchange.com/questions/154783/…) to address your comments, if you have advice on that, it will be great. Vote up for all of your comments.

Janne Karila Janne Karila 10.6k21 silver badges34 bronze badges · Accepted Answer · 2017-02-09 08:29:41Z

In the verbal description of the algorithm, "should be less than" implies "not equal", which would not be correct. A wording such as "must be at most" would be totally clear.
Your code has a bug because you are not quite following rule 1. The rule says "at any point...", but you only check the rule after a smiley. You miss the case when a lone ) breaks the rule.

Instead of canceling += 1 with -= 1 here...

left += 1
if i > 0 and source[i-1] == ':': # :(
 left -= 1
 frown += 1

... it would be clearer to use else:

if i > 0 and source[i-1] == ':': # :(
 frown += 1
else:
 left += 1

Instead of if i > 0 and source[i-1] == ':' it would be simpler to remember the previous character in a variable:

 previous_char = None 
 for char in source:
 if char == 'c':
 if previous_char == ':':
 ...
 previous_char = char

instead of manually setting the previous character in a variable, it would be clearer to use the itertools' pairwise recipe
@MathiasEttinger Good idea, but then checking if the first character of the string is ( or ) would be a special case.
Not necessarily, you just need to adapt pairwise: a, b = tee(iterable); yield next(a), None; yield from zip(a, b). And you iterate over with for char, previous in ....
Thanks Janne, I post a new thread (codereview.stackexchange.com/questions/154783/…) to address your comments, if you have advice on that, it will be great. Mark your reply as answer.

Stack Exchange Network

Balanced smileys check algorithm (part 2)

2 Answers 2

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Linked

Hot Network Questions

Balanced smileys check algorithm (part 2)

2 Answers 2

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Linked

Related

Hot Network Questions