Python program to remove invalid parentheses

Question 1

Remove the minimum number of invalid parentheses in order to make the input string valid. Return all possible results.

Note -

The input string may contain letters other than the parentheses ( and ).

Here is my solution to this challenge -

def remove_invalid_parentheses(s):
 removed = 0
 results = {s}
 count = {"(": 0, ")": 0}
 for i, c in enumerate(s):
 if c == ")" and count["("] == count[")"]:
 new_results = set()
 while results:
 result = results.pop()
 for j in range(i - removed + 1):
 if result[j] == ")":
 new_results.add(result[:j] + result[j + 1:])
 results = new_results
 removed += 1
 else:
 if c in count:
 count[c] += 1
 count = {"(": 0, ")": 0}
 i = len(s)
 ll = len(s) - removed
 for ii in range(ll - 1, -1, -1):
 i-=1
 c = s[i]
 if c == "(" and count["("] == count[")"]:
 new_results = set()
 while results:
 result = results.pop()
 for j in range(ii, ll):
 if result[j] == "(":
 new_results.add(result[:j] + result[j + 1:])
 results = new_results
 ll -= 1
 else:
 if c in count:
 count[c] += 1
 return list(results)

Explanation -

Scan from left to right, make sure count["("] >= count[")"].
Then scan from right to left, make sure count["("] <= count[")"].

Here are some inputs/outputs -

print(remove_invalid_parentheses("()())()"))
>>> ["()()()", "(())()"]
print(remove_invalid_parentheses("(a)())()"))
>>> ["(a)()()", "(a())()"]
print(remove_invalid_parentheses(")("))
>>> [""]

Here are the times taken for each output -

%timeit remove_invalid_parentheses("()())()")
6.54 μs ± 176 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
%timeit remove_invalid_parentheses("(a)())()")
8.43 μs ± 1.29 μs per loop (mean ± std. dev. of 7 runs, 100000 loops each)
%timeit remove_invalid_parentheses(")(")
3.69 μs ± 67.3 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

So, I would like to know whether I could make this program shorter and more efficient.

Any help would be highly appreciated.

Question 2

Test, consistent behavior & data structure

Trying to add tests for the code based on the example of input/output provided led to the following result: the behavior is not consistent : in particular the order of the elements in the returned list is not always the same.

The root cause for this is that we are using sets in the logic only to convert to list at the end.

A solution would be to return sorted(results) at the end. An alternative would be to just return a set instead of trying to convert to list.

Then, we get the following automated tests:

print(remove_invalid_parentheses("()())()") == {"(())()", "()()()"})
print(remove_invalid_parentheses("(a)())()") == {"(a())()", "(a)()()"})
print(remove_invalid_parentheses(")(") == {""})

It would make sense to try to rewrite it as unit-tests using a proper unit-test framework but I will stop here as this is enough for me to keep working.

More explicit loop

I tend to find loops based on for ii in range(ll - 1, -1, -1) very error-prone/hard to understand because of the number of tweaks on both sides of the range. I'd suggest a much more explicit alternative: for ii in reversed(range(ll)):.

Similar indices

As ii will go over ]ll, 0] which is ]len(s) - removed, 0], i will go over ]len(s), removed].

The difference between the indices is always removed, we have ii + removed == i: we can get rid of the i variable. We can take this chance to rename ii into i.

At this stage, we have:

def remove_invalid_parentheses(s):
 removed = 0
 results = {s}
 count = {"(": 0, ")": 0}
 for i, c in enumerate(s):
 if c == ")" and count["("] == count[")"]:
 new_results = set()
 while results:
 result = results.pop()
 for j in range(i - removed + 1):
 if result[j] == ")":
 new_results.add(result[:j] + result[j + 1:])
 results = new_results
 removed += 1
 elif c in count:
 count[c] += 1
 count = {"(": 0, ")": 0}
 ll = len(s) - removed
 for i in reversed(range(ll)):
 c = s[i + removed]
 if c == "(" and count["("] == count[")"]:
 new_results = set()
 while results:
 result = results.pop()
 for j in range(i, ll):
 if result[j] == "(":
 new_results.add(result[:j] + result[j + 1:])
 results = new_results
 ll -= 1
 elif c in count:
 count[c] += 1
 return results
print(remove_invalid_parentheses("()())()") == {"(())()", "()()()"})
print(remove_invalid_parentheses("()())))()") == {"(())()", "()()()"})
print(remove_invalid_parentheses("(a)())()") == {"(a())()", "(a)()()"})
print(remove_invalid_parentheses(")(") == {""})

(I took this chance to add a new test case)

More explicit loop again

We want to iterate in reversed order over the beginning of the s string. Here again, we can try to make things more explicit and remove some computation as we go:

 for i, c in enumerate(reversed(s[removed:])):
 if c == "(" and count["("] == count[")"]:
 new_results = set()
 while results:
 result = results.pop()
 for j in range(len(s) - i - removed - 1, ll):
 if result[j] == "(":
 new_results.add(result[:j] + result[j + 1:])
 results = new_results
 ll -= 1
 elif c in count:
 count[c] += 1

In order to test this more throughly, I took this chance to add the following test case:

print(remove_invalid_parentheses("((((((((()(") == {"()"})

More beautiful loops

Instead of using a combination of while and pop to iterate over the elements of a set, you can just use for:

 for result in results:

Work in progress: I have to go. I may continue in the future.

SylvainD SylvainD 29.8k1 gold badge49 silver badges93 bronze badges · Accepted Answer · 2019-05-29 09:24:38Z

Test, consistent behavior & data structure

Trying to add tests for the code based on the example of input/output provided led to the following result: the behavior is not consistent : in particular the order of the elements in the returned list is not always the same.

The root cause for this is that we are using sets in the logic only to convert to list at the end.

A solution would be to return sorted(results) at the end. An alternative would be to just return a set instead of trying to convert to list.

Then, we get the following automated tests:

print(remove_invalid_parentheses("()())()") == {"(())()", "()()()"})
print(remove_invalid_parentheses("(a)())()") == {"(a())()", "(a)()()"})
print(remove_invalid_parentheses(")(") == {""})

It would make sense to try to rewrite it as unit-tests using a proper unit-test framework but I will stop here as this is enough for me to keep working.

More explicit loop

I tend to find loops based on for ii in range(ll - 1, -1, -1) very error-prone/hard to understand because of the number of tweaks on both sides of the range. I'd suggest a much more explicit alternative: for ii in reversed(range(ll)):.

Similar indices

As ii will go over ]ll, 0] which is ]len(s) - removed, 0], i will go over ]len(s), removed].

The difference between the indices is always removed, we have ii + removed == i: we can get rid of the i variable. We can take this chance to rename ii into i.

At this stage, we have:

def remove_invalid_parentheses(s):
 removed = 0
 results = {s}
 count = {"(": 0, ")": 0}
 for i, c in enumerate(s):
 if c == ")" and count["("] == count[")"]:
 new_results = set()
 while results:
 result = results.pop()
 for j in range(i - removed + 1):
 if result[j] == ")":
 new_results.add(result[:j] + result[j + 1:])
 results = new_results
 removed += 1
 elif c in count:
 count[c] += 1
 count = {"(": 0, ")": 0}
 ll = len(s) - removed
 for i in reversed(range(ll)):
 c = s[i + removed]
 if c == "(" and count["("] == count[")"]:
 new_results = set()
 while results:
 result = results.pop()
 for j in range(i, ll):
 if result[j] == "(":
 new_results.add(result[:j] + result[j + 1:])
 results = new_results
 ll -= 1
 elif c in count:
 count[c] += 1
 return results
print(remove_invalid_parentheses("()())()") == {"(())()", "()()()"})
print(remove_invalid_parentheses("()())))()") == {"(())()", "()()()"})
print(remove_invalid_parentheses("(a)())()") == {"(a())()", "(a)()()"})
print(remove_invalid_parentheses(")(") == {""})

(I took this chance to add a new test case)

More explicit loop again

We want to iterate in reversed order over the beginning of the s string. Here again, we can try to make things more explicit and remove some computation as we go:

 for i, c in enumerate(reversed(s[removed:])):
 if c == "(" and count["("] == count[")"]:
 new_results = set()
 while results:
 result = results.pop()
 for j in range(len(s) - i - removed - 1, ll):
 if result[j] == "(":
 new_results.add(result[:j] + result[j + 1:])
 results = new_results
 ll -= 1
 elif c in count:
 count[c] += 1

In order to test this more throughly, I took this chance to add the following test case:

print(remove_invalid_parentheses("((((((((()(") == {"()"})

More beautiful loops

Instead of using a combination of while and pop to iterate over the elements of a set, you can just use for:

 for result in results:

Work in progress: I have to go. I may continue in the future.

Stack Exchange Network

Python program to remove invalid parentheses

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

Python program to remove invalid parentheses

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions