My solution to the Coderbyte task 'Vowel Square'. It's a simple task, but I'm sure there's a more Pythonic solution than mine.
The problem:
Vowel Square
Have the function VowelSquare(strArr)
take the strArr
parameter being passed
which will be a 2D matrix of some arbitrary size filled with letters from
the alphabet, and determine if a 2x2 square composed entirely of vowels
exists in the matrix. For example: strArr
is ["abcd", "eikr", "oufj"]
then
this matrix looks like the following:
a b c d
e i k r
o u f j
Within this matrix there is a 2x2 square of vowels starting in the second row and first column, namely, ei, ou. If a 2x2 square of vowels is found your program should return the top-left position (row-column) of the square, so for this example your program should return 1-0. If no 2x2 square of vowels exists, then return the string not found. If there are multiple squares of vowels, return the one that is at the most top-left position in the whole matrix. The input matrix will at least be of size 2x2.
Examples:
- Input:
["aqrst", "ukaei", "ffooo"]
- Output: 1-2,
- Input:
["gg", "ff"]
- Output: not found
My solution:
VOWELS = "aeiouAEIOU"
def find_vowel_square(strs: list):
"""Return the top left grid ref of any 2x2 sq composed of vowels only
If more than one 2x2 sq exists. Return that which is at the most top-left
position.
"""
height, width = len(strs), len(strs[0])
# Ensure all strings within strs are of equal length
assert all(len(s) == width for s in strs)
for string_i in range(height-1):
for i in range(width-1):
if strs[string_i][i] in VOWELS and strs[string_i][i+1] in VOWELS:
if strs[string_i+1][i] in VOWELS and strs[string_i+1][i+1] in VOWELS:
return f"{i}-{string_i}"
return "Not found"
assert find_vowel_square(strs=["aqree", "ukaei", "ffooo"]) == "3-0"
assert find_vowel_square(strs=["aqrst", "ukaei", "ffooo"]) == "2-1"
assert find_vowel_square(strs=["gg", "ff"]) == "Not found"
-
\$\begingroup\$ What does "most top-left" mean? Minimum value of x+y, or something else? \$\endgroup\$Toby Speight– Toby Speight2020年01月13日 16:51:23 +00:00Commented Jan 13, 2020 at 16:51
4 Answers 4
These are all relatively minor notes:
You can use
typing.List
to provide a better type definition for the parameter (List[str]
), precisely defining what it's a list of.I always try to avoid giving variables names that are just a variation on the Python type to tell me what its type is (that's the type annotation's job); the problem description calls this the "matrix" so I'll use that.
If the problem description doesn't say what to do with invalid input, I'd assume it's fine (and preferable) to just let the code raise an exception if any assumptions are violated.
If you're checking that a bunch of iterable conditions are all true, I think it generally looks nicer to use the
all
function than to do a bunch ofand
s.For iterating over indices (i.e. where it's super obvious from context what the variable represents and a longer name only serves to distract) I always prefer using short, generic variable names like
i, j
orx, y
.
from typing import List
VOWELS = "aeiouAEIOU"
def find_vowel_square(matrix: List[str]):
"""Return the top left grid ref of any 2x2 sq composed of vowels only
If more than one 2x2 sq exists. Return that which is at the most top-left
position.
"""
for y in range(len(matrix) - 1):
for x in range(len(matrix[y]) - 1):
if all (matrix[i][j] in VOWELS
for i, j in [(y, x), (y+1, x), (y, x+1), (y+1, x+1)]):
return f"{x}-{y}"
return "Not found"
We can use doctest
instead of the asserts:
import doctest
def find_vowel_square(strs: list):
"""Return the top left grid ref of any 2x2 sq composed of vowels only.
>>> find_vowel_square(strs=["aqree", "ukaei", "ffooo"])
'3-0'
>>> find_vowel_square(["aqrst", "ukaei", "ffooo"])
'2-1'
>>> find_vowel_square(strs=["gg", "ff"])
'Not found'
"""
# (snip implementation)
if __name__ == "__main__":
import doctest
doctest.testmod()
I'd write a function with a more conventional interface: return positions as a list of two integers, and None
when not found. It's easy to provide a simple adapter from that to the strings that the problem statement requires.
Consider how you would extend this to find a N×ばつN block of vowels in the grid. Hint: for efficiency, we can start by reading only every Nth line and only when we find a suitable run of vowels, look up and down in the matrix.
Toby & Sam both make excellent points; I won't repeat them. I would like the add the following:
Use sets with in
You are repeatedly testing whether a letter is in
the string VOWELS
. This requires a linear search through the string, checking each substring to see if it matches.
If instead, you declared VOWELS
as a set
:
VOWELS = set("aeiouAEIOU")
then the in
operation should become a much faster \$O(1)\$ lookup.
Repeated tests
You may be checking strs[x][y] in VOWELS
up to 4 times. Once for strs[string_i][i]
, once for strs[string_i][i+1]
, once for strs[string_i+1][i]
, and once for strs[string_i+1][i+1]
.
You could perform the check exactly once per character:
vowel = [[ch in VOWELS for ch in line] for line in strs]
With ["abcd", "eikr", "oufj"]
, this would yield the matrix:
[[True, False, False, False],
[True, True, False, False],
[True, True, False, False]]
Then testing vowel[x][y]
, vowel[x+1][y]
, vowel[x][y+1]
, vowel[x+1][y+1]
would be simple lookup operations, instead of more complicated in
containment tests.
An alternate approach
Using this matrix of vowel flags, you could compute whether pairs of vowels exist in a row, by AND-ing each adjacent pair of boolean values together:
vowel_pairs = [[a and b for a, b in zip(row[:-1], row[1:])] for row in vowels]
resulting in
[[False, False, False],
[True, False, False],
[True, False, False]]
Then, you could AND each pair of adjacent rows:
vowel_square = [[a and b for a, b in zip(row_a, row_b)]
for row_a, row_b in zip(vowel_pairs[:-1], vowel_pairs[1:])]
resulting in
[[False, False, False],
[True, False, False]]
The True
in the row 1, column 0 means a 2x2 vowel square occurs there. Assuming "most top-left" means first in top-to-bottom, then left-to-right ordering, we can extract the locations of all vowel squares using a generator expression:
locations = (f"{x}-{y}"
for x, row in enumerate(vowel_square) for y, flag in enumerate(row)
if flag)
and extract only the first if it exists:
return next(locations, "Not found")
Note: this matrix row/column AND'ing does a complete analysis of the matrix, finding any and all 2x2 vowel matrices. It does not stop when the first one is found, so may be slower if the required 2x2 matrix is found early on. In the case of no 2x2 vowel matrix existing, it may perform the fastest, since every possibility would need to be checked. However, it would not scale well to larger submatrices. Toby's recommendation of searching every Nth line would be better.
-
\$\begingroup\$ Note that the vowels string is pretty short. It might be worth testing the speed of string vs set. \$\endgroup\$JollyJoker– JollyJoker2020年01月14日 10:30:19 +00:00Commented Jan 14, 2020 at 10:30
-
4\$\begingroup\$ @JollyJoker I did test this: str was ~35% slower on my setup. \$\endgroup\$Dave– Dave2020年01月14日 17:58:20 +00:00Commented Jan 14, 2020 at 17:58
Taking into account the three answers(at the time of posting) posted, I've come up with the below answer(hope I didn't miss anything).
I have attempted to use every search every nth line as per Toby's suggestion but ran into issues with it finding a solution that was not the top, leftmost. I will hopefully get a chance to pursue it further and do some speed tests.
import doctest
from typing import List
VOWELS = set("aeiouAEIOU")
def _make_coderbyte_compliant(result: None or List[int]) -> str:
"""Make Coderbyte compliant, the result of find_vowel_square func call
Helper function.
>>> _make_coderbyte_compliant(result=None)
'Not found'
>>> _make_coderbyte_compliant(result=[1, 2])
'1-2'
"""
if result == None:
return "Not found"
return f"{result[0]}-{result[1]}"
def find_vowel_square(matrix: List[str], n: int) -> list:
"""Return the top left grid ref of any N x N sq composed of vowels only
If more than one N x N sq exists, return that which is at the most top-left
position.
:param matrix: the matrix as a list of strings, all equal length
:param n: int. The width and height of the vowel square for which to search
:returns: None or list of top, left index coordinated of the found vowel square
>>> find_vowel_square(matrix=["aqree", "ukaei", "ffooo"], n=2)
[3, 0]
>>> find_vowel_square(matrix=["aqrst", "ukaei", "ffooo"], n=2)
[2, 1]
>>> find_vowel_square(matrix=["aqiii", "ukaei", "ffooo"], n=3)
[2, 0]
>>> find_vowel_square(matrix=["aqtyt", "rtrtt", "aaaaa", "ukaei", "ffooo"], n=3)
[2, 2]
>>> find_vowel_square(matrix=["aqtyt", "aaaaa", "aaaaa", "uuuuu", "oooo"], n=4)
[0, 1]
>>> find_vowel_square(matrix=["gg", "ff"], n=2)
"""
height, width = len(matrix), len(matrix[0])
# True if vowel else False
bool_matrix = [[l in VOWELS for l in line] for line in matrix]
for y in range(height-(n-1)):
for x in range(width-(n-1)):
if all(cell for row in bool_matrix[y:y+n]
for row in bool_matrix[x:x+n]):
return [x, y]
return None
if __name__ == "__main__":
import doctest
doctest.testmod()
assert _make_coderbyte_compliant(
result=find_vowel_square(matrix=["aqree", "ukaei", "ffooo"], n=2)) == "3-0"
assert _make_coderbyte_compliant(
result=find_vowel_square(matrix=["gg", "ff"], n=2)) == "Not found"
-
1\$\begingroup\$ Now you don’t need the first
if all(...)
; it is completely covered by the second. \$\endgroup\$AJNeufeld– AJNeufeld2020年01月14日 19:12:10 +00:00Commented Jan 14, 2020 at 19:12 -
1\$\begingroup\$ Avoid the slow Python addition and indexing. Use
all(cell for row in bool_matrix[y:y+n] for cell in row[x:x+n])
. My testing shows a 2.7x speed up for n = 99, when all cells are True. \$\endgroup\$AJNeufeld– AJNeufeld2020年01月14日 19:23:01 +00:00Commented Jan 14, 2020 at 19:23 -
1\$\begingroup\$
find_vowel_square
returns aList[int]
, not astr
. Your type hint is wrong. \$\endgroup\$AJNeufeld– AJNeufeld2020年01月15日 14:10:16 +00:00Commented Jan 15, 2020 at 14:10
Explore related questions
See similar questions with these tags.