3
\$\begingroup\$

Here is a problem came from codingbat:

Given 2 strings, a and b, return the number of the positions where they contain the same length 2 substring. So "xxcaazz" and "xxbaaz" yields 3, since the "xx", "aa", and "az" substrings appear in the same place in both strings.

There are several answers but it may hard to choose which one is the most preferred, such as:

# Solution 1
# Using for loop
def strmatch_forloop(a, b):
 shorter = min(len(a), len(b))
 count = 0
 for i in range(shorter-1):
 a_sub = a[i:i+2]
 b_sub = b[i:i+2]
 if a_sub == b_sub:
 count = count + 1
 return count
# Solution 2
# Using list comprehension
def strmatch_listcomp(a, b):
 shorter = min(len(a), len(b))
 return [a[i:i+2] == b[i:i+2] for i in range(shorter-1)].count(True)
# Solution 3
# Using generator
def strmatch_gen(a, b):
 shorter = min(len(a), len(b))
 return sum(a[i:i+2] == b[i:i+2] for i in range(shorter-1))

Note that the "preferable" might be subjective; it may refer to the speed, the memory use or the coding style. For instance, their speeds are reported as:

%timeit strmatch_forloop
10000000 loops, best of 3: 21.7 ns per loop
%timeit strmatch_listcomp
10000000 loops, best of 3: 22.9 ns per loop
%timeit strmatch_gen
10000000 loops, best of 3: 21.8 ns per loop

According to the results, there may no difference between these approaches. For memory use, similar results can be shown by %memit. However, coding style is too subjective to measure. How could I choose among them?

Jamal
35.2k13 gold badges134 silver badges238 bronze badges
asked Sep 10, 2014 at 5:40
\$\endgroup\$
1
  • \$\begingroup\$ In general (depending on how efficient this has to be) I would choose whichever one looks the most natural to me. \$\endgroup\$ Commented Sep 10, 2014 at 15:32

1 Answer 1

3
\$\begingroup\$

Since memory and performance metrics are similar, coding style is possibly the only differentiating factor.

Nothing wrong with solution 1 but it takes more time to read it, unless you are paid by the number of lines you write! I guess it is Pythonic to write code in fewer lines so long as readability is not compromised.

I prefer solution 3. It reads well and terse. Solution 2 is just as good but it appears to be doing extra work in the sense of creating a list and then counting. Why do that when you can directly sum up the matches?

answered Nov 2, 2014 at 14:55
\$\endgroup\$

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.