- 29.4k
- 3
- 49
- 98
Once again, this can be optimizeoptimized by using zip
instead of retrieving letters with their indices in the string:
for _ in range(T):
patient, virus = input().split()
matches = 0
for index in analyze(patient, virus):
print(index, end=' ')
matches += 1
if not matches:
print('No match!')
else:
print()
Or you can convert the matches on the fly:
for _ in range(T):
patient, virus = input().split()
matches = ' '.join(map(str, analyze(patient, virus)))
if not matches:
matches = 'No match!'
print(matches)
Once again, this can be optimize by using zip
instead of retrieving letters with their indices in the string:
for _ in range(T):
patient, virus = input().split()
matches = 0
for index in analyze(patient, virus):
print(index, end=' ')
matches += 1
if not matches:
print('No match!')
else:
print()
Once again, this can be optimized by using zip
instead of retrieving letters with their indices in the string:
for _ in range(T):
patient, virus = input().split()
matches = 0
for index in analyze(patient, virus):
print(index, end=' ')
matches += 1
if not matches:
print('No match!')
else:
print()
Or you can convert the matches on the fly:
for _ in range(T):
patient, virus = input().split()
matches = ' '.join(map(str, analyze(patient, virus)))
if not matches:
matches = 'No match!'
print(matches)
Various nitpicks
range(0, x)
is better written asrange(x)
print(str(i) + " ", end='')
is equivalent toprint(i, end=' ')
but slowerif matches == 0
reads better asif not matches
- You compute
len(V)
several times, just store its value in a variable. - Given that you don't use
sep
, and thath
should already be a string,P,sep,V = str(h).partition(' ')
is better written asP, V = h.split()
#Special cases aren't special enough to break the rules
Checking for equality between the patient sub-string and the virus is just a sub-case of counting the differences between them (and finding it is 0). Same for comparing the entire patient string and the virus. All are just sub-cases of:
for i in range(0,upperlimit):
Psub = P[i:i+len(V)]
for j in range(0,len(V)): # If we hit this code then, we're about to see how many characters mismatch in the substring
if Psub[j] != V[j]:
mismatches += 1
if mismatches>=2: # No point checking the substrings alphabet by alphabet any further, more than 1 mismatch present here.
break
Besides, it is to be expected that there will be much more mismatch or fuzzy-matching when checking the substrings than there will be exact matches. So the exact comparison is just slowing you down.
Moreover, you can abuse the fact that booleans are integers and use the sum
builtins to compute the number of differences faster:
for i in range(upperlimit):
Psub = P[i:i+len(V)]
mismatches = sum(Psub[j] != V[j] for j in range(len(V)))
Once again, this can be optimize by using zip
instead of retrieving letters with their indices in the string:
for i in range(upperlimit):
Psub = P[i:i+len(V)]
mismatches = sum(p != v for p, v in zip(Psub, V))
The advantage of using zip
being that the iteration will stop when reaching the end of the shortest string. We can use that to simplify the writting to:
for i in range(upperlimit):
mismatches = sum(p != v for p, v in zip(P[i:], V))
#Separation of concerns
As it stand, your code does 3 things at once:
- iterate over the input data
- compute the index of "matching" substrings
- print such indexes
You should make functions to separate concerns and make it easier to re-use/test. A basic layout could look like:
def analyze(patient, virus):
# build indexes where there is a match
def main():
# iterate over lines in input
for line in input_string:
patient, virus = line.split()
matches = analyze(patient, virus)
# print output based on `matches`
It makes it more clear to follow what is going on. You then just have to figure out how to return meaningful values from analyze
. First thoughts should be to build a list and return it:
def analyze(patient, virus):
matches = []
for i in range(len(patient) - len(virus) + 1):
if sum(p != v for p, v in zip(patient[i:], virus)) < 2:
matches.append(i)
return matches
def main():
# iterate over lines in input
for line in input_string:
patient, virus = line.split()
matches = analyze(patient, virus)
if not matches:
print('No match!')
else:
print(' '.join(map(str, matches)))
Or you could turn analyze
into a generator and convert its computation into a list when calling it. It let you avoid explicitly calling append
:
Proposed improvements
def analyze(patient, virus):
for begin in range(len(patient) - len(virus) + 1):
if sum(p != v for p, v in zip(patient[begin:], virus)) < 2:
yield begin
if __name__ == '__main__':
T = int(input())
for _ in range(T):
patient, virus = input().split()
matches = list(analyze(patient, virus))
if not matches:
print('No Match!')
else:
print(' '.join(map(str, matches)))
The advantage of using a generator is that you don't even have to convert it to a list, your original print
s work as well:
for _ in range(T):
patient, virus = input().split()
matches = 0
for index in analyze(patient, virus):
print(index, end=' ')
matches += 1
if not matches:
print('No match!')
else:
print()