T9 search function

Question 1

This code I try to write based on last week codewars event that I joined which I can't figure it out during the event.

The question is:

T9 Autocomplete The file contacts.txt has names and phone numbers.

The format is:

column 1 = name
column 2 = phone number

Your program should receive a sequence of numbers and return the possible contacts that it could match on a phone keypad.

The letters for T9 autocomplete are:

[[2, "ABC"], [3, "DEF"], [4, "GHI"], [5, "JKL"], [6, "MNO"], [7, "PQRS"], [8, "TUV"], [9, "WXYZ"]]

For simplicity: - Ignore 1's and 0's for name matching. - Assume that matches always start at the beginning of the name or phone number

Output the autocomplete contacts for the following inputs:

728

5203

273

2738

I paste the contact list (contacts.txt) here.

My idea:

create a dictionary key-value pair between number and characters
translate number into a list of characters e.g. 728 --> ['PQRS', 'ABC', 'TUV']
loop through the contacts.txt file and check the first characters whether it is valid in the first element of list in #2 above or not. if yes, then add correct point by 1. then, check the second characters in that line whether it exist in list #2 above or not. if yes, then add another correct point by 1. if not, reset the score and go to next line.
If the correct point when the loop finish means we found the result based on the search condition. return that line.

t9_dict = {
 2: "ABC", 
 3: "DEF",
 4: "GHI",
 5: "JKL",
 6: "MNO",
 7: "PQRS",
 8: "TUV",
 9: "WXYZ",
 }
def t9_input(number):
 list_text = []
 for i in str(number):
 if int(i) != 0:
 list_text.append(t9_dict[int(i)])
 return list_text
def read_contact(list_text):
 filename = 'contacts.txt'
 count = 0
 f = open(filename, "r")
 for line in f:
 for j in range(len(list_text)): 
 if count < len(list_text):
 if line.upper()[count] in list_text[count]:
 count += 1
 else:
 count = 0
 break
 else:
 count = 0
 break
 if count == len(list_text):
 f.close()
 return(line)
 f.close()
 return "no result"
#print(t9_input(728))
#print(t9_input(5203))
print(read_contact(t9_input(728))) #return Patricia Adkins 741-256-2766
print(read_contact(t9_input(5203))) #return Kadyn Giles 597-981-0606
print(read_contact(t9_input(273))) #return Brennan Rosales 930-238-6553
print(read_contact(t9_input(2738))) #return no result

Even though the above functions return the correct answer, however, I want to know what I can improve further.

Question 2

An alternative would be to build a regex from the t9_input. E.g, 728 would become something like "^([PQRS][ABC][TUV].*)". The file could then be read in as one string a searched using the regex.

Question 3

Toward optimization and reorganization

t9_dict turned into a constant T9_DICT.

t9_input function
The function would better reflect the intention if named as number_to_chars_t9 (or number_to_t9chars).
As the question stated "Ignore 1's and 0's" - instead of executing filtering condition on each iteration 0 and 1 can be stripped at once with flexible str.translate function: str(number).translate({48: None, 49: None}).
Next, prefer list comprehension over constructing and appending to a list.
Even further, it's better to return an immutable tuple of T9 charsets to avoid compromising them.
read_contact function
list_text argument is better renamed to a meaningful t9_charsets.
filename is better defined as a keyword argument with default value: def read_contact(t9_charsets, filename='contacts.txt'):.

Instead of verbose open(...), f.close() calls - use flexible context manager with to automate handling of a file resource.

Nested for loops with all those noisy conditions with dragging around count variable and break statements are just redundant.
All the crucial job can be done by means of builtin zip (generates respective items pairs until exhausting the shortest iterable) and all functions.

Finally, here's the full optimized implementation:

T9_DICT = {
 2: "ABC",
 3: "DEF",
 4: "GHI",
 5: "JKL",
 6: "MNO",
 7: "PQRS",
 8: "TUV",
 9: "WXYZ",
}
def number_to_chars_t9(number):
 return tuple(T9_DICT[int(num)] 
 for num in str(number).translate({48: None, 49: None}))
def read_contact(t9_charsets, filename='contacts.txt'):
 with open(filename) as f:
 for line in f:
 if all(c.upper() in t9_chars 
 for c, t9_chars in zip(line, t9_charsets)):
 return line.strip()
 return "no result"

Tests:

print(read_contact(number_to_chars_t9(728)))
print(read_contact(number_to_chars_t9(5203)))
print(read_contact(number_to_chars_t9(273))) 
print(read_contact(number_to_chars_t9(2738)))

The output (consecutively):

Patricia Adkins 741-256-2766
Kadyn Giles 597-981-0606
Brennan Rosales 930-238-6553
no result

Question 4

If the keys for T9_DICT were characters instead of ints, the int() conversion would not be needed in number_to_chars_t9(). e.g., {'2':'ABC', ...}

Question 5

Thanks, RomanPerekhrest for an amazing reply. There are several things I can learn further during holiday. <3

score 2 · Accepted Answer · 2019-12-17 06:33:09Z

Toward optimization and reorganization

t9_dict turned into a constant T9_DICT.

t9_input function
The function would better reflect the intention if named as number_to_chars_t9 (or number_to_t9chars).
As the question stated "Ignore 1's and 0's" - instead of executing filtering condition on each iteration 0 and 1 can be stripped at once with flexible str.translate function: str(number).translate({48: None, 49: None}).
Next, prefer list comprehension over constructing and appending to a list.
Even further, it's better to return an immutable tuple of T9 charsets to avoid compromising them.
read_contact function
list_text argument is better renamed to a meaningful t9_charsets.
filename is better defined as a keyword argument with default value: def read_contact(t9_charsets, filename='contacts.txt'):.

Instead of verbose open(...), f.close() calls - use flexible context manager with to automate handling of a file resource.

Nested for loops with all those noisy conditions with dragging around count variable and break statements are just redundant.
All the crucial job can be done by means of builtin zip (generates respective items pairs until exhausting the shortest iterable) and all functions.

Finally, here's the full optimized implementation:

T9_DICT = {
 2: "ABC",
 3: "DEF",
 4: "GHI",
 5: "JKL",
 6: "MNO",
 7: "PQRS",
 8: "TUV",
 9: "WXYZ",
}
def number_to_chars_t9(number):
 return tuple(T9_DICT[int(num)] 
 for num in str(number).translate({48: None, 49: None}))
def read_contact(t9_charsets, filename='contacts.txt'):
 with open(filename) as f:
 for line in f:
 if all(c.upper() in t9_chars 
 for c, t9_chars in zip(line, t9_charsets)):
 return line.strip()
 return "no result"

Tests:

print(read_contact(number_to_chars_t9(728)))
print(read_contact(number_to_chars_t9(5203)))
print(read_contact(number_to_chars_t9(273))) 
print(read_contact(number_to_chars_t9(2738)))

The output (consecutively):

Patricia Adkins 741-256-2766
Kadyn Giles 597-981-0606
Brennan Rosales 930-238-6553
no result

If the keys for T9_DICT were characters instead of ints, the int() conversion would not be needed in number_to_chars_t9(). e.g., {'2':'ABC', ...}
Thanks, RomanPerekhrest for an amazing reply. There are several things I can learn further during holiday. <3

Stack Exchange Network

T9 search function

1 Answer 1

Toward optimization and reorganization

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

T9 search function

1 Answer 1

Toward optimization and reorganization

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions