Find a best fuzzy match for a string

Asked 6 years, 6 months ago

Viewed 6k times

\$\begingroup\$

I am trying to find a best match for a name within a list of names.

I came up with the below code, which works, but I feel that the creation of the intermediate ratios list is not the best solution performance-wise.

Could it be somehow replaced? Would reduce be a good fit here (I have tried it, but could not get the index of best match and its value at the same time).

from fuzzywuzzy import fuzz
name_to_match = 'john'
names = ['mike', 'james', 'jon', 'jhn', 'jimmy', 'john']
ratios = [fuzz.ratio(name_to_match, name) for name in names]
best_match = names[ratios.index(max(ratios))]
print(best_match) # output is 'john'

edited Mar 11, 2019 at 8:38

200_success's user avatar

200_success

145k22 gold badges190 silver badges478 bronze badges

asked Mar 11, 2019 at 8:33

barciewicz's user avatar

barciewicz barciewicz

3371 gold badge4 silver badges12 bronze badges

\$\endgroup\$

2

\$\begingroup\$ Someone else asked about this on stack overflow once before, and I suggested they try downloading python-levenshtein since the github page suggests it may speed up execution by 4-10x. They later confirmed that it did in fact speed up their solution, so you may want to try that as well. \$\endgroup\$

Dillon Davis
– Dillon Davis

2019年03月11日 08:43:09 +00:00
Commented Mar 11, 2019 at 8:43

Add a comment |

2 Answers 2

Sorted by: Reset to default

\$\begingroup\$

You should take advantage of the key argument to max():

The key argument specifies a one-argument ordering function like that used for list.sort().

best_match = max(names, key=lambda name: fuzz.ratio(name_to_match, name))

answered Mar 11, 2019 at 8:46

200_success's user avatar

200_success 200_success

145k22 gold badges190 silver badges478 bronze badges

\$\endgroup\$

Add a comment |

\$\begingroup\$

fuzzywuzzy already includes functionality to return only the best match. unless you need it sorted like 200_success's solution, this would be the easiest:

from fuzzywuzzy import process
choices = ["Atlanta Falcons", "New York Jets", "New York Giants", "Dallas Cowboys"]
process.extractOne("cowboys", choices)
# returns ("Dallas Cowboys", 90)

answered Mar 12, 2019 at 8:11

JanM's user avatar

JanM JanM

211 bronze badge

\$\endgroup\$

Add a comment |

Your Answer

Draft saved

Draft discarded

Sign up or log in

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.

lang-py

Stack Exchange Network

Find a best fuzzy match for a string

2 Answers 2

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Linked

Hot Network Questions

Find a best fuzzy match for a string

2 Answers 2

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Linked

Related

Hot Network Questions