How do I get the probability of a string being similar to another string in Python?
I want to get a decimal value like 0.9 (meaning 90%) etc. Preferably with standard Python and library.
e.g.
similar("Apple","Appel") #would have a high prob.
similar("Apple","Mango") #would have a lower prob.
There is a built in.
from difflib import SequenceMatcher
def similar(a, b):
return SequenceMatcher(None, a, b).ratio()
Using it:
>>> similar("Apple","Appel")
0.8
>>> similar("Apple","Mango")
0.0
SequenceMatcher
vs python-Levenshtein
module. stackoverflow.com/questions/6690739/… — Feb 09, 2015 at 13:06 get_close_matches
built in, although i found sorted(... key=lambda x: difflib.SequenceMatcher(None, x, search).ratio(), ...)
more reliable, with custom sorted(... .get_matching_blocks())[-1] > min_match
checks — Sep 15, 2016 at 19:51 get_closest_matches
). It's a convenience function that may be what you are looking for, AKA read the docs! In my particular application I was doing some basic error checking / reporting to the user providing bad input, and this answer allows me to report to them the potential matches and what the "similarity" was. If you don't need to display the similarity, though, definitely check out get_closest_matches
— Sep 03, 2017 at 22:54