Programming
python probability similarity metric
Updated Sun, 04 Sep 2022 20:07:05 GMT

# Find the similarity metric between two strings

How do I get the probability of a string being similar to another string in Python?

I want to get a decimal value like 0.9 (meaning 90%) etc. Preferably with standard Python and library.

e.g.

``````similar("Apple","Appel") #would have a high prob.
similar("Apple","Mango") #would have a lower prob.
``````

## Solution

There is a built in.

``````from difflib import SequenceMatcher
def similar(a, b):
return SequenceMatcher(None, a, b).ratio()
``````

Using it:

``````>>> similar("Apple","Appel")
0.8
>>> similar("Apple","Mango")
0.0
``````

### Comments (5)

• +0 – See this great answer comparing `SequenceMatcher` vs `python-Levenshtein` module. stackoverflow.com/questions/6690739/… — Feb 09, 2015 at 13:06
• +1 – Interesting article and tool: chairnerd.seatgeek.com/… — Jan 05, 2016 at 19:04
• +7 – I would highly recommend checking out the whole difflib doc docs.python.org/2/library/difflib.html there is a `get_close_matches` built in, although i found `sorted(... key=lambda x: difflib.SequenceMatcher(None, x, search).ratio(), ...)` more reliable, with custom `sorted(... .get_matching_blocks())[-1] > min_match` checks — Sep 15, 2016 at 19:51
• +2 – @ThorSummoner brings attention to a very useful function (`get_closest_matches`). It's a convenience function that may be what you are looking for, AKA read the docs! In my particular application I was doing some basic error checking / reporting to the user providing bad input, and this answer allows me to report to them the potential matches and what the "similarity" was. If you don't need to display the similarity, though, definitely check out `get_closest_matches` — Sep 03, 2017 at 22:54
• +0 – This worked perfectly. Simple and effective. Thankyou :) — May 09, 2020 at 16:39