Let's say I have a string "Hello" and a list
words = ['hello', 'Hallo', 'hi', 'house', 'key', 'screen', 'hallo','question', 'Hallo', 'format']
How can I find the n words that are the closest to "Hello" and present in the list words ?
In this case, we would have ['hello', 'hallo', 'Hallo', 'hi', 'format'...]
So the strategy is to sort the list words from the closest word to the furthest.
I thought about something like this
word = 'Hello'
for i, item in enumerate(words):
if lower(item) > lower(word):
...
but it's very slow in large lists.
UPDATE
difflib works but it's very slow also. (words list has 630000+ words inside (sorted and one per line)). So checking the list takes 5 to 7 seconds for every search for closest word!