I have a list of a bunch of phrases. Because this is a fairly long list, I also have a text box which users can type into as a search bar. As of right now, terms that do not exactly contain with the letters in the search bar are filtered out. However, I would like to have it give a list of a few suggestions of what the word might be.
Note: I am not looking for a "Did you mean..." or spell checking algorithm like the ones here or here or here (though this image from the first link seems good); I want an algorithm that will be able to suggest the best match for an incomplete word or phrase; e.g. the word "bat" should be a better match of the word "battery" than the word "car".
It would also be impractical to use Google's method of returning the few strings that are most common that start with (approximately) the same letters, because, as far as I know, each element in the list would be equally as common as any other.
Also, I would like to do this in Java (8); however, other language answers are acceptable, as long as they do not use built in functions for which Java has no equivalent. In case it is useful, I wrote a modified version of Levenshtein distance (below) which fills the search string with asterisks signifying "any character." This works for single words, e.g. "mud" is a perfect match of "muddy", but isn't good enough when considering people may use "car" to search for "race car".
/**
* <ul>
* <b><i>searchDistance</i></b><br>
* <br>
* <code> public static int searchDistance(String key, String match)</code><br>
* <br>
* Gets the Levenshtein distance between <code>key</code> and <code>match</code>. <br>
* If <code>useAsterisk</code> is true, then the follwing applies: If <code>key</code> is shorter than <code>match</code>, the asterisk <code>'*'</code> is appended to it until the lengths are equal. Asterisks can be used in <code>key</code> to signify 'any character.'
* @param key - The text to search for
* @param match - The text to compare <code>key</code> against
* @param useAsterisk - Whether or not to use asterisks for the purpose described above
* @return the Levenshtein distance between <code>key</code> and <code>match</code>.
* </ul>
*/
public static int searchDistance(String key, String match, boolean useAsterisk) {
while (key.length() < match.length()) {
key = key + "*";
}
int[][] matrix = new int[key.length() + 1][match.length() + 1];
for (int i = 0; i < matrix.length; i++) {
matrix[i][0] = i;
}
for (int i = 0; i < matrix[0].length; i++) {
matrix[0][i] = i;
}
for (int a = 1; a < matrix.length; a++) {
for (int b = 1; b < matrix[0].length; b++) {
matrix[a][b] = Math.min(Math.min(matrix[a - 1][b] + 1, matrix[a][b - 1] + 1), matrix[a - 1][b - 1] + (key.charAt(a - 1) == match.charAt(b - 1) || key.charAt(a - 1) == '*' ? 0 : 1));
}
}
return matrix[matrix.length - 1][matrix[0].length - 1];
}
TL;DR: Is there a good way to give completion suggestions for search terms?
Thanks in advance!