consider the following Strings:
- he llo
- goodbye
- hello
- = (goodbye)
- (he)(llo)
- good bye
- helium
I'm trying to sort these in such a way that similar words comes together, I know
- alphanumerical sortingis not an option
- removing special chars ",-_ and etcthen comparing is certainly helpful but results won't be as good as I hope for.
NOTE :
there might be few different desired ouput for this, one of which is :
DESIRED OUTPUT:
- hello
- he llo
- (he)(llo)
- helium
- goodbye
- good bye
- = (goodbye)
so my question is that if there is a java package that compares strings and ultimately sort them based on it .
I've heard of terms such as n-gram and skip-gram but didn't quite understand them. I'm not even sure if they can be useful for me at all.
UPDATE: finding similarities is certainly part of my question but the main problem is the sorting part.
 
     
    