Is there a tool that can detect the language of the text of several paragraphs?
Asked
Active
Viewed 2,392 times
2 Answers
1
The file tool has a bunch of heuristics for guessing file types, including one that reports "English text". I don't know if it knows about other human languages, but it definitely could be upgraded to guess between them.
1
there are many tools around to do this, the first one thatI can think of is Google's own: http://code.google.com/apis/ajax/playground/#language_detect
- In java, there is http://textcat.sourceforge.net/
- In Ruby https://github.com/peterc/whatlanguage
- In Perl http://search.cpan.org/~ambs/Lingua-Identify-0.29/lib/Lingua/Identify.pm etc.
Hope it helps
Mortimer
- 196
- 1
- 2