I need to hide emails and phone number in a string. Replacing well formatted emails/number is easy with a regex, but what about other formats? Here is an example:
Input:
Email addresses likeemail@example.comoremail AT example DOT comshould be replaced. Phone numbers like347 323 4567ortree four seven, three two three four five six sevenshould also be replace.
Output:
Email addresses like(email hidden)or(email hidden)should be replaced. Phone numbers like(phone hidden)or(phone hidden)should also be replace.
AirBnB's messaging system is really good at doing that. Apparently they used to do that:
It looks for @ symbols, spellings of “this is me AT whatever DOT com” and series of numbers with at least 7 digits (telephone number) with some sensitivity to separators.
What would be the best way to do the same thing? Writing complex regexes? Using a natural language processing library?
 
     
    