tl;dr
grep -P -i "[\w+-.]+@[\w+-]+\.[a-z]{2,}$" file.txt
-P - option for advanced Perl like regex (allows using \w)
-i - ignore case (matches @xyz.com ou @xyz.COM)
For input file: file.txt
example@gmail.com
john@smith.org
example@example.co.gr
john@zcal.co
john.smith@zcal.co
john.smith@zcal.com.br
john.de-smith@zcal.com.br
john.de-smith@hotmail.com
john_de_smith@hotmail.com
john.de-smith@hotmail.AG
john.de-smith@dept.hotmail.AG
Resulting:
example@gmail.com
john@smith.org
john@zcal.co
john.smith@zcal.co
john.de-smith@hotmail.com
john_de_smith@hotmail.com
john.de-smith@hotmail.AG
No fancy characters, please.
In order to answer your question it's important to make some assumptions.
- E-mails regex are tricky, and you already read this answer on Stackoverflow (1), as well as this article on Wikipedia (2).
- Your e-mails local part (a.k.a. user name) only have the following characters: letter from
A-Za-z, numbers from 0 to 9, special characters +-_ (a very reduced of the allowed set), and dot . in the middle.
- No fancy
utf-8 or utf-16 characters. Not even latin ones (e.g. ç, ñ)
This assumption represents 99,73% of all e-mail addresses known so far.
Allowed chars
username_allowed_chars = [A-Za-z0-9_+-.]
In fact, I assume you're using gnu grep, therefore you may use grep -P (perl style regex) and the following set \w which is equivalent to [A-Za-z0-9_], thence:
username_allowed_chars = [\w+-.]
As for the domain part, remove + and dot ., thence:
domain_allowed_chars = [\w-]
Finally we will use + for 1 or more repetitions of chars.
grep -P -i "[\w+-.]+@[\w+-]+\.[a-z]{2,}$" file.txt
I'll break this regex in parts. First the character set \w that is used extensively.
\w - Translates do [A-Za-z0-9_] word indentifier a.k.a. allowed chars for variable names, in programming parlance. In practice disallows punctuations and other unusual characters in e-mail user name;
\. - literal dot .;
[\w+-.]+- One or more of these identifiers, and includes the period or dot in user names. e.g. john.doe@gmail.com.
@ - literal @ to separate username from domain name.
[a-z]{2,}$ - No less than two lowercase letters up to the end of the string (marked by $).
References
(1) Stack Overflow
(2) Wikipedia