Is the regular expression standard used in grep POSIX + ASCII or something else is mixed in?
1 Answers
That all depends on which flags you pass to grep.
The normal flagless grep (which is the same as passing -G) uses "Basic regular expressions":
-G, --basic-regexp
Interpret PATTERN as a basic regular expression (BRE, see
below). This is the default.
If you specify -E it uses "Extended" regular expressions:
-E, --extended-regexp
Interpret PATTERN as an extended regular expression (ERE,
see below). (-E is specified by POSIX.)
And then you have -P for Perl regular expressions (PCRE):
-P, --perl-regexp
Interpret PATTERN as a Perl regular expression. This is highly
experimental and grep -P may warn of unimplemented features.
Basic vs Extended Regular Expressions
In basic regular expressions the meta-characters ?, +, {, |, (, and ) lose their special meaning; instead use the backslashed versions \?, \+, \{, \|, \(, and \).
Traditional egrep did not support the { meta-character, and some egrep implementations support \{ instead, so portable scripts should avoid { in grep -E patterns and should use [{] to match a literal {.
GNU grep -E attempts to support traditional usage by assuming that { is not special if it would be the start of an invalid interval specification. For example, the command grep -E '{1' searches for the two-character string {1 instead of reporting a syntax error in the regular expression. POSIX.2 allows this behavior as an extension, but portable scripts should avoid it.
So although grep strives to be as close to POSIX as possible there are still some flaws in it.