The following regex a[bcd]*b matches the longest substring (because * is greedy):
- astarting with- a
- [bcd]*followed by any number (0: can match empty string) of character in set (b,c,d)
- bending by- b
EDIT: following comment, backtracking occurs in following example
>>> re.findall(r2,"abcxb")
['ab']
- abcmatches- a[bcd]*, but- xis not expected
- aalso matches- a[bcd]*(because empty string matches [bcd]*)
- finally returns ab
Concerning greediness, the metacharacter * after a single character, a character set or a group, means any number of times (the most possible match) some regexp engines accept the sequence of metacharacters *? which modifies the behavior to the least possible, for example:
>>> r2 = r'a[bcd]*?b'
>>> re.findall(r2,"abcbde")
['ab']