I have some text with lowercase letters, dots, parentheses, and greater than and less than signs. Suppose that I want to match substrings on each line that (1) begin with a period, (2) contain any number of letters, and (3) have a non-negative number of either parentheses or </> signs, but not both. Therefore, given this text,
foobar.hello(world)
foobar.hello<world>
foobar.hello>>>world<>(baz)
I want to match .hello(world) on the first line, .hello<world> on the second line, and .hello>>>world<> on the third (since I can't mix parentheses and </> signs).
I could use two regular expressions to match my desired strings, \.[a-z()]+ and \.[a-z<>]+. However, because regexes are more efficient when similar patterns are combined, I tried to combine them into a single regex with a logical OR |:
\.(?:[a-z()]+|[a-z<>]+)
After trying this online, while the regex matched my desired substring for the first line, for the second and third lines, it only matched .hello. Yet when I switch the order of the elements, the opposite happens—the first line gets matched as .hello, and the second and third lines are matched as desired. This comes to me as a surprise, since I wouldn't think order would matter with an OR operator. What's happening here?