How to express a regular expression over the alphabet {a, b, c} that doesn't contain the contiguous sub-string baa?
            Asked
            
        
        
            Active
            
        
            Viewed 220 times
        
    -1
            
            
         
    
    
        Shangchih Huang
        
- 319
- 3
- 11
1 Answers
2
            
            
        If your regex flavor supports negative lookaheads, then it's relatively simple. E.g. in php it looks like this:
^^(?:(?!baa)[abc])*$
Demo here.
Explanation:
- ^...$makes sure we match the entire line
- [abc]is a character class that defines the alphabet
- (?!baa)is the negative lookahead. It checks for every position if it is followed by- baa. If it is, then it's not a match
- finally, we group these two with a non-capturing group: (?:...)and repeat them as many times as fits into the line:(?:...)*
Update
Updated the demo and the regex according to ClasG -s comment. Indeed, to make sure it fails for a simple baa, the lookahead must come first, then the character class.
- 
                    [`^(?:[abc](?!baa))*$`](https://regex101.com/r/2jIewo/1/) matches `baa`. – Wiktor Stribiżew Nov 17 '17 at 09:35
- 
                    Swap the tests - `^(?:(?!baa)[abc])*$` and it'll work. I.e. look-ahead prior to character class. – SamWhan Nov 17 '17 at 09:38
- 
                    @WiktorStribiżew The question isn't a dup of the Q you say (though it lacks a lot). To use the answer, it may be helpful though. – SamWhan Nov 17 '17 at 09:40
- 
                    @ClasG thanks, updated – Tamas Rev Nov 17 '17 at 09:52
 
    