I want to match the last group, which is enclosed in [], but may contain one of more [] inside of itself in a nested structure.
I managed, although not elegantly, to get the nested [] matching going using the regex of python. This solution works for some cases (such as s1) but not s2 or s3 when there are multiple such matches. My solution will only match the first one.
Any suggestions? A better regular expression? Or regular expression is not the way to go? Thanks a lot!
In [116]:
s1 = 'AAA [BBB [CCC]]'
s2 = 'AAA [DDD] [EEE]'
s3 = 'AAA [BBB [CCC]] [EEE]'
for s in [s1, s2, s3]:
result = regex.search(r'(?<rec>\[(?:[^\[\]]++|(?&rec))*\])',s,flags=regex.VERBOSE)
print(result.captures('rec'))
['[CCC]', '[BBB [CCC]]'] #I know it is perfect, but I can take the last one in the list
['[DDD]'] #This is the first one, I want the last one, which is [EEE]
['[CCC]', '[BBB [CCC]]'] #same problem as above
Edit:
Thanks a lot of the help, if I have 15 reps I will up-vote ya all. However, sorry for not including the intended result, which should be:
'AAA [BBB [CCC]]' -> '[BBB [CCC]]'
'AAA [DDD] [EEE]' -> '[EEE]'
'AAA [BBB [CCC]] [EEE]' -> '[EEE]'
'000 [[aaa] xxx [yyy [zzz ]]' -> '[[aaa] xxx [yyy [zzz ]]'