The task: I need to find all abbreviations of address object identifiers in a string using a list of said abbreviations. (To delete them later). (Abbreviations list is in another language and is waaaay bigger (200+ elements), so foreach is out of question due to "complex regex beats foreach in speed").
The problem:
Regex like this (?:[^\w\d]|\A)(?:street|str|c|city|state|st|apt)([^\w\d]|\Z)
works on a string like this: Klutc state, Beast st, apt c5
and correcttly gives state, st, apt.
But on a string: state Klutc, Beast st,apt c5 it returns state and st, but not apt, because the [^\w\d] is somewhat stolen by the previous st
I also cannot use just the (?:[^\w\d]|\A)(?:street|str|c|city|state|st|apt) (left side) because it will not work on Klutc state, Beast st, apt c5 and give c from c5
Neither can I use only the right side (?:street|str|c|city|state|st|apt)([^\w\d]|\Z) because on a string Klutc state, Beast st, apt c5 it will return st from beast and c from Klutc.
The question:
How should I rewrite the regex, so it correctly return the abbreviations only? (Make st, not steal , from ,apt, i.e. make st and apt both use the same ,). Test inputs are:
Klutc state, Beast st, apt c5
state Klutc, Beast st,apt c5
Klutc State,Beast st,c5 apt