There is a piece of JavaScript code with several regexes using \b to match word boundaries. They don't work as expected on Unicode input. They work more or less like this:
',.,.Michał /#@$^Øystein(*()'.match(/\b.+?\b/g) // => ["Micha", "ł /#@$^Ø", "ystein"]
I would like the expression above to return [ "Michał", " /#@$^", "Øystein" ].
The expressions inside \b are actually more complicated than .+? and some of them are generated, so changing them is quite tricky. Ideally, I would like to keep these expressions unchanged, and substitute \b with something that matches zero-width word boundaries in a Unicode-aware way.
Is it possible at all? If it is, how can I do it? If it is not, how can I do it in a way that requires least changes to the expressions inside \b?
I hoped ES6 could help, but it won't – the behaviour of \b hasn't been changed there.