Introduction/Question:
I have been studying the use of Regular Expressions (using VBA/Excel), and so far I cannot understand how I would isolate a <space> (or " ") using regexp from other white space characters that are included in \s. I thought that I would be able to use \p{Zs}, but in my testing so far, it has not worked out. Could someone please correct my misunderstanding? I appreciate any helpful input.
To offer proper credit, I modified some code that started off as a very helpful post by @Portland Runner that is found here: How to use Regular Expressions (Regex) in Microsoft Excel both in-cell and loops
This has been my approach/study so far:
Using the string "14z-16z Flavored Peanuts", I've been trying to write a RegExp which removes "14z-16z " and leaves only "Flavored Peanuts". I initially used ^[0-9](\S)+ as strPattern and a sub procedure with following snippets:
Sub REGEXP_TEST_SPACE()
Dim strPattern As String
Dim strReplace As String
Dim strInput As String
Dim regEx As New RegExp
strInput = "14z-16z Flavored Peanuts"
strPattern = "^[0-9](\S)+"
strReplace = ""
With regEx
.Global = True
.MultiLine = True
.IgnoreCase = True
.pattern = strPattern
End With
If regEx.Test(strInput) Then
Range("A1").Value = regEx.Replace(strInput, strReplace)
End If
End Sub
This approach gave me an A1 value of " Flavored Peanuts" (note the leading <space> in that string).
I then changed strPattern = "^[0-9](\S)+(\s)" (added the (\s)), which gave me the desired A1 value of "Flavored Peanuts". Great!!! I got the desired output!
But as I understand it, \s represents all white-space characters, equal to [ \f\n\r\t\v]. In this case, I know that the character is just a normal, single space -- I don't need carriage return, horizontal tab, etc. So I tried to see if I could just isolate the <space> character in regex (unicode separator: space), which I believe is \p{Zs} (e.g., strPattern = "^[0-9](\S)+(\p{Zs})"). Using this pattern, however, doesn't return a match whatsoever, nevermind removing the leading space. I also tried the more general \p{Z} (all unicode separators), but that didn't work either.
Clearly I have missed something in my study. Help is both desired and appreciated. Thank you.