I have a regex pattern that works perfectly in Python and various other languages, but is failing to capture the sub matches I need for my implementation in a VBScript regex (the engine of which is apparently almost identical to JavaScript). The pattern in question is as follows:
"Sincerely,[\s\n]+([\w\.]+)\s+(\w+)\s+(.+)[\s\n]+(\d+\s.+)[\s\n]+(.+)"
An example test case is as follows:
email received 3/30/17:
Dear Sir,
Hello
Sincerely,
Mr. Robert Thomas
1104 Madison Avenue
New York, NY 10021
email received 3/30/17:
Dear Sir,
Hello
Sincerely,
Ms. Angela Carraway
402 Arlington Drive
Concord, MA 01742
The objective is for a global regex that extracts 5 subgroups out of this example match after a variable keyword which here is "Sincerely,". The subgroups should be Ms. (1st subgroup), Angela (second subgroup), Carraway (third subgroup), 402 Arlington Drive (fourth subgroup), Concord, MA 01742 (fifth subgroup). In Python, it matches the 5 groups perfectly in a Regex tester, yet for VBScript (the JavaScript engine) it matches the entire string as a match, but with no subgroups at all. Therefore when I call the sub matches in an Excel VBA macro to write to a cell, I get all of the text jumbled up into a couple cells. What am I doing wrong? Is there some character that I am missing that is disabling capturing subgroups? If so what is the critical difference between these two engines so that I can avoid this in the future and how could one fix this pattern in this test case? I've tried reading about the differences online, yet everything said seems to be only small differences that should cause the issue I am having. Any help would be greatly appreciated because I cannot seem to isolate the difference/problem. Thank you!
Edit: The following is the VBA code that utilizes the regex:
Sub regex()
    Dim docxinput As String
    Dim keyword As Variant
    Dim patterninput As Variant
    Dim pattern As String
    Dim regex As New RegExp
    docxinput = Application.GetOpenFilename(Title:="Step #1: Enter Word Document Input File Name")
        Dim wrdApp As Word.Application
        Dim wrdDoc As Word.Document
        Dim strInput As String
        Set wrdApp = CreateObject("Word.Application")
        wrdApp.Visible = False
        Set wrdDoc = wrdApp.Documents.Open(docxinput)
        strInput = wrdDoc.Range.Text
        Debug.Print (strInput)
        wrdDoc.Close 0
        Set wrdDoc = Nothing
        wrdApp.Quit
        Set wrdApp = Nothing
    pattern = "Sincerely,[\s\n]+([\w\.]+)\s+(\w+)\s+(.+)[\s\n]+(\d+\s.+)[\s\n]+(.+)"
    Dim objMatches As MatchCollection
    With regex
        .Global = True
        .MultiLine = True
        .IgnoreCase = False
        .pattern = pattern
    End With
    Set objMatches = regex.Execute(strInput)
    Dim row As Variant
    Dim SubMatches As Variant
    row = 2
    For Each SubMatches In objMatches
        Cells(row, 1).Value = objMatches(0).SubMatches(0)
        Cells(row, 2).Value = objMatches(0).SubMatches(1)
        Cells(row, 3).Value = objMatches(0).SubMatches(2)
        Cells(row, 4).Value = objMatches(0).SubMatches(3)
        Cells(row, 5).Value = objMatches(0).SubMatches(4)
        row = row + 1
    Next
End Sub
This is a picture of the results. As you can see, The first two subgroups work but then the regex (or at least I think) runs into grouping error and dumps almost of the other content into the next column. It then moves onto the fourth column, running into errors there as well. Is this an issue with the code iterating or the regex itself. I have tried to troubleshoot the code and cannot find reasons why it cannot break the text up correctly other than the regex being at fault. Any thoughts?

 
     
    