I wrote a parser for a custom (subset of) BBCode in Javascript and now I translated it to C#. This custom BBCode allows parsing line by line so I have regex allowing me to "pop" the first line from the BBCode string:
/(^.*$|^.*\r?\n)/
It matches an empty string. The first part ^.*$ matches a simple string like "Simple string" (single line without CrLf at the end).
The second part ^.*\r?\n matches the first line ending with CrLf.
This works perfect in Javascript. But while running the unit tests in C# I noticed a difference.
Assume we have "line1\n" as input.
The regex in Javascript will match it as follows:
^.*$ won't match because . is any symbol except CrLf and we have \n at the end.
^.*\r?\n will match as we have string starting with 0 or more symbols and \n at the end.
Now in C# it works different:
^.*$ will match (why?), but only the line1. Thus the whole /(^.*$|^.*\r?\n)/ will also match only line1 an the \n goes missing.
Could someone please explain? Is there a way to force C# regex to behave like the Javascript regex in the sense described above?
The simplest workaround would be to change the order in the pattern : /(^.*$|^.*\r?\n)/ -> /(^.*\r?\n|^.*$)/ and so the problem will be solved ...,
but I still would like to know the reason behind that difference.
Click here for the C# test code ...
For Javascript see below:
const first_line_pattern = /(^.*$|^.*\r?\n)/
const single_string_pattern = /^.*$/
const line_pattern = /^.*\r?\n/
const input4 = "line1\n"
function log(pattern) {
let m4 = input4.match(pattern)
console.log('~~~~~~~~' + pattern.toString() + '~~~~~~~~~')
console.log("'line1\\n':':" + (m4 != null) + ":value: /" + (m4 ? m4[0] : 'no match') + "/")
}
log(first_line_pattern)
log(single_string_pattern)
log(line_pattern)
Thank you for your time!