I'm checking line by line in C#
Example data:
bob jones,123,55.6,,,"Hello , World",,0
jim neighbor,432,66.5,,,Andy "Blank,,1
john smith,555,77.4,,,Some value,,2
Regex to pick commas outside of quotes doesn't resolve second line, it's the closest.
I'm checking line by line in C#
Example data:
bob jones,123,55.6,,,"Hello , World",,0
jim neighbor,432,66.5,,,Andy "Blank,,1
john smith,555,77.4,,,Some value,,2
Regex to pick commas outside of quotes doesn't resolve second line, it's the closest.
Try the following regex:
(?!\B"[^"]*),(?![^"]*"\B)
Here is a demonstration:
" you inserted does not have a closing quotation mark.,r"a string",10 because the letter on the edge of the " will create a word boundary, rather than a non-word boundary.(".*?,.*?"|.*?(?:,|$))
This will match the content and the commas and is compatible with values that are full of punctuation marks
The below regex is for parsing each fields in a line, not an entire line
Apply the methodical and desperate regex technique: Divide and conquer
[^,"]*(,|$)
[^,"]*"[^"]*"[^,"]*(,|$)
[^,"]*"[^,"]$
[^,"]*"[^"],(?!.*")
Now that we have all the cases, we then '|' everything together and enjoy the resultant monstrosity.
The best answer written by Vasili Syrakis does not work with negative numbers inside quotation marks such as:
bob jones,123,"-55.6",,,"Hello , World",,0
jim neighbor,432,66.5
Following regex works for this purpose:
,(?!(?=[^"]*"[^"]*(?:"[^"]*"[^"]*)*$))
But I was not successful with this part of input:
,Andy "Blank,
import re
print re.sub(',(?=[^"]*"[^"]*(?:"[^"]*"[^"]*)*$)',"",string)