I have a file containing multiple entries. Each entry is of the following form:
"field1","field2","field3","field4","field5"
All of the fields are guaranteed to not contain any quotes, however they can contain ,. The problem is that field4 can be split across multiple lines. So an example file can look like:
"john","male US","done","Some sample text
across multiple lines. There
can be many lines of this","foo bar baz"
"jane","female UK","done","fields can have , in them","abc xyz"
I want to extract the fields using Python. If the field would not have been split across multiple lines this would have been simple: Extract string from between quotations. But I can't seem to find a simple way to do this in presence of multiline fields.
EDIT: There are actually five fields. Sorry about the confusion if any. The question has been edited to reflect this.