0

I am trying to use Regex to search between the start of the report to the start of the next report further down the same file, capture the report as a whole, then use that to search for duplicates, and remove them.

They are broken up by CRLFs and I thought I was smart by doing (\r\n).*(\r\n) to capture report, find, delete, repeat for next report.

When I do (\r\n).*(\r\n) it captures from the next CRLF to the last CRLF in the file.

I cannot for the life of me figure out how to limit the search to just one instance of the first line of the report, the ~30 lines of the body, then the end of the report.

DavidPostill
  • 162,382

1 Answers1

1

your problem is that dot is matching new line.. try unticking the 'dot matches newline' box. Which in notepad++ might not have hard to find(See the bottom left hand corner of notepad++'s edit..find dialog box). I won't include a picture because you didn't put notepad++ in your title and I think it's good if the answer isn't tainted to look too notepad++ centric, and unnecessarily notepad++ centric. Other programs that support regex, also have a dot matches newline thing that can be ticked or unticked.

you could experiment with other searches and see if they work or don't.. some will work regardless of dot, e.g. if they don't use dot, or if they may have use dot but with eg .*? which uses an operator *? so it doesn't match too much. Other regex examples require that dot matches new line is unticked. So may as well untick it, and only tick it to see what if any contrast. You can try this ^.*$ with dot matches new line not ticked. Or your one with it unticked. Or see what happens with this a kind of pattern of the form [^X]*X, (that is a good way of averting the problem of e.g. if you do .*x then the * will include x, and you don't want it to. So you can specify everything that is not x, *, followed by x), such as \r\n[^\r\n]*\r\n or [^\r\n]*\r\n try ^[^\r\n]*\r\n Note the caret within square brackets means Not. The ^ outside square brackets means match position at the beginning of the line. Another way is trying *? specifically .*? eg \r\n.*?\r\n .*? will matches few dots as possible. So .*?X will match a few characters as possible up till X.

barlop
  • 25,198