0

My sample text is this:

This is a demo . Of all the places you want to go here! Superb work !
Is your name John ? Or is it that you are hiding your name? Seems to be a good work.But as you know this punctuation is not correct. ( this is not correct) but (this is correct). Similarly (this is not correct ) and this also ( is not correct ) .
The thing is, that it may seems to be inappropriate , but this process is good.
Here quotes are also weird. " Some are ok some are weird ". Also, "there should not" be any "incorrect spelling ". This is semi - boiled.

You can follow these steps: step 1, step 2 , step 3, step4,step5

Or these steps : New step 1, new step2,new step 3, new step5 Select this/or that. Otherwise select New\ old or select one / two.

My expected output is:

This is a demo. Of all the places you want to go here! Superb work!
Is your name John? Or is it that you are hiding your name? Seems to be a good work.But as you know this punctuation is not correct. (this is not correct) but (this is correct). Similarly (this is not correct) and this also (is not correct).
The thing is, that it may seems to be inappropriate, but this process is good.
Here quotes are also weird. "Some are ok some are weird". Also, "there should not" be any "incorrect spelling". This is semi-boiled.

You can follow these steps: step 1, step 2 , step 3, step4, step5

Or these steps: New step 1, new step2, new step 3, new step5 Select this/or that. Otherwise select New\old or select one/two.

My goal is to remove all the spaces before and after a fullstop, question mark, comma, colon, semi-colon, any type of brackets. Then I would like to give a space after all the punctuations except / and \ slash to follow the rules of typography.

This regex is not working:

\p{po}(?!\x{2F}\x{3b})(?!\x20)

I tried this other way also it is not skipping the slashes:

(?!\x{2F}\x{3b})\p{po}(?!\x20)
Destroy666
  • 12,350
Shahid
  • 147

1 Answers1

0

This isn't something that you can do in a single regex since there's an "if" in your requirement. Conditionally putting a space after the punctuation character means that you will need at least 2 regexes. One for punctuation that needs a space after it:

\h*([.,;:?!-])\h*

Replace with (there's a space at the end):

$1 

Then the rest:

\h*([()\/\\[\]])\h*

Replace with (no space at the end):

$1

Where \h is horizontal space and the rest are characters you're capturing into 1st group.

Note than none of this is fully reliable for text proecessing. Very quick random examples of proper usage of . that will get corrected:

2.0
Hello...
e.g.
20:45
S.T.A.L.K.E.R.

All of those would need proper exceptions, which aren't worth wasting time on. You really should find a proper tool for punctuation correction instead. E.g. pasting the text into ChatGPT and asking it to correct the punctuation will succeed in 95+% of cases. This is your text corrected by it:

This is a demo. Of all the places you want to go, here! Superb work!
Is your name John? Or is it that you are hiding your name? Seems to be a good work. But as you know, this punctuation is not correct. (This is not correct) but (this is correct). Similarly, (this is not correct) and this also (is not correct).
The thing is, it may seem to be inappropriate, but this process is good.
Here, quotes are also weird. "Some are okay, some are weird." Also, "there should not" be any "incorrect spelling". This is semi-boiled.

You can follow these steps: Step 1, step 2, step 3, step 4, step 5

Or these steps: New step 1, new step 2, new step 3, new step 5 Select this/or that. Otherwise, select new/old or select one/two.

Alternatively use a Word-like tool that has this sort of functionality built-in. Often even as autocorrection.

Destroy666
  • 12,350