I have to remove the strings that start with "===" and also end with "===" (for example I have to replace the string "===Links===" with null string) in python. But the problem here is it can start with three "=" or four or any number of '='. I have tried to use the regex re.sub('[=]*.*?[=]*', '', string). But when it is run on "===Refs===", it is giving "Refs" as output instead of null string. Can you please suggest something for this?
Asked
Active
Viewed 49 times
0
Arkistarvh Kltzuonstev
- 6,824
- 7
- 26
- 56
-
Your description is a little confusing, but have you looked into [non-capturing groups?](https://stackoverflow.com/questions/3512471/what-is-a-non-capturing-group-what-does-do) – vmg Mar 14 '19 at 10:33
3 Answers
2
import re
string = '===Ref==='
pattern = r'^\=+.+\=+$'
string = re.sub(pattern, '', string)
RnD
- 1,019
- 5
- 23
- 49
-
3@BhanuPrakashReddy glad to help, don't forget to accept the answer if this solved your problem – RnD Mar 14 '19 at 10:42
1
Too late :-(
import re
str = '===Links=== are great, but ===Refs=== bla bla == blub ===blub'
pattern = re.compile('=+\w+=+')
replaced = re.sub(pattern, '', str)
print(replaced)
Xenobiologist
- 2,091
- 1
- 12
- 16
-
Honestly, this is the best one posted thus fa r because it's designed with scalability in mind. – FailSafe Mar 14 '19 at 15:21
0
.? suggests that you are only accepting no or a single character between your =s. Try changing it to .* to match multiple characters between =s.
Perhaps you can use str.startswith() and str.endswith() to find out if the string starts/ends with ===?
Moberg
- 5,253
- 4
- 38
- 54