I have a text file that containing a pattern [Chinese character]\nRT Journal, and I want to identify this pattern, and substitute it to [The original Chinese character]\n\nRT Journal. I tried the code below but [The original Chinese character] becomes a unicode \x01.
import re
x = "据\nRT Journal"
print(re.sub('([\u4e00-\u9fff])\nRT','\1\n\nRT',x))
It returns '\x01\n\nRT Journal' rather than '据\n\nRT Journal'. But if I replace the 据 in x with an a, I can get what I want. Can you please explain to me a bit why does this happen and how to solve this? Thanks!
