It's to validate username, my codes:
import re
regex = r'^[\w.@+-]+\Z'
result = re.match(regex,'名字')
In python2.7, it returns None.
In python3.7, it returns '名字'.
It's to validate username, my codes:
import re
regex = r'^[\w.@+-]+\Z'
result = re.match(regex,'名字')
In python2.7, it returns None.
In python3.7, it returns '名字'.
 
    
     
    
    It's because of the different definitions for \w in Python 2.7 versus Python 3.7.
In Python 2.7, we have:
When the LOCALE and
UNICODEflags are not specified, matches any alphanumeric character and the underscore; this is equivalent to the set[a-zA-Z0-9_].
(emphasis and hyperlink and formatting added)
However, in Python 3.7, we have:
For Unicode (str) patterns: Matches Unicode word characters; this includes most characters that can be part of a word in any language, as well as numbers and the underscore. If the ASCII flag is used, only
[a-zA-Z0-9_]is matched.
(emphasis and formatting added)
So, if you want it to work in both versions, you can do something like this:
# -*- coding: utf-8 -*-
import re
regex = re.compile(r'^[\w.@+-]+\Z', re.UNICODE)
match = regex.match(u'名字')
if match:
    print(match.group(0))
else:
    print("not matched!")
output:
名字
Here's proof that it works in both versions:
Note the differences:
I added # -*- coding: utf-8 -*- at the top of the script, because without it, in Python 2.7, we'll get an error saying
Non-ASCII character '\xe5' on line 3, but no encoding declared; see http://www.python.org/peps/pep-0263.html for details
Instead of using result = re.match(pattern, string), I used regex = re.compile(pattern, flags) and match = regex.match(string) so that I can specify flags.
I used re.UNICODE flag, because without it, in Python 2.7, it will only match [a-zA-Z0-9_] when using \w.
I used u'名字' instead of '名字', because in Python 2.7 you need to use Unicode Literals for unicode characters.
Also, while answering your question, I found out that print("not matched!") works in Python 2.7 as well, which makes sense, because in this case the parentheses are ignored, which I didn't know, so that was fun.
