I would like to decode MACCYRILLIC code, for example "%EE%F2_%E4%EE%E1%F0%E0_%E4%EE%E1%F0%E0_%ED%E5_%E8%F9%F3%F2". How can I do it using Python2?
phrase.decode("MACCYRILLIC") has no effect.
I would like to decode MACCYRILLIC code, for example "%EE%F2_%E4%EE%E1%F0%E0_%E4%EE%E1%F0%E0_%ED%E5_%E8%F9%F3%F2". How can I do it using Python2?
phrase.decode("MACCYRILLIC") has no effect.
 
    
    urllib — Open arbitrary resources by URL
urllib.unquote(string)Replace
%xxescapes by their single-character equivalent.Example:
unquote('/%7Econnolly/')yields'/~connolly/'.
==> py -2
Python 2.7.12 (v2.7.12:d33e0cf91556, Jun 27 2016, 15:24:40) [MSC v.1500 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>>
>>> import urllib
>>> MACCYRILLIC = "%EE%F2_%E4%EE%E1%F0%E0_%E4%EE%E1%F0%E0_%ED%E5_%E8%F9%F3%F2"
>>> print urllib.unquote(MACCYRILLIC).decode('cp1251')
от_добра_добра_не_ищут
>>>
Edit. Another approach (step by step):
==> py -2
Python 2.7.12 (v2.7.12:d33e0cf91556, Jun 27 2016, 15:24:40) [MSC v.1500 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import urllib
>>> import codecs
>>> MACCYRILLIC = '%EE%F2_%E4%EE%E1%F0%E0_%E4%EE%E1%F0%E0_%ED%E5_%E8%F9%F3%F2'
>>> #print
... x = urllib.unquote(MACCYRILLIC) #.decode('cp1251')
>>> print repr(x)
'\xee\xf2_\xe4\xee\xe1\xf0\xe0_\xe4\xee\xe1\xf0\xe0_\xed\xe5_\xe8\xf9\xf3\xf2'
>>> y = codecs.decode(x, 'cp1251')
>>> print y
от_добра_добра_не_ищут
>>>
All above would work on the following requirement:
>>> import sys
>>> sys.stdout.encoding
'utf-8'
>>> print sys.stdout.encoding
utf-8
>>>
Unfortunately, the example in http://rextester.com/XAX79891 shows sys.stdout.encoding None (and I don't know a way of changing it to utf-8). Read more in Lennart Regebro's answer to Stdout encoding in python:
A better generic solution under Python 2 is to treat
stdoutas what it is: An 8-bit interface. And that means that anything you print to tostdoutshould be 8-bit. You get the error when you are trying to print Unicode data, because print will then try to encode the Unicode data to the encoding ofstdout, and if it's None it will assumeASCII, and fail, unless you setPYTHONIOENCODING.
