I am using the Python interpreter in Windows 7 terminal.
I am trying to wrap my head around unicode and encodings.
I type:
>>> s='ë'
>>> s
'\x89'
>>> u=u'ë'
>>> u
u'\xeb'
Question 1: Why is the encoding used in the string s different from the one used in the unicode string u?
I continue, and type:
>>> us=unicode(s)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
UnicodeDecodeError: 'ascii' codec can't decode byte 0x89 in position 0: ordinal
not in range(128)
>>> us=unicode(s, 'latin-1')
>>> us
u'\x89'
Question2: I tried using the latin-1 encoding on good luck to turn the string into an unicode string (actually, I tried a bunch of other ones first, including utf-8). How can I find out which encoding the terminal has used to encode my string?
Question 3: how can I make the terminal print  Hmm, stupid me. ë as ë instead of '\x89' or u'xeb'?print(s) does the job.
I already looked at this related SO question, but no clues from there: Set Python terminal encoding on Windows
 
     
     
     
     
     
     
     
     
    