I am very confused with the way charset and encoding work in SQLAlchemy. I understand (and have read) the difference between charsets and encodings, and I have a good picture of the history of encodings.
I have a table in MySQL in latin1_swedish_ci (Why? Possible because of this). I need to create a pandas dataframe in which I get the proper characters (and not weird symbols). Initially, this was in the code:
connect_engine = create_engine('mysql://user:password@1.1.1.1/db')
sql_query = "select * from table1"
df = pandas.read_sql(sql_query, connect_engine)
We started having troubles with the Š character (corresponding to the u'\u0160' unicode, but instead we get '\x8a'). I expected this to work:
connect_engine = create_engine('mysql://user:password@1.1.1.1/db', encoding='utf8') 
but, I continue getting '\x8a', which, I realized, makes sense given that the default of the encoding parameter is utf8. So, then, I tried encoding='latin1' to tackle the problem:
connect_engine = create_engine('mysql://user:password@1.1.1.1/db', encoding='latin1')
but, I still get the same '\x8a'. To be clear, in both cases (encoding='utf8' and encoding='latin1'), I can do mystring.decode('latin1') but not  mystring.decode('utf8').
And then, I rediscovered the charset parameter in the connection string, i.e. 'mysql://user:password@1.1.1.1/db?charset=latin1'. And after trying all possible combinations of charset and encoding, I found that this one work:
connect_engine = create_engine('mysql://user:password@1.1.1.1/db?charset=utf8')
I would appreciate if somebody can explain me how to correctly use the charset in the connection string, and the encoding parameter in the create_engine?
 
     
     
     
    