I observed some strange behaviour of the column names when using to_flat_index() function.
Starting with a MultiIndex dataframe
a=[0,.25, .5, .75]
b=[1, 2, 3, 4]
c=[5, 6, 7, 8]
d=[1, 2, 3, 5]
df=pd.DataFrame(data={('a','a'):a, ('b', 'b'):b, ('c', 'c'):c, ('d', 'd'):d})
Produces this dataframe
      a  b  c  d
      a  b  c  d
0  0.00  1  5  1
1  0.25  2  6  2
2  0.50  3  7  3
3  0.75  4  8  5
Use the .to_flat_index to flatten the index
df.columns = df.columns.to_flat_index()
Produces the following dataframe
    (a, a)  (b, b)  (c, c)  (d, d)
0   0.00    1   5   1
1   0.25    2   6   2
2   0.50    3   7   3
3   0.75    4   8   5
If I try to select a column using df['(a, a)'] method I get a KeyError message. If I try to clean up the column name using df.columns = df.columns.str.lower().str.rstrip() (or any other .str method) I get nan instead of column names
NaN NaN NaN NaN
0   0.00    1   5   1
1   0.25    2   6   2
2   0.50    3   7   3
3   0.75    4   8   5
What am I doing wrong. How can I select the column after using to_flat_index()?
 
     
    