I can find the n largest values in each row of a numpy array (link) but doing so loses the column information which is what I want. Say I have some data:
import pandas as pd
import numpy as np
np.random.seed(42)
data = np.random.rand(5,5)
data = pd.DataFrame(data, columns = list('abcde'))
data
          a         b         c         d         e
0  0.374540  0.950714  0.731994  0.598658  0.156019
1  0.155995  0.058084  0.866176  0.601115  0.708073
2  0.020584  0.969910  0.832443  0.212339  0.181825
3  0.183405  0.304242  0.524756  0.431945  0.291229
4  0.611853  0.139494  0.292145  0.366362  0.456070
I want the names of the largest contributors in each row. So for n = 2 the output would be:
0  b  c
1  c  e
2  b  c
3  c  d
4  a  e
I can do it by looping over the dataframe but that would be inefficient. Is there a more pythonic way?