I have this dataframe:
df = pd.DataFrame([['137', 'earn'], ['158', 'earn'],['144', 'ship'],['111', 'trade'],['132', 'trade']], columns=['value', 'topic'] )
print(df)
    value  topic
0   137   earn
1   158   earn
2   144   ship
3   111  trade
4   132  trade
And I want an additional numeric column like this:
    value  topic  topic_id
0   137   earn    0
1   158   earn    0
2   144   ship    1
3   111  trade    2
4   132  trade    2
So basically I want to generate a column which encodes a string column to a numeric value. I implemented this solution:
topics_dict = {}
topics = np.unique(df['topic']).tolist()
for i in range(len(topics)):
        topics_dict[topics[i]] = i
df['topic_id'] = [topics_dict[l] for l in df['topic']]
However, I am quite sure there is a more elegant and pandaic way to solve this but I couln't find something on Google or SO. I read about pandas' get_dummies but this creates multiple columns for each different value in the original column.
I am thankful for any help or pointer in a direction!
 
     
     
     
     
    