TLDR: What's the most concise way to encode ordered categories to numeric w/ a particular encoding conversion? (i.e. one that preserves the ordered nature of the categories).
["Weak","Normal","Strong"] --> [0,1,2]
Assuming I have an ordered categorical variable like similar to the example from here:
import pandas as pd
raw_data = {'patient': [1, 1, 1, 2, 2], 
        'obs': [1, 2, 3, 1, 2], 
        'treatment': [0, 1, 0, 1, 0],
        'score': ['strong', 'weak', 'normal', 'weak', 'strong']} 
df = pd.DataFrame(raw_data, columns = ['patient', 'obs', 'treatment', 'score'])
df
obs treatment   score
0   1           strong
1   1           weak
2   1           normal
3   2           weak
4   2           strong
I can create a function and apply it across my dataframe to get the desired conversation:
def score_to_numeric(x):
    if x=='strong':
        return 3
    if x=='normal':
        return 2
    if x=='weak':
        return 1
df['score_num'] = df['score'].apply(score_to_numeric)
df
obs treatment   score   score_num
0   1           strong  3
1   1           weak    1
2   1           normal  2
3   2           weak    1
4   2           strong  3
My question: Is there any way I can do this inline? (w/o having to specific a separate "score_to_numeric" function.
Maybe using some kind of lambda or replace functionality? Alternatively, this SO article suggests that Sklearn's LabelEncoder() is pretty powerful, and by extension may somehow have a way of handling this, but I haven't figured it out...
 
     
    