I have a df that is 2,000 lines long, it is taking me about 5 minutes to map 2,000 IDs to the corresponding names, which is way too long. I'm trying to figure out a way to reduce the mapping time. One option I want to try is mapping the IDs as I make the dictionary- and not storing more than one line at a time in the dictionary.
Here is the process I'm using:
df_IDs=
studentID  grade  age
123        12th   18
432        11th   17
421        11th   16
And I want to replace the 'studentID' column with the student names that I can get from a mapping file (student_directory_df).
I created functions that will make a dictionary and map the IDs to the names:
dicts={}
def search_tool_student(df, column, accession):
    index=np.where(df[column].str.contains(accession, na=False))
    if len(index[0])==0:
        done=""
        pass
    else:
        done=df.iloc[index]
    return(done)
def create_dict(x): 
    result_df = search_tool_student(student_directory_df, 'studentID', x)
    if (len(result_df) == 0): 
        print('bad match found: '+ x)
    else: 
        student_name = result_df['name'].iloc[0]
        dicts[x] = student_name
    return(dicts)
def map_ID(df_IDs):
    studentIDs=df_IDs['studentID'] 
    new_dict=list(map(create_dict, studentIDs))[0]
    df_IDs['studentID']=df_IDs['studentID'].map(new_dict).fillna(df_IDs['studentID'])
    return(df_IDs)
desired output
studentID    grade  age
sally        12th   18
joe          11th   17
sarah        11th   16
 
    