I have a long pandas dataframe of emails (90,000) and I want to create a new dataframe where every email will be grouped together by subject. for example if I have 3 emails with the subject 'hello', I would have one column be the subject and the other column would contain a list of 3 email ID's that correspond to the 3 emails. So far I have:
index = 0
for i in range(df.shape[0]):
    count = 0
    for x in range(bindf.shape[0]):
        if (df['Subject'][i] == bindf['Subject'][x]):
            bindf['emailID'][x].append(df['Message-ID'][i])
            count = 1
    if count == 0:
        bindf.iloc[index] = [df['Subject'][i],df['Message-ID'][i]]
        bindf['emailID'][index] = bindf['emailID'][index].split(' ', maxsplit = 0)
        index = index +1
This works, but it is incredibly slow to the point where I would need multiple hours to run it.
NOTE: every email contains a subject and the email ID is a string in the original dataframe, where I want it to be part of a list here
 
    