I am working with a dataframe that contains a column with concatenated and not concatenated items:
| Name | Group | Average Age | 
|---|---|---|
| Mary | A, D, T, F | 10 | 
| Lukas | A, D, T, F | 20 | 
| John | A, D, T, F | 5 | 
| Mary | B, G, Y, Z | 15 | 
| Lukas | B, G, Y, Z | 25 | 
| John | B, G, Y, Z | 50 | 
| Mary | K | 12 | 
| Lukas | L | 23 | 
| John | M | 56 | 
I have a group list with:
group_list = ['D', 'Y', 'K', 'L', 'M']
I want the Average Age value for all names over this list, but firstly I'd like to split Group column.
I've tried:
if ',' in df['Group']:
    new_df['Group'] = df['Group'].str.split(",").apply(lambda x: list(set(x).intersection(set(group_list)))[0])
    else:
        new_df['Group'] = df['Group']
I also tried:
 new_df['Group'] = df['Group'].str.split(",").apply(lambda x: [list(set(x).intersection(set(group_list)))[0]] for ',' in df['Group'] else df['Group'])
But I am not able to run, Kernel always crash.
Anyone knows how to solve this?
Thanks!
