I have a datafame which will look like this: where col4 has unique ID. There are posts on concatenating strings, but I have to concatenate integer which is throwing error if I am using str(int) which is not the usual case
| col1 | col2 | col3 | col4 | col5 | 
|---|---|---|---|---|
| 1999 | ABC | ggg | 1 | kogyk | 
| 1999 | ABC | ggg | 2 | hfu | 
| 1989 | CAT | ppp | 3 | gl | 
| 1999 | ABC | uyt | 4 | klyif | 
| 1989 | CAT | ppp | 5 | gil | 
I want to merge the contents of col4 if col1,col2,col3 values match and add a count of it. output must look like this:
| col1 | col2 | col3 | col4 | count | 
|---|---|---|---|---|
| 1999 | ABC | ggg | 1,2 | 2 | 
| 1989 | CAT | ppp | 3,5 | 2 | 
| 1999 | ABC | uyt | 4 | 1 | 
I got the necessary output with: df.groupby(['col1', 'col2', 'col3']).agg(col4=('col4', lambda x: ','.join([str(x) for x in list(x))), count=('col4', 'size')).reset_index() works as expected
