This is response from one sql query in pandas. I want to concatenate, "Label column", together based on Issue and client. I tried Group By, but is only works with integer values. Any idea, how I can do this? Pandas based solution should also be fine.
I tried .groupby, in pandas too, command and output below, it only gives me a subset of needed dataframe.
Is it possible to update the label column against each Issue in first dataframe and removed duplicates and get output as expected below?
SQL Version is:
Microsoft SQL Server 2014
Output:
| Issue | Subject | type | Team | Sub Team | Client | Priority | CreatedOn | Label | BuiltOn | CreatedBy | Status | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 1 | ABCABC | Bug | Develop | Automation | Andy | 0 | 2021-01-11 00:00:00 | Enhancement | None | John | InProgress | 
| 1 | 2 | DEFDEF | Bug | Develop | Automation | Judy | 0 | 2021-01-10 00:00:00 | Feature | None | Andre | New | 
| 2 | 3 | HIGHIG | Bug | Develop | Testing123 | Cathy | 2 | 2021-02-11 00:00:00 | Feature | None | Keith | New | 
| 3 | 3 | HIGHIG | Bug | Develop | Testing123 | Cathy | 2 | 2021-02-11 00:00:00 | Internal | None | Keith | New | 
| 4 | 4 | XYZXYZ | Bug | Develop | Automation | Jack | 1 | 2021-05-11 00:00:00 | Enhancement | None | Maya | Analysis | 
| 5 | 4 | XYZXYZ | Bug | Develop | Automation | Jack | 1 | 2021-05-11 00:00:00 | Internal | None | Maya | Analysis | 
| 6 | 4 | XYZXYZ | Bug | Develop | Automation | Larry | 1 | 2021-05-11 00:00:00 | Enhancement | None | Maya | Analysis | 
| 7 | 4 | XYZXYZ | Bug | Develop | Automation | Larry | 1 | 2021-05-11 00:00:00 | Internal | None | Maya | Analysis | 
| 8 | 4 | XYZXYZ | Bug | Develop | Automation | Colin | 1 | 2021-05-11 00:00:00 | Enhancement | None | Maya | Analysis | 
| 9 | 4 | XYZXYZ | Bug | Develop | Automation | Colin | 1 | 2021-05-11 00:00:00 | Internal | None | Maya | Analysis | 
| 10 | 4 | XYZXYZ | Bug | Develop | Automation | Nitin | 1 | 2021-05-11 00:00:00 | Enhancement | None | Maya | Analysis | 
| 11 | 4 | XYZXYZ | Bug | Develop | Automation | Nitin | 1 | 2021-05-11 00:00:00 | Internal | None | Maya | Analysis | 
| 12 | 4 | XYZXYZ | Bug | Develop | Automation | Lisa | 1 | 2021-05-11 00:00:00 | Enhancement | None | Maya | Analysis | 
| 13 | 4 | XYZXYZ | Bug | Develop | Automation | Lisa | 1 | 2021-05-11 00:00:00 | Internal | None | Maya | Analysis | 
Expected (Note the label column):
| Issue | Subject | Issue_type | Team | Sub Team | Client | Priority | CreatedOn | Label | BuiltOn | CreatedBy | Status | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 1 | ABC | Bug | Develop | Automation | Andy | 0 | 2021-01-11 00:00:00 | Enhancement | None | John | InProgress | 
| 1 | 2 | DEF | Bug | Develop | Automation | Judy | 0 | 2021-01-10 00:00:00 | Feature | None | Andre | New | 
| 2 | 3 | HIG | Bug | Develop | Testing | Cathy | 2 | 2021-02-11 00:00:00 | Feature, Internal | None | Keith | New | 
| 3 | 4 | XYZ | Bug | Develop | Automation | Jack | 1 | 2021-05-11 00:00:00 | Enhancement, Internal | None | Maya | Analysis | 
| 4 | 4 | XYZ | Bug | Develop | Automation | Larry | 1 | 2021-05-11 00:00:00 | Enhancement, Internal | None | Maya | Analysis | 
| 5 | 4 | XYZ | Bug | Develop | Automation | Colin | 1 | 2021-05-11 00:00:00 | Enhancement, Internal | None | Maya | Analysis | 
| 6 | 4 | XYZ | Bug | Develop | Automation | Nitin | 1 | 2021-05-11 00:00:00 | Enhancement, Internal | None | Maya | Analysis | 
| 7 | 4 | XYZ | Bug | Develop | Automation | Lisa | 1 | 2021-05-11 00:00:00 | Enhancement, Internal | None | Maya | Analysis | 
Update: This is they query:
SELECT I.Issue,
       I.Subject,
       I.type, 
       P.Team, 
       P.Subteam,
       CR.Client,
       I.Priority,
       I.CreatedOn,
       L.Label,
       I.BuiltOn,
       I.CreatedBy,
       I.Status
 FROM master.IssueRequests AS I 
 JOIN master.Participants AS P 
   ON P.Issue = I.Issue 
 JOIN master.ClientRecords AS CR 
   ON CR.Issue = I.Issue 
 JOIN master.IssueLabels AS L
   ON L.Issue = I.Issue
 WHERE I.Issue IN ('2652523', '2703670', '2984120')
Update2
Output of df.groupby:
df.groupby(['Issue', 'Client'])['Label'].apply(','.join).reset_index()
Output:
| Issue | Client | Label | |
|---|---|---|---|
| 0 | 1 | Andy | Enhancement | 
| 1 | 2 | Judy | Feature | 
| 2 | 3 | Cathy | Feature,Internal | 
| 3 | 4 | Colin | Enhancement,Internal | 
| 4 | 4 | Jack | Enhancement,Internal | 
| 5 | 4 | Larry | Enhancement,Internal | 
| 6 | 4 | Lisa | Enhancement,Internal | 
| 7 | 4 | Nitin | Enhancement,Internal | 
Clarification: Merging on all columns except Label will not work since in some cases, some of the other data might be "null" or different, which can cause data to be missed altogether. If the data is different in other columns, I can keep the first instance of that data.
 
    