I have a Dataframe where I want to groupby a couple conditions and then sum a few fields. When I sum the values I want to flatten the values
With an input of:
UserID    Site    First Name    Last Name    May Hours    June Hours     July Hours
6385         A          John          Doe            8             9             10
6385         A          John          Doe            3             7              8
4756         A          Mark         Mean            7             6              5
4756         B          Mark         Mean            3             2              3
and grouping by UserID and Site, I want an output of:
UserID    Site    First Name    Last Name    May Hours    June Hours     July Hours
6385         A          John          Doe           11            16             18
4756         A          Mark         Mean            7             6              5
4756         B          Mark         Mean            3             2              3
It can be assumed that besides the UserID, Site, and the Monthly Hours, all other column values are the same.
My current idea was to do:
for group_name, group_df in df.groupby(by=['UserID','Site']):
    if len(group_df) > 1: # We have rows that need to be summed
        sums = group_df[['May Hours','June Hours','July Hours']].sum()
        # Assign this sum to the first row in the grouped_df
        # Remove all following rows in grouped_df (we already got their sum)
        # Append this to a new dataframe
    else:
        # Append this to a new dataframe
While this solution may work, it is definitely not efficient but I am unsure of the best way to flatten the dataframe
 
    