I have a dataset that looks like this:
   UserID    Query     Asthma    Stroke    
   142       abc dr    0         0
   142       asthma    1         0
   142       stroke    0         1
   145       stroke    0         1
   145       pizza     0         0
There are hundreds of thousands of UserIDs and each user submitted a variable number of queries. In order to do further analysis, I need to sum "Asthma" and "Stroke" for each UserID. Any advice? Can you recommend resources for dealing with this type of dataset?
Thank you in advance... I'm very new to this.
 
    