I have a csv file containing infection data for hospital patients from 2007-present, imported into a pandas dataframe. Each entry contains standard patient details like age, bmi etc. If the patient has an infection, they will also have data on: the date the infection was found, what microbe it is (number coded), and whether it is gram positive/negative/fungi/unknown. A patient can have at most 3 different microbes. Out of 18k entries, about 400 have infections. You can see a (simplified) snapshot of the dataset below. (Standard patient details have been removed as they're not relevant to my question)
  PatientID DateOp  DateOnset   1_CODE  1_TYPE  2_CODE  2_TYPE  3_CODE  3_TYPE
1   21/08/2011  28/08/2011  276 Gram Neg                
2   31/08/2011  08/10/2011  922 Fungi               
3   01/12/2011  15/12/2011  113 Gram Neg    780 Gram Pos        
4   01/12/2011  08/12/2011  113 Gram Neg                    
5   20/03/2012  24/03/2012  424 Gram Neg    276 Gram Neg        
6   16/12/2011  21/12/2011  221 Gram Neg                
7   23/04/2012  26/04/2012  922 Gram Pos    311 Gram Neg    780 Gram Pos
8   19/05/2012  29/05/2012  772 Gram Pos                
9   08/02/2012  13/02/2012      Unknown
I want to see how many of each microbe type (pos/neg/fungi/other) has occurred in each month since 2007 (using DateOnset as the date), and plot this in a suitable chart - for example a stacked bar-chart coloured by microbe type.
I thought I could do this by creating new columns in ssidf: gram_pos, gram_neg, fungi, other. Then, iterate through 1_TYPE, 2_TYPE, 3_TYPE, and if there is an infection in a row, add 1 to the necessary column(s). This is how I thought I could achieve this:
for row in ssidf.1_TYPE:
   if ssidf.1_TYPE[row].str.contains('Gram Pos') is True:
      ssidf.gram_pos[row] == 1
   elif ssidf.1_TYPE[row].str.contains('Gram Neg') is True:
      ssidf.gram_neg[row] == 1
   elif ssidf.1_TYPE[row].str.contains('Fungi') is True:
      ssidf.fungi[row] == 1
   else:
      ssidf.unknown[row] == 1
This is my expected result:
   DateOnset    gram_pos    gram_neg    fungi   unknown
28/08/2011  0   1   0   0
08/10/2011  0   0   1   0
15/12/2011  1   1   0   0
08/12/2011  0   1   0   0
24/03/2012  0   2   0   0
21/12/2011  1   0   0   0
26/04/2012  2   1   0   0
29/05/2012  1   0   0   0
13/02/2012  0   0   0   1
Then, I would sum the totals of gram_pos, gram_neg, fungi, other per month, and represent them in a stacked bar chart.
However, I'm falling at the first hurdle - the for loop. I keep getting: KeyError: 'Gram Neg'. I haven't even tried to get it to iterate across 2_TYPE and 3_TYPE yet, because the first column isn't working. Can anyone see what the problem is and suggest how to fix?
 
    