I have a data frame with 99 columns for dx1-dx99 & 99 for px1-px99 and one column as mort:
dx1 dx2 dx3 .   dx99    px1 px2 .   px99    mort
E10 I12 E10 N18 R18     0FY 0TY 0DN 0DN      1
E10 I12 I31 E44 N17     0FY 0TY 0FT 5A1      0
E10 I12 N17 T86 T86     0TY 0FY 0DT          0
I12 E10 N18 A04         0TY 0FY 0DT 0T7      1
E10 I12 E10 N18 Z99     0TY 0FY              0
E10 N18 Z76             0FY 0TY 04Q 0D1      1
E10 N18 Z99 N25 E78     0TY 0FY 0WP          0
I want to keep all values in dx-dx99 & px-px99 where in matching rows the value of mort=1, otherwise set them to zero. After that count the frequencies of occurrences of remaining codes.
I tried this:
dx = df.loc[:,'dx1':'dx99']
X1pr = df.loc[:,'px1':'px99']
dx = dx.fillna(0)    
X1p = X1pr.fillna(0)
death = df.loc[:,'mort']
df1 = pd.concat([dx, X1p, death], axis=1)
N = len(df1.columns)
keep = df1.iloc[:,-(N-1):].isin(["1"]).values
df1.iloc[:,:N-1] = df1.iloc[:,:N-1].where(keep, 0)
X1d = df1.[df1.columns[0:N-1]]
mat = X1d.as_matrix(columns=None)
values, counts = np.unique(mat.astype(str), return_counts=True)
matrix = []
for v,c in zip(values, counts):
    matrix.append( [v,c])
icd9_counted_d = pd.DataFrame(matrix, columns = ['ICD_code', 'DEATHS'])
I am getting nothing in DEATHS column. Any idea?
 
    