here is my DataFrame
Tipo Número renal dialisis
CC 260037 NULL NULL
CC 260037 NULL AAB
CC 165182 NULL NULL
CC 165182 NULL CCDE
CC 260039 NULL NULL
CC 49740 XYZ NULL
CC 260041 NULL NULL
CC 259653 NULL NULL
I want to determine if values in renal and dialisis are NULL ore not, for each row in the DataFrame. Those rows which are not NULL will be 1 in survived list; and if they are both NULL are going to be 0.
My code is:
survival = pd.read_table('Sophia_Personalizado bien.txt',encoding='utf-16')
survived = []
numero_paciente = []
lista_pacienytes= survival['Número'].values.tolist()
lista_pacienytes= sorted(set(lista_pacienytes))
for e in lista_pacienytes:
survival_i = survival.loc[survival['Número']==e]
renal = set(survival_i['renal'].values.tolist())
dialisis = set(survival_i["dialisis"].values.tolist())
print('dialisis',dialisis)
print('renal',renal)
if renal == 'nan' or dialisis == 'nan':
survived.append(0)
numero_paciente.append(e)
else:
survived.append(1)
numero_paciente.append(e)
e = pd.DataFrame({'numero': numero_paciente,
'survival': survived})
Surprisingly, all rows equal to 1, but as we can see in the DataFrame it is not true. Also, the result of
print('dialisis',dialisis)
print('renal',renal)
is:
dialisis {nan, nan}
renal {nan}
which should be NAN as I use set().
What am I missing? Thanks