In a fictional patients dataset one might encounter the following table:
pd.DataFrame({
    "Patients": ["Luke", "Nigel", "Sarah"],
    "Disease": ["Cooties", "Dragon Pox", "Greycale & Cooties"]
})
Which renders the following dataset:
Now, assuming that the rows with multiple illnesses use the same pattern (separation with a character, in this context a &) and that there exists a complete list diseases of the illnesses, I've yet to find a simple solution to applying to these situations pandas.get_dummies one-hot encoder to obtain a binary vector for each patient.
How can I obtain, in the simplest possible manner, the following binary vectorization from the initial DataFrame?
pd.DataFrame({
    "Patients": ["Luke", "Nigel", "Sarah"],
    "Cooties":[1, 0, 1],
    "Dragon Pox":[0, 1, 0],
    "Greyscale":[0, 0, 1]
})


 
     
     
    