I have a dataframe like the following:
In[8]: df = pd.DataFrame({'transport': ['Car;Bike;Horse','Car','Car;Bike', 'Horse;Car']})
df
Out[8]:
    transport
 0  Car;Bike;Horse
 1  Car
 2  Car;Bike
 3  Horse;Car
And I want to convert it to something like this:
In[9]: df2 = pd.DataFrame({'transport_car': [True,True,True,True],'transport_bike': [True,False,True,False], 'transport_horse': [True,False,False,True]} )
df2
Out[10]:
  transport_bike    transport_car   transport_horse
0   True                True            True
1   False               True            False
2   True                True            False
3   False               True            True
I got a solution, but it feels very 'hacked' and 'unpythonic'. (It works for my considerably small data set)
In[11]:
# get set of all possible values
new_columns = set()
for element in set(df.transport.unique()):
    for transkey in str(element).split(';'):
        new_columns.add(transkey)
print(new_columns)
# Use broadcast to initialize all columns with default value.
for col in new_columns:
    df['trans_'+str(col).lower()] = False
# Change cells appropiate to keywords
for index, row in df.iterrows():
    for key in new_columns:
        if key in row.transport:
            df.set_value(index, 'trans_'+str(key).lower(), True)
df
Out[11]:
    transport     trans_bike    trans_car   trans_horse
0   Car;Bike;Horse  True          True          True
1   Car             False         True          False
2   Car;Bike        True          True          False
3   Horse;Car       False         True          True
My goal is to use the second representation to perform some evaluation to answer questions like: "How often is car used?", "How often is car used together with horse", etc.
This and this answers suggest using pivot and eval might be the way to go, but I'm not sure.
So what would be the best way, to convert a DataFrame from first representation to the second?
 
     
    
 
     
    