Thank you in advance for reading.
I have a dataframe:
df = pd.DataFrame({'Words':[{'Sec': ['level']},{'Sec': ['levels']},{'Sec': ['level']},{'Und': ['ba ']},{'Pro': ['conf'],'ProAbb': ['cth']}],'Conflict':[None,None,None,None,'Match Conflict']})
         Conflict                                     Words
0            None                      {u'Sec': [u'level']}
1            None                     {u'Sec': [u'levels']}
2            None                      {u'Sec': [u'level']}
3            None                        {u'Und': [u'ba ']}
4  Match Conflict  {u'ProAbb': [u'cth'], u'Pro': [u'conf']}
I want to apply a routine that, for each element in 'Words', checks if Conflict = 'Match Conflict' and if so, applies some function to the value in 'Words'.
For instance, using the following placeholder function:
def func(x):
    x = x.clear()
    return x
I write:
df['Words'] = df[df['Conflict'] == 'Match Conflict']['Words'].apply(lambda x: func(x))
My expected output is:
         Conflict                                     Words
0            None                      {u'Sec': [u'level']}
1            None                     {u'Sec': [u'levels']}
2            None                      {u'Sec': [u'level']}
3            None                        {u'Und': [u'ba ']}
4  Match Conflict                                        None
Instead I get:
         Conflict Words
0            None   NaN
1            None   NaN
2            None   NaN
3            None   NaN
4  Match Conflict  None
The function is applied only to the row which has Conflict = 'Match Conflict' but at the expense of the other rows (which all become None. I assumed the other rows would be left untouched; obviously this is not the case.
Can you explain how I might achieve my desired output without dropping all of the information in the Words column? I believe the answer may lie with np.where but I have not been able to make this work, this was the best I could come up with.
Any help much appreciated. Thanks.
 
    
 
     
     
    