I have a data frame like this:
>>> MCQ_DATA['q45']
10    [13, 14]
11    [13, 14]
12    [13, 12]
13    [13, 12]
14     [9, 12]
15      [2, 6]
16         [2]
17        [16]
18    [13, 12]
19    [13, 11]
So all values for column 'q45' are lists. I want to create a boolean filter for rows that contain 13, like:
>>> [MCQ_DATA['q45'] == 13]
10    True
11    True
12    True
13    True
14    False
15    False
16    False
17    False
18    True
19    True
Already tried these:
- [MCQ_DATA['q45'] == 13]returns false for everything.
- [MCQ_DATA['q45'].isin([13])]returns- TypeError. Also, this looks for values against a list. I want to check a nested list in a dataframe for a value. (from Use a list of values to select rows from a pandas dataframe)
- df.loc[df['q45'] == 13]returns an empty dataframe, because none of the values are- 13. (From Select rows from a DataFrame based on values in a column in pandas)
- df['q45'].apply(lambda sublist: 13 in sublist)finally, this worked. But the source says this is not an efficient way to do it: Operating on tuples held within Pandas DataFrame column)
I've after looking on SO for the prerequisite half-hour, if the last way is not the right way, what IS the right way?
Further testing found this also works. I would say it is the best approach. Uses pandas framework and easily readable:
- df['q45'].dropna().apply({13}.issubset)-- I'm guessing this is faster if I do that on a large scale, but maybe someone knows. (I needed the- .dropna()because- nangives an error.)
