For the following df (please note that the df I am working with is read in raw data imported from a txt file and not the below df created in python for this example)
import pandas as pd
df = pd.DataFrame({'ID': ['12374' ,'19352','21014','2619','2621','9566','9686','61319','68086','69239','69353', '69373','69491','69535','69582','69691','174572','174637','174646','175286','175390'], 
                   'Category': [' ', ' ', ' ', '???? ?????','? ?',' ','?? ?',' ',' ',' ','?? ?',' ','? ?','???? ????? ??? ','? ?','?? ?','A','A','B','B','C']}) 
I am trying to flag, where users denoted a category as question mark. It does work and it marks the flag for all rows with a question mark. But it also adds the the Y flag to rows which are blank in that column.
df['?_Flag'] = np.where(df['Category'].str.contains("\?"), 'Y', '')
Do I need to use match instead?
This is the dataframe I get:
ID      Category    ?_Flag
12374                  Y
19352                  Y
21014                  Y
2619    ???? ?????     Y
2621    ? ?            Y
9566                   Y
9686    ?? ?           Y
61319                  Y
68086                  Y
69239                  Y
69353   ?? ?           Y
69373                  Y
69491   ? ?            Y
69535   ???? ????? ??? Y
69582   ? ?            Y
69691   ?? ?           Y
174572   A
174637   A
174646   B
175286   B
175390   C
Could it be related to the datatype?
df.info()
First_Name_E  197357 non-null object
 
     
     
    