I have a large panel data in a pandas dataframe. The example data can be found here:
import pandas as pd
df = pd.read_csv('example_data.csv')
df.head()
ID Year y DOB Year_of_death event
223725 1991 6 1975.0 2021 No
223725 1992 6 1975.0 2021 No
223725 1993 6 1975.0 2021 No
223725 1994 6 1975.0 2021 No
223725 1995 6 1975.0 2021 No
I want to change the values in the column event so that if the
Year value corresponds to the Year_of_death value then the observation in event for that specific row or ID changes to Yes, otherwise it remains as No.
For example, ID 68084329 died in 2012 but has the value Yes in every observation in the column event. I want to change it so that only the row with Year 2012 for this ID has Yes in event. The other event values should remain as No.
df.loc[df['ID'] == '68084329']
ID Year y DOB Year_of_death event
68084329 1991 6 1942.0 2012 Yes
68084329 1992 5 1942.0 2012 Yes
68084329 1993 5 1942.0 2012 Yes
68084329 1994 6 1942.0 2012 Yes
68084329 1995 6 1942.0 2012 Yes
68084329 1996 5 1942.0 2012 Yes
68084329 1997 6 1942.0 2012 Yes
68084329 1998 5 1942.0 2012 Yes
68084329 1999 6 1942.0 2012 Yes
68084329 2000 6 1942.0 2012 Yes
68084329 2001 6 1942.0 2012 Yes
68084329 2002 5 1942.0 2012 Yes
68084329 2003 6 1942.0 2012 Yes
68084329 2004 5 1942.0 2012 Yes
68084329 2005 5 1942.0 2012 Yes
68084329 2006 6 1942.0 2012 Yes
68084329 2007 6 1942.0 2012 Yes
68084329 2008 6 1942.0 2012 Yes
68084329 2010 5 1942.0 2012 Yes
68084329 2011 5 1942.0 2012 Yes
68084329 2012 0 1942.0 2012 Yes
How do I make these changes for a large DataFrame with many IDs in accordance with the above conditions?