I have six CSV files for six different years and I'd like to combine them into a single dataframe, with the column headers appropriately labelled.
Each raw CSV file looks like this (e.g. 2010.csv)
state,gender,population
FL,m,2161612
FL,f,2661614
TX,m,3153523
TX,f,3453523
...
And this is the structure I'd like to end up with:
state    gender    population_2010   population_2012   population_2014  .....
FL       m         2161612           xxxxxxx           xxxxxxx          .....
FL       f         2661614           xxxxxxx           xxxxxxx          .....
TX       m         3153526           xxxxxxx           xxxxxxx          .....
TX       f         3453523           xxxxxxx           xxxxxxx          .....
How can I do this efficiently? Currently I have this:
df_2010 = pd.read_csv("2010.csv")
df_2012 = pd.read_csv("2012.csv")
...
temp = df_2010.merge(df_2012, on=("state", "gender"), how="outer", suffixes=("_2010", "_2012")
temp1 = temp.merge(df_2014, on=("state", "gender"), how="outer", suffixes=(None, "_2014")
... repeat five more times to get the final dataframe
But I feel there must be a better way.
 
    