I have a data frame df and I want to create multiple lags of column A.
I should be able to use the .assign() method and a dictionary comprehension, I think.
However, all lags are the longest lag with my solution below, even though the dictionary comprehension itself creates the correct lags.
Also, can someone explain why I need the ** just before my dictionary comprehension?
import numpy as np
import pandas as pd
df = pd.DataFrame({'A': np.arange(5)})
df.assign(**{'lag_' + str(i): lambda x: x['A'].shift(i) for i in range(1, 5+1)})
    A   lag_1   lag_2   lag_3   lag_4   lag_5
0   0   NaN     NaN     NaN     NaN     NaN
1   1   NaN     NaN     NaN     NaN     NaN
2   2   NaN     NaN     NaN     NaN     NaN
3   3   NaN     NaN     NaN     NaN     NaN
4   4   NaN     NaN     NaN     NaN     NaN
The dictionary comprehension itself creates the correct lags.
{'lag_' + str(i): df['A'].shift(i) for i in range(1, 5+1)}
{'lag_1': 0    NaN
 1    0.0
 2    1.0
 3    2.0
 4    3.0
 Name: A, dtype: float64,
 'lag_2': 0    NaN
 1    NaN
 2    0.0
 3    1.0
 4    2.0
 Name: A, dtype: float64,
 'lag_3': 0    NaN
 1    NaN
 2    NaN
 3    0.0
 4    1.0
 Name: A, dtype: float64,
 'lag_4': 0    NaN
 1    NaN
 2    NaN
 3    NaN
 4    0.0
 Name: A, dtype: float64,
 'lag_5': 0   NaN
 1   NaN
 2   NaN
 3   NaN
 4   NaN
 Name: A, dtype: float64}
 
    