I have a data set containing serveral financial data, including some fundamentals. For example, I got debt data from April, but it actually should be lets say Dec. As they are released at a later point in time, i have to lag them back for appr. 4 monhts.
This is what my data looks like (illustration)
k <- c("gvkey1" , "gvkey1" , "gvkey1" , "gvkey1", "gvkey2", "gvkey2", "gvkey2", "gvkey2", "gvkey2", "gvkey3", "gvkey3")
l <- c("Date1", "Date2", "Date3", "Date4" , "Date5" , "Date6" , "Date7" , "Date8" , "Date9" , "Date10" , "Date11" )
m <- c(1:11)
y <- structure(list(a = l, b = k, c = m), .Names = c("Date", "gvkey" , "DLCQ"),
               row.names = c(NA, -11L), class = "data.frame")
     Date  gvkey DLCQ
1   Date1 gvkey1    1
2   Date2 gvkey1    2
3   Date3 gvkey1    3
4   Date4 gvkey1    4
5   Date5 gvkey2    5
6   Date6 gvkey2    6
7   Date7 gvkey2    7
8   Date8 gvkey2    8
9   Date9 gvkey2    9
10 Date10 gvkey3   10
11 Date11 gvkey3   11
and this is the code I already tried:
x <- shift(y$DLCQ, 4L)
However, this gives me back one single vector and basically "deletes" all the other columns (date, gvkey).
[1] NA NA NA NA  1  2  3  4  5  6  7
It should look like something like this:
     Date  gvkey DLCQ
1   Date1 gvkey1    NA
2   Date2 gvkey1    NA
3   Date3 gvkey1    NA
4   Date4 gvkey1    NA
5   Date5 gvkey2    1
6   Date6 gvkey2    2
7   Date7 gvkey2    3
8   Date8 gvkey2    4
9   Date9 gvkey2    5
10 Date10 gvkey3    6
11 Date11 gvkey3    7
Moreover, since my data is in long format, the code should run for each gvkey separately (e.g. with ,by =gvkey).
Thanks Johannes
