I have a table as following:
id name amount year
001 A 10 2010
001 A 10 2011
001 A 12 2012
-----------------------
002 A 3 2012
002 A 4 2013
-----------------------
003 B 20 2011
003 B 20 2012
(Note two entities have the same name A but they are different, id is the unique identifier.)
I want to compute the increment in amount over the previous year, i.e. the result should look like:
id name increment year
001 A 0 2010
001 A 0 2011
001 A 2 2012
----------------------------
002 A 0 2012
002 A 1 2013
----------------------------
003 B 0 2011
003 B 0 2012
Note that the increment of the first year is considered "0".
In MSSQL, it can implemented by:
SELECT id,
name,
amount - LAG(amount, 1, amount) OVER (PARTITION BY id ORDER BY YEAR) as increment,
year
FROM table
I am trying to accomplish the task in R with data.table. I found an succinct example here:
DT[, increment := amount - shift(amount, 1), by=id]. But error was prompted: could not find function "shift".
The versions are:
- R: 3.2.0_1
- data.table: 1.9.4
The questions are:
- I found the
shiftfunction implemented on data.table's Github, why I failed to invoke the function? - I think that
byin data.table is equivalent toPARTITION BYin SQL, then what is the counterpart ofORDER BYin R? Do I have to set the key of data.table before carrying out any aggregation so the data.table is ordered?