I have a data set with sales numbers from individual items ordered on various dates. All items in a particular order share the same ID. I want to calculate order totals (i.e. sum of sales for all items in a particular order), while preserving the date associated with each order (we can assume all items part of an order share the same date). How do I sum sales with respect to ID, while preserving the date?
This questions is different from others I've seen, because I want to preserve and collapse the Date column while summing with respect to a different column, Sales.
Columns before: Date, ID, Sales
Columns after: Date, ID, Order.Total
The following code returns an error because dates obviously cannot be summed :
df[, lapply(.SD, sum), by = "ID"]
The following code removes the Date field altogether :
df[, lapply(.SD, sum), by = "ID", .SDcols = !"Date"]
For example, if my data set before is :
DATE  ID  SALES
1/2   01    1 
1/2   01    2
1/2   02    3
1/3   03    6
1/4   04    5
1/4   04    4
My data set after should be :
DATE  ID  ORDER.TOTAL
1/2   01    3 
1/2   02    3
1/3   03    6
1/4   04    9
 
    