I have a data.table full of some consumer products. I've created some distinction for the products as 'low', 'high', or 'unknown' quality. The data are time series, and I'm interested in smoothing out some seasonality in the data. If a product's raw classification (the classification churned out by the algorithm I used to determine quality) is 'low' quality in period X, but its raw classification was 'high' quality in period X-1, I'm reclassifying that product as 'high' quality for period X. This process is done within some sort of product group distinction.
To accomplish this, I've got something like the following:
require(data.table)
# lag takes a column and lags it by one period,
# padding with NA
lag <- function(var) {
lagged <- c(NA,
var[1:(length(var)-1)])
return(lagged)
}
set.seed(120)
foo <- data.table(group = c('A', rep(c('B', 'C', 'D'), 5)),
period = c(1:16),
quality = c('unknown', sample(c('high', 'low', 'unknown'), 15, replace = TRUE)))
foo[, quality_lag := lag(quality), by = group]
foo[, quality_1 := ifelse(quality == 'low' & quality_lag == 'high',
'high',
quality)]
Taking a look at foo:
group period quality quality_lag quality_1
1: A 1 unknown NA unknown
2: B 2 low NA NA
3: C 3 high NA high
4: D 4 low NA NA
5: B 5 unknown low unknown
6: C 6 high high high
7: D 7 low low low
8: B 8 unknown unknown unknown
9: C 9 high high high
10: D 10 unknown low unknown
11: B 11 unknown unknown unknown
12: C 12 low high high
13: D 13 unknown unknown unknown
14: B 14 high unknown high
15: C 15 high low high
16: D 16 unknown unknown unknown
So, quality_1 is mostly what I want. If period X is 'low' and period X-1 is 'high', we see the reclassification to 'high' occurs and everything is left mostly intact from quality. However, when quality_lag is NA, 'low' gets reclassified to NA in quality_1. This is not an issue with 'high' or 'unknown'.
That is, the first four rows of foo should look like this:
group period quality quality_lag quality_1
1: A 1 unknown NA unknown
2: B 2 low NA low
3: C 3 high NA high
4: D 4 low NA low
Any thoughts on what is causing this?