What's the best way to build and evaluate a table of various conditions for evaluation against a dataset?
For example, let's say I want to identify invalid rows in a dataset that looks like:
library("data.table")
# notional example -- some observations are wrong, some missing
set.seed(1)
n = 100 # Number of customers.
        # Also included are "non-customers" where values except cust_id should be NA.
cust <- data.table( cust_id = sample.int(n+1),
                    first_purch_dt =
                      c(sample(as.Date(c(1:n, NA), origin="2000-01-01"), n), NA),
                    last_purch_dt = 
                      c(sample(as.Date(c(1:n, NA), origin="2000-04-01"), n), NA),
                    largest_purch_amt = 
                      c(sample(c(50:100, NA), n, replace=TRUE), NA),
                    last_purch_amt = 
                      c(sample(c(1:65,NA), n, replace=TRUE), NA)
                    )
setkey(cust, cust_id)
The errors I want to check for each observation are any occurrences of last_purch_dt < first_purch_dt or largest_purch_amt < last_purch_amt, as well as any missing values other than all or none.  (All missing would be OK for a non-purchaser.)
Rather than a series of hard-coded expressions (which is getting really long and difficult to document/maintain), I just want to store the expressions as strings in a table of conditions:
checks <- data.table( cond_id = c(1L:3L),
                      cond_txt = c("last_purch_dt < first_purch_dt",
                                  "largest_purch_amt < last_purch_amt",
                                  paste("( is.na(first_purch_dt) + is.na(last_purch_dt) +",
                                          "is.na(largest_purch_amt) +",
                                          "is.na(last_purch_amt) ) %% 4 != 0") # hacky XOR  
                                  ),
                      cond_msg = c("Error: last purchase prior to first purchase.",
                                   "Error: largest purchase less than last purchase.",
                                   "Error: partial transaction record.")
                     )
I know that I can loop through rows of conditions and rbindlist the resulting subsets, for example:
err_obs <- 
  rbindlist(
    lapply(1:nrow(checks), function(i) {
      err_set <- cust[eval( parse(text= checks[i,cond_txt]) ) ,  ]
      cbind(err_set, 
            checks[i, .(err_id = rep.int(cond_id, times = nrow(err_set)),
                        err_msg = rep.int(cond_msg, times = nrow(err_set))
                        )]
            )                
    } )
  )
print(err_obs) # returns desired result
which seems to work and to handle NAs correctly in the evaluations.  
When I say "what's the best way", I'm asking:
- Is this the best approach, or is there a more efficient or idiomatic alternative to rbindlist(lapply(...)?
- Are there pitfalls in my current approach?
- Could this be written as a merge or join, something like cust inner join checks on eval(checks.condition(cust.values)) == TRUE?
 
     
    