I have two data frames: raw2 which has 28,406 records and raw3 26,421 records. 
The records in raw3 are a subset of those in raw2. In fact raw3 was derived using:
raw3<-setDT(raw2)[order(O_ID, Program_forsorting), head(.SD, 1), .(O_ID)]
I now have a setdiff function where I'm trying to pull the records that did not get carried over from raw2 to raw3 using: 
settdiff(raw2,raw3)
The results should have 1,985 records. However, the results have 28,406 which represents raw2. If I switch the formula around to settdiff(raw3,raw2) the results contains 26,421 records.
What am I doing wrong?
Here is sample data
raw2<-as.data.frame(cbind("col1"=c("a","h","b","f","g"),"O_ID"=c(1,1,1,4,5), "Program_forsorting"=c("p1","p2","p2","p3","p1")))
 
    