I create a master data table that I extract smaller tables from and then combine them into a new table. The process goes like this
- Create a master table from some other data. Call it dt.master
- Create a copy of it and do some edits. Example script is - dt.1 <- copy(dt.master) dt.1 <- dt.1[ v1 %in% "cat1".]
- create other versions of dt.1 that are edited a bit. Here's the code where the mistake enters) - dt.2 <- dt.3 <- dt.1
- edit each of the new version as follows - dt.2[, v1 := "dt.2"] unique(dt.2$v1) dt.3[, v1 := "dt.3"] unique(dt.2$v1)
I know (and eventually remember) that dt.3  <- dt.1 doesn't create a new version of dt.1. But unique(dt.2$v1) returns "dt.2" in the code above; in subsequent code it returns "dt.1". I put my solution to this bad coding in the answer, but would also be interested in knowing why unique(dt.2$v1) returns a different answer. Here is some example code that demonstrates this
dt.master <- data.table(v1 = c("cat1", "cat1", "cat2", "cat","cat2" ), v2 = c(1,2,3,4,5))
dt.1 <- copy(dt.master)
dt.1 <- dt.1[v1 %in% "cat1",]
dt.2 <- dt.3  <- dt.1
dt.2[, v1 := "xxx"]
unique(dt.2$v1)
dt.3[, v1 := "yyy"]
unique(dt.3$v1)
print(dt.2)
v1 in dt.2 is supposed to be xxx but in the print statement, it is yyy.
