I appear to have produced some code where data.table is actually doing a copy on assignment even when using :=.  The below is a toy example that illustrates the point. copy.A() takes arguments DT (a data.table passed by reference) and an integer n.  It prints the address of DT, copies A to a new column, and then prints the new address of DT.  Nothing is returned; copy.A() is intended to operate via side-effect.
DT <- data.table(A=rnorm(10000))
copy.A <- function(DT, n) {
    address.start <- address(DT) 
    DT[, sprintf('A.copy.%i', n):=A]
    address.final <- address(DT)
    cat(sprintf('%.3i) %s --> %s\n', n, address.start, address.final))
}
for(n in 1L:120L) 
    copy.A(DT, n)
Output:
001) 0x2d979d0 --> 0x2d979d0
002) 0x2d979d0 --> 0x2d979d0
...
098) 0x2d979d0 --> 0x2d979d0
099) 0x2d979d0 --> 0x2d979d0
100) 0x2d979d0 --> 0x6564820 # Copying starts to occur
101) 0x2d979d0 --> 0x2bcfa30
102) 0x2d979d0 --> 0x456cad0
103) 0x2d979d0 --> 0x4282570
...
At some point, the address starts to change when the assignment occurs. Based on this example, I would say whenever we modify a data.table that was passed as an argument via reference, we must explicitly return that data.table or else there is no guarantee that the change will persist. I should add that this behavior is not that surprising. I just hadn't realized it was occurring until now.
This question is really just a request for more information. I haven't found anything in the documentation that really discusses it. Can someone shed some more light on this under-the-hood copying behavior or perhaps point to some documentation that explains this? Does data.table always pre-allocate the same amount of memory, and what are the rules for memory allocation as the size of the data.table increases?
