In R 2.15.0 and data.table 1.8.9:
d = data.table(a = 1:5, value = 2:6, key = "a")
d[J(3), value]
# a value
# 3 4
d[J(3)][, value]
# 4
I expected both to produce the same output (the 2nd one) and I believe they should.
In the interest of clearing up that this is not a J syntax issue, same expectation applies to the following (identical to the above) expressions:
t = data.table(a = 3, key = "a")
d[t, value]
d[t][, value]
I would expect both of the above to return the exact same output.
So let me rephrase the question - why is (data.table designed so that) the key column printed out automatically in d[t, value]?
Update (based on answers and comments below): Thanks @Arun et al., I understand the design-why now. The reason the above prints the key is because there is a hidden by present every time you do a data.table merge via the X[Y] syntax, and that by is by the key. The reason it's designed this way seems to be the following - since the by operation has to be performed when merging, one might as well take advantage of that and not do another by if you are going to do that by the key of the merge.
Now that said, I believe that's a syntax design flaw. The way I read data.table syntax d[i, j, by = b] is
take
d, apply theioperation (be that subsetting or merging or whatnot), and then do thejexpression "by" b
The by-without-by breaks this reading and introduces cases one has to think about specifically (am I merging on i, is by just the key of the merge, etc). I believe this should be the job of the data.table - the commendable effort to make data.table faster in one particular case of the merge, when the by is equal to the key, should be done in an alternative way (e.g. by checking internally if the by expression is actually the key of the merge).