Can anyone explain what this line t[exists,][1:6,] is doing in the code below and how that subsetting works?
t<-trees
t[1,1]= NA
t[5,3]= NA
t[1:6,]
exists<-complete.cases(t)
exists
t[exists,][1:6,]
Can anyone explain what this line t[exists,][1:6,] is doing in the code below and how that subsetting works?
t<-trees
t[1,1]= NA
t[5,3]= NA
t[1:6,]
exists<-complete.cases(t)
exists
t[exists,][1:6,]
The complete.cases function will check the data frame and will return a vector of TRUE and FALSE where a TRUE indicates a row with no missing data. The vector will be as long as there are rows in t.
The t[exits,] part will subset the data so that only rows where exists is true will be considered - the row that have missing data will be FALSE in exists and removed. The [1:6,] will only take the first 6 rows where there is no missing data.
In R, [ is a function like any other. R parses t[exists, ] as
`[`(t, exists) # don't forget the backticks!
Indeed you can always call [ with the backtick-and-parentheses syntax, or even crazier use it in constructions like
as.data.frame(lapply(t[exists, ], `[`, 1:6, ))
which, believe it or not, is (almost) equivalent to t[exists,][1:6,].
The same is true for functions like [[, $, and more exotic stuff like names<-, which is a special function to assign argument value to the names attribute of an object. We use functions like this all the time with syntax like
names(iris) <- tolower(names(iris))
without realizing that what we're really doing is
`names(iris)<-`(iris, tolower(names(iris))
And finally, you can type
?`[`
for documentation, or type
`[`
to return the definition, just like any other function.
t[exists,][1:6,] doesThe simple answer is that R parses t[exists,][1:6,] as something like:
tTRUE elements of exists.1:6, i.e. rows 1 through 6The more complicated answer is that this is handled by the parser as:
`[`(`[`(t, exists, ), 1:6, ) # yes, this has blank arguments
which a human can interpret as
temporary_variable_1 <- `[`(t, exists, )
temporary_variable_2 <- `[`(temporary_variable_1, 1:6, )
print(temporary_variable_2) # implicitly, sending an object by itself to the console will `print` that object
Interestingly, because you typically can't pass blank arguments in R, certain constructions are impossible with the bracket function, like eval(call("[", t, exists, )) which will throw an undefined columns selected error.