I am using by to apply a function to a range columns of a data frame based on a factor. Everything works perfectly well if I use mean() as the function but if I use median() I get an error of the type "Error in median.default(x) : need numeric data" even if I don't have NAs in the data frame.
The line that works using mean():
by(iris[,1:3], iris$Species, function(x) mean(x,na.rm=T))
> by(iris[,1:3], iris$Species, function(x) mean(x,na.rm=T))
iris$Species: setosa
Sepal.Length  Sepal.Width Petal.Length 
       5.006        3.428        1.462 
------------------------------------------------------------ 
iris$Species: versicolor
Sepal.Length  Sepal.Width Petal.Length 
       5.936        2.770        4.260 
------------------------------------------------------------ 
iris$Species: virginica
Sepal.Length  Sepal.Width Petal.Length 
       6.588        2.974        5.552 
Warning messages:
1: mean(<data.frame>) is deprecated.
 Use colMeans() or sapply(*, mean) instead. 
2: mean(<data.frame>) is deprecated.
 Use colMeans() or sapply(*, mean) instead. 
3: mean(<data.frame>) is deprecated.
 Use colMeans() or sapply(*, mean) instead. 
But if I use median() (note the na.rm=T option):
> by(iris[,1:3], iris$Species, function(x) median(x,na.rm=T))
Error in median.default(x, na.rm = T) : need numeric data
However if instead of choosing the range [,1:3] of columns I choose only one of the columns it works:
> by(iris[,1], iris$Species, function(x) median(x,na.rm=T))
iris$Species: setosa
[1] 5
------------------------------------------------------------ 
iris$Species: versicolor
[1] 5.9
------------------------------------------------------------ 
iris$Species: virginica
[1] 6.5
How can I achieve this behaviour while selecting a range of columns?
 
     
     
    