I would like to take the unique rows of a data frame and then join it with another row of attributes. I'd then like to be able to count up the number of varieties, e.g. the number of unique fruits of a particular type or origin.
The first data frame has my list of fruits:
fruits <- read.table(header=TRUE, text="shop    fruit
                    1   apple
                    2   orange
                    3   apple
                    4   pear
                    2   banana
                    1   banana
                    1   orange
                    3   banana")
The second data frame has my attributes:
fruit_class <- read.table(header=TRUE, text="fruit  type    origin
apple   pome    asia
                      banana  berry   asia
                      orange  citrus  asia
                      pear    pome    newguinea")
Here's my clumsy solution to the problem:
fruit <- as.data.frame(unique(fruit[,2])) #get a list of unique fruits
colnames(fruit)[1] <- "fruit" #this won't rename the column and I don't know why...
fruit_summary <- join(fruits, fruit_class, by="fruit" #create a data frame that I can query
count(fruit_summary, "origin") #for eg, summarise the number of fruits of each origin
So my main question is: how can this be expressed more elegantly (i.e. a single line rather than 3)? Secondarily: why won't it allow me to rename the column?
Thanks in advance
 
     
    