Is there a faster way to do this? I guess this is unnecessary slow and that a task like this can be accomplished with base functions.
df <- ddply(df, "id", function(x) cbind(x, perc.total = sum(x$cand.perc)))
I'm quite new to R. I have looked at by(), aggregate() and tapply(), but didn't get them to work at all or in the way I wanted. Rather than returning a shorter vector, I want to attach the sum to the original dataframe. What is the best way to do this?
Edit: Here is a speed comparison of the answers applied to my data.
> # My original solution
> system.time( ddply(df, "id", function(x) cbind(x, perc.total = sum(x$cand.perc))) )
   user  system elapsed 
 14.405   0.000  14.479 
> # Paul Hiemstra
> system.time( ddply(df, "id", transform, perc.total = sum(cand.perc)) )
   user  system elapsed 
 15.973   0.000  15.992 
> # Richie Cotton
> system.time( with(df, tapply(df$cand.perc, df$id, sum))[df$id] )
   user  system elapsed 
  0.048   0.000   0.048 
> # John
> system.time( with(df, ave(cand.perc, id, FUN = sum)) )
       user  system elapsed 
      0.032   0.000   0.030 
> # Christoph_J
> system.time( df[ , list(perc.total = sum(cand.perc)), by="id"][df])
   user  system elapsed 
  0.028   0.000   0.028 
 
     
     
     
     
     
     
     
    