I want to sum about 10000 columns like colSparseX on 1500 sparse rows of an dataframe. If I have the input:
(I tried on OriginalDataframe this:
coldatfra <- aggregate(. ~colID,datfra,sum)
and this:
coldatfra <- ddply(datfra, .(colID), numcolwise(sum))
But it doesn't work!)
colID <- c(rep(seq(1:6),2), rep(seq(1:2),3))
colSparse1 <- c(rep(1,5), rep(0,4), rep(1,2), rep(0,5), rep(1,2))
cPlSpars2 <- c(rep(1,3), rep(0,6), rep(1,2), rep(0,5), rep(1,2))
coMSparse3 <- c(rep(1,6), rep(0,3), rep(1,2), rep(0,5), rep(1,2))
colSpArseN <- c(rep(1,2), rep(0,7), rep(1,2), rep(0,5), rep(1,2))
(datfra <- data.frame(colID, colSparse1, cPlSpars2, coMSparse3, colSpArseN))
colID colSparse1 cPlSpars2 coMSparse3 colSpArseN
    1          1         1          1          1
    2          1         1          1          1
    3          1         1          1          0
    4          1         0          1          0
    5          1         0          1          0
    6          0         0          1          0
    1          0         0          0          0
    2          0         0          0          0
    3          0         0          0          0
    4          1         1          1          1
    5          1         1          1          1
    6          0         0          0          0
    1          0         0          0          0
    2          0         0          0          0
    1          0         0          0          0
    2          0         0          0          0
    1          1         1          1          1
    2          1         1          1          1
And want to sum the elements for each ID on all (10000 columns - requires some placeholder for colnames are very variable words) colSparses in order to get this:
colID colSparse1 cPlSpars2 coMSparse3 colSpArseN
    1          2         2          2          2
    2          2         2          2          2
    3          1         1          1          0
    4          2         1          2          1
    5          2         1          2          1
    6          0         0          1          0
Note: str(OriginalDataframe)
'data.frame':   1500 obs. of  10000 variables:
 $ someword                                                : num  0 0 0 0 0 0 0 0 0 0 ...
 $ anotherword                                             : num  0 0 0 0 0 0 0 0 0 0 ...
And on a smaller version (which was terminated) of the OriginalDataframe treated with ddply(datfra, .(colID), numcolwise(sum)) I get:
     colID colSparse1 cPlSpars2 coMSparse3 colSpArseN
1     0019          0         0          0          0
NA    <NA>         NA        NA         NA         NA
NA.1  <NA>         NA        NA         NA         NA
NA.2  <NA>         NA        NA         NA         NA
NA.3  <NA>         NA        NA         NA         NA
 
     
    