I am sorry if this is a silly question. I am looking to optimize my code, however, I am a newbie in R, so I do not know where to start.
I have a matrix X, whose rows are labeled by elements of y. Set of labels is numeric and consists of {1,...,K}. I want to be able to compute column sum for each submatrix corresponding to different labels and store it in M. To make this more clear, I am providing my current code:
for (i in 1:K) {
    cluster = (y == i)
    if (any(cluster)) {
      clusterRows = X[cluster, , drop = F]
      M[i, ] = colSums(clusterRows)
    }
}
Is there a better, more efficient way to do this? By efficient, I mean the running time.
EDIT: Example.
Input:
set.seed(1)
X = matrix(rnorm(100*2), nrow = 100, ncol = 2)
y = rep(1:2, 50)
M = matrix(rep(0,4), 2)
K = 2
Output:
       [,1]      [,2]
[1,] 9.776280 -2.595435
[2,] 1.112457 -1.185373
EDIT 2: I am not using any libraries besides base.
Here is my sessionInfo():
R version 3.4.4 (2018-03-15)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Linux Mint 19.3
Matrix products: default
BLAS: /usr/lib/x86_64-linux-gnu/openblas/libblas.so.3
LAPACK: /usr/lib/x86_64-linux-gnu/libopenblasp-r0.2.20.so
locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C               LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8    LC_PAPER=en_US.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C             LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       
attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     
loaded via a namespace (and not attached):
[1] microbenchmark_1.4-7 compiler_3.4.4       tools_3.4.4  
 
     
    