Using as input the matrix m derived from the built-in 11 by 8 data.frame anscombe:
# create test matrix m
m <- as.matrix(anscombe)
1) apply/tapply Try this:
t(apply(m, 1, tapply, gl(4, 1, ncol(m)), sum))
giving:
          1     2     3     4
 [1,] 18.04 19.14 17.46 14.58
 [2,] 14.95 16.14 14.77 13.76
 [3,] 20.58 21.74 25.74 15.71
 [4,] 17.81 17.77 16.11 16.84
 [5,] 19.33 20.26 18.81 16.47
 [6,] 23.96 22.10 22.84 15.04
 [7,] 13.24 12.13 12.08 13.25
 [8,]  8.26  7.10  9.39 31.50
 [9,] 22.84 21.13 20.15 13.56
[10,] 11.82 14.26 13.42 15.91
[11,] 10.68  9.74 10.73 14.89
2) tapply or this giving the same result:
do.call(cbind, tapply(1:ncol(m), gl(4, 1, ncol(m)), function(ix) rowSums(m[, ix])))
3) tapply - 2 or this which gives a similar result:
matrix(tapply(m, gl(4 * nrow(m), 1, length(m)), sum), nrow(m))
4) apply/array or this which additionally requires that there be the same number of input columns summed into each of the output columns:
apply(array(m, c(nrow(m), 4, ncol(m) / 4)), 1:2, sum)
Note that this is just apply(array(m, c(11, 4, 2), 1:2, sum) in the case of m.
5) for This alternative is based on a for loop:
res <- 0
for(i in seq(1, ncol(m), 4)) res <- res + m[, seq(i, length = 4)]
res
It would be possible to speed this up even more by setting res to m[, 1:4] and then starting i at 4+1 but the code gets a bit uglier so we will not bother.
6) Reduce
matrix(Reduce("+", split(m, gl(ncol(m) / 4, nrow(m) * 4))), nrow(m))
7) rowsum
t(rowsum(t(m), gl(4, 1, ncol(m))))
Note: Of the solutions tested below 
- (6), (5) and (4) are the fastest in descending order of speed (i.e. (6) is fastest).  These three also require that the number of columns of mbe an even multiple of 4. (2) is the fastest of the solutions that do not require an even multiple followed by (3), (7) and (1) where (1) is the slowest.
- (7) is the shortest, (1) is the next shortest and (4) is the third shortest
Here is the benchmark:
library(rbenchmark)
benchmark(
  one = t(apply(m, 1, tapply, gl(4, 1, ncol(m)), sum)),
  two = do.call(cbind, 
         tapply(1:ncol(m), gl(4, 1, ncol(m)), function(ix) rowSums(m[, ix]))),
  three = matrix(tapply(m, gl(4 * nrow(m), 1, length(m)), sum), nrow(m)), 
  four = apply(array(m, c(nrow(m), 4, ncol(m) / 4)), 1:2, sum),
  five = {res <- 0
          for(i in seq(1, ncol(m), 4)) res <- res + m[, seq(i, length = 4)]
          res },
  six = matrix(Reduce("+", split(m, gl(ncol(m) / 4, nrow(m) * 4))), nrow(m)),
  seven = t(rowsum(t(m), gl(4, 1, ncol(m)))),
  order = "relative", replications = 1000)[1:4]
giving:
   test replications elapsed relative
6   six         1000    0.12    1.000
5  five         1000    0.18    1.500
4  four         1000    0.30    2.500
2   two         1000    0.31    2.583
3 three         1000    0.39    3.250
7 seven         1000    0.58    4.833
1   one         1000    2.27   18.917