The apply family are really just loops with a bow tie.
Here is a way to do it without apply. With some input checking and should work on any size matrix.
off_diag = function(X)
{
    if(!is.matrix(X)) stop('Argument is not a matrix')
    n <- nrow(X)
    if(ncol(X) != n) stop('Matrix is not square')
    if(n<2) return(X)
    Y <- X * c(0,rep(rep(c(0,1),c(n-2,1)),n),rep(0,n-1))
    return(Y)
}
Now it can handle numeric vectors, character vectors and NAs.
mat <-  matrix(1:16, 4, byrow = TRUE)
off_diag(mat)
#      [,1] [,2] [,3] [,4]
# [1,]    0    0    0    4
# [2,]    0    0    7    0
# [3,]    0   10    0    0
# [4,]   13    0    0    0
Edit: improvement
I realised my function will fail if there are NAs since NA*0 is NA, additionally it will not work on characters, but doesn't check the matrix has mode as numeric. So instead I use the same setup to make a logical vector
minor_diag = function(X)
{
    if(!is.matrix(X)) stop('Argument is not a matrix')
    n <- nrow(X)
    if(ncol(X) != n) stop('Matrix is not square')
    if(n<2) return(X)
    index = c(TRUE,rep(rep(c(TRUE,FALSE),c(n-2,1)),n),rep(TRUE,n-1))
    X[index]=0
    return(X)
}
mat <-  matrix(letters[1:16], 4, byrow = TRUE)
minor_diag(mat)
##      [,1] [,2] [,3] [,4]
## [1,] "0"  "0"  "0"  "d" 
## [2,] "0"  "0"  "g"  "0" 
## [3,] "0"  "j"  "0"  "0" 
## [4,] "m"  "0"  "0"  "0" 
minor_diag(matrix(NA,2,2))
##      [,1] [,2]
## [1,]    0   NA
## [2,]   NA    0