I am trying to apply a function to a very large matrix I want to eventually create a (40,000 by 40,000) matrix (where only one side of the diagonal is completed) or create a list of the results.
The matrix looks like:
            obs 1     obs 2     obs 3     obs 4     obs 5     obs 6     obs 7     obs 8     obs 9
words 1 0.2875775 0.5999890 0.2875775 0.5999890 0.2875775 0.5999890 0.2875775 0.5999890 0.2875775
words 2 0.7883051 0.3328235 0.7883051 0.3328235 0.7883051 0.3328235 0.7883051 0.3328235 0.7883051
words 3 0.4089769 0.4886130 0.4089769 0.4886130 0.4089769 0.4886130 0.4089769 0.4886130 0.4089769
words 4 0.8830174 0.9544738 0.8830174 0.9544738 0.8830174 0.9544738 0.8830174 0.9544738 0.8830174
words 5 0.9404673 0.4829024 0.9404673 0.4829024 0.9404673 0.4829024 0.9404673 0.4829024 0.9404673
words 6 0.0455565 0.8903502 0.0455565 0.8903502 0.0455565 0.8903502 0.0455565 0.8903502 0.0455565
I use the function using cosine(mat[, 3], mat[, 4]) which gives me a single number.
          [,1]
[1,] 0.7546113
I can do this for all of the columns but I want to be able to know which columns they came from, i.e. the calculation above came from columns 3 and 4 which is "obs 3" and "obs 4".
Expected output might be the results in a list or a matrix like:
          [,1]   [,1]   [,1]
[1,]        1      .      .
[1,]      0.75     1      .
[1,]      0.23    0.87    1
(Where the numbers here are made up)
So the dimensions will be the size of the ncol(mat) by ncol(mat) (if I go the matrix method).
Data/Code:
#generate some data
mat <- matrix(data = runif(200), nrow = 100, ncol = 20, dimnames = list(paste("words", 1:100),
                                                                        paste("obs", 1:20)))
mat
#calculate the following function
library(lsa)
cosine(mat[, 3], mat[, 4])
cosine(mat[, 4], mat[, 5])
cosine(mat[, 5], mat[, 6])
Additional
I thought about doing the following:
- Creating an empty matrix and calculating the function in a forloop but its not working as expected and creating a 40,000 by 40,000 matrix of 0's brings up memory issues.
co <- matrix(0L, nrow = ncol(mat), ncol = ncol(mat), dimnames = list(colnames(mat), colnames(mat)))
co
for (i in 2:ncol(mat)) {
  for (j in 1:(i - 1)) {
    co[i, j] = cosine(mat[, i], mat[, j])
  }
}
co
I also tried putting the results into a list:
List <- list()
for(i in 1:ncol(mat))
{
  temp <- List[[i]] <- mat
}
res <- List[1][[1]]
res
Which is also wrong.
So I am trying to create a function which will column by column calculate the function and store the results.
 
     
     
     
    