I have a data set that looks like this:
library(tidyverse)
data <- tibble(id = 1:10,
vectors = list(rnorm(25)))
# A tibble: 25 x 2
id vectors
<int> <list>
1 1 <dbl [25]>
2 2 <dbl [25]>
3 3 <dbl [25]>
4 4 <dbl [25]>
5 5 <dbl [25]>
6 6 <dbl [25]>
7 7 <dbl [25]>
8 8 <dbl [25]>
9 9 <dbl [25]>
10 10 <dbl [25]>
I'd like to use this data set to find cosine similarity where each row represents a document. The cosine function from the lsa package seems like a good/easy way to do this, however I would need each document represented as a column. I'd like to simply to do data %>% t() to get my desired result, but that's not working. I've also tried "spreading" the list column first using unest and spread. I've also tried flatten to no avail. The first line of my desired output would look something like:
1 2 3 4 5 6 7 8 9 10
0.1 0.3 0.7 0.3 0.1 0.1 0.3 0.7 0.3 0.1
If there's a function from another package that handles data in this format I would by all means just use that instead though at this point I would like to figure this out from a curiosity standpoint. I've looked at R - list to data frame, but I'm not sure how I can apply that to this situation.
The background to this is that I've performed doc2vec in python with gensim but do to our environment in work, if I want to build something interactive for a client it would need to be in R.