I have term document matrix before and want to add new document to that term document matrix, in another way it can say to join two document matrix.
My term document matrix is :
Docs
Term 1
eat 7
food 2
run 2
sick 3
Then another document is watch football match and eat food
After the process, i want my term document matrix to be :
Docs
Term 1 2
eat 7 1
food 2 1
run 2 0
sick 3 0
watch 0 1
football 0 1
match 0 1
and 0 1
I've tried this :
library("SnowballC")
library("NLP")
library("tm")
library("lsa")
#mytermdm (term document matrix i have before)
text2 <- "watch fottball match and eat food"
myCorpus <- Corpus(VectorSource(text2))
tdm2 <- TermDocumentMatrix(myCorpus, control = list
(removeNumbers = TRUE,
removePunctuation = TRUE,
stopwords=stopwords_en,
stemming=TRUE)
)
mytdm3 <- c(mytermdm,tdm2)
inspect(mytdm3)
I get this :
TermDocumentMatrix (terms: 7, document:2)
Error in `[.simple_triplet_matrix`(x,terms,doc)`
Repeated indices currently no allowed.