Create document-term matrix
dtm <- DocumentTermMatrix(docs, control = params)
Error in nchar(rownames(m)) : invalid multibyte string, element 1
Anyone who knows how to tackle this error? Working in Rstudio
dtm <- DocumentTermMatrix(docs, control = params)
Error in nchar(rownames(m)) : invalid multibyte string, element 1
Anyone who knows how to tackle this error? Working in Rstudio
 
    
     
    
    Sys.setlocale( 'LC_ALL','C' ) 
In R studio apply this code .. It will refresh the locale .. worked for me many times.
 
    
    This happens when your input text isn't UTF-8 encoded. You can read about character encoding here.
Another good reference is this
I've found that the best way to handle these issues is to use stringr::str_conv.
mydocs <- c("doc1", "doc2", "doc3")
stringr::str_conv(mydocs, "UTF-8")
Where you have non-UTF-8 characters, you'll get a warning, but the character vector that comes out the other side will be usable.
Do that to your docs vector before calling `DocumentTermMatrix.
 
    
    I encountered this error while trying to write a data frame to a SQL server table. This function helped me, I used it to remove all non-UTF8 characters from a data frame before writing it to the server. It's built off another post, linked below.
# Create a function to convert all columns to UTF-8 encoding,
# dropping any characters that can't be converted.
df_convert_utf8 <- function(df_data){
  # Convert all character columns to UTF-8
  # Source: https://stackoverflow.com/questions/54633054/dbidbwritetable-invalid-multibyte-string
  df_data[,sapply(df_data,is.character)] <- sapply(
    df_data[,sapply(df_data,is.character)],
    iconv,"WINDOWS-1252","UTF-8",sub = "")
  
  return(df_data)
}
Example usage:
  # Convert all character strings to UTF8, removing any characters we can't use
  df_chunk <- df_convert_utf8(df_chunk)
