I am new to R and machine learning. Here I tried to build a random forest classification model to predict the priority of a incident ticket from its description. The below steps I followed.
1) Input <- description using CSV file
library(tm)
library(SnowballC)
library(caTools)
library(randomForest)
incidents = read.csv("incident.csv", stringsAsFactors = FALSE)
> str(incidents) 'data.frame':  4265 obs. of  7 variables:  $ number                : chr  "INC0031193" "INC0037867" "INC0159979" "INC0031446" ...  $
> u_detailed_description: chr  "Close & Ignore new Ticket New-Production
> SNOW Auto Routing test for XYZ SNOW ticketing in uat"  "" "" ""...  $
> priority              : chr  "3 - Moderate" "2 - High" "4 - Low" "3 -
> Moderate" ...  $ state                 : chr  "Canceled" "Canceled"
> "Canceled" "Canceled" ...  $ category              : chr  "Server"
> "Tools" "Server" "Server" ...  $ assignment_group      : chr 
> "Windows" "Tools" "SNOC Support" "Windows" ...
2) Data cleaning, creating DocumenTermMatrix and convert to DataFrame.
incidentCorpus <- Corpus(VectorSource(incidents$u_detailed_description))
incidentCorpus <- tm_map(incidentCorpus, tolower)
incidentCorpus <- tm_map(incidentCorpus, removePunctuation)
incidentCorpus <- tm_map(incidentCorpus, removeWords, stopwords("english"))
incidentCorpus <- tm_map(incidentCorpus, stemDocument)
incidentDTM <- DocumentTermMatrix(incidentCorpus)
3) Splitting data into train and test set using caTools.
set.seed(123)
split <- sample.split(incidentSparse$priority,SplitRatio = 0.7)
train <- subset(incidentSparse, split == TRUE)
test  <- subset(incidentSparse, split == FALSE)
train$priority <- as.character(train$priority) 
train$priority <- as.factor(train$priority
test$priority  <- as.character(testSet1$priority)
test$priority  <- as.factor(testSet1$priority)
4) Apply the randomforest() function to create my model and used predict function to classify my test set as well. 
incidentRandomF <- randomForest(priority ~ ., data = train, ntree = 200, mtry = 50, importance = TRUE, proximity = TRUE)
5) the overall accuracy of the model is around 90%.
baselineAccuracy <- sum(diag(table(predict(incidentRandomF, type="class"), train$priority)))/nrow(train)
> baselineAccuracy
[1] 0.8392498
predFinalTestSet_RF <- predict(incidentRandomF, newdata = test,  type="class")
FinalTestSetAccuracy <- sum(diag(table(test$priority,predFinalTestSet_RF)))/nrow(test)
> FinalTestSetAccuracy
[1] 0.8828125
As of now my classification model is ready and now I need to execute this model to predict the priority based on a given description, where the description would be provided by the user.
How to provide user input to the R script to make it functional properly?
Your help would be highly appreciated. Thanks in advance.
 
     
    