After creating a simple decision tree using rpart, I want to plot the performance using ROCR. When I change the avg= parameter the curve of the ROC changes significantly.
When I use the same method and changes in plotting the performance of a GLM model, nothing changes. Why does this parameter only influence the tree plot and in which way.
# create tree model
bsprp <- mean(df.sub.train$y)
target <- y_fact ~ age + gender + a + b + c + d
m.dt <- rpart(target, 
          data = df.sub.train, 
          parms=list(prior=c(bsprp,1-bsprp)), cp=0.005)
# predict on df.sub.vld
dt.predicted <- predict(m.dt, newdata = df.sub.vld)
dt.pred <- prediction(dt.predicted[,2],df.sub.vld$y)
dt.perf <- performance(dt.pred, "tpr", "fpr")
# plot performance 
plot(dt.perf, avg= "threshold", col="red", lwd= 2, main= "ROC curve")
abline(0, 1, untf = FALSE, col = "lightgray", lty = 2)
# vs
plot(dt.perf, avg= "none", col="red", lwd= 2, main= "ROC curve")
abline(0, 1, untf = FALSE, col = "lightgray", lty = 2)
An example of the dataset used:
   y_fact y      age gender bf2          a             b          c
5       1 1 71.11233   Male  40          6             0          0
10      1 1 51.83836   Male  11          5             3          0
13      1 1 70.14521 Female   7          3             1          0
15      1 1 40.00548   Male  64          6             0          0
16      1 1 55.81096   Male  55          8             1          0
19      1 1 54.45479   Male  13          3             1          0
Screenshots of the different plots:
