classification - ROC curve in R using rpart package? -
i split train data set , test data set.
i used package rpart cart (classification tree) in r (only train set). , want carry out roc analysis using rocr package.
variable `n. use' (response varible... 1=yes, 0=no):
> pred2 = prediction(pred.cart, test$n.use) error in prediction(pred.cart, test$n.use) : **format of predictions invalid.** this code. problem? , right type ("class" or "prob"?
library(rpart) train.cart = rpart(n.use~., data=train, method="class") pred.cart = predict(train.cart, newdata = test, type = "class") pred2 = prediction(pred.cart, test$n.use) roc.cart = performance(pred2, "tpr", "fpr")
the prediction() function rocr package expects predicted "success" probabilities , observed factor of failures vs. successes. in order obtain former need apply predict(..., type = "prob") rpart object (i.e., not "class"). however, returns matrix of probabilities 1 column per response class need select "success" class column.
as example, unfortunately, not reproducible i'm using kyphosis data rpart package illustration:
library("rpart") data("kyphosis", package = "rpart") rp <- rpart(kyphosis ~ ., data = kyphosis) then can apply prediction() function rocr. here, i'm using in-sample (training) data same can applied out of sample (test data):
library("rocr") pred <- prediction(predict(rp, type = "prob")[, 2], kyphosis$kyphosis) and can visualize roc curve:
plot(performance(pred, "tpr", "fpr")) abline(0, 1, lty = 2) or accuracy across cutoffs:
plot(performance(pred, "acc")) or of other plots , summaries supported rocr.

Comments
Post a Comment