classification - ROC curve in R using rpart package? -
i split train
data set , test
data set.
i used package rpart
cart (classification tree) in r (only train set). , want carry out roc analysis using rocr
package.
variable `n. use' (response varible... 1=yes, 0=no):
> pred2 = prediction(pred.cart, test$n.use) error in prediction(pred.cart, test$n.use) : **format of predictions invalid.**
this code. problem? , right type
("class"
or "prob"
?
library(rpart) train.cart = rpart(n.use~., data=train, method="class") pred.cart = predict(train.cart, newdata = test, type = "class") pred2 = prediction(pred.cart, test$n.use) roc.cart = performance(pred2, "tpr", "fpr")
the prediction()
function rocr
package expects predicted "success" probabilities , observed factor of failures vs. successes. in order obtain former need apply predict(..., type = "prob")
rpart
object (i.e., not "class"
). however, returns matrix of probabilities 1 column per response class need select "success" class column.
as example, unfortunately, not reproducible i'm using kyphosis
data rpart
package illustration:
library("rpart") data("kyphosis", package = "rpart") rp <- rpart(kyphosis ~ ., data = kyphosis)
then can apply prediction()
function rocr
. here, i'm using in-sample (training) data same can applied out of sample (test data):
library("rocr") pred <- prediction(predict(rp, type = "prob")[, 2], kyphosis$kyphosis)
and can visualize roc curve:
plot(performance(pred, "tpr", "fpr")) abline(0, 1, lty = 2)
or accuracy across cutoffs:
plot(performance(pred, "acc"))
or of other plots , summaries supported rocr
.
Comments
Post a Comment