r - Using ordinal variables in rpart and caret without converting to dummy categorical variables -
i trying create ordinal regression tree in r using rpart
, predictors being ordinal data, stored factor
in r.
when created tree using rpart
, this:
where values factor values (e.g. a170
has labels ranging -5 10).
however, when use caret
train
data using rpart
, when extract final model, tree no longer has ordinal predictors. see below sample output tree
as see above, seems ordinal variable a170
has been converted multiple dummy categorical value, i.e. a17010
in second tree dummy a170
of value 10
.
so, possible retain ordinal variables instead of converting factor variables multiple binary indicator variables when fitting trees caret
package?
let's start reproducible example:
set.seed(144) dat <- data.frame(x=factor(sample(1:6, 10000, replace=true))) dat$y <- ifelse(dat$x %in% 1:2, runif(10000) < 0.1, ifelse(dat$x %in% 3:4, runif(10000) < 0.4, runif(10000) < 0.7))*1
as note, training rpart
function groups factor levels together:
library(rpart) rpart(y~x, data=dat)
i able reproduce caret package splitting factors individual levels using formula interface train
function:
library(caret) train(y~x, data=dat, method="rpart")$finalmodel
the solution found avoid splitting factors level input raw data frames train
function instead of using formula interface:
train(x=data.frame(dat$x), y=dat$y, method="rpart")$finalmodel
Comments
Post a Comment