r - Using ordinal variables in rpart and caret without converting to dummy categorical variables -
i trying create ordinal regression tree in r using rpart, predictors being ordinal data, stored factor in r.
when created tree using rpart, this:

where values factor values (e.g. a170 has labels ranging -5 10).
however, when use caret train data using rpart, when extract final model, tree no longer has ordinal predictors. see below sample output tree

as see above, seems ordinal variable a170 has been converted multiple dummy categorical value, i.e. a17010 in second tree dummy a170 of value 10.
so, possible retain ordinal variables instead of converting factor variables multiple binary indicator variables when fitting trees caret package?
let's start reproducible example:
set.seed(144) dat <- data.frame(x=factor(sample(1:6, 10000, replace=true))) dat$y <- ifelse(dat$x %in% 1:2, runif(10000) < 0.1, ifelse(dat$x %in% 3:4, runif(10000) < 0.4, runif(10000) < 0.7))*1 as note, training rpart function groups factor levels together:
library(rpart) rpart(y~x, data=dat) 
i able reproduce caret package splitting factors individual levels using formula interface train function:
library(caret) train(y~x, data=dat, method="rpart")$finalmodel 
the solution found avoid splitting factors level input raw data frames train function instead of using formula interface:
train(x=data.frame(dat$x), y=dat$y, method="rpart")$finalmodel 
Comments
Post a Comment