r - Using ordinal variables in rpart and caret without converting to dummy categorical variables -


i trying create ordinal regression tree in r using rpart, predictors being ordinal data, stored factor in r.

when created tree using rpart, this:

enter image description here

where values factor values (e.g. a170 has labels ranging -5 10).

however, when use caret train data using rpart, when extract final model, tree no longer has ordinal predictors. see below sample output tree

enter image description here

as see above, seems ordinal variable a170 has been converted multiple dummy categorical value, i.e. a17010 in second tree dummy a170 of value 10.

so, possible retain ordinal variables instead of converting factor variables multiple binary indicator variables when fitting trees caret package?

let's start reproducible example:

set.seed(144) dat <- data.frame(x=factor(sample(1:6, 10000, replace=true))) dat$y <- ifelse(dat$x %in% 1:2, runif(10000) < 0.1, ifelse(dat$x %in% 3:4, runif(10000) < 0.4, runif(10000) < 0.7))*1 

as note, training rpart function groups factor levels together:

library(rpart) rpart(y~x, data=dat) 

enter image description here

i able reproduce caret package splitting factors individual levels using formula interface train function:

library(caret) train(y~x, data=dat, method="rpart")$finalmodel 

enter image description here

the solution found avoid splitting factors level input raw data frames train function instead of using formula interface:

train(x=data.frame(dat$x), y=dat$y, method="rpart")$finalmodel 

enter image description here


Comments

Popular posts from this blog

facebook - android ACTION_SEND to share with specific application only -

python - Creating a new virtualenv gives a permissions error -

javascript - cocos2d-js draw circle not instantly -