How to encode categorical variables for modeling?

UserBird
Dataiker
How to encode categorical variables for modeling?
Is it okay to use categorical variable directly or is it better to use one-hot encoding?
0 Kudos
1 Reply
jrouquie
Dataiker Alumni
It's ok to use categorical variable directly. The model will automatically do the one-hot encoding. This is also called dummification.

You can chose in project "Settings" between one-hot encoding and "impact encoding". For text variable, there other options available: tf-idf, hashing, etc.
0 Kudos

Labels

?
Labels (2)
A banner prompting to get Dataiku