How to encode categorical variables for modeling?
UserBird
Dataiker, Alpha Tester Posts: 535 Dataiker
Is it okay to use categorical variable directly or is it better to use one-hot encoding?
Tagged:
Answers
-
It's ok to use categorical variable directly. The model will automatically do the one-hot encoding. This is also called dummification.
You can chose in project "Settings" between one-hot encoding and "impact encoding". For text variable, there other options available: tf-idf, hashing, etc.