How to encode categorical variables for modeling?

UserBird
UserBird Dataiker, Alpha Tester Posts: 535 Dataiker
Is it okay to use categorical variable directly or is it better to use one-hot encoding?

Answers

  • jrouquie
    jrouquie Dataiker Alumni Posts: 87 ✭✭✭✭✭✭✭
    It's ok to use categorical variable directly. The model will automatically do the one-hot encoding. This is also called dummification.

    You can chose in project "Settings" between one-hot encoding and "impact encoding". For text variable, there other options available: tf-idf, hashing, etc.
Setup Info
    Tags
      Help me…