Sign up to take part
Registered users can ask their own questions, contribute to discussions, and be part of the Community!
Added on July 7, 2023 10:55PM
Likes: 0
Replies: 1
Hi all,
I am trying to understand how new values for categorical columns are handled by ML models. By new values, I mean values that weren't included in the training data.
I am using one hot encoding with the Visual ML settings with minimum samples and max value settings so presumably there will be an Other category for all the infrequently occurring values in the training data.
Here's my specific question: are the new values put into the Other category when scoring new records?
Anyone have any insights on this?
Thanks,
Marlan
Operating system used: Red Hat
Operating system used: Red Hat
Hi,
Yes, categorical values not seen during training are put in the Other category.