Community Conundrums are live! Learn more

How to set (or remove) default maximum number of categories in modeling?

Level 2
How to set (or remove) default maximum number of categories in modeling?
For categorical variables, there is "Max. Nb. Categories" field which I think is set to 100 by default.

Is there any way to remove this default maximum number of categories or set the default to a different value?

Thanks.
0 Kudos
1 Reply
Dataiker
Dataiker
Hi,

No you can't. This limit is useful to avoid Ram memory error when you train models.
I suggest you instead to use "hashing" encoding.

A sparse matrix will be build. Notice that in scikit-learn only some algorithms allow sparse matrix.

To do it easily, you can sort your features by types, select all your categorical features and click Hashing instead of Dummy-encode.

Matt
Mattsco
Labels (2)