How to set (or remove) default maximum number of categories in modeling?
okiriza
Registered Posts: 5 ✭✭✭✭
For categorical variables, there is "Max. Nb. Categories" field which I think is set to 100 by default.
Is there any way to remove this default maximum number of categories or set the default to a different value?
Thanks.
Is there any way to remove this default maximum number of categories or set the default to a different value?
Thanks.
Tagged:
Best Answer
-
Hi,
No you can't. This limit is useful to avoid Ram memory error when you train models.
I suggest you instead to use "hashing" encoding.
A sparse matrix will be build. Notice that in scikit-learn only some algorithms allow sparse matrix.
To do it easily, you can sort your features by types, select all your categorical features and click Hashing instead of Dummy-encode.
Matt