Num to cat conversion in clustering. Is it possible to have only 1 (like impact testing) column instead of n dummy?

Marija
Level 1
Num to cat conversion in clustering. Is it possible to have only 1 (like impact testing) column instead of n dummy?
I was able to convert feature from numerical to categorical and choose impact testing from drop list in supervised modeling. Am I able to do the same in clustering? More precisely, I am using k-means from quick models in LAB from 1 data set. In the dropdown list I have 4 options, I am not sure which one to use.



Thank you so much,

Marija
0 Kudos
1 Reply
Liev
Dataiker Alumni

Hi Marija,



impact-coding (or target encoding) refers to a preprocessing method where the categorical feature in question is encoded by using the target variable. A little more info here: http://contrib.scikit-learn.org/categorical-encoding/targetencoder.html



As you can see, in the context of clustering (unsupervised learning) where no target variable is known, this preprocessing option is not available in DSS. 



I hope this helps!

0 Kudos

Labels

?
Labels (2)
A banner prompting to get Dataiku