New to Dataiku DSS? Try out our NEW Quick Start Programs today and get onboarded on the product in just one hour! Let's go

Num to cat conversion in clustering. Is it possible to have only 1 (like impact testing) column instead of n dummy?

Marija
Level 1
Num to cat conversion in clustering. Is it possible to have only 1 (like impact testing) column instead of n dummy?
I was able to convert feature from numerical to categorical and choose impact testing from drop list in supervised modeling. Am I able to do the same in clustering? More precisely, I am using k-means from quick models in LAB from 1 data set. In the dropdown list I have 4 options, I am not sure which one to use.



Thank you so much,

Marija
0 Kudos
1 Reply
Liev
Dataiker
Dataiker

Hi Marija,



impact-coding (or target encoding) refers to a preprocessing method where the categorical feature in question is encoded by using the target variable. A little more info here: http://contrib.scikit-learn.org/categorical-encoding/targetencoder.html



As you can see, in the context of clustering (unsupervised learning) where no target variable is known, this preprocessing option is not available in DSS. 



I hope this helps!

0 Kudos
Labels (2)
A banner prompting to get Dataiku DSS