Evaluate clustering models on test set
Hi everyone,
How do you evaluate clustering models on a test set with Dataiku plugins? ('Evaluate' seems to be only available for supervised tasks...)
Otherwise, how can you get access to the inference workflow 'easily' (reproduce all the feature handling steps, dimensionality reduction... without coding from scratch )?
Thank you.
Answers
-
Hi,
Because clustering is an unsupervised learning task (you're not actually trying to predict anything) and you're just trying to understand underlying groups and patterns within your dataset, there is no concept of a train and test set, you just cluster one dataset and then can cluster on any new dataset if you deploy your clustering model as a "retrainable model to the flow". You can feed in any new dataset to be assigned to one of the existing clusters (and note that you can also name the cluster labels within the model results tab).
Katie