Survey banner
The Dataiku Community is moving to a new home! Some short term disruption starting next week: LEARN MORE

Value Clustering Question

Hongstershine
Level 1
Value Clustering Question

Hi, I found the Value Clustering under Field -> Analyze is very helpful to normalize data when, say, there are 100 different ways for "United States". Dataiku clustering function makes it easier to map different variations into the same value. However, it won't map USA into United States. I am wondering if there is a way to train the system to make the mapping. This is just a simple use case. We have more complex clustering mapping use cases that could benefit from further training capability to train the system to optimize the clustering mapping.

Cheers, Hong

0 Kudos
1 Reply
AdrienL
Dataiker

You could build a table/dataset of what should be mapped to what, and use a join recipe. You could look at a fuzzy join recipe to handle typos.

0 Kudos

Setup info

?
Tags (1)
A banner prompting to get Dataiku