What formula is used for the fuzzy values clustering in the prepare recipe on DSS?

JCB
JCB Registered Posts: 7 ✭✭✭

According to: https://knowledge.dataiku.com/latest/kb/data-prep/prepare-recipe/How-to-standardize-text-fields-using-fuzzy-values-clustering.html

You can choose a clustering strategy of “Fuzzy” or “Highly fuzzy” to cluster and merge similar text in the dataset. What is this fuzzy matching based on? Damerau–Levenshtein? If so, what threshold? It would be great to understand the logic behind the fuzzy clustering before applying the prepare recipe step.

Thanks in advance.


Operating system used: windows

Answers

Setup Info
    Tags
      Help me…