Sign up to take part
Registered users can ask their own questions, contribute to discussions, and be part of the Community!
Registered users can ask their own questions, contribute to discussions, and be part of the Community!
i need to do fuzzy match based on jaro distance .I have two columns (X, Y). I have two unique values in the Y column.The fuzzy match need to take the shortest string from the X column and it should compare with another Y column's X values.likewise it need to do for all the X column values. If it satisfied the predefined threshold value, the values should be stored in the output with the match score and match key.Is there any available way to do this in dataiku.
Hi, I am afraid I don't see any screenshots in your post. Also there are 4 different algorithms in the fuzzy join so you would need to see which one is the closest to your use case. Did you try them all? If neither of these 4 algorithms don't match your requirements you will need to do this in a Python recipe and come up with your own algorithm.
Rather than posting your data can you post how you configured the algorithms in the fuzzy join?