Distinct after Concat in one cell
Sylvie
Registered Posts: 4 ✭
Hello,
I'm trying to clear cells after concatenation :
A,A,B,C,C,D --> A,B,C,D
A,A --> A
in a prepare recipe
Thanks in advance
Answers
-
Konstantina Dataiku DSS Core Designer, Dataiku DSS ML Practitioner, Dataiku DSS Adv Designer, Registered Posts: 27 ✭✭✭✭✭
Hi,
You can use a Python function step in a Prepare recipe (see attached image). I have created a function that transforms each input (row cell from the column of interest) from unicode to string - this step might not be needed in your case, it depends on the format of your data. Then, the function splits each string on ',' and finally, gets the set of this list, turns it again into a list and joins its elements with ','.
I hope this helps!
-Konstantina