Distinct after Concat in one cell

Options
Sylvie
Sylvie Registered Posts: 4

Hello,

I'm trying to clear cells after concatenation :

A,A,B,C,C,D --> A,B,C,D

A,A --> A

in a prepare recipe

Thanks in advance

Answers

  • konathan
    konathan Dataiku DSS Core Designer, Dataiku DSS ML Practitioner, Dataiku DSS Adv Designer, Registered Posts: 25 ✭✭
    Options

    Hi,

    You can use a Python function step in a Prepare recipe (see attached image). I have created a function that transforms each input (row cell from the column of interest) from unicode to string - this step might not be needed in your case, it depends on the format of your data. Then, the function splits each string on ',' and finally, gets the set of this list, turns it again into a list and joins its elements with ','.

    I hope this helps!

    -Konstantina

Setup Info
    Tags
      Help me…