Discover this year's submissions to the Dataiku Frontrunner Awards and give kudos to your favorite use cases and success stories!READ MORE

Split my dataset from key column auto-recognition

Solved!
shosho88
Level 2
Level 2
Split my dataset from key column auto-recognition

Hello, Bonjour,

I have a dataset and I want to spilt it in 2 parts. Data from a key column (2000 rows)

In this key column  is a full sentence (exemple : "hello how r u", "how r you", "hello fine")

And when a key word is detected ("exemple : hello") it'll split automatically in one new dataset and when it's not detected it'll split in the second new dataset.

Hope my english isn't too bad :d

Thanks for your help! Cheers

0 Kudos
1 Solution
AlexGo
Dataiker
Dataiker

No problem! You just need to change it to 'At least one of the following' in the drop down at the top, and then add additional conditions.

 Screen Shot 2022-04-29 at 11.33.13 AM.png

You could also use a regular expression or a formula if you're comfortable with those

View solution in original post

0 Kudos
5 Replies
AlexGo
Dataiker
Dataiker

Hi,

You should be able to use the 'Split' visual recipe for this. You just need to select the 'Define filters' and then use the column 'contains' condition. You can split to several different datasets based on terms, or add several conditions to each dataset if you want.

Everything else will go by default to the second dataset.

shosho88
Level 2
Level 2
Author

Thanks @AlexGo it work perfectly!

I want to use different key words. I can use space between them or coma ? or something else ?

Thanks 🙂

0 Kudos
AlexGo
Dataiker
Dataiker

No problem! You just need to change it to 'At least one of the following' in the drop down at the top, and then add additional conditions.

 Screen Shot 2022-04-29 at 11.33.13 AM.png

You could also use a regular expression or a formula if you're comfortable with those

0 Kudos
Jurre
Neuron
Neuron

<edit> reaction removed, @AlexGo  beated me with reaction time 🙂

shosho88
Level 2
Level 2
Author

Thanks guys for the solutions 🙂