Split my dataset from key column auto-recognition
Hello, Bonjour,
I have a dataset and I want to spilt it in 2 parts. Data from a key column (2000 rows)
In this key column is a full sentence (exemple : "hello how r u", "how r you", "hello fine")
And when a key word is detected ("exemple : hello") it'll split automatically in one new dataset and when it's not detected it'll split in the second new dataset.
Hope my english isn't too bad :d
Thanks for your help! Cheers
Best Answer
-
AlexGo Dataiker, Dataiku DSS Core Designer, Dataiku DSS ML Practitioner, Dataiku DSS Adv Designer, Registered Posts: 18 Dataiker
No problem! You just need to change it to 'At least one of the following' in the drop down at the top, and then add additional conditions.
You could also use a regular expression or a formula if you're comfortable with those
Answers
-
AlexGo Dataiker, Dataiku DSS Core Designer, Dataiku DSS ML Practitioner, Dataiku DSS Adv Designer, Registered Posts: 18 Dataiker
Hi,
You should be able to use the 'Split' visual recipe for this. You just need to select the 'Define filters' and then use the column 'contains' condition. You can split to several different datasets based on terms, or add several conditions to each dataset if you want.
Everything else will go by default to the second dataset.
-
Thanks @AlexGo
it work perfectly!I want to use different key words. I can use space between them or coma ? or something else ?
Thanks
-
Thanks guys for the solutions