Remove columns by pattern
Lets say I want to remove all the columns that contain the word "Spot" in them. How would I do that. I cannot figure out the syntax for Remove columns matching.
Best Answer
-
Alexandru Dataiker, Dataiku DSS Core Designer, Dataiku DSS ML Practitioner, Dataiku DSS Adv Designer, Registered Posts: 1,226 Dataiker
Hi,
You could do this using a Visual Prepare recipe and adding a step with the processor library "Delete/Keep Columns by name".
Within that you can use regex to match on the pattern you are looking for example: .*Spot.*
Answers
-
Thank you for your help!
-
Hi @AlexT
,I tried to use your solution for remove columns using pattern to remove all which start with a common text phrase but the formula did not work.
The text phrase is FR_AMER_ so I used the regex .*FR_AMER_.* but all the columns are still in the table preview.
-
Alexandru Dataiker, Dataiku DSS Core Designer, Dataiku DSS ML Practitioner, Dataiku DSS Adv Designer, Registered Posts: 1,226 Dataiker
If you want to remove columns that start with a string you can use slightly different regex :
^FR_AMERT_.*
-
Jurre Dataiku DSS Core Designer, Dataiku DSS & SQL, Dataiku DSS Core Concepts, Registered, Dataiku DSS Developer, Neuron 2022 Posts: 115 ✭✭✭✭✭✭✭
For building and testing regex https://regex101.com/ is a nice resource.
-
Doesn't work for me. I have attached a screenshot can you please help me with that? TIA
-
If I wanted to extend this and make it spot plus any single digit would it be
.*_spot\d$.*, .*_spot\d.* or something else?