Switch formula with wildcards
Morning wonderful Dataiku geeks - aka people smarter than me
I'm sure this must be possible in the switch formula. Despite trying all sorts of options, I just can't find a syntax that works.
Could someone straighten me out please?
switch(Sex, "An.*", "Female", "Bil.*", "Male")
Thanks
Martin
Answers
-
Miguel Angel Dataiker, Dataiku DSS Core Designer, Dataiku DSS ML Practitioner, Dataiku DSS Adv Designer, Registered Posts: 118 Dataiker
Hi Martin,
The switch formula matches cannot be Regexs. There are multiple ways we can extract the relevant data for the formula to use. A couple of examples:
switch("True",contains(NamesColumn,"Bil."),"Male",contains(NamesColumn,"An."),"Female") switch("True",startsWith(NamesColumn,"Bil."),"Male",startsWith(NamesColumn,"An."),"Female")
You could similarly use an If block:
if(startsWith(NamesColumn,"Bil."),"Male",if(startsWith(NamesColumn,"An."),"Female","Nothing"))
If the name formats is "firstname.lastname" we can extract the relevant part of the name prior to the switch and use that for the case:
with(NamesColumn.split("."),a,switch(a[0],"Bil","Male","An","Female"))
-
Thanks Miguel
Some great suggestions.
I am intrigued though as when you enter half the expression into the formula, it fires back:
"Formula is invalid: Incorrect formula: Missing number, string, identifier, regex, or parenthesized expression (Parsing error at offset 47)"
Seems to strongly suggest that regex is allowed as an input.(By the way, the sample data is thoroughly fictitious - unlike my real data!)