How to get the value before a specific delimiter
If I have a string value formatted as firstname_middlename_lastname_initials_age_gender
and I only need everything before age, is there a formula I can use to give me everything before the fourth instance of my delimiter "_"?
I know I can probably use the split function to split them all apart and then concat together the ones I need, but I was wondering if there's a way to accomplish something like that with one step.
Operating system used: windows
Best Answer
-
Sarina Dataiker, Dataiku DSS Core Designer, Dataiku DSS Adv Designer, Registered Posts: 317 Dataiker
Hi @abalo006
,
You can indeed use a formula to accomplish this. If you are quite certain that you always want the first 4 strings, you could use a formula like this, where `line` is the column name:if(arrayLen(split(line, '_')) > 3, join(split(line, '_')[0,4], '_'), 'none' )
This will check and make sure that there indeed are at least 4 strings split by the "_" character, and then will re-join the first four strings back by a "_" character (though you could of course modify this as you like)You may wish to add a few additional if statements within the formula step to ensure that you are always returning expected results in case any data is missing etc.
Thanks,
Sarina