How to get the value before a specific delimiter

abalo006 Registered Posts: 29

If I have a string value formatted as firstname_middlename_lastname_initials_age_gender

and I only need everything before age, is there a formula I can use to give me everything before the fourth instance of my delimiter "_"?

I know I can probably use the split function to split them all apart and then concat together the ones I need, but I was wondering if there's a way to accomplish something like that with one step.

Operating system used: windows


Best Answer

  • Sarina
    Sarina Dataiker, Dataiku DSS Core Designer, Dataiku DSS Adv Designer Posts: 315 Dataiker
    edited July 17 Answer ✓

    Hi @abalo006

    You can indeed use a formula to accomplish this. If you are quite certain that you always want the first 4 strings, you could use a formula like this, where `line` is the column name:

    if(arrayLen(split(line, '_')) > 3,
        join(split(line, '_')[0,4], '_'), 

    This will check and make sure that there indeed are at least 4 strings split by the "_" character, and then will re-join the first four strings back by a "_" character (though you could of course modify this as you like)

    Screenshot 2024-04-30 at 3.27.22 PM.png

    You may wish to add a few additional if statements within the formula step to ensure that you are always returning expected results in case any data is missing etc.


Setup Info
      Help me…