How to get the value before a specific delimiter

abalo006
abalo006 Registered Posts: 29

If I have a string value formatted as firstname_middlename_lastname_initials_age_gender

and I only need everything before age, is there a formula I can use to give me everything before the fourth instance of my delimiter "_"?

I know I can probably use the split function to split them all apart and then concat together the ones I need, but I was wondering if there's a way to accomplish something like that with one step.


Operating system used: windows

Tagged:

Best Answer

  • Sarina
    Sarina Dataiker, Dataiku DSS Core Designer, Dataiku DSS Adv Designer, Registered Posts: 320 Dataiker
    edited July 2024 Answer ✓

    Hi @abalo006
    ,

    You can indeed use a formula to accomplish this. If you are quite certain that you always want the first 4 strings, you could use a formula like this, where `line` is the column name:

    if(arrayLen(split(line, '_')) > 3,
        join(split(line, '_')[0,4], '_'), 
        'none' 
    )


    This will check and make sure that there indeed are at least 4 strings split by the "_" character, and then will re-join the first four strings back by a "_" character (though you could of course modify this as you like)

    Screenshot 2024-04-30 at 3.27.22 PM.png

    You may wish to add a few additional if statements within the formula step to ensure that you are always returning expected results in case any data is missing etc.

    Thanks,
    Sarina 

Setup Info
    Tags
      Help me…