Survey banner
Switching to Dataiku - a new area to help users who are transitioning from other tools and diving into Dataiku! CHECK IT OUT

How to get the value before a specific delimiter

Solved!
abalo006
Level 2
How to get the value before a specific delimiter

If I have a string value formatted as firstname_middlename_lastname_initials_age_gender

and I only need everything before age, is there a formula I can use to give me everything before the fourth instance of my delimiter "_"? 

I know I can probably use the split function to split them all apart and then concat together the ones I need, but I was wondering if there's a way to accomplish something like that with one step.


Operating system used: windows

0 Kudos
1 Solution
SarinaS
Dataiker

Hi @abalo006,

You can indeed use a formula to accomplish this. If you are quite certain that you always want the first 4 strings, you could use a formula like this, where `line` is the column name:

if(arrayLen(split(line, '_')) > 3,
	join(split(line, '_')[0,4], '_'), 
    'none' 
)

 
This will check and make sure that there indeed are at least 4 strings split by the "_" character, and then will re-join the first four strings back by a "_" character (though you could of course modify this as you like)

Screenshot 2024-04-30 at 3.27.22โ€ฏPM.png

You may wish to add a few additional if statements within the formula step to ensure that you are always returning expected results in case any data is missing etc.

Thanks,
Sarinaโ€ƒ

View solution in original post

0 Kudos
1 Reply
SarinaS
Dataiker

Hi @abalo006,

You can indeed use a formula to accomplish this. If you are quite certain that you always want the first 4 strings, you could use a formula like this, where `line` is the column name:

if(arrayLen(split(line, '_')) > 3,
	join(split(line, '_')[0,4], '_'), 
    'none' 
)

 
This will check and make sure that there indeed are at least 4 strings split by the "_" character, and then will re-join the first four strings back by a "_" character (though you could of course modify this as you like)

Screenshot 2024-04-30 at 3.27.22โ€ฏPM.png

You may wish to add a few additional if statements within the formula step to ensure that you are always returning expected results in case any data is missing etc.

Thanks,
Sarinaโ€ƒ

0 Kudos

Labels

?
Labels (1)
A banner prompting to get Dataiku