Survey banner
The Dataiku Community is moving to a new home! Some short term disruption starting next week: LEARN MORE

How to get the value before a specific delimiter

Solved!
abalo006
Level 3
How to get the value before a specific delimiter

If I have a string value formatted as firstname_middlename_lastname_initials_age_gender

and I only need everything before age, is there a formula I can use to give me everything before the fourth instance of my delimiter "_"? 

I know I can probably use the split function to split them all apart and then concat together the ones I need, but I was wondering if there's a way to accomplish something like that with one step.


Operating system used: windows

0 Kudos
1 Solution
SarinaS
Dataiker

Hi @abalo006,

You can indeed use a formula to accomplish this. If you are quite certain that you always want the first 4 strings, you could use a formula like this, where `line` is the column name:

if(arrayLen(split(line, '_')) > 3,
	join(split(line, '_')[0,4], '_'), 
    'none' 
)

 
This will check and make sure that there indeed are at least 4 strings split by the "_" character, and then will re-join the first four strings back by a "_" character (though you could of course modify this as you like)

Screenshot 2024-04-30 at 3.27.22 PM.png

You may wish to add a few additional if statements within the formula step to ensure that you are always returning expected results in case any data is missing etc.

Thanks,
Sarina 

View solution in original post

0 Kudos
1 Reply
SarinaS
Dataiker

Hi @abalo006,

You can indeed use a formula to accomplish this. If you are quite certain that you always want the first 4 strings, you could use a formula like this, where `line` is the column name:

if(arrayLen(split(line, '_')) > 3,
	join(split(line, '_')[0,4], '_'), 
    'none' 
)

 
This will check and make sure that there indeed are at least 4 strings split by the "_" character, and then will re-join the first four strings back by a "_" character (though you could of course modify this as you like)

Screenshot 2024-04-30 at 3.27.22 PM.png

You may wish to add a few additional if statements within the formula step to ensure that you are always returning expected results in case any data is missing etc.

Thanks,
Sarina 

0 Kudos

Labels

?

Setup info

?
A banner prompting to get Dataiku