Regex function to return string between 2 characters

abalo006
abalo006 Registered Posts: 29

I'm trying to create a regex function that gives me the string between 2 characters

I have the string below

word1_word2_word3_word4_word5_word6_word7_word8_length_string.txt

and I'm trying to return everything after the 7th instance of "_" and before ".txt"

Desired output: word8_length_string

is there a way to use a regex function / regex tool to accomplish this?


Operating system used: windows

Tagged:

Answers

  • louisbarjon
    louisbarjon Dataiker, Dataiku DSS Core Designer, Dataiku DSS Adv Designer, Registered Posts: 9 Dataiker
    edited July 17

    Hello,

    What is exactly your context ?
    If you are using a prepare recipe you can use a formula step and use this regular expression:

    match(your_column_name, '^(?:[^_]*_){7}(.*)\.txt$')[0]

    Note that this regexp explicitly does what you describe, it really counts 7 instances of _ then return everything before .txt

    More info about the formula processor here

    If you are in a python code recipe, the same regular expression will work as well.

    Louis

  • AdrienL
    AdrienL Dataiker, Alpha Tester Posts: 196 Dataiker

    Some alternatives:

Setup Info
    Tags
      Help me…