Regex to extract information from string

vkana
vkana Registered Posts: 4 ✭✭✭

Hi all,

I have the following city names ending by "CEDEX [0-9]":

MONACO CEDEX 15
PARIS LA DEFENSE CEDEX
AJACCIO CEDEX 1
CLERMONT FERRAND CEDEX 1
PARIS LA DEFENSE CEDEX
MARSEILLE CEDEX 07
DIJON CEDEX
TOURS CEDEX 9

I would like to keep only city name

How can I use Regex in Dataiku to obtain this result?

The result must be:

MONACO
PARIS LA DEFENSE
AJACCIO
CLERMONT FERRAND
PARIS LA DEFENSE
MARSEILLE
DIJON
TOURS

Could anyone help me please?

Thank you so much


Operating system used: Windows

Tagged:

Answers

  • Miguel Angel
    Miguel Angel Dataiker, Dataiku DSS Core Designer, Dataiku DSS ML Practitioner, Dataiku DSS Adv Designer, Registered Posts: 118 Dataiker

    Hi Vkana,

    You can use the 'Extract with regular expression processor' in a Prepare recipe to get the desired values.

    From your example, a valid regex would be: ^(.*?)\sCEDEX\s?\d*$

    a.PNG

    You can use online resources such as https://regex101.com/ if you want to further modify the regex to fit with your business requirements.

  • vkana
    vkana Registered Posts: 4 ✭✭✭

    Good! Thank you so much.

Setup Info
    Tags
      Help me…