Regex to extract information from string
Hi all,
I have the following city names ending by "CEDEX [0-9]":
MONACO CEDEX 15
PARIS LA DEFENSE CEDEX
AJACCIO CEDEX 1
CLERMONT FERRAND CEDEX 1
PARIS LA DEFENSE CEDEX
MARSEILLE CEDEX 07
DIJON CEDEX
TOURS CEDEX 9
I would like to keep only city name
How can I use Regex in Dataiku to obtain this result?
The result must be:
MONACO
PARIS LA DEFENSE
AJACCIO
CLERMONT FERRAND
PARIS LA DEFENSE
MARSEILLE
DIJON
TOURS
Could anyone help me please?
Thank you so much
Operating system used: Windows
Answers
-
Miguel Angel Dataiker, Dataiku DSS Core Designer, Dataiku DSS ML Practitioner, Dataiku DSS Adv Designer, Registered Posts: 118 Dataiker
Hi Vkana,
You can use the 'Extract with regular expression processor' in a Prepare recipe to get the desired values.
From your example, a valid regex would be: ^(.*?)\sCEDEX\s?\d*$
You can use online resources such as https://regex101.com/ if you want to further modify the regex to fit with your business requirements.
-
Good! Thank you so much.