Regex expression to extract substring in a large string
Hi
I wrote the following Regex in Dataiku to extract the 8 digit number starting with 9.
Example: 93809629 or 93650953
Regex: [LCC|FACTURE\ |FACTURE][A-Z]{1,7}\s(\d\d\d\d\d\d\d\d)\
But it doesn't match for all the situations below.
1) LBB2022 FACT 93809929
2) LBBF.90930153 - 06/12/2021
3) LBBING FACTURE NO90857587 DU 02/12/21AFFRANCHISSEMENT
4) LBB90945758 CLT 662063
5) LBBFACT N 90903643 DU 02/12/2021
6) LBB93856595 FACTUREDEPARTEMENT
7) LBBFACTURE 93720887FRAIS AFFRANCHISSEMENTUSSY
9) LBBFACTURE NO 93741852 DU 05/12/2022AFFRANCHISSEMENT
10) LBBN FACT : 93650972 REEUE LE 04/12/2022AFFRANCHISSEMENTS
Could anyone help me please?
Thank you so much
Operating system used: Windows
Best Answer
-
Hi, note that this has little to do with DSS itself
You can try:
\D(9\d{7})(?!\d)
- \D will match anything that is not a digit (to not match within a number)
- ( starts a capturing group for extraction
- 9 matches a 9 (duh)
- \d{7} matches the next 7 digits
- ) ends the capturing group
- (?!\d) is a negative lookahead ensuring it is not followed by another digit
See it with your examples here: https://regex101.com/r/K30Fww/1
Answers
-
It works correctly!!! Thank you so much.