Extract date in filename
J2U_45000
Partner, Registered Posts: 3 Partner
Hi,
I created a column in dataiku which retrieves the filename :
get(SOURCEFILENAME, lastIndexOf(SOURCEFILENAME, "/")+1, length(SOURCEFILENAME))
I have differents files, which are named this way:
ZMMFI 2021.12.P1.xlsx |
ZMMFI 2021.12.P2.xlsx |
ZMMFI 2021.12.P3.xlsx |
ZMMFI 2021.12.P4.xlsx |
ZMMFI_OESV 2018.12 [08.01.2019].xlsx |
ZMMFI_OESV 2019.12 [13.01.2020].xlsx |
ZMMFI_OESV 2020.12 [27.01.2021].xlsx |
ZMMFI_OESV 2022.02.xlsx |
ZMMFI_OESX 2018.12 [08.01.2019].xlsx |
ZMMFI_OESX 2019.12 [13.01.2020].xlsx |
ZMMFI_OESX 2020.12 [27.01.2021].xlsx |
ZMMFI_OSFI 2018.03 [08.01.2019].xlsx |
ZMMFI_OSFI 2018.06 [08.01.2019].xlsx |
ZMMFI_OSFI 2018.10 [08.01.2019].xlsx |
ZMMFI_OSFI 2018.12 [08.01.2019].xlsx |
ZMMFI_OSFI 2019.12 [13.01.2020].XLSX |
ZMMFI_OSFI 2020.12 [27.01.2021].xlsx |
ZMMFI_OSFI 2022.02.xlsx |
ZMMFI_RTMV 2018.02 [08.01.2019].xlsx |
ZMMFI_RTMV 2018.04 [08.01.2019].XLSX |
ZMMFI_RTMV 2018.06 [08.01.2019].XLSX |
ZMMFI_RTMV 2018.08 [08.01.2019].XLSX |
ZMMFI_RTMV 2018.10 [08.01.2019].XLSX |
ZMMFI_RTMV 2018.12 [08.01.2019].xlsx |
ZMMFI_RTMV 2020.12.1 [27.01.2021].xlsx |
ZMMFI_RTMV 2020.12.2 [27.01.2021].xlsx |
ZMMFI_RTMV 2022.02.xlsx |
How can I retrieve only the date of each file?
thank you in advance for your help
Tagged:
Answers
-
Alexandru Dataiker, Dataiku DSS Core Designer, Dataiku DSS ML Practitioner, Dataiku DSS Adv Designer, Registered Posts: 1,226 Dataiker
Hi,
You could try using extract with regex with the prepare processor/s, Extract with regular expression
To extract The first date
(\s\d\d\d\d\.\d\d).*
Extract date from within []
\s\[(.*)\]\.