select files with regex
We want to select files from a cloud based folder based on regex. We are providing inclusion rule as CN_CRPROD to select only CN_CRPROD.csv, but it doesn't seems to be working.
Best Answer
-
Can you replace your regex "CN_CRPROD" by ".*CNCRPOD.*" (or ".*CNCRPOD\.csv" if you want to be more specific)? Otherwise it won't match other parts of the filename.
Answers
-
Hi,
Could you please give a bit more context ? Are you trying to get files in a Python recipe? Or in a "Files in folder" dataset?
What about the regex you used?
-
pi_485 Partner, L2 Admin, Dataiku DSS Core Designer, Dataiku DSS & SQL, Dataiku DSS Core Concepts, Registered Posts: 6 Partner
For context,
we are reading files from Azure Blob Storage where they are stored in directory structure based on date.Please find the regex setting available in screenshot.
-
pi_485 Partner, L2 Admin, Dataiku DSS Core Designer, Dataiku DSS & SQL, Dataiku DSS Core Concepts, Registered Posts: 6 Partner
Thank you, it worked.
Can you help with a guide with respect to Regex expression that we need to use in Dataiku DSS?
-
For file listing (and most of the time), DSS uses the Java regex / Pattern. You can for example find a reference here.