We want to select files from a cloud based folder based on regex. We are providing inclusion rule as CN_CRPROD to select only CN_CRPROD.csv, but it doesn't seems to be working.
Could you please give a bit more context ? Are you trying to get files in a Python recipe? Or in a "Files in folder" dataset?
What about the regex you used?
For context,we are reading files from Azure Blob Storage where they are stored in directory structure based on date.
Please find the regex setting available in screenshot.
Can you replace your regex "CN_CRPROD" by ".*CNCRPOD.*" (or ".*CNCRPOD\.csv" if you want to be more specific)? Otherwise it won't match other parts of the filename.
Thank you, it worked.
Can you help with a guide with respect to Regex expression that we need to use in Dataiku DSS?
For file listing (and most of the time), DSS uses the Java regex / Pattern. You can for example find a reference here.