You now have until September 15th to submit your use case or success story to the 2022 Dataiku Frontrunner Awards!ENTER YOUR SUBMISSION

Enhance the Files in Folder dataset to allow filtering for the latest files

The Files in Folder dataset is an extremely useful built-in dataset that allows Dataiku users to quickly load a bunch of similar files all at once. While very powerful it could even be better if it could be enhanced to add the capability to select the lastest file without having to write a Python recipe. It is currently possible to "select" which files Files in Folder dataset will include using a glob expression. However this doesn't allow users to select files by date/time. Sometimes files will named with a date in them (YYYYMMDD) or time hour/minute but since this date/time changes all the time it's not possible to have a glob expression to to select the latest one. 

So it will be great if the Files in Folder dataset could either conditionally select files by date/time based on the file creation date/time or by converting the date/time in the file name (assuming it's available). In an ideal world it should allow users to set Files in Folder dataset to include files using expressions that target:

  1. Last changed file or files in a Files in Folder dataset (last 10 files)
  2. Files changed in the last 24hs either by file creation date/time or by converting the date/time in the file name (assuming it's available)
  3. Filed that match a specific dynamic calendar date (say today) either by file creation date/time or by converting the date/time in the file name (assuming it's available)

Thanks!