Survey banner
Share your feedback on the Dataiku documentation with this 5 min survey. Thanks! TAKE THE SURVEY

Enhance the Files in Folder dataset to allow filtering for the latest files

The Files in Folder dataset is an extremely useful built-in dataset that allows Dataiku users to quickly load a bunch of similar files all at once. While very powerful it could even be better if it could be enhanced to add the capability to select the lastest file without having to write a Python recipe. It is currently possible to "select" which files Files in Folder dataset will include using a glob expression. However this doesn't allow users to select files by date/time. Sometimes files will named with a date in them (YYYYMMDD) or time hour/minute but since this date/time changes all the time it's not possible to have a glob expression to to select the latest one. 

So it will be great if the Files in Folder dataset could either conditionally select files by date/time based on the file creation date/time or by converting the date/time in the file name (assuming it's available). In an ideal world it should allow users to set Files in Folder dataset to include files using expressions that target:

  1. Last changed file or files in a Files in Folder dataset (last 10 files)
  2. Files changed in the last 24hs either by file creation date/time or by converting the date/time in the file name (assuming it's available)
  3. Filed that match a specific dynamic calendar date (say today) either by file creation date/time or by converting the date/time in the file name (assuming it's available)

Thanks!

 

 

 

1 Comment
ElieA
Dataiker

Thanks for your idea, @Turribeach 

Your idea meets the criteria for submission, we'll reach out should we require more information.

If you’re reading this and think this would be a great capability to add to DSS, be sure to kudos the original post!

Take care

Status changed to: In the Backlog

Thanks for your idea, @Turribeach 

Your idea meets the criteria for submission, we'll reach out should we require more information.

If you’re reading this and think this would be a great capability to add to DSS, be sure to kudos the original post!

Take care