Sign up to take part
Registered users can ask their own questions, contribute to discussions, and be part of the Community!
If you load lots of files of the same type into Dataiku you should be looking using the Files in folder dataset. It's a great built-in feature to automate the ingestion of files of the same format.
You can create a new Files in Folders dataset by going to: Dataset => New Dataset => All dataset types => DSS => Files in folder. This is how it looks:
But this tip is not about Files in Folders dataset but about a "hidden" built-in feature of this dataset. When using the Files in Folders dataset is it possible to have the filename and row ID of records imported into Dataiku added as a column? Yes it is! What you need to do is:
Finally run your recipe to populate the new columns. You will now have the filepath, filename and row ID of records added to your data automatically!