partitioning
Hi! In my flow I am loading data (hourly timestamps) for a whole year and then work with it. The flow runs every day and I would like to only toad the future and keep the past entries from the last run. I thought about using file-based partitioning but the past is always erased. How could I solve this problem?
Best Answer
-
JordanB Dataiker, Dataiku DSS Core Designer, Dataiku DSS Adv Designer, Registered Posts: 296 Dataiker
Hi @TheHobbit
,It seems that you would need to change your flow so that your past data is not removed. Load your year of data, partition it using file-based partitioning, and build the most recent data at intervals with an automated trigger. Please see the following tutorial on file-based partitioning: https://knowledge.dataiku.com/latest/mlops-o16n/partitioning/tutorial-file-based.html
Partitioning in a scenario: https://knowledge.dataiku.com/latest/mlops-o16n/partitioning/concept-scenario.html
If this does not work for you, please describe your flow in further detail.
Thanks,
Jordan