partitioning

Solved!
TheHobbit
Level 1
partitioning

Hi! In my flow I am loading data (hourly timestamps) for a whole year and then work with it. The flow runs every day and I would like to only toad the future and keep the past entries from the last run. I thought about using file-based partitioning but the past is always erased. How could I solve this problem?

0 Kudos
1 Solution
JordanB
Dataiker

Hi @TheHobbit,

It seems that you would need to change your flow so that your past data is not removed. Load your year of data, partition it using file-based partitioning, and build the most recent data at intervals with an automated trigger. Please see the following tutorial on file-based partitioning: https://knowledge.dataiku.com/latest/mlops-o16n/partitioning/tutorial-file-based.html

Partitioning in a scenario: https://knowledge.dataiku.com/latest/mlops-o16n/partitioning/concept-scenario.html

If this does not work for you, please describe your flow in further detail. 

Thanks,

Jordan

View solution in original post

0 Kudos
1 Reply
JordanB
Dataiker

Hi @TheHobbit,

It seems that you would need to change your flow so that your past data is not removed. Load your year of data, partition it using file-based partitioning, and build the most recent data at intervals with an automated trigger. Please see the following tutorial on file-based partitioning: https://knowledge.dataiku.com/latest/mlops-o16n/partitioning/tutorial-file-based.html

Partitioning in a scenario: https://knowledge.dataiku.com/latest/mlops-o16n/partitioning/concept-scenario.html

If this does not work for you, please describe your flow in further detail. 

Thanks,

Jordan

0 Kudos