Announcing the winners & finalists of the Dataiku Frontrunner Awards 2021! Read their inspiring stories

Append in redispatch partitioning dataset

Jas
Level 2
Append in redispatch partitioning dataset

Hi - Need help with understanding, how the append would work when redispatch partitioning a data set. 

My Scenario is – Data is partitioned on time dimension(monthly). The update is run every month and an extract is created for the past 3 months. Ideally what I want is, if N is my current month, N-1 would be a new partition, and (N-2, N-3) that are already existing, are rebuilt with new data. I also need the already exiting data to remain for the past months(N-4, N-5, N-T).

My question is would the above scenario be played out on checking the append option, if not what would be the best achieve this?

 

0 Kudos
1 Reply
AlexT
Dataiker
Dataiker

Hi @Jas ,

If I understand correctly you need to build the previous month's partition and also refresh the data from N-2, N-3, and all previous partitions to remain untouched.

To build the previous month you can use a Scenario build step with PREVIOUS_MONTH as the partition. 

Screenshot 2021-10-18 at 09.54.13.png

Given your data is partitioned by date, is there a real need to rebuild the previous partitions, e.g 2 months ago, 3 months ago?

If you do need you can create 2 scenario variables e.g 

inc(now(),-2,'month').toString('yyyy-MM-dd')

Screenshot 2021-10-18 at 10.05.35.png

 

Then use the variable name to build the respective partitions. 

 

Let me know this helps or if you have any additional questions.  

 

0 Kudos
Labels (1)
A banner prompting to get Dataiku DSS