Append in redispatch partitioning dataset

Options
Jaspal
Jaspal Registered Posts: 9 ✭✭✭✭

Hi - Need help with understanding, how the append would work when redispatch partitioning a data set.

My Scenario is – Data is partitioned on time dimension(monthly). The update is run every month and an extract is created for the past 3 months. Ideally what I want is, if N is my current month, N-1 would be a new partition, and (N-2, N-3) that are already existing, are rebuilt with new data. I also need the already exiting data to remain for the past months(N-4, N-5, N-T).

My question is would the above scenario be played out on checking the append option, if not what would be the best achieve this?

Tagged:

Answers

  • Alexandru
    Alexandru Dataiker, Dataiku DSS Core Designer, Dataiku DSS ML Practitioner, Dataiku DSS Adv Designer, Registered Posts: 1,209 Dataiker
    Options

    Hi @Jas
    ,

    If I understand correctly you need to build the previous month's partition and also refresh the data from N-2, N-3, and all previous partitions to remain untouched.

    To build the previous month you can use a Scenario build step with PREVIOUS_MONTH as the partition.

    Screenshot 2021-10-18 at 09.54.13.png

    Given your data is partitioned by date, is there a real need to rebuild the previous partitions, e.g 2 months ago, 3 months ago?

    If you do need you can create 2 scenario variables e.g

    inc(now(),-2,'month').toString('yyyy-MM-dd')

    Screenshot 2021-10-18 at 10.05.35.png

    Then use the variable name to build the respective partitions.

    Let me know this helps or if you have any additional questions.

Setup Info
    Tags
      Help me…