Scenario Custom Trigger Tips

Marlan
Marlan Neuron 2020, Neuron, Registered, Dataiku Frontrunner Awards 2021 Finalist, Neuron 2021, Neuron 2022, Dataiku Frontrunner Awards 2021 Participant, Neuron 2023 Posts: 332 Neuron

I have been working on writing a custom Python trigger script that enables flexible time-based scheduling. For example, running a scenario at multiple times during business hours on weekdays and then on a different schedule on the weekend.

One could do some versions of such a schedule with a bunch of built-in time-based triggers. In my case it would have required close to 500 triggers so wasn't feasible.

In the process of writing this, I learned a few things about custom Python triggers. Thought I would share a few notes in the reply below.

Tagged:

Best Answer

Answers

  • LisaB
    LisaB Dataiker, Alpha Tester Posts: 208 Dataiker

    Awesome! Thanks for sharing your knowledge, Marlan!

  • rona
    rona Registered Posts: 52 ✭✭✭✭✭

    Thanks Marlan for these very valuable information !

    Please, do you know the use of the 'Grace Delay' ? If I set 10 seconds and true for the CheckAgainAfterGraceDelay parameter, how does the trigger work ?

  • Marlan
    Marlan Neuron 2020, Neuron, Registered, Dataiku Frontrunner Awards 2021 Finalist, Neuron 2021, Neuron 2022, Dataiku Frontrunner Awards 2021 Participant, Neuron 2023 Posts: 332 Neuron

    Hi @rona
    ,

    I asked about this a while back and below is what I received from Matthieu Scordia from Dataiku. Still not sure I completely understand all the combinations but nonetheless thought it was helpful.

    Marlan

    Examples:

    run every 900s

    grace delay 120s

    recheck on

    you would have

    t-0 : trigger runs, no change detected

    t-900 : trigger runs, no change detected

    t-1257 : dataset is changed (rebuilt, or files changed by external source for external datasets)

    t-1800 : trigger runs, detects changes, prepares grace delay sequence

    t-1920 : trigger runs again (because of recheck on), no change detected, launches scenario run

    t-3000 : scenario done

    t-3630 : trigger runs, no change detected

    ....

    The "recheck" option controls whether DSS runs the trigger again at t-1920. The goal is to wait for the dataset to "stabilize". Imagine a dataset that is updated over the course of 30s (because of many files, or big files, or slow network, or whatever...), and you have a grace delay of 10s:

    t-0 : trigger runs, no change detected

    t-900 : trigger runs, no change detected

    t-1797 : dataset starts changing

    t-1800 : trigger runs, detects changes, prepares grace delay sequence

    t-1810 : trigger runs again (because of recheck on), detects more changes, resets grace delay sequence

    t-1820 : trigger runs again (because of recheck on), detects more changes, resets grace delay sequence

    t-1827 : dataset stops changing

    t-1830 : trigger runs again (because of recheck on), no change detected, launches scenario run

    t-3000 : scenario done

    t-3720 : trigger runs, no change detected

    ....

    The grace delay is meant to aggregate triggers if they arrive in bulk, for example if you have a second trigger B on the same dataset, with grace delay 100:

    t-0 : trigger A runs, no change detected

    t-900 : trigger A runs, no change detected

    t-1257 : dataset is changed (rebuilt, or files changed by external source for external datasets)

    t-1800 : trigger A runs, detects changes, prepares grace delay sequence

    t-1870 : trigger B runs, detects changes, prepares grace delay sequence => grace delay of trigger A is dropped (because it ends before the one of trigger B)

    t-1970 : trigger B runs again (because of recheck on), no change detected, launches scenario run

    t-3000 : scenario done

    t-3720 : trigger A runs, no change detected

  • importthepandas
    importthepandas Dataiku DSS Core Designer, Dataiku DSS & SQL, Dataiku DSS Core Concepts, Neuron, Dataiku DSS Adv Designer, Registered, Neuron 2023 Posts: 115 Neuron

    it would be quite swell if we could back into both the project key and scenario ID with get_trigger() without looping over every scenario, finding runs where trigger IDs are used to check last status in a custom trigger routine.

  • abder_
    abder_ Dataiku DSS Core Designer, Registered Posts: 1 ✭✭✭

    Is it possible to edit the 'Repeat every' setting in a custom Python trigger for customers?

Setup Info
    Tags
      Help me…