This variable either gets the value of another variable called "partition_to_build" if defined (that our main scenario will define in step 2), the value of "scenarioTriggerParam_partition_to_build" that we can define manually, or the current date as a fallback.
Now, use this variable in the build steps as a partition identifier:
You can try to run the scenario. It will run for the current day.
You can also run the scenario for another date choosing the "Run with custom parameters" in the top-right corner and entering a value for the parameter "partition_to_build":
Step 2: Meta-scenario that runs the first scenario for all missing partitions
Now that we have a scenario that can build the Flow for a given partition, let's create another scenario that will be able to run this scenario for all missing partitions.
First, create a "Custom Python script" scenario.
You can now add a script that:
gets all existing partitions
generates a list of partitions that should exist
finds missing partitions (difference of the two following lists)
executes the scenario to build the Flow for any missing partition, one by one
from dataiku.scenario import Scenario
from datetime import timedelta, date
# object for this scenario
scenario = Scenario()
# let's get all curent existing partitions from a dataset of the flow
dataset = dataiku.Dataset('weather_conditions_prepared')
partitions = dataset.list_partitions()
# generate all partitions that should be buikt (here from Jan 1st 2020 until current day)
def dates_range(date1, date2):
for n in range(int ((date2 - date1).days)+1):
yield date1 + timedelta(n)
all_dates = [dt.strftime("%Y-%m-%d") for dt in dates_range(date(2020, 1, 1), date.today())]
print("Partitions that should exist:")
# let's find missing partitions
for partition in all_dates:
if partition not in partitions:
print("%s : missing partition" % partition)
# let's set a variable (on the current scenario) with the missing partition to build
# let's run the scenario that builds the flow for a given partition
# note that scenario variables are propagated to children scenarios, so the scenario
# will be able to read the variable 'partition_to_build'
Here is the same Python script as a scenario.
Finally, you can run the scenario and see in the list of jobs that missing partitions get built.