Run a recipe for all partitions available
When I run a recipe how do I run it for all the partitions of one variable.
In the below photo I would like to run this code recipe for all partitions in RW_Index.
Answers
-
Alexandru Dataiker, Dataiku DSS Core Designer, Dataiku DSS ML Practitioner, Dataiku DSS Adv Designer, Registered Posts: 1,225 Dataiker
Hi,
Currently DSS expects the explicit list of partitions to build. If you want run a recipe on all partitions you can use scenario with execute python code. Here is one example of how you could accomplish this:
from dataiku.scenario import Scenario import dataiku scenario = Scenario() dataset = dataiku.Dataset("input_dataset_name") partitions = dataset.list_partitions() # get all partitions from input dataset # for all available partitions in all dimensions #partitions_str = ','.join(partitions) # concatenate #when some dimensions are defined but another dimensions requires ALL include in your example partitions you want will start with '2020Q4|Pricing|L4L_Monthly' partitions_str = ','.join([item for item in partitions if item.startswith('2020Q4|Pricing|L4L_Monthly')]) scenario.build_dataset("output_good", partitions=partitions_str)
-
Is there no way to run the spark engine for all partitions using visual recipes?
-
Alexandru Dataiker, Dataiku DSS Core Designer, Dataiku DSS ML Practitioner, Dataiku DSS Adv Designer, Registered Posts: 1,225 Dataiker
Hi @TheMLEngineer
,There is no direct way to build all partitions visually there has been a feature request submitted.
When redispatch partitions in that case all partitions will be redispatched by default.
If you have a date partition, you can set a date range to cover all partitions.
If you have discrete, you can set a variable with all of your partitions and build it that way.
One of the benefits of partitions is not having to build all every time, when building them the first time, you can indeed use the scenario or other methods mentioned above>
Kind Regards, -
Thanks @AlexT
, this is helpful. I used a pyspark recipe to run the partition in the code.