Running same set of recipes multiple times in parallel with different parameters

Solved!
pratikgujral-sf
Level 2
Running same set of recipes multiple times in parallel with different parameters

Hi Community,

In my dataset, I have a categorical column called customer_segment with 10 different possible values. 

I wish to train 10 different models- one for each customer_segment using filtered records only for that particular segment. We have a data preparation recipe, which is just Python code. As each customer_segment is independent of the other, we want to be able to run the steps for data preparation, model training, and evaluation for each customer_segment in parallel, by passing a different value of customer_segment to the recipes each time. 

Furthermore, for ease of maintenance, we do not wish to create 10 copies of the same code- for data preparation, training, and evaluation. 

Is it possible to do so with a Flow? 

I'm attaching a sample flow for illustrative purposes to help explain my question.

 

Sample Flow added for illustrative purposes.Sample Flow added for illustrative purposes.


Operating system used: Red Hat Enterprise Linux

0 Kudos
1 Solution
AlexT
Dataiker

Hi @pratikgujral-sf ,
If I understand your requirements, a partitioned model would essentially do what you are looking for.
You would partition the input dataset and train the partitioned model, the partition being customer_segment https://doc.dataiku.com/dss/latest/machine-learning/partitioned.html

If you wish to bundle flow and make it re-usable you can also look at app-as-recipe. 
https://doc.dataiku.com/dss/8.0/applications/application-as-recipe.html

Thanks

View solution in original post

1 Reply
AlexT
Dataiker

Hi @pratikgujral-sf ,
If I understand your requirements, a partitioned model would essentially do what you are looking for.
You would partition the input dataset and train the partitioned model, the partition being customer_segment https://doc.dataiku.com/dss/latest/machine-learning/partitioned.html

If you wish to bundle flow and make it re-usable you can also look at app-as-recipe. 
https://doc.dataiku.com/dss/8.0/applications/application-as-recipe.html

Thanks