Running same set of recipes multiple times in parallel with different parameters

pratikgujral-sf
pratikgujral-sf Registered Posts: 8

Hi Community,

In my dataset, I have a categorical column called customer_segment with 10 different possible values.

I wish to train 10 different models- one for each customer_segment using filtered records only for that particular segment. We have a data preparation recipe, which is just Python code. As each customer_segment is independent of the other, we want to be able to run the steps for data preparation, model training, and evaluation for each customer_segment in parallel, by passing a different value of customer_segment to the recipes each time.

Furthermore, for ease of maintenance, we do not wish to create 10 copies of the same code- for data preparation, training, and evaluation.

Is it possible to do so with a Flow?

I'm attaching a sample flow for illustrative purposes to help explain my question.

Sample Flow added for illustrative purposes.


Operating system used: Red Hat Enterprise Linux

Best Answer

Setup Info
    Tags
      Help me…