Do you know the History of Data Science? READ MORE

Run a flow for every unique variable in a column

Tsurapaneni
Level 3
Level 3
Run a flow for every unique variable in a column

Hi Team,

I have a use case which is, I have a column with 20 unique variables and I want to run a flow which is consisting of a filter recipe which filters the rows for each unique variable and then the filtered dataset should be fed as input into the clustering recipe and then winning model output should be saved as the output and all the outputs has to be appended. The same process should be done for every unique variable within the dataset. I have seen the declaration of the global and local variables etc.. can you please walk me through these steps on how we can automate them. For better understanding I am giving an example below:

The below is the original dataset 

step 1:

Dataset A                                                                      Dataset B

name    counts                   =>                                  name      counts

wet.        45                    Filtered based on the          wet          45

tree.        30                   count records                      tree           30

yard.        10                  (counts > 10)                      

 

step 2: From here the flow should run for every row in the dataset B

name     counts                  (Implement the clustering       deploy the winning model

wet          45                             recipe)                                    output 1

 

 

name     counts                  (Implement the clustering       deploy the winning model

tree          30                             recipe)                                    output 2

 

the output dataset we can append all the outputs in the same dataset.

 

Please let me know if you need more information.

 

Thanks in advance !

0 Kudos
0 Replies
A banner prompting to get Dataiku DSS