Scoring partitioned dataset

Newuser01 Registered Posts: 3 ✭✭✭


I have created a partitioned dataset with 10 partitions. Now I am training 3 regression model on this dataset.
What I have observed is that different partitions have different models(out of the 3) which gave better result.

For example: Partition 1 had best RMSE from model 2 while Partition 2 had best RMSE from model 1.

Is it possible that while scoring I can have the model with best result specific to that partition? Is it possible to automate it instead of manually selecting the model and then selecting that partition?

Any help would be appreciated.




  • tgb417
    tgb417 Dataiku DSS Core Designer, Dataiku DSS & SQL, Dataiku DSS ML Practitioner, Dataiku DSS Core Concepts, Neuron 2020, Neuron, Registered, Dataiku Frontrunner Awards 2021 Finalist, Neuron 2021, Neuron 2022, Frontrunner 2022 Finalist, Frontrunner 2022 Winner, Dataiku Frontrunner Awards 2021 Participant, Frontrunner 2022 Participant, Neuron 2023 Posts: 1,595 Neuron


    Welcome to the Dataiku community.

    One way, might be, NOT to use the built in partitioning for building models. But to instead use a split recipe to split by the partitions you currently have as a partition. Then rather than create one model create three models. Once all are trained then picking the best model for each of the subsets that were part of your original portioned model.

    The next challenge will be how you operationalize your model. When submitting your data for production scoring you will have to have a split routine for incoming data. Sending the new data to the right one of these separate models.

    The next question I would have is how big a difference do these different model type making. If they are huge and important difference then it might be worth this extra work and compute. If the differences were very small, you will need to decide if the extra effort makes a difference in your context.

    Others here may have other ideas. Please jump in with your perspectives.

Setup Info
      Help me…