ability to train a model on the entire dataset

Tanguy Dataiku DSS Core Designer, Dataiku DSS & SQL, Dataiku DSS ML Practitioner, Dataiku DSS Core Concepts, Neuron, Dataiku DSS Adv Designer, Registered, Dataiku DSS Developer, Neuron 2023 Posts: 112 Neuron

It is currently not possible to train a deployed model on an entire dataset because dataiku forces the user to specify a test set.

See this thread detailing dataiku's current limitations when training a model: https://community.dataiku.com/t5/Using-Dataiku/when-training-a-model-with-a-visual-recipe-does-dataiku-fit-the/m-p/31365/thread-id/11656#M11665

However, especially when a model has not reached its performance plateau (which one can see using a learning curve), it is considered best practice to train a model on all available data after finding the best combination of hyperparameters.

Adding this feature in the train visual recipe would be highly appreciated.

3 votes

New · Last Updated

Setup Info
      Help me…