Output validation predictions in AutoML

Out-of-bag validation predictions (regression, classification or multiclassification) are key pieces of error analysis, ensembles, multi tier models and, in general, any type of custom model performance analysis.

It would be nice if AutoML allowed to output predictions for validation data. At a minimum this would be a dataset with an id and the actual prediction. Ideally this would add column(s) to the validation set (which already includes features and target(s)).

Furthermore this feature would be really helpful when applied to k-fold validation. In this case the output would include the entirety of the training data, and out-of-bag predictions for the whole dataset. The key here is to have out-of-bag predictions. In a 4-folds setting, for example, the first 3 folds would produce out of bag predictions for the 4th fold, then folds 1,2, 4 would produce out-of-bag predictions for fold 3 and so forth.


Thanks for the suggestion, this idea is currently in our backlog.

Status changed to: In Backlog

Thanks for the suggestion, this idea is currently in our backlog.

Community Manager
Community Manager
I hope I helped! Do you Know that if I was Useful to you or Did something Outstanding you can Show your appreciation by giving me a KUDOS?

Looking for more resources to help you use DSS effectively and upskill your knowledge? Check out these great resources: Dataiku Academy | Documentation | Knowledge Base

A reply answered your question? Mark as ‘Accepted Solution’ to help others like you!
Status changed to: In the Backlog