Ready for Dataiku 10? Try out the Crash Course on new features!GET STARTED

Output validation predictions in AutoML

Out-of-bag validation predictions (regression, classification or multiclassification) are key pieces of error analysis, ensembles, multi tier models and, in general, any type of custom model performance analysis.

It would be nice if AutoML allowed to output predictions for validation data. At a minimum this would be a dataset with an id and the actual prediction. Ideally this would add column(s) to the validation set (which already includes features and target(s)).

Furthermore this feature would be really helpful when applied to k-fold validation. In this case the output would include the entirety of the training data, and out-of-bag predictions for the whole dataset. The key here is to have out-of-bag predictions. In a 4-folds setting, for example, the first 3 folds would produce out of bag predictions for the 4th fold, then folds 1,2, 4 would produce out-of-bag predictions for fold 3 and so forth.

1 Comment
Krishna
Dataiker
Dataiker
Status changed to: In Backlog

Thanks for the suggestion, this idea is currently in our backlog.