Output validation predictions in AutoML
Out-of-bag validation predictions (regression, classification or multiclassification) are key pieces of error analysis, ensembles, multi tier models and, in general, any type of custom model performance analysis.
It would be nice if AutoML allowed to output predictions for validation data. At a minimum this would be a dataset with an id and the actual prediction. Ideally this would add column(s) to the validation set (which already includes features and target(s)).
Furthermore this feature would be really helpful when applied to k-fold validation. In this case the output would include the entirety of the training data, and out-of-bag predictions for the whole dataset. The key here is to have out-of-bag predictions. In a 4-folds setting, for example, the first 3 folds would produce out of bag predictions for the 4th fold, then folds 1,2, 4 would produce out-of-bag predictions for fold 3 and so forth.
Comments
-
Krishna Dataiker, Dataiku DSS Core Designer, Dataiku DSS ML Practitioner, Product Ideas Manager Posts: 18 Dataiker
Thanks for the suggestion, this idea is currently in our backlog.