Rebalancing training metrics
malalearning
Registered Posts: 7 ✭
Hi everybody,
i have a doubt about the performances shown by dataiku while estimating a model in dss. In particular, we are applying training dataset rebalancing in order to deal with the unbalanced training dataset in our flow. We apply a 75%(approx ratio) downsampling on the training set with 5 folds stratified cross validation. We are not sure about the metrics shown in the analysis graph the appears after the training of the model in the Results section. Does that graph refers to the metrics on the rebalanced training dataset (hence all the cross validation process) or does it refer to the complete training dataset?
Additional question: is there a way to access the rebalanced dataset somehow in dataiku?
Thank you all