Sign up to take part
Registered users can ask their own questions, contribute to discussions, and be part of the Community!
Registered users can ask their own questions, contribute to discussions, and be part of the Community!
以下の記述がありますので、指定した%の範囲にて、両方のクラス数が同程度となるように、単純にアンダーサンプリングする動作となるはずです。
This method does not oversample, only undersample (so some rare modalities may remain under-represented). In all cases, rebalancing is approximative.
https://doc.dataiku.com/dss/latest/explore/sampling.html#class-rebalancing-approximate-ratio
その意味では、以下ドキュメントに記載のように、学習データの数が少ない場合、Class rebalancingではなくてweighting strategyのclass weightsを利用することを推奨しています。
Class weights can be substituted by a “Class rebalancing” sampling strategy settable in Settings: Train / Test set, which is recommended for larger datasets. For smaller datasets, i.e. when preprocessed data fits in memory, chosing the “class weights” weighting strategy is the recommended option.