When I try to "split" a dataset randomly, I currently get the following options:
- Full random
- Random subset
Neither of those is what I often use to split into training/test data: Stratified sampling, to ensure that classes with very low presence (e.g. only a few dozen of 10000) are present in both sets. Is there something I overlooked, or is this not currently implemented?