Submit your inspiring success story or innovative use case to the 2022 Dataiku Frontrunner Awards! ENTER YOUR SUBMISSION

Identifying training records in Visual Analyses

InfiniteZest
Level 1
Identifying training records in Visual Analyses

For Visual Analyses, is there a way to see which records landed in the training and testing sets when using the 'class rebalancing' option for subsampling? I want to be able to exclude certain rows that the model was trained on when scoring on another dataset. I’ve had a quick look online but was unable to find an answer.


Operating system used: Windows

1 Reply
Nicolas_Servel
Dataiker
Dataiker

Hello,

It is not possible to know which row ends up in which dataset.

 

If you wish to have more control on the  dataset(s) used for training, you can switch to "Explicit extract from two datasets" policy in the "Train / Test set" tab of the training, where you can select 2 datasets that you have previously created.

 

Hope that helps,

Best,

Nicolas Servel