Class imbalance and test and train accuracy scores in Dataiku

sir
Level 1
Class imbalance and test and train accuracy scores in Dataiku

Hi All,

1)How do I obtain test set accuracy in Dataiku? I want to make sure that I am not over fitting and my train set is accuracy is close to test set accuracy. But currently, Idon't know how to see the test set accuracy

2)What are the detailed steps to address class imbalance ? Can someone show with screenshots?

Thank you so much!

S

1 Reply
JordanB
Dataiker

Hi @sir,

During the training phase, DSS “holds out” on the test set, and the model is only trained on the train set. This ensures that the evaluation is done on data that the model has “never seen before”. To get test set results, you will need to run an evaluation recipe:

An Evaluation recipe takes as inputs:

  • an evaluation dataset

  • a model

An Evaluation Recipe can have up to three outputs:

  • an Evaluation Store, containing the main Model Evaluation and all associated result screens

  • an output dataset, containing the input features, prediction and correctness of prediction for each record

  • a metrics dataset, containing just the performance metrics for this evaluation (i.e. it’s a subset of the Evaluation Store)

The metrics dataset will provide results that you can compare to the train set results.

In regards to addressing class imbalance, please refer to the following community post: https://community.dataiku.com/t5/Using-Dataiku/How-to-train-a-classification-model-on-an-imbalanced-...

Please let us know if you have any questions.

Thanks!

Jordan

0 Kudos