Class imbalance and test and train accuracy scores in Dataiku

sir
sir Registered Posts: 2

Hi All,

1)How do I obtain test set accuracy in Dataiku? I want to make sure that I am not over fitting and my train set is accuracy is close to test set accuracy. But currently, Idon't know how to see the test set accuracy

2)What are the detailed steps to address class imbalance ? Can someone show with screenshots?

Thank you so much!

S

Tagged:

Answers

  • JordanB
    JordanB Dataiker, Dataiku DSS Core Designer, Dataiku DSS Adv Designer, Registered Posts: 296 Dataiker

    Hi @sir
    ,

    During the training phase, DSS “holds out” on the test set, and the model is only trained on the train set. This ensures that the evaluation is done on data that the model has “never seen before”. To get test set results, you will need to run an evaluation recipe:

    An Evaluation recipe takes as inputs:

    • an evaluation dataset

    • a model

    An Evaluation Recipe can have up to three outputs:

    • an Evaluation Store, containing the main Model Evaluation and all associated result screens

    • an output dataset, containing the input features, prediction and correctness of prediction for each record

    • a metrics dataset, containing just the performance metrics for this evaluation (i.e. it’s a subset of the Evaluation Store)

    The metrics dataset will provide results that you can compare to the train set results.

    In regards to addressing class imbalance, please refer to the following community post: https://community.dataiku.com/t5/Using-Dataiku/How-to-train-a-classification-model-on-an-imbalanced-dataset/m-p/6119

    Please let us know if you have any questions.

    Thanks!

    Jordan

Setup Info
    Tags
      Help me…