output of a random forest classification

Dataiku DSS Core Designer, Dataiku DSS Adv Designer, Registered Posts: 18 ✭✭✭✭

Hi i hope you doing well ,

i have a binary classification of two classe 0 and 2

and when i test my model on another dataset i get 3 columns in the output : the probability of being 0 (proba_0) , the probability of being 2 (proba_2) and the class (0 or 2) the logic is that if the proba_0 > 0,5 the algorithm must predict 0 else 2

but i don't get the same logic for the last row as the picture shows

kind regards

Welcome!

It looks like you're new here. Sign in or register to get started.

Answers

  • Dataiker Posts: 37 Dataiker

    Hello,

    when using visual ML in DSS for binary classification problem, an optimal threshold to decide which class is selected is computed, and not necessarily equal to 0.5:

    * The way it is computed is decided in the Design > Metrics tab, and by default, it is computed to optimize the F1 score

    * Then, in the report of the model, you can see the impact of the value of the threshold on the metrics in the "Confusion matrix" tab

    * When running a scoring/evaluation recipe, you can either keep the computed threshold of the model, or override it for this run in the "Threshold" section of the recipe

    If you wish to have a 0.5 threshold, you can change the settings of your recipe accordingly.

    Hope this helps,

    Best regards,

  • Dataiku DSS Core Designer, Dataiku DSS Adv Designer, Registered Posts: 18 ✭✭✭✭

    HELLO ,

    so if i choose a threshold of 0,025 what is the probability that decide if the prediction is 0 or 1?

    best regards

  • Dataiker Posts: 37 Dataiker

    Hello,

    Then the limit probability is the threshold, i.e. 0.025 (or 2.5% in percentage). Above, it will be predicted 1, below, it will be predicted 0.

    Hope this helps,

    Best regards

Welcome!

It looks like you're new here. Sign in or register to get started.

Welcome!

It looks like you're new here. Sign in or register to get started.