output of a random forest classification
Hi i hope you doing well ,
i have a binary classification of two classe 0 and 2
and when i test my model on another dataset i get 3 columns in the output : the probability of being 0 (proba_0) , the probability of being 2 (proba_2) and the class (0 or 2) the logic is that if the proba_0 > 0,5 the algorithm must predict 0 else 2
but i don't get the same logic for the last row as the picture shows
kind regards
Answers

Hello,
when using visual ML in DSS for binary classification problem, an optimal threshold to decide which class is selected is computed, and not necessarily equal to 0.5:
* The way it is computed is decided in the Design > Metrics tab, and by default, it is computed to optimize the F1 score
* Then, in the report of the model, you can see the impact of the value of the threshold on the metrics in the "Confusion matrix" tab
* When running a scoring/evaluation recipe, you can either keep the computed threshold of the model, or override it for this run in the "Threshold" section of the recipe
If you wish to have a 0.5 threshold, you can change the settings of your recipe accordingly.
Hope this helps,
Best regards,

HELLO ,
so if i choose a threshold of 0,025 what is the probability that decide if the prediction is 0 or 1?
best regards

Hello,
Then the limit probability is the threshold, i.e. 0.025 (or 2.5% in percentage). Above, it will be predicted 1, below, it will be predicted 0.
Hope this helps,
Best regards