Optimize therehsold options in dataiku
Dataiku provides many options to evalate model in the section"Optimize model hyperparameters for"
However there are only 3 options in the section below it which is "Optimize threshold for"
Why are others parameters (like AUC) are removed for the list?
Answers
-
Hi,
Optimizing threshold can only use metrics that actually depend on the threshold. AUC and log loss are "global" metrics that look at all probabilities, so it would not make sense to use them to optimize the threshold (they are fully independent from it).
Precision and Recall are also excluded because "optimizing for precision" would always yield 1 as threshold, and "optimizing for recall" would always yield 0. You need metrics that represent a balance between these, which is why F1 or cost-matrix are almost always what you want.