Output explanations of a subset of data
Hi,
I have a large dataset, and my model estimates probabilities for each record. I would like to limit "ouput explanations" to the 1 % highest probabilities, since the calculation is quite time and resource consuming. I know that I can calculate all probabilities once, then filter the dataset, and finally calculate probabilities with explanations for the filtered subset. But I hope that it was a more elegant way to fix this.
-anders-
Best Answer
-
Hi,
There is no option in the scoring recipe to selectively explain rows based on some predicate. The filtering you suggested would be the way to go.
There is such a feature in the model's report though, where the Individual explanations tab offers to explain the prediction for the rows from the test set with the highest and/or lowest probabilities.