Output explanations of a subset of data
Hi,
I have a large dataset, and my model estimates probabilities for each record. I would like to limit "ouput explanations" to the 1 % highest probabilities, since the calculation is quite time and resource consuming. I know that I can calculate all probabilities once, then filter the dataset, and finally calculate probabilities with explanations for the filtered subset. But I hope that it was a more elegant way to fix this.
anders
Best Answer

Hi,
There is no option in the scoring recipe to selectively explain rows based on some predicate. The filtering you suggested would be the way to go.
There is such a feature in the model's report though, where the Individual explanations tab offers to explain the prediction for the rows from the test set with the highest and/or lowest probabilities.