What is globalExplanationsTopImportances ? How is it calculated?

Mohammed · May 29

I'm using the API to retrieve the following information for an regression model.

details.get_performance_metrics()['globalExplanationsTopImportances']

This returns a dictionary list of dictionaries with keys and "s" and "d" as follows

[{s:"Feature1",d:0.25},{s:"Feature2",d:0.15}]

What value is given as d? What are the criteria for selecting the top features here (I see a varying number of features in this list)?
Is there any way to get the importance of all the variables as a dictionary?

Operating system used: Windows

AlexisD · May 29

Hello !

Those values are computed with shapley values. You can find more details about the process here.

The `globalExplanationsTopImportances` values are the 10 most important feature importance values.

I believe you are getting your `details` from `get_trained_model_snippet`. You can get the whole absolute feature importance dictionary using `get_trained_model_details("id").details["globalExplanationsAbsoluteImportance"]` instead. The top 10 values of that dictionnary should match the ones from the `globalExplanationsTopImportances` snippet.

Note that we don't necessarily compute the feature importances on all columns as this is a compute-heavy process. In most cases we compute a surrogate model (random forest regressor) and use the feature importances of this model to select the columns we will compute absolute feature importances on.

I hope that helps.

What is globalExplanationsTopImportances ? How is it calculated?

Best Answer

Categories

Setup Info

Tags