Could you please help in interpreting what Others stand for in Variable Importance chart as shown
Hi Team,
Below chart is showing Variable importance of my Predicting model. Could you please help me in interpreting it as what does "SN_PN is Other" means? I dont have others class in SN_PN variable data where SN_PN stands for SerialNumber_ProductNumber and my target variable is "ResolutionCode". Thanks in advance
Answers
-
tgb417 Dataiku DSS Core Designer, Dataiku DSS & SQL, Dataiku DSS ML Practitioner, Dataiku DSS Core Concepts, Neuron 2020, Neuron, Registered, Dataiku Frontrunner Awards 2021 Finalist, Neuron 2021, Neuron 2022, Frontrunner 2022 Finalist, Frontrunner 2022 Winner, Dataiku Frontrunner Awards 2021 Participant, Frontrunner 2022 Participant, Neuron 2023 Posts: 1,605 Neuron
My guess is that your feature SN_PN, is a categorical variable with very high cardinality. (My guess is that this is like an alpha numeric serial number plus part number.)
When you look at your feature handling in setting up your model for the feature, Data Science Studio (DSS) notes this and rather than 1 hot encoding all of those possible values of this feature, it is encoding a subset, and placing all other less frequent values in the category “other”.
It appears to me that your model has discovered that looking at the less common values of SN_PN is helpful at predicting your target value.
Hope that helps.You can increase the number of values that will be encoded, or try using one of the other encoding strategies to see if you can get any better of a model.
Have fun with this.