Meet DSS user Ben Powis, Data Science Manager at UK retail company MandM Direct Read More

Potential Bug in partial dependence Plot

Level 1
Potential Bug in partial dependence Plot

If I do a logistic regression on a classification problem and look at the partial dependence plot of one of my features I see a non-linear curve. If the partial dependency would be measured in Probability I would be fine with that. 

But it is said that it is measured in log odds. If it is measured in log odds, for a logistic regression model only a linear partial dependence plot should be possible (at least from my understanding)

Please let me know if this is a bug or a deep error of my understanding 😉

best regards,

Simon

0 Kudos
3 Replies
Dataiker
Dataiker

Hello @SimonK ,

You are absolutely right, on a classification problem the partial dependence plot should be linear for a logistic regression.

The non linearity you're facing could be caused by a polynomial combinaison between the feature on which you're computing the partial dependence and another one. Could you check that you didn't add any feature generation mechanism during the design phase of your model?

Best regards,

Louis Pouillot

0 Kudos
Level 1
Author

Hello @louisplt ,

I did not add any feature generation mechanism.

Best regards,

Simon

0 Kudos
Dataiker
Dataiker

Hello @SimonK,

Could you make sure you didn't check the derived features for any numerical feature of your model as in the following screenshot:

Capture d’écran 2020-06-08 à 15.28.46.png

 

Best regards,

Louis Pouillot

0 Kudos