Potential Bug in partial dependence Plot

Options
SimonK
SimonK Partner, Registered Posts: 3 Partner

If I do a logistic regression on a classification problem and look at the partial dependence plot of one of my features I see a non-linear curve. If the partial dependency would be measured in Probability I would be fine with that.

But it is said that it is measured in log odds. If it is measured in log odds, for a logistic regression model only a linear partial dependence plot should be possible (at least from my understanding)

Please let me know if this is a bug or a deep error of my understanding

best regards,

Simon

Answers

  • louisplt
    louisplt Dataiker Posts: 21 Dataiker
    Options

    Hello @SimonK
    ,

    You are absolutely right, on a classification problem the partial dependence plot should be linear for a logistic regression.

    The non linearity you're facing could be caused by a polynomial combinaison between the feature on which you're computing the partial dependence and another one. Could you check that you didn't add any feature generation mechanism during the design phase of your model?

    Best regards,

    Louis Pouillot

  • SimonK
    SimonK Partner, Registered Posts: 3 Partner
    Options

    Hello @louisplt
    ,

    I did not add any feature generation mechanism.

    Best regards,

    Simon

  • louisplt
    louisplt Dataiker Posts: 21 Dataiker
    Options

    Hello @SimonK
    ,

    Could you make sure you didn't check the derived features for any numerical feature of your model as in the following screenshot:

    Capture d’écran 2020-06-08 à 15.28.46.png

    Best regards,

    Louis Pouillot

Setup Info
    Tags
      Help me…