Loading Dataiku model in Python Notebook are recipes

Paul
Paul Registered Posts: 3

Hello everyone,

I would like to discuss about an issue i'm facing. I trained a gradient boosting classifier using Dataiku Lab. I would like to use Shap explainability on it and first, try it on a python notebook.

To do such a thing i am loading my model this way :

import dataiku
from dataiku

import pandasutils as pdu

model = dataiku.Model("Predict_categories")


predictor = model.get_predictor()

I got a huge quantities of warning after this about the difference of sklearn version.

/local/dataiku/data/code-envs/python/python39_ML/lib/python3.9/site-packages/sklearn/base.py:318: UserWarning: Trying to unpickle estimator DecisionTreeRegressor from version 0.20.4 when using version 1.2.2. This might lead to breaking code or invalid results. Use at your own risk. For more info please refer to:https://scikit-learn.org/stable/model_persistence.html#security-maintainability-limitations

After this i am not able to recover the model with a "sklearn" format, wich is what SHAP is waiting for as input because the variable predictor is actually an instance of type

 <class 'dataiku.core.saved_model.Predictor'>

and trying Shap on this returns an exception of type

InvalidModelError: Model type not yet supported by TreeExplainer: <class 'dataiku.core.saved_model.Predictor'>

I tried to deep dive into this object attributs or methods and maybe i found what i need. Predictor object or Predictor._model object have an .clf attribute that would refer to classifier. But calling this attribute return the following exception :

AttributeError: 'UnpicklableGradientBoostingClassifier' object has no attribute 'ccp_alpha'

My question : Am i doing it the right way ? Are these issues all related to the difference of version between notebook kernel and dataiku env ?

Thank you for your time.

Setup Info
    Tags
      Help me…