Submit your inspiring success story or innovative use case to the 2022 Dataiku Frontrunner Awards! ENTER YOUR SUBMISSION

Manipulation and documentation of custom model

Antal
Level 3
Manipulation and documentation of custom model

Hi there,

 

I made a model using VisualML and utilizing a self-developed custom model plugin (CatBoost).

I also deployed the resulting model to the flow successfully.

 

I would love to be able to do 2 things with this model:

1. Export model documentation from the VisualML model summary "Export model summary". This throws errors with the custom plugin. However, all the information that would end up in the document seems to me to be available in the model summary page. Is there any way to get this to work?

2. Pick up the model object in a python notebook using the python API. That way I can use the predictor for other tasks (for example calculate permutation importance).

Normally I'd do it like this

 

# Retrieve trained model object
model = dataiku.Model("qC0ANLpX")
predictor = model.get_predictor()
clf = predictor._clf

 

But that throws an error, because the model is not a known model type in dataiku's inner workings, I guess. Is there another way to get the predictor/sklearn model object from the saved flow model?

 

---------------------------------------------------------------------------
ModuleNotFoundError                       Traceback (most recent call last)
<ipython-input-3-c3a6a531446f> in <module>
      1 # Retrieve trained model object
      2 model = dataiku.Model("qC0ANLpX")
----> 3 predictor = model.get_predictor()
      4 clf = predictor._clf
      5 

/opt/dss/dataiku-dss-10.0.4/python/dataiku/core/saved_model.py in get_predictor(self, version_id)
    206                 model_folder = target_model_folder
    207 
--> 208             self._predictors[version_id] = build_predictor_for_saved_model(model_folder, self.get_type(), sm.get("conditionalOutputs", []))
    209         return self._predictors[version_id]
    210 

/opt/dss/dataiku-dss-10.0.4/python/dataiku/core/saved_model.py in build_predictor_for_saved_model(model_folder, model_type, conditional_outputs)
    331     from dataiku.doctor.utils.split import get_saved_model_resolved_split_desc
    332     split_desc = get_saved_model_resolved_split_desc(model_folder)
--> 333     return build_predictor(model_type, model_folder, model_folder, conditional_outputs, core_params, split_desc)
    334 
    335 

/opt/dss/dataiku-dss-10.0.4/python/dataiku/core/saved_model.py in build_predictor(model_type, model_folder, preprocessing_folder, conditional_outputs, core_params, split_desc, train_split_desc)
    396             pkl_path = osp.join(model_folder, "clf.pkl" if is_prediction else "clusterer.pkl")
    397             with open(pkl_path, "rb") as f:
--> 398                 clf = pickle.load(f)
    399                 try:
    400                     logger.info("Post-processing model")

ModuleNotFoundError: No module named 'modelcatboost'

 

 

0 Kudos
0 Replies