Answer: how to apply shap Python package to DSS visual model
MarkPundurs
Dataiku DSS Core Designer, Dataiku DSS ML Practitioner, Registered Posts: 29 ✭✭✭✭
Here's the code I used for a visual KNN model:
# -*- coding: utf-8 -*-
import dataiku
import pandas as pd, numpy as np
from dataiku import pandasutils as pdu
import shap
# Read recipe inputs
model_1 = dataiku.Model("l10JO8hN")
# get DSS Predictor from model
pred_1 = model_1.get_predictor()
feat_ds = dataiku.Dataset("myDatasetName")
# get pandas dataframe from DSS dataset
feat_df = feat_ds.get_dataframe()
# get numpy array from dataframe
feat_arr = feat_df.values
# Compute recipe outputs from inputs:
# define function that takes a numpy array as its argument and returns the output of the DSS Predictor
def shap_predict(feat_arr):
# rebuild dataframe from array
shap_df = pd.DataFrame.from_records(feat_arr, columns=feat_df.columns)
# pass dataframe to DSS Predictor and return output
return pred_1.predict(shap_df)
# pass array-to-prediction function and feature array as arguments to shap.KernelExplainer()
explainer = shap.KernelExplainer(shap_predict, feat_arr)
shap_values = explainer.shap_values(feat_arr)
shap_df = pd.DataFrame.from_records(shap_values)
# Write recipe outputs
shap = dataiku.Dataset("shap_new")
shap.write_with_schema(shap_df)
Answers
-
Hi,
This is not fully supported for visual models, may work for some but not sure it will work for all kinds of models / preprocessings. Thanks for sharing the code that did work for you though.
Note that DSS 12.1 natively offers shapley values estimations and their export into a dataset.