Extracting variable importance from model recipe through code recipe

shreyanshv6 · January 2023

Hi, I wish to automate extraction of variable importance as a data frame from a model and use it for further processes using python recipe.

However, when I'm using this method, https://community.dataiku.com/t5/Using-Dataiku/How-to-get-Variable-Importance-from-Model/td-p/3589 [trained_model_detail.get_raw().get('perf').get('variables_importance')]. I get different importance scores than what's visible in the model.

Also dataiku.Model("hSO3BRlk").list_versions() show only top 10 variables as per importance scores but I need more than 10.

Also, I am running only one model, so there is not possibility of incorrect model choosing.

I wish to move forward with the below pre-written codes in python recipe from deployed model

model_1 = dataiku.model('hSO3BRlk')

pred_1 = model_1.get_predictor()

Kindly help.

Sarina · January 2023

Hi @shreyanshv6
,

Can you check what you get if you print out trained_model_ids in your code? For example:

Screen Shot 2023-01-11 at 6.10.21 PM.png

I want to make sure that you are referring to the correct ID just in case you do have more than 1 returned

I also found on DSS 11 that the following code works, which I would suggest trying first:

client = dataiku.api_client()
project = client.get_default_project()

# make sure you are pointing to your analysis ID here
analysis = project.get_analysis('IEyUbkB7')

mltask = analysis.get_ml_task('S74zMovS')
trained_model_ids = mltask.get_trained_models_ids()
print(trained_model_ids)

# here, i'm pointing to my first trained_model_id, but this may vary 
prediction_results = mltask.get_trained_model_details(trained_model_ids[0])
values = prediction_results.get_raw()['iperf']['rawImportance']
for i in range(0, len(prediction_results.get_raw()['iperf']['rawImportance']['variables'])):
    print(values['variables'][i], values['importances'][i])

If the results still don't look right to you, can you please attach a screenshot of your full variable importance screen in visual analysis, including the URL which will contain the analysis ID.

Then please paste your trained_model_ids results and the results you get when printing out the importance values that are different from what you expect.

Thanks,
Sarina

Extracting variable importance from model recipe through code recipe

Answers

Categories

Setup Info

Tags