Ready for Dataiku 10? Try out the Crash Course on new features!GET STARTED

Issue with Gradient Boosting Tree Regressor

mahmoud_shihab
Level 1
Level 1
Issue with Gradient Boosting Tree Regressor

I'm currently trying to run a GradientBoostedTree model in a python code recipie as shown below:

 

# -*- coding: utf-8 -*-
import dataiku
import pandas as pd, numpy as np
from dataiku import pandasutils as pdu
from sklearn.ensemble import GradientBoostingRegressor

# Read recipe inputs
field = dataiku.Dataset("HM_Analysis_FieldLevel_prepared")
field_df = field.get_dataframe()
# GBT_81
model_1 = dataiku.Model("WyoWL843")
pred_1 = model_1.get_predictor()

# GBT_79
model_2 = dataiku.Model("sXriI8AV")
pred_2 = model_2.get_predictor()

data = field_df['Total_diff_sum_sum']<=250
data.head()

# Compute recipe outputs from inputs
# TODO: Replace this part by your actual code that computes the output, as a Pandas dataframe
# NB: DSS also supports other kinds of APIs for reading and writing data. Please see doc.

best_fit_df = field_df # For this sample code, simply copy input to output

# Write recipe outputs
best_fit = dataiku.Dataset("best_fit")
best_fit.write_with_schema(best_fit_df)

 

However, when I run this, I get the error in the attached picture.

I have the following installed:

ipykernel=4.8.2=py37_0
ipython=7.29.0=py37hb070fc8_0
ipython_genutils=0.2.0=pyhd3eb1b0_1
ipywidgets=7.6.5=pyhd3eb1b0_1
jupyter=1.0.0=py37_7
jupyter_client=5.2.4=py37_0
jupyter_console=6.4.0=pyhd3eb1b0_0
jupyter_core=4.9.1=py37h06a4308_0
jupyterlab_widgets=1.0.0=pyhd3eb1b0_1
numpy=1.21.2=py37h20f2e39_0
numpy-base=1.21.2=py37h79a1101_0
pandas=1.0.5=py37h0573a6f_0
python=3.7.11=h12debd9_0
scikit-learn=1.0.1=py37h51133e4_0
xgboost=0.82=py37h1aa3f02_0

 What am I doing wrong?

0 Kudos
2 Replies
AlexT
Dataiker
Dataiker

Hi @mahmoud_shihab ,

The current error "ModuleNotFoundError: No module named 'sklearn.ensemble.gradient_boosting'" suggests the import for GradientBoostingClassifier

Not sure why that's happening as a test can you try importing 

import sklearn.ensemble

and let us know if you still see the same error. 

0 Kudos
mahmoud_shihab
Level 1
Level 1
Author

Hey Alex,

I have tried that as well.

It seems that (for some reason), it is using sklearn 0.20 as the default way to call the model for the visual recipies...

Not sure how to change that, but after downgrading my sklearn, it worked...

But now I have the issue of figuring out why it wasn't using the sklearn in the environment specified...

0 Kudos