Issue with Gradient Boosting Tree Regressor

Options
mahmoud_shihab
mahmoud_shihab Partner, Dataiku DSS Core Designer, Dataiku DSS ML Practitioner, Dataiku DSS Adv Designer, Registered Posts: 42 Partner
edited July 16 in Using Dataiku

I'm currently trying to run a GradientBoostedTree model in a python code recipie as shown below:

ipykernel=4.8.2=py37_0
ipython=7.29.0=py37hb070fc8_0
ipython_genutils=0.2.0=pyhd3eb1b0_1
ipywidgets=7.6.5=pyhd3eb1b0_1
jupyter=1.0.0=py37_7
jupyter_client=5.2.4=py37_0
jupyter_console=6.4.0=pyhd3eb1b0_0
jupyter_core=4.9.1=py37h06a4308_0
jupyterlab_widgets=1.0.0=pyhd3eb1b0_1
numpy=1.21.2=py37h20f2e39_0
numpy-base=1.21.2=py37h79a1101_0
pandas=1.0.5=py37h0573a6f_0
python=3.7.11=h12debd9_0
scikit-learn=1.0.1=py37h51133e4_0
xgboost=0.82=py37h1aa3f02_0

However, when I run this, I get the error in the attached picture.

I have the following installed:

# -*- coding: utf-8 -*-
import dataiku
import pandas as pd, numpy as np
from dataiku import pandasutils as pdu
from sklearn.ensemble import GradientBoostingRegressor

# Read recipe inputs
field = dataiku.Dataset("HM_Analysis_FieldLevel_prepared")
field_df = field.get_dataframe()
# GBT_81
model_1 = dataiku.Model("WyoWL843")
pred_1 = model_1.get_predictor()

# GBT_79
model_2 = dataiku.Model("sXriI8AV")
pred_2 = model_2.get_predictor()

data = field_df['Total_diff_sum_sum']<=250
data.head()

# Compute recipe outputs from inputs
# TODO: Replace this part by your actual code that computes the output, as a Pandas dataframe
# NB: DSS also supports other kinds of APIs for reading and writing data. Please see doc.

best_fit_df = field_df # For this sample code, simply copy input to output

# Write recipe outputs
best_fit = dataiku.Dataset("best_fit")
best_fit.write_with_schema(best_fit_df)

What am I doing wrong?

Tagged:

Answers

  • Alexandru
    Alexandru Dataiker, Dataiku DSS Core Designer, Dataiku DSS ML Practitioner, Dataiku DSS Adv Designer, Registered Posts: 1,209 Dataiker
    Options

    Hi @mahmoud_shihab
    ,

    The current error "ModuleNotFoundError: No module named 'sklearn.ensemble.gradient_boosting'" suggests the import for GradientBoostingClassifier

    Not sure why that's happening as a test can you try importing

    import sklearn.ensemble

    and let us know if you still see the same error.

  • mahmoud_shihab
    mahmoud_shihab Partner, Dataiku DSS Core Designer, Dataiku DSS ML Practitioner, Dataiku DSS Adv Designer, Registered Posts: 42 Partner
    Options

    Hey Alex,

    I have tried that as well.

    It seems that (for some reason), it is using sklearn 0.20 as the default way to call the model for the visual recipies...

    Not sure how to change that, but after downgrading my sklearn, it worked...

    But now I have the issue of figuring out why it wasn't using the sklearn in the environment specified...

Setup Info
    Tags
      Help me…