Issue with Gradient Boosting Tree Regressor
I'm currently trying to run a GradientBoostedTree model in a python code recipie as shown below:
ipykernel=4.8.2=py37_0 ipython=7.29.0=py37hb070fc8_0 ipython_genutils=0.2.0=pyhd3eb1b0_1 ipywidgets=7.6.5=pyhd3eb1b0_1 jupyter=1.0.0=py37_7 jupyter_client=5.2.4=py37_0 jupyter_console=6.4.0=pyhd3eb1b0_0 jupyter_core=4.9.1=py37h06a4308_0 jupyterlab_widgets=1.0.0=pyhd3eb1b0_1 numpy=1.21.2=py37h20f2e39_0 numpy-base=1.21.2=py37h79a1101_0 pandas=1.0.5=py37h0573a6f_0 python=3.7.11=h12debd9_0 scikit-learn=1.0.1=py37h51133e4_0 xgboost=0.82=py37h1aa3f02_0
However, when I run this, I get the error in the attached picture.
I have the following installed:
# -*- coding: utf-8 -*- import dataiku import pandas as pd, numpy as np from dataiku import pandasutils as pdu from sklearn.ensemble import GradientBoostingRegressor # Read recipe inputs field = dataiku.Dataset("HM_Analysis_FieldLevel_prepared") field_df = field.get_dataframe() # GBT_81 model_1 = dataiku.Model("WyoWL843") pred_1 = model_1.get_predictor() # GBT_79 model_2 = dataiku.Model("sXriI8AV") pred_2 = model_2.get_predictor() data = field_df['Total_diff_sum_sum']<=250 data.head() # Compute recipe outputs from inputs # TODO: Replace this part by your actual code that computes the output, as a Pandas dataframe # NB: DSS also supports other kinds of APIs for reading and writing data. Please see doc. best_fit_df = field_df # For this sample code, simply copy input to output # Write recipe outputs best_fit = dataiku.Dataset("best_fit") best_fit.write_with_schema(best_fit_df)
What am I doing wrong?
Answers
-
Alexandru Dataiker, Dataiku DSS Core Designer, Dataiku DSS ML Practitioner, Dataiku DSS Adv Designer, Registered Posts: 1,226 Dataiker
Hi @mahmoud_shihab
,The current error "ModuleNotFoundError: No module named 'sklearn.ensemble.gradient_boosting'" suggests the import for GradientBoostingClassifier
Not sure why that's happening as a test can you try importing
import sklearn.ensemble
and let us know if you still see the same error.
-
mahmoud_shihab Partner, Dataiku DSS Core Designer, Dataiku DSS ML Practitioner, Dataiku DSS Adv Designer, Registered Posts: 42 Partner
Hey Alex,
I have tried that as well.
It seems that (for some reason), it is using sklearn 0.20 as the default way to call the model for the visual recipies...
Not sure how to change that, but after downgrading my sklearn, it worked...
But now I have the issue of figuring out why it wasn't using the sklearn in the environment specified...