Use Case 5: Churn Prediction - Error implementing Python code in tutorial

UserBird
Dataiker
Use Case 5: Churn Prediction - Error implementing Python code in tutorial
Hi,

I am new to DSS and am attempting to advance my knowledge through the tutorials. I got to the lecture 'create a scikit-learn (python) model', but when copying and pasting in the code that is to be output to the folder 'model_scikit', I'm receiving this error:

'Job failed : Error in Python process: : min_samples_split must be at least 2 or in (0, 1], got 1'

When I change the value of min_samples_split (eg. [1.0, 3, 10], or even just [2]) I get a different error:

'Job failed : Error in Python process: : Dataset None cannot be used : declare it as input or output of your recipe'

Any ideas?

Thanks,

Clรญona
0 Kudos
1 Reply
Alex_Reutter
Dataiker Alumni
Hi,

You have the right idea that the problem with the min_samples_split setting is that it's expecting a float value in (0.0,1.0], so it's rejecting the integer 1, so the line of code in Teachable should read:

"min_samples_split": [1.0, 3, 10],

The error that you're getting after setting min_samples_split correctly suggests that your recipe input is setting "None" as the input dataset; that is, it looks something like this:

# Recipe inputs
df = dataiku.Dataset("None").get_dataframe()

rather than with "train" as the input dataset.
0 Kudos