Use Case 5: Churn Prediction - Error implementing Python code in tutorial
UserBird
Dataiker, Alpha Tester Posts: 535 Dataiker
Hi,
I am new to DSS and am attempting to advance my knowledge through the tutorials. I got to the lecture 'create a scikit-learn (python) model', but when copying and pasting in the code that is to be output to the folder 'model_scikit', I'm receiving this error:
'Job failed : Error in Python process: : min_samples_split must be at least 2 or in (0, 1], got 1'
When I change the value of min_samples_split (eg. [1.0, 3, 10], or even just [2]) I get a different error:
'Job failed : Error in Python process: : Dataset None cannot be used : declare it as input or output of your recipe'
Any ideas?
Thanks,
Clíona
I am new to DSS and am attempting to advance my knowledge through the tutorials. I got to the lecture 'create a scikit-learn (python) model', but when copying and pasting in the code that is to be output to the folder 'model_scikit', I'm receiving this error:
'Job failed : Error in Python process: : min_samples_split must be at least 2 or in (0, 1], got 1'
When I change the value of min_samples_split (eg. [1.0, 3, 10], or even just [2]) I get a different error:
'Job failed : Error in Python process: : Dataset None cannot be used : declare it as input or output of your recipe'
Any ideas?
Thanks,
Clíona
Tagged:
Answers
-
Hi,
You have the right idea that the problem with the min_samples_split setting is that it's expecting a float value in (0.0,1.0], so it's rejecting the integer 1, so the line of code in Teachable should read:
"min_samples_split": [1.0, 3, 10],
The error that you're getting after setting min_samples_split correctly suggests that your recipe input is setting "None" as the input dataset; that is, it looks something like this:
# Recipe inputs
df = dataiku.Dataset("None").get_dataframe()
rather than with "train" as the input dataset.