Use Case 5: Churn Prediction - Error implementing Python code in tutorial

Dataiker, Alpha Tester Posts: 535 Dataiker
Hi,

I am new to DSS and am attempting to advance my knowledge through the tutorials. I got to the lecture 'create a scikit-learn (python) model', but when copying and pasting in the code that is to be output to the folder 'model_scikit', I'm receiving this error:

'Job failed : Error in Python process: : min_samples_split must be at least 2 or in (0, 1], got 1'

When I change the value of min_samples_split (eg. [1.0, 3, 10], or even just [2]) I get a different error:

'Job failed : Error in Python process: : Dataset None cannot be used : declare it as input or output of your recipe'

Any ideas?

Thanks,

Clíona

Answers

  • Alpha Tester, Dataiker Alumni, Dataiku DSS Core Designer Posts: 105 ✭✭✭✭✭✭✭
    Hi,

    You have the right idea that the problem with the min_samples_split setting is that it's expecting a float value in (0.0,1.0], so it's rejecting the integer 1, so the line of code in Teachable should read:

    "min_samples_split": [1.0, 3, 10],

    The error that you're getting after setting min_samples_split correctly suggests that your recipe input is setting "None" as the input dataset; that is, it looks something like this:

    # Recipe inputs
    df = dataiku.Dataset("None").get_dataframe()

    rather than with "train" as the input dataset.

Welcome!

It looks like you're new here. Sign in or register to get started.

Welcome!

It looks like you're new here. Sign in or register to get started.