Fields in Scoring Dataset that weren't in Training Dataset

Tags
Registered Posts: 18 ✭✭✭✭
Hi there,

Two questions:

1) I'm receiving the error message below and I'm wondering if case on the field text is impacting this. I have 'Number of Rooms' in my training set and 'NUMBER_OF_ROOMS' in my scoring set.

- Will I need to go into my scoring set and match the case of the field names with the training set?

An invalid argument has been encountered : in act.score_Model_Score_NP: Cannot apply the model with the output of preparation on this input (Missing column: Number_of_Rooms)

2) I have some extra fields in my scoring set that were not in my training set, is there a way for my model to ignore those additional fields when scoring the model?

Thank you in advance, I really appreciate it.


Operating system used: Windows


Operating system used: Windows

Welcome!

It looks like you're new here. Sign in or register to get started.

Best Answer

  • Dataiker, Dataiku DSS Core Designer, Dataiku DSS ML Practitioner, Dataiku DSS Adv Designer, Registered Posts: 1,270 Dataiker
    Answer ✓

    Hi @ccecil
    ,
    You either need to match the schema of the training dataset to the scored datase.
    You can to use a preparation script in the visual analysis for the model you've trained.
    E.g Simply add rename column step where the column name is "Number of Rooms" and change to "Number_of_Rooms" , you may get a warning if the column doesn't exist in the training datasets.

    Screen Shot 2023-03-31 at 4.44.55 PM.pngBut it would apply the same script steps when running the scoring recipe and should avoid the error you see.

    Thanks

Answers

Welcome!

It looks like you're new here. Sign in or register to get started.

Welcome!

It looks like you're new here. Sign in or register to get started.