Feature Handling Step for Prediction Data
Hi Team,
Can some one please guide me on replicating the feature handling steps used while training a model to prediction data for prediction score. I build the model using visual analysis and deployed that model, but now it throws error as my prediction data does not have same features.
Answers
-
Hi,
When using visual machine learning, the data preparation (analysis script step, if any) and feature preprocessing steps are included in the model. So the model expects to score a dataset that has the same columns than the original dataset.
-
Agree.....thanks for response however, question is still remains unanswered. How we can generate the same features from the newer prediction data, which were generated during training of the model so that model dont complain while making the prediction on new data. Or please suggested the dataiku recommended way for feature handling for both training and new prediction data
-
I'm not sure I get what you mean.
If the model was trained to predict column Target on a dataset that has columns Feature1, Feature2, Target, then you can score a dataset with data not seen in training, but indeed you still need this dataset to have columns Feature1, Feature2.
What kind of error do you have?