Welcome to the Dataiku, community. We are glad to have you here with us.
In general you are correct. You need an id key column in the dataset you are scoring so you can connect it back to other data you might have. (This is common, and I do it often.). So right idea there on your part.
I think that you should focus on “If i try to input the table with feature table + customer_id (51 columns) into to the scoring recipe, it throws an error saying number of features mismatch.” With this description I don’t have enough information to understand exactly how this error is coming up. For the community to help you, we will need some more details.
Although this description makes me wonder, How are you trying to build the model. (In general I’ve found with the visual model builders that I have to re-build and redeploy the model whenever I add, remove or change the type of a feature in the data set. Even if I’m ignoring a column in the model like a customer key. (Remember to exclude the customer number from the list of features. Or the model is very likely to ever fit.)
Finally if you have a key in your dataset, in general I would just pass it through the model building and scoring phase of your flow. The only reason I’d create a new id column is because the existing data coming into the flow did not have a unique key, and I needed to make a join somewhere later in my process.
Hope that might help a bit. Others in the community may have further insights particularly if you provide further details.
Sign up to take part
Registered users can ask their own questions, contribute to discussions, and be part of the Community!