Why is accuracy metrics different when retrained on the same data with same set of parameters ?

maryas · March 2021

Hello Community,

I'm quite new to DSS and I have this doubt that why the accuracy metrics are different when retrained in the flow on the same data with same design and the model ? If you retrain with the same design in the Lab, it would give the same result. But why is it showing different result in the flow ? Can anybody help ?

Thanks & Regards!

tgb417 · March 2021

@maryas

Welcome to the Dataiku Community. I've enjoyed your questions so far.

Can you share some more detail about the flow you are using?

One of the typical reasons that model results might be different between the Lab and flow is that you are actually running the model on a different dataset. (In the lab this would typically be a training set.) In the flow, you are often running against a different validate set that the model has not yet "seen". Might that distinction help explain what you are seeing?

Typical Train & Validate Split.jpg

maryas · March 2021

Hello Tom,

Of course the accuracy will be different when executed on validation set rather than the test set in the Lab. But when you re-train the model with same design in Lab, you would get the same result. But when I retrain the model in the flow, with no changes at all, it will show me different accuracy metrics.

Regards!

tgb417 · March 2021

Many models have random selection criteria based on a random seed. (I don’t know if these are held constant between labs and flow.) This may make a difference in results from one run to the next.

You have not mentioned how different the results are between lab and flow? Some more details on the specifics of your model flow and the amount of the variability you are seeing might be helpful for folks to be of help.

cc: @CoreyS

Why is accuracy metrics different when retrained on the same data with same set of parameters ?

Answers

Categories

Setup Info

Tags