How do I get the model to automatic re-train after the dataset is updated

kid4869 · ‎11-09-2023

When the dataset is updated (including when the input schema also changes), how do I get the model to automatically adapt to this change and learn automatically ?

For now, what's happening is that since I also have tools like VIF to pre-filter the columns, when the data is updated, the columns that are filtered out will change.

But if this column changes, then when I use the model task that has been trained before, I get an error because it requires exactly the same information about the column as before.

Is there any way to adapt the model to this change for automation purposes?

Now I have to re-train the model in the original analysis after the data is updated, select the model and deploy it later to use the model's

AlexT · ‎11-14-2023

Hi @kid4869 ,

If the columns change especially but not the target, you need to detect the new schema of the datasets in the modeling tas using code e.g mltask.wait_guess_complete()
See full example here:
https://developer.dataiku.com/latest/concepts-and-examples/ml.html
Once that is done, you can retrain your model with the new columns features.

Note if the larger column changes you will need to create a new visual analysis entirely, retrain and deploy to the flow.

Kind Regards,

Sign up to take part

How do I get the model to automatic re-train after the dataset is updated

How do I get the model to automatic re-train after the dataset is updated