Batch job deployment using model pkl
Is it possible to do a batch job deployment using a Scikit learn model pkl file? I know it is possible to create an API endpoint using pkl but didn't find any resource for my use case.
Basically I wish to create a batch job that runs on a weekly basis taking input from Oracle db, processing it and predicting output using the model pkl which I have generated outside DSS. Also, I have to replicate the same for 50 other similar models. Any suggestion is welcome, even if it is a different approach to the problem.
Answers
-
Alexandru Dataiker, Dataiku DSS Core Designer, Dataiku DSS ML Practitioner, Dataiku DSS Adv Designer, Registered Posts: 1,226 Dataiker
Hi @varun
,As mentioned in :
https://community.dataiku.com/t5/Using-Dataiku/import-sklearn-model-trained-outside-of-Dataiku-into-Dataiku/td-p/3821You can upload the models to a managed folder and load them and score the datasets as done all in python the recipe is illustrated here:
http://gallery.dataiku.com/projects/DKU_ADVANCEDML/recipes/compute_test_scored_scikit/As you have to do this for multiple models you can wrap this in a function and add in a project library to make it easier to reuse.
If you can export your models to MLFlow you also import those and use Visual ML in DSS: https://doc.dataiku.com/dss/latest/mlops/mlflow-models/index.html