Discover this year's submissions to the Dataiku Frontrunner Awards and give kudos to your favorite use cases and success stories!READ MORE

Batch job deployment using model pkl

varun
Level 1
Batch job deployment using model pkl

Is it possible to do a batch job deployment using a Scikit learn model pkl file? I know it is possible to create an API endpoint using pkl but didn't find any resource for my use case.

Basically I wish to create a batch job that runs on a weekly basis taking input from Oracle db, processing it and predicting output using the model pkl which I have generated outside DSS. Also, I have to replicate the same for 50 other similar models. Any suggestion is welcome, even if it is a different approach to the problem.

0 Kudos
1 Reply
AlexT
Dataiker
Dataiker

Hi @varun ,

As mentioned in : 
https://community.dataiku.com/t5/Using-Dataiku/import-sklearn-model-trained-outside-of-Dataiku-into-...

You can upload the models to a managed folder and load them and score the datasets as done all in python the recipe is illustrated here:
http://gallery.dataiku.com/projects/DKU_ADVANCEDML/recipes/compute_test_scored_scikit/

As you have to do this for multiple models you can wrap this in a function and add in a project library to make it easier to reuse. 

If you can export your models to MLFlow you also import those and use Visual ML in DSS:  https://doc.dataiku.com/dss/latest/mlops/mlflow-models/index.html 

0 Kudos