How to create a model to deploy from a pickle

Solved!
ASten1
Level 3
How to create a model to deploy from a pickle

Hi,

I'm creating a flow which trains a custum model written outside DSS. To do that I use a python recipe that has a pickle containing the trained model as output. I would like to make this pickle a model in dataiku, so that I can create an API service and deploy it. Is there a way to do that? I've already seen the customized models, but you have to write them in DSS, which is not what I want. Any suggestion is welcome, even if it is a different approach to the problem.

Thank you!!

1 Solution
fchataigner2
Dataiker

Hi

an API service can have a "custom prediction (python)" or a "python function" endpoint. These types of endpoints let you specify a managed folder from your flow, which gets passed to the code of the endpoint. So you can put the pickle in a managed folder, and write an endpoint that loads the pickle and calls it with the features passed to the scoring method

View solution in original post

0 Kudos
5 Replies
fchataigner2
Dataiker

Hi

an API service can have a "custom prediction (python)" or a "python function" endpoint. These types of endpoints let you specify a managed folder from your flow, which gets passed to the code of the endpoint. So you can put the pickle in a managed folder, and write an endpoint that loads the pickle and calls it with the features passed to the scoring method

0 Kudos
ASten1
Level 3
Author

Thank you for your reply, you are right, it works!

0 Kudos
RohitRanga
Level 3

Hi @ASten1 , would you be able to share any code on how we can deploy a pkl model file inside a managed folder? 

0 Kudos
renjiinfy
Level 1

Hi,

When loading a pickle file from a managed folder, i am using the download stream method. While loading it, the type of the file getting loaded is in bytes. So we cannot make predictions using this. Is there an alternative way to load the pickle file so that I can use it to make predictions. 

The type of the model which should be loaded should be '

statsmodels.discrete.discrete_model.BinaryResultsWrapper

' and not bytes. How we can achieve this?

fchataigner2
Dataiker

the data you retrieve via the API is the raw file's content, ie the actual bytes stored on disk. If the file is a pickle file, and you need the object that was pickled, then you need to use `pickle.loads()` on the bytes 

0 Kudos