Prediction on a saved model from python API endpoint

ssuhas76 · ‎02-21-2024

I am trying to predict a value from a python APi endpoint as per the below code.

I have a saved model in DSS after training on Dataiku and need to predict by sending single row of data to the API. Is this the right way?

client = dataikuapi.DSSClient(host, apiKey)
client._session.verify = False
project = client.get_project("xxxxxxxxx")
model = project.get_saved_model('xxxxxxxxx')

predcited_value = model.predict('single_row_dataframe')
#then save it to a dataset within DSS

AlexT · ‎02-22-2024

Hi @ssuhas76 ,
There is no equivalent dataikuAPI that can be used from API endpoint.
get_predictor is meant to be used only within DSS. It requires direct filesystem access.

You should deploy an API endpoint for each prediction model, note you can have a single service with x predictions endpoints within it. You can use a scenario or code to create and deploy API services.

An approach where you use get_predictor directly is not desirable in the API node context, because it would mean that each time for each query, the model is fully fetched from the design node, which is against the concept of the API node. Response times would be very slow if we would allow this.

View solution in original post

AlexT · ‎02-21-2024

Hi,

The standard, if you want to do prediction, is to deploy the model as a prediction endpoint:

https://doc.dataiku.com/dss/latest/apinode/endpoint-python-prediction.html

Once you deployed the prediction endpoint, you can send the request from another endpoint in the same API service as per the example here:
https://doc.dataiku.com/dss/latest/apinode/api/endpoints-api.html

Or use the code sample available in the API deployer:

If you don't want to deploy an endpoint you can use :

https://developer.dataiku.com/latest/api-reference/python/ml.html#dataiku.Model.get_predictor

model = dataiku.Model("name_of_model")
predictor = model.get_predictor()
df_pred = predictor.predict(df, with_input_cols=True)

But note this does not there are limitations with this approach the dataframe you are passing should have the same types as your original training data. If you are scoring a record in DSS you should have a scoring recipe.
Other things to consider :
- get_predictor does not from remote client or API nodes( from custom code endpoint)
- get_predictor only works containerized notebook/recipe in recent DSS releases ( 12.3.0+)

Thanks,

ssuhas76 · ‎02-22-2024

Hi ALex, thanks for your reply.

Is there a equivalent dataikuapi code for this? I am trying to predict from API endpoint code in python. We have around 25-30 models dont want to build so many prediction endpoints and instead access the saved models from the python endpoint function and do the prediction

AlexT · ‎02-22-2024

Hi @ssuhas76 ,
There is no equivalent dataikuAPI that can be used from API endpoint.
get_predictor is meant to be used only within DSS. It requires direct filesystem access.

You should deploy an API endpoint for each prediction model, note you can have a single service with x predictions endpoints within it. You can use a scenario or code to create and deploy API services.

An approach where you use get_predictor directly is not desirable in the API node context, because it would mean that each time for each query, the model is fully fetched from the design node, which is against the concept of the API node. Response times would be very slow if we would allow this.

Sign up to take part

Prediction on a saved model from python API endpoint

Prediction on a saved model from python API endpoint