Predicting multiple rows for a custom python prediction endpoint in an API Service
Hello, is there a way to send multiple records for scoring when using a custom python prediction endpoint in an API Service? The predict() function itself of the ClassificationPredictor indicates that the dataframe it accepts as an argument can contain one or more rows, but I can't figure out a way to actually pass multiple rows to the API in the request. If I try to pass an array of objects in the query, ex.
{ "features": [{ "state": "KS", "area code": 415 }, { "state": "OH", "area code": 415 }] } # Or [{ "features": { "state": "KS", "area code": 415 } }, { "features": { "state": "OH", "area code": 415 } } ]
I always get the error:
Expected a com.google.gson.JsonObject but was com.google.gson.JsonArray
Is there a different format I need to use for my input to be interpreted as multiple rows?
I realise that if I have a large number of rows that I need to do batch scoring for, I may be better served with the automation node, but I need this to be realtime, and the number of rows in the batch will 2-3 maximum.
Best Answer
-
Hi,
If you use the Python client to talk to the API, you would use:
client.predict_records("myendpoint", [ { "features": {"feat1" : "va1", "feat2": "va2"} }, # Record 0 { "features": {"feat1" : "vb1", "feat2": "vb2"} } # Record 1 ])
If you prefer to use the REST API, you would POST on /public/api/v1/serviceId/endpointId/predict-multi
{ "items": [ { "features": { "feat1": "va1", "feat2": "va2", } }, { "features": { "feat1": "vb1", "feat2": "vb2", } } ] }
Please see https://doc.dataiku.com/dss/api/7.0/apinode-user/#prediction-endpoints-perform-multiple-predictions-post and https://doc.dataiku.com/dss/latest/apinode/api/user-api.html#dataikuapi.APINodeClient.predict_records
Answers
-
Hey Clément
Perfect, thanks! I never saw that API documentation before, great resource. On a side note, I am quite interested in this "context" parameter that we can pass to the API call. Can be interesting to use for monitoring / reporting. The documentation mentions that it's logged but not directly used which is totally fine, except that I can't find where it's being logged. It's not in the API node logs. Do I have to connect the API node to Graphite to enable the logging of this context information?
Thank you