Batch Processing for Custom API end point
I’ve developed a custom Python API endpoint for regression and successfully predicted outcomes for individual records. However, when I attempt to process a batch of records, I encounter the following error:
"Failed: Could not parse a SinglePredictionQuery from request body, caused by: JsonSyntaxException: Expected a com.google.gson.JsonObject but was com.google.gson.JsonArray."
What steps can I take to enable batch processing since the predictions are currently set up for single queries?
Operating system used: ubuntu
Operating system used: ubuntu
Answers
-
Alexandru Dataiker, Dataiku DSS Core Designer, Dataiku DSS ML Practitioner, Dataiku DSS Adv Designer, Registered Posts: 1,215 Dataiker
For prediction endpoints you can use predict-multi :
curl -X POST \ http://localhost:13000/public/api/v1/first-test/avocados/predict-multi \ --data '{ "items": [ { "features": { "Date": "2015-12-27", "AveragePrice": 1.33, "TotalVolume": 64236.62, "SmallBags": 8603.62, "LargeBags": 93.25, "XLargeBags": 0, "type": "conventional", "year": 2015, "region": "Albany" } }, { "features": { "Date": "2015-12-27", "AveragePrice": 1.33, "TotalVolume": 64236.62, "SmallBags": 8603.62, "LargeBags": 93.25, "XLargeBags": 0, "type": "conventional", "year": 2015, "region": "Albany" } } ] }'
For custom code endpoints there is no real concept of "batch" if you send a batch request you will need to send it as valid JSON and handle the json yourself to process multiple request from a single payload.