Predicting multiple rows for a custom python prediction endpoint in an API Service

Options
marawan
marawan Partner, Registered Posts: 19 Partner
edited July 16 in Using Dataiku

Hello, is there a way to send multiple records for scoring when using a custom python prediction endpoint in an API Service? The predict() function itself of the ClassificationPredictor indicates that the dataframe it accepts as an argument can contain one or more rows, but I can't figure out a way to actually pass multiple rows to the API in the request. If I try to pass an array of objects in the query, ex.

{
   "features": [{
      "state": "KS",
      "area code": 415
   }, {
      "state": "OH",
      "area code": 415
   }]
}

# Or

[{
   "features": {
      "state": "KS",
      "area code": 415
   }
}, 
{
   "features": {
      "state": "OH",
      "area code": 415
   }
}
]

I always get the error:

Expected a com.google.gson.JsonObject but was com.google.gson.JsonArray

Is there a different format I need to use for my input to be interpreted as multiple rows?

I realise that if I have a large number of rows that I need to do batch scoring for, I may be better served with the automation node, but I need this to be realtime, and the number of rows in the batch will 2-3 maximum.

Best Answer

Answers

  • marawan
    marawan Partner, Registered Posts: 19 Partner
    Options

    Hey Clément

    Perfect, thanks! I never saw that API documentation before, great resource. On a side note, I am quite interested in this "context" parameter that we can pass to the API call. Can be interesting to use for monitoring / reporting. The documentation mentions that it's logged but not directly used which is totally fine, except that I can't find where it's being logged. It's not in the API node logs. Do I have to connect the API node to Graphite to enable the logging of this context information?

    Thank you

Setup Info
    Tags
      Help me…