Force generation for Python function endpoints

marawan Partner, Registered Posts: 19 Partner
edited July 16 in Using Dataiku

Hello, I am trying to implement a canary test for new versions of an api service by using multiple generations where the production version has a probability of 1 and the new version has a probability of 0, then I use the forced_generation parameter of the API call to test the new version (without affecting my production version) to make sure it's responding correctly before switching the whole API to the new version. This works great for visual prediction endpoints and custom prediction endpoints, but it doesn't seem to exist for other endpoints (I am particularly interested in the python function endpoint).

# For Prediction Endpoints
NEW_VERSION_ID = "newversion"
result = api_node_client.predict_record(
    endpoint_id, test_query['q']['features'], 
    forced_generation=NEW_VERSION_ID, context={"canary_test": True}
assert result['apiContext']['serviceGeneration'] == NEW_VERSION_ID #<-- This one succeeds

# For python function endpoints
result = api_node_client.run_function(endpoint_id, **test_query['q'])
assert result['apiContext']['serviceGeneration'] == NEW_VERSION_ID  # <--- This one fails of course because I didn't pass the forced generation id

Given that the version is a service-level property, not endpoint-level, I was expecting this to work for all endpoints. Is there a way to pass the forced generation parameter to non-prediction endpoints?

Thank you


  • Mark_Treveil
    Mark_Treveil Dataiker Alumni Posts: 30 ✭✭✭✭✭


    Did you make any progress on this problem?

    I had a look at the code and it certainly seems that generation mechanism is service-level, and I believe it is the same code that finds Endpoints in both Single and Multiple generation modes. I could not see that the type of Endpoint made any different to the way it was acquired in multi-generation mode. If this is the case then the mostly issue would be a mismatch between the Endpoint Ids in the two Service API versions. The logs should give you some indication of this.

    Hope this helps. If not I will try to reproduce your scenario.

    Thanks, Mark

  • marawan
    marawan Partner, Registered Posts: 19 Partner
    edited July 17

    Hey Mark

    Thanks for the response and sorry for the late reply. I have tested it and it seems that the "forcedGeneration" parameter for the python function endpoint is not really taken into account unlike the case with python prediction endpoints. I have tested it with both the python SDK and the REST API on the same service (i.e both endpoints have the same multi-generation configuration):


    # First the python prediction endpoint
    >> VERSION_ID = "v-2020-08-09-152047-api"
    >> api_node_client.predict_record(
    # notice the service generation in the response
    {u'apiContext': {u'endpointId': u'custom_python_endpoint_no_enrichments',
      u'serviceGeneration': u'v-2020-08-09-152047-api',
      u'serviceId': u'churn_service2'},
     u'result': {u'ignored': False,
      u'prediction': u'false',
      u'probas': {u'false': 0.8989954436971144, u'true': 0.10100455630288564}},
     u'timing': {u'enrich': 14,
      u'functionInternal': 39525,
      u'postProcessing': 44,
      u'preProcessing': 88,
      u'prediction': 40144,
      u'preparation': 0,
      u'wait': 512}}
    # Now the python function endpoint
    # since the python SDK doesn't support a parameter for forcedGeneration, we do the same thing that the python prediction endpoint does, which is adding a key "dispatch" to the query sent to the endpoint
    >> python_function_query_content = {"dispatch": {"forcedGeneration": VERSION_ID}}
    >> python_function_query_content.update(python_function_query["q"])
    >> api_node_client.run_function(
    # notice the service generation in the response
    {u'apiContext': {u'endpointId': u'custom_python',
      u'serviceGeneration': u'v-2020-08-09-131001-api',
      u'serviceId': u'churn_service2'},
     u'response': [[False], [[0.8989954436971144, 0.10100455630288564]]],
     u'timing': {u'execution': 34884,
      u'functionInternal': 33944,
      u'preProcessing': 0,
      u'wait': 28}}


    # First with the python prediction endpoint
    $ curl -X POST http://<my_url>/public/api/v1/churn_service2/custom_python_endpoint_no_enrichments/predict \
      --data '{ "features" : { \
        "state": "KS", \
        "account length": 128, \
        "area code": 415, \
        }, \
        "dispatch" : { "forcedGeneration": "v-2020-08-09-152047-api"} \
    # notice the value of the serviceGeneration in the response
    # Now with the python function endpoint
    $ curl -X POST http://<my_url>/public/api/v1/churn_service2/custom_python/run \
      --data '{ \
        "state": "KS", \
        "account length": 128, \
        "area code": 415, \
        "dispatch" : { "forcedGeneration": "v-2020-08-09-152047-api"} \
    # notice the value of the serviceGeneration in the response

    Do python function endpoints maybe expect this "forcedGeneration" parameter somewhere else other than inside the "dispatch" key? There's actually nothing in the API documentation about this "forcedGeneration" parameter, and I only found about about how to set it in the curl request by looking at the code directly.



Setup Info
      Help me…