Unexpected prediction behavior in API Designer Test Queries with missing feature values

Hwijae
Hwijae Registered Posts: 1 ✭✭

Hello,

I am currently testing an API endpoint created with Dataiku API Designer and noticed some prediction behaviors that I do not fully understand.

Environment

  • Dataiku DSS
  • API Designer
  • LightGBM prediction model
  • Approximately 48 input features

Observation

I expected the API to return an error when required feature values were missing from the input payload. However, the API continues to return predictions even when some features are omitted or when an almost empty query is submitted.

To better understand the behavior, I performed the following tests using the Test Queries feature in API Designer.

Test Cases

  1. Submit a complete input payload (ground truth)
  2. Modify some feature values
  3. Remove one or more feature values from the payload
  4. Modify features with high importance according to the Feature Importance analysis
  5. Submit an empty query

Results

Test

Result

Complete input payload

Prediction returned

Modified feature values

Same prediction as Test #1

Missing feature values

Same prediction as Test #1

Modified high-importance features

Different prediction

Empty query

Different prediction, but prediction is still returned

Questions

Based on these results, I would like to better understand how Dataiku handles missing features during API inference.

  • Does Dataiku automatically fill missing feature values using defaults, training statistics, or preprocessing logic?
  • Is the behavior different between API Designer test queries and actual API calls?
  • Is there a way to enforce strict validation so that predictions fail when required features are missing?
  • How should an empty query be interpreted by the prediction endpoint?

The fact that predictions are still generated even when features are removed or when an empty query is submitted makes it difficult to understand exactly what input data is being used by the model.

Any explanation of the inference-time handling of missing inputs would be greatly appreciated.

Thank you.

Dataiku version used: 14.4.3

Dataiku version used: 14.4.3

Answers

  • Turribeach
    Turribeach Dataiku DSS Core Designer, Neuron, Dataiku DSS Adv Designer, Registered, Neuron 2023, Circle Member Posts: 2,710 Neuron

    This is a known issue which I raised years ago and has was never fixed. The fact you can't force the API to return an error if a required parameter is obviously a massive issue for ML models which can return unpredictable results when data is not valid. The only way we know to deal with this is to use a custom Python function API which means you end up doing all the coding to handle the parameters validation. Contact Dataiku Support and ask your company to be added to the enhancement request which is collecting dust somewhere…

Setup Info
    Tags
      Help me…