unusual behaviour of SQL probes on a partitioned dataset

Tanguy
Tanguy Dataiku DSS Core Designer, Dataiku DSS & SQL, Dataiku DSS ML Practitioner, Dataiku DSS Core Concepts, Neuron, Dataiku DSS Adv Designer, Registered, Dataiku DSS Developer, Neuron 2023 Posts: 113 Neuron

I've implemented a SQL probe on a partitioned dataset that leverages AWS Athena to query an AWS S3 dataset :

Capture d’écran 2024-01-09 173552.jpg

A check has been configured on the SQL result of this probe to ensure the value is equal to 1, essentially validating the absence of duplicate keys in my data.

Capture d’écran 2024-01-09 172737.jpg

I asked dataiku to automatically compute the SQL probe and the corresponding check for all the partitions when building the dataset. If a check were to fail, then an error is raised.

Here's the surprising aspect: Dataiku raises an error because it detects SQL probe values greater than 1. However, upon manually recomputing the SQL probe for a partition with a value greater than 1, the calculated value consistently appears to be 1.

Capture d’écran 2024-01-09 172635.jpg

It seems as though the SQL probe behaves incorrectly during runtime, causing the checks to erroneously raise errors in my scenario. This issue appeared when we upgraded from DSS v9 to v11.

Is there a potential issue, or am I overlooking something?


Operating system used: linux redhat

Answers

  • Turribeach
    Turribeach Dataiku DSS Core Designer, Neuron, Dataiku DSS Adv Designer, Registered, Neuron 2023 Posts: 1,877 Neuron

    Is the check throwing a Warning or a Fail?

  • Tanguy
    Tanguy Dataiku DSS Core Designer, Dataiku DSS & SQL, Dataiku DSS ML Practitioner, Dataiku DSS Core Concepts, Neuron, Dataiku DSS Adv Designer, Registered, Dataiku DSS Developer, Neuron 2023 Posts: 113 Neuron

    Indeed:

    Capture d’écran 2024-01-10 113630.jpg

    but the error appears to be a false positive when manually computing the SQL probe (and the corresponding check)!

Setup Info
    Tags
      Help me…