Discussions - Dataiku Community

Latest Activity

How to access specific check results in a scenario
I have a simple dataset containing 1 row and about 20 columns. Each column represents a Boolean value of whether or not a particular threshold has been breached. If the result is true for any of the 20 columns, then a message is sent to a designated MS Teams Channel that a breach has occurred. I would like to supplement…
Custom Python Script
Hi All Trying to achieve a custom python script based on below scenario. My dataset contains a column WeekEnd and values as Dates (Some dates fall on Weekdays and some on Saturday and Sunday). The check I'm trying to achieve here is check if column WeekEnd falls on Saturday then OK, else ERROR ONLY able to write below.…
Launch scenario if dataset is not empty
Hello, I would like to trigger one step of my scenario if one of my dataset is not empty. What I did is: 1- Create Check in my dataset (that I called NUMBER_RECORDS) to verify that record count>1 (img1) . 2- Create "Run Check" step on this dataset. 3- Create step I would like to launch based on condition…
Why Aren't Record Counts Computable by Default?
In my DSS flows, I always activate record counts to check the volume of my datasets. However, it is cumbersome to activate them one by one, and it seems that it is not possible to activate them by default for an entire project. Why doesn't DSS allow this? Operating system used: RHL 8
unusual behaviour of SQL probes on a partitioned dataset
I've implemented a SQL probe on a partitioned dataset that leverages AWS Athena to query an AWS S3 dataset : A check has been configured on the SQL result of this probe to ensure the value is equal to 1, essentially validating the absence of duplicate keys in my data. I asked dataiku to automatically compute the SQL probe…
How to add a custom calculated model metric into the evaluation store in Dataiku?
Hello everyone, I am currently working on a project in Dataiku and trying to log a custom model metric into the evaluation store. The model is not a visual ml model its a custom model that is logged in Dataiku as a saved model using the mlflow integration that Dataiku offers. However, I am not sure how to add a custom…
Metrics Dataset using DSS Python API
Hello, I am using the DSS Python API from DSS Notebooks in order to enable metrics like min, max, distinct null values per column and I would like to generate a table based on these metrics. I am able to enable and generate metrics. I am missing the way to generate the metrics dataset using the API in order to store the…
Data Drift Detection Issue after DSS v12 Update
Hey Dataiku Community, I'm facing an issue with the data drift detection feature in the Evaluate recipe after updating to DSS v12. Prior to the update, this feature was working perfectly fine. However, after the update, I'm encountering an error that's preventing the Evaluate recipe from running successfully. Error…
Returning the Metric Record Count in a Slack message
Hello, I have set up a scenario which returns a Slack message upon completion, along with some standard variables (e.g., ${scenarioName}). By following the documentation here https://doc.dataiku.com/dss/latest/scenarios/variables.html ("Retrieving the value of a metric") I have been able to get the entire output from the…
Custom check to determine if a columns data is unique (does not have duplicates)
I would like to run a check that fails if a column "col1" in my dataset has duplicate values. In the metrics tab I am running the "Distinct value count" on col1 and "Records Counts" on the table. How do I write a custom Python check to determine if the "Distinct value count" on col1 equals "Records Counts" to determine if…