Data Quality Checks in a flow
Hi everyone,
I have a quick question:
I am importing datasets from snowflake, in a proper flow i.e., source -> warehouse -> data mart.
1) can either create them again in dataiku (double work) OR is there a way that this whole staging process goes in a proper sequential manner within the dataiku?
2) In this whole flow i want to add a few data quality checks such as checking for nulls, mismatching of IDs from one stage to another and duplications based on a few keys.
How could I enable it here in dataiku?
Best Answer
-
Turribeach Dataiku DSS Core Designer, Neuron, Dataiku DSS Adv Designer, Registered, Neuron 2023 Posts: 2,160 Neuron
Metrics and Checks is what you want to use. Have a read:
https://knowledge.dataiku.com/latest/mlops-o16n/automation/concept-metrics-checks.html
Answers
-
Yes, I used them but I do have a doubt and would appreciate your assistance Metrics and Checks - Dataiku Community