Data quality : Monitoring on datasets processing

Solved!
Romain_NIO
Level 2
Data quality : Monitoring on datasets processing

Hi,



I'm asking about how DSS monitors issues during datasets processing. I see two kinds of potential issues: 




  • Volume : Inconsistant number of records in a dataset (eg : I expected at least 1k records per  day for my "webtraffic" dataset)

  • Schema / values:  One or more rows have fields that don't respect the defined schema or expected values (eg : in webtraffic dataset, IP adresses are not valid or values of a date field are not expected). 



 Is there a way to monitor / handle those errors in DSS and be notified by email or something ?



Thanks,



Romain.



 



 

0 Kudos
1 Solution
jereze
Community Manager
Community Manager
Hi Romain,

These features are on our roadmap. You can get in touch with our Sales team if you'd like more details.

As of today, you could write a custom recipe in Python for instance and write your tests.
Jeremy, Product Manager at Dataiku

View solution in original post

6 Replies
jereze
Community Manager
Community Manager
Hi Romain,

These features are on our roadmap. You can get in touch with our Sales team if you'd like more details.

As of today, you could write a custom recipe in Python for instance and write your tests.
Jeremy, Product Manager at Dataiku
Romain_NIO
Level 2
Author
Good new!

Thanks for the quick reply ๐Ÿ™‚
0 Kudos
sbourgeois-k
Level 2

Hello Jeremy,

Have those monitoring features been developed since?

Or is it still on your roadmap?

Sรฉbastien

0 Kudos
sbourgeois-k
Level 2

Hello @Ignacio_Toledo !

Great!

Thank you so much for your message and the links.

Have a nice day!

adamnieto

Another tip that might be useful with the metrics and checks is that you can automate them using scenarios in your project. Here is a good resource from the academy that walks through this: https://academy.dataiku.com/automation-course-1?next=%2Fautomation-course-1%2F668968

Labels

?
Labels (3)
A banner prompting to get Dataiku