Provide a GUI and APIs to check for failing triggers
We do a lot of work in our instances to try to keep the backend logs as clean as possible. On large and busy instances this is a constant battle to deal with user errors which pester the backend logs and made them too busy. One of the most common user errors we find is constantly failing triggers. In the majority of the cases these triggers fail to evaluated every 60 seconds which generates a considerate amount of of useless logging and wasted compute resources. Some of these we are able to detect programmatically using the Dataiku API:
- Active scenarios with Active Triggers which use Datasets as Dataset Changed that no longer exist
- Active scenarios with Active Triggers which Follow a scenario that no longer exist
- Active scenarios with Active Triggers which have empty build steps
However there are other cases where it's not possible for us to easily detect errors on Triggers such as constantly failing SQL triggers or custom Python triggers. Furthermore there is no way for users or admins to view the state of failing triggers anywhere in the GUI. A lot of times scenarios have multiple triggers so it's not immediately obvious some triggers are constantly failing since the scenario may still be firing for other working triggers.
This idea is to enhance Dataiku to both provide a GUI and an API to check for failing triggers in order to be able to enable Dataiku Admins and Users to see this information and take action.