As part of a project, we are running a couple of Bokeh apps to feed dashboards that show model results and performance and allow users to play around with some parameters.
However, for no apparent reasons, sometimes the backends of these apps are shut down. They need to be restarted manually in order to make the dashboards available again. In the meantime, our business users get an internal server error on the dashboard. This is inconvenient.
As a hotfix, we have an hourly scenario run that restarts all backends, but this is kind of a buckshot method.
Is there anyway we can have DataIKU detect that a dashboard's Bokeh backend is down and restart it automatically? Maybe while displaying a nicer message like: we are reloading the results, please wait...
Thanks for reporting this issue. The expected behaviour is indeed the one you described: the bokeh webapp backends should be restarted but it is not the case. The issue is now logged in our backlog.
That being said, it could be interesting to try to understand why the backends of those bokeh webapps are stopped by looking at the logs (both the corresponding webapp backend logs and the main DSS instance backend log) when you see that such an issue happens.