Announcing the winners & finalists of the Dataiku Frontrunner Awards 2021! Read their inspiring stories

Efficient Environment and Connection Management: Tracking, Labeling, Grouping and Annotations

0 Kudos

Our teams use all of Dataiku's nodes according to spec: e.g. design to automation for dev to prod. In these nodes, we maintain many connections to dev, uat and prod systems. We also have project specific environments that can be troublesome to fully identify and maintain. 

It often becomes burdensome as an admin or end user to suss out if a valid connection already exists and can be re-used or if a new one needs to be built. We've resolved to using naming conventions, but these only go so far and can become ugly to look at and maintain. A user with create connections permissions may end up creating a new env because she needed more information.

Additionally, when deploying projects to automation, it can become confusing to manage package dependencies of ours and DSS's own dependencies (e.g. for visual ML).

For instance, we have a set of packages with pandas as a requirement. If we deploy a project to automation with check boxes for jupyter and visual support, we receive errors for double requirements for pandas. To set up an env "just perfectly" so we get the best of both worlds, we need to set up dependencies in the appropriate order so DSS viz capability works, along with our own package(s) capabilities, or create dss compatible versions of our packages. 

In any of these cases, it would be nice to label, categorize and generally manage connections and envs much in the same way we do other DSS objects, in order to alert users and admins to gotchas, or how to use the objects. Additionally, it would be excellent to have these metadata items available in the catalog associated to users and their use cases if possible (not just admins in monitoring screen for instance with connections). There is currently some functionality in the data catalog, but it's difficult to maneuver around, for instance, to find a user connections across all their projects - and environments are not available in general.

I hope this is a productive suggestion and not just me being a poor Dataiku user. I've provided a few very primitive screens, primarily because I do not trust my own UI/UX instincts.

I am very glad to provide more information!