DSS Backstage (A New DSS Instance Type)

Background and Introduction:

Dataiku DSS has many different instance types whether it be Dataiku DSS Designer Nodes, Automation Nodes, API Deployer etc. It feels as if there may be a node missing from the Dataiku DSS platform. Specifically one that can be used to help deploy web apps and test out plugins. A place where folks are able to separate their development environment from the designer node which is really considered "production" for plugin developers and web app developers. Since the goal of this instance type would be to cater to the development that happens "behind-the-scenes" for the DSS Designer Node, "Backstage" may be an appropriate name.

Additionally, having a node that caters to plugin development and web app development may also help separate out some of the features of the DSS project page in designer nodes which can sometimes be overwhelming for new DSS users when they try to navigate through the top navigation bar with all of the tools they can use in their project. Perhaps also having a designer node that is more minimal may help on the UX side as well. 

Description (User Story):

As a Dataiku DSS developer I want a new DSS instance type that caters to plugin and web app development so that I have an easier time as a developer versioning, testing and deploying my web apps and plugins to Dataiku DSS designer instances which may act as my "Production environment" but are sort of used today like a development/production all-in-one. 

Example of Enhancements:

I'll know it is successful when the following features/enhancements are implemented for the DSS instance version:

  • The DSS Backstage instance is able to host development containers of which users can SSH into from their local machine. This container may come pre-packaged with all of the dataiku libraries for python, R, etc. so the end-user doesn't need do anything but just start coding. 
  • A CLI tool (perhaps a tool pre-installed in the container described in bullet above) that acts like Yeoman but for building DSS plugins and web apps. 
  • A CI/CD pipeline user interface that enforces best practices for your plugins or in some cases web app by requiring you to test out your plugin programmatically with unit tests that are being ran on a dummy project's flow. 
  • You are able to build web apps in a container (refer to bullet one) and have terminal access to the container so you can easily install JS frontend libraries and use any backend framework (perhaps Django) that you would like. Imagine a Heroku like experience but instead DSS is the one that hosts the web apps. 
  • The ability for an individual developer to provide an SSH key of their own and associate it with themselves via the GUI instead of having to associate it with a security group in DSS which multiple users may belong to. -- perhaps this would be for the designer node.

Priority:

Medium - High 

Note, I realize this will most likely take a while to build. 

Benefit/Impact:

The overall Dataiku DSS experience for both the developers and end users will be greatly enhanced. The reason for this is because if the plugins or web apps themselves are not properly developed by the DSS developer persona (have bugs) then the overall image and experience of Dataiku is not properly reflected since these plugins and web apps help extend the functionality of DSS and what it has to offer. If there is anyway Dataiku can help encourage quality control and best practices then it is a win for everyone.

Like I mentioned before in the background section, this feature may help decouple the DSS project page which  is sometimes overwhelming for DSS users especially new ones. Separating out things like plugin development, web app development may help alleviate this as well as provide an opportunity for these features to grow and be built upon.

Enhancing this experience encourages end users to develop well tested and robust plugins and web apps that they can share with the larger dataiku community. 

Additionally, I will highlight some of the pain points experienced today to give you a sense of why this may be something that is needed. Perhaps shedding light on these pain points may help us construct any alternative solutions if necessary.

What are some of the pain points with creating plugins and web apps in DSS?

It is hard for developers to build more sophisticated web apps in DSS that use JavaScript frontend libraries in Dataiku DSS instances that may be used by multiple users. This is because currently developers need to have server level access to DSS to be able work with JS frameworks which is not always possible to get in a enterprise setting. 

SSH keys for security groups so that developers can push and pull to remote git hosting services isn't something that is easy to configure on the DSS server out of the box. Making it hard to utilize services like GitHub, Bit Bucket, or GitLab through DSS.

When developing plugins and web apps in the designer nodes today it feels as if we are following bad practices by not separating the development environment from our production environment since there is no deployment process built into the development of plugins or web apps. You simply just create it in the designer node from beginning to end.

There is no part in the DSS GUI that enforces or recommends testing of your plugin or web app or perhaps even offer off-the shelf standard tests (scenarios) to help speed up the development processes via standard flow quality control checking.

It is hard to collaborate with other developers on a web app since there is no way to work on different branches from each other (I believe....maybe branches  for the overall project can do this??) slowing down the development process. 

Let me know if there is anything you would like me to clarify or any ideas you would like to add yourself to this. 

Note:

Attached are some of my ideas expressed visually via PDF and PowerPoint. This shows a rough draft skeleton.

 

5 Comments
AshleyW
Dataiker
 
Status changed to: In the Backlog
 

Linking images from PDF here Slide1.png

 

Slide2.png

 

Slide3.png

 

Slide4.png

 

Slide5.png

 

Slide6.png

 

Slide7.png

 

Slide8.png

 

Linking images from PDF here Slide1.png

 

Slide2.png

 

Slide3.png

 

Slide4.png

 

Slide5.png

 

Slide6.png

 

Slide7.png

 

Slide8.png

 

CoreyS
Dataiker Alumni

This is really cool. Thanks for sharing @adamnieto!

Looking for more resources to help you use Dataiku effectively and upskill your knowledge? Check out these great resources: Dataiku Academy | Documentation | Knowledge Base

A reply answered your question? Mark as โ€˜Accepted Solutionโ€™ to help others like you!

This is really cool. Thanks for sharing @adamnieto!

@adamnieto 

This is a super well written product idea.

 

--Tom

@adamnieto 

This is a super well written product idea.

 

It's a interesting idea but I think that perhaps it goes too far into the solution. When we as users request a product enhancement it's important that we don't "solucionize" as there are many ways to solve a requirement and in the initial stages we should focus on understanding the problem. The User Story part is very good I think and clearly states the issues. 

In terms of web apps and plugins we follow the standard Dataiku pattern: we develop on the Designer node and we deploy them to the Automation node for UAT or Production. Most of our Web Apps are self-contained within the project web app code. We do have some web apps that have some static files (ie image logos etc) which we deploy manually to the Automation node. I do agree that at the moment there is no built-in way to handle these artifacts within Dataiku although it should be possible to create a plugin that can deploy these artifacts for you using Python. A plugin could let you select the Web App you want to deploy and the destination Automation node and via some metadata that you can create for all your web apps it can "release" all your web app artifacts. In our case we don't have that many static files nor we need to update them very often therefore we are OK to do these manually for now. I can see, however, that if we were to move into HTML/JS Web Apps this could become an issue so I do support the idea of having a more strealined Web App deployment experience for Web App artifacts and bringing them within the existing built-in git versioning control. I am not sure if a new node is the best way to solve this requirement though.

Plugin deployment to the Automation node is "manual". What we do is that we change the version on the plugin.json and then manually download the plugin zip file from Designer and manually deploy it to the Automation node. We do keep the plugin downloaded files in case we need to roll back to a previous version. It's not clear to me why you can't follow this same pattern. At the moment you do have full version control in Designer and if you keep your downloaded/versioned plugins you can effectively roll back to previous versions if needed. But again I support the idea of Dataiku having built-in capability to deploy plugins to the Automation node and some way of tagging plugin releases like project bundles do to allow for easy rollback.  

With regards to version control and parallel development plugins are not much different than projects. You do have automatic git version control but it is very limited on what you can do. Git branching is possible but since Dataiku provides no capability to do all the other things you will need a git repo to do (merge, conflict resolution, rebase, etc) it's pretty much useless. So I think the whole versioning control issue is worth of another idea discussion since there is a lot of scope for improvement although I do understand why Dataiku hasn't implemented a full git workflow, it will be very complex to do.

It's a interesting idea but I think that perhaps it goes too far into the solution. When we as users request a product enhancement it's important that we don't "solucionize" as there are many ways to solve a requirement and in the initial stages we should focus on understanding the problem. The User Story part is very good I think and clearly states the issues. 

In terms of web apps and plugins we follow the standard Dataiku pattern: we develop on the Designer node and we deploy them to the Automation node for UAT or Production. Most of our Web Apps are self-contained within the project web app code. We do have some web apps that have some static files (ie image logos etc) which we deploy manually to the Automation node. I do agree that at the moment there is no built-in way to handle these artifacts within Dataiku although it should be possible to create a plugin that can deploy these artifacts for you using Python. A plugin could let you select the Web App you want to deploy and the destination Automation node and via some metadata that you can create for all your web apps it can "release" all your web app artifacts. In our case we don't have that many static files nor we need to update them very often therefore we are OK to do these manually for now. I can see, however, that if we were to move into HTML/JS Web Apps this could become an issue so I do support the idea of having a more strealined Web App deployment experience for Web App artifacts and bringing them within the existing built-in git versioning control. I am not sure if a new node is the best way to solve this requirement though.

Plugin deployment to the Automation node is "manual". What we do is that we change the version on the plugin.json and then manually download the plugin zip file from Designer and manually deploy it to the Automation node. We do keep the plugin downloaded files in case we need to roll back to a previous version. It's not clear to me why you can't follow this same pattern. At the moment you do have full version control in Designer and if you keep your downloaded/versioned plugins you can effectively roll back to previous versions if needed. But again I support the idea of Dataiku having built-in capability to deploy plugins to the Automation node and some way of tagging plugin releases like project bundles do to allow for easy rollback.  

With regards to version control and parallel development plugins are not much different than projects. You do have automatic git version control but it is very limited on what you can do. Git branching is possible but since Dataiku provides no capability to do all the other things you will need a git repo to do (merge, conflict resolution, rebase, etc) it's pretty much useless. So I think the whole versioning control issue is worth of another idea discussion since there is a lot of scope for improvement although I do understand why Dataiku hasn't implemented a full git workflow, it will be very complex to do.