I have a architectural design based question. My team is looking to create a web application outside of Dataiku DSS and use React for our frontend with a microservice based implementation for our backend written in Flask.
Do you think it would be a good idea to use the Dataiku API Deployer in substitution of Flask (use a python function endpoint) to help fully implement a microservice based web application? I was just wondering what your thoughts are and if this even makes sense to do.
Thank you for your help!
With the Dataiku API Deployer, you can manage and expose REST API endpoints for use by any application. Indeed, you can use Flask to call the endpoints, for instance using the requests library.
The question of microservices relates to what kind of API Deployment you choose. I would recommend to start with a simple static infrastructure, with a predefined set of API nodes. Then, if you realize you have heavy/varying API load, you can move on to a "microservices" infrastructure based on Kubernetes, as described here.
Hope it helps,
I have a follow up architectural question, @Alex_Combessie . I have a small web application that does some CRUD operations + one main ML query (with a custom model). Our ML code is all on Dataiku and we plan to use the API Node to serve the results of the model (custom python endpoint). My question is, would it make sense to also host the CRUD endpoints (~5 endpoints) on Dataiku API Nodes, or is it better to have a dedicated webserver to do the CRUD operations and then delegate the ML query to Dataiku?
So basically, I am contemplating whether to do this:
Web Frontend -----CRUD-----> Web backend -------ML------> Dataiku
Web Frontend -----CRUD_and_ML-----> Dataiku
Are there any situations where I should prefer one over the other? From an organizational prespective, I would prefer to have everything in Dataiku so I don't have two backend server with the maintenance involved, but I don't have experience with API node, so I don't know if there are disadvantages to having non-Analysis/ML endpoints hosted on Dataiku and the limitations for that vs having a dedicated webserver for the CRUD requests and for passing the ML query to the Dataiku API Node
That's an interesting question. It makes sense to centralize backend tasks on Dataiku API nodes:
- CRUD tasks can be run through SQL endpoints, assuming you have an external PostgreSQL or MySQL database. Don't forget to activate the "post-commit" option if you run update/insert statements.
- Arbitrary tasks can be run through Python function (or R) endpoints. For instance, you can write a wrapper function which checks the input request, pass it to other API endpoints (using apinode.utils) and returns a formatted response.
The benefit of this approach is that you'll be able to manage all your tasks through one API Service. The Dataiku API Deployer will facilitate deployment, versioning (including roll-back) and scaling.
One assumption here is that your CRUD tasks are synchronous. That's the central assumption for API nodes. If you want to have asynchronous CRUD tasks, it makes sense to host them separately, for instance on a Dataiku Automation node.
Hope it helps,