Provision of connecting your own compute resource to dataiku
What is the provision of connecting your own compute resource(instance or K8 cluster) from Azure(or AWS) to dataiku for running specific jobs such as model training or any recipes?
Best Answer
-
Alexandru Dataiker, Dataiku DSS Core Designer, Dataiku DSS ML Practitioner, Dataiku DSS Adv Designer, Registered Posts: 1,212 Dataiker
You can install DSS on EC2 and possible use docker on the EC2 machine but this would not be recommended.
https://doc.dataiku.com/dss/latest/containers/docker.html#why-kubernetes-rather-than-docker
As for Azure App Service this is unsupported.
We recommend using K8s to distribute computing from DSS.
https://knowledge.dataiku.com/latest/kb/data-prep/where-compute-happens.html
Answers
-
Alexandru Dataiker, Dataiku DSS Core Designer, Dataiku DSS ML Practitioner, Dataiku DSS Adv Designer, Registered Posts: 1,212 Dataiker
Hi,
You can use un-managed K8s (both AKS or EKS)
https://doc.dataiku.com/dss/latest/containers/aks/unmanaged.html
https://doc.dataiku.com/dss/latest/containers/eks/unmanaged.html
Thanks,
-
What about any EC2 instance or Azure app service?
-
Consider for use-cases where we just have to deal with ETL processes for around 5-6lakhs of rows. Can we connect our computes(Ec2 instances) to the DSS and use it for running certain ETL jobs(code or visual recipes) only, for optimizing the cost and don't want to leverage K8s, can that be done?
With that, Instead of leveraging our own, can also something that can be provided by Dataiku? If yes, what are their costings?
Also, is there any documentation that is available that specifically talks about cloud computing and infra management. If yes you can attach it in the reply, this would be really helpful.
Thanks a lot.