How to integrate GPU from an external instance to my actual Dataiku instance
Hi
I am currently running Dataiku DSS in AWS EC2 and I have a common use instance type where it is running. Due to some proyects and proposals I am required to use GPUs to accelerate the process of training models, and I have an accelerated computing EC2 instance that I want to use for this new proyects, but migrating all the proyects to this new instance would be very difficult. In order to keep all the information and everything in the original Dataiku instance, I would like to know how can I attach the other instance that has a GPU to where DSS is running, so I can send the model training jobs to the instance where the GPU is installed.
I have seen in another post that this can be reached by using a remote docker daemon, but I do not get clear with the documentation.
Hope you can help me to solve this topic. Thank you
Operating system used: Ubuntu
Operating system used: Ubuntu
Answers
-
Turribeach Dataiku DSS Core Designer, Neuron, Dataiku DSS Adv Designer, Registered, Neuron 2023 Posts: 2,088 Neuron
First of all let say that getting a GPU working in a Dataiku is not usually a trivial task. The difficulty will vary a lot depending on what GPU you want to work with, what software stack you want to use and how you want to expose the GPU to Dataiku. I covered some of the issues on this recent answer I wrote and it's linked post which I strongly suggest you read to get an idea.
Using Docker with Dataiku is not a recommended setup. You should be looking a Kubernetes.
Using GPUs in a Cloud context becomes expensive very quickly. Therefore the usual approach is to only have the GPUs active during the training phase and shut them off while you are not using them to stop paying for them. This elastic computation pattern is what Kubernetes is designed for. So start but giving a good read to the Elastic AI computation Dataiku documentation page which covers the integration with Kubernetes in great detail. There are 3 main sections in it which cover the 3 main cloud vendors AWS, Azure and GCP and their 3 Kubernetes services EKS, AKS and GKS. Each of the sections also explain the requirements to build docker images with CUDA and GPU support to use inside Kubernetes. You should be warned that bringing Kubernetes to the mix will raise the complexity level even further, so this is not an easy setup. The level of complexity will depend a lot on what permissions you have in your cloud account and how you integrate with the relevant Kubernetes service. At this stage it might make sense for you to get a CUDA/GPU Cloud VM already configured by your cloud vendor and install Dataiku on top to do a POC to really find out if GPU training is what your Dataiku use cases need.
There is also no guarantee that your model training will be faster in a GPU. Often it's only certain specific types of machine learning will benefit from GPU training, typically deep learning and neural networks. In addition to this for you to do GPU training the ML algorithm you want to use needs to support GPU training, it's not as simple as just flipping a switch and it will just run in the GPU. Dataiku does support for Keras models to run in GPUs, so in a sense it does allow for "switch flipping" GPU training. But I wouldn't underestimate the effort required to setup those GPUs nor the fact that you wouldn't want them sitting idle for most of the day if you don't have workloads to train in them.
-
Thank you for your reply. Just to be clear, the GPUs might be used for Artifficial Vision models, is it better to use k8s instead of a VM with GPUs in it?
-
Turribeach Dataiku DSS Core Designer, Neuron, Dataiku DSS Adv Designer, Registered, Neuron 2023 Posts: 2,088 Neuron
Yes.