Timeseries forecasting with GPU / cuda 11

Hello,
I am now trying to train a model with timeseries forecast by using GPU.
OS: Ubuntu 22.04
Installed with apt-get on OS:
- libcudnn9-cuda-11
- cuda-toolkit-11-8
- libnccl2
I then created a new python env :
when i use that environment in the model, I can see at first that it's fine since it shows me my GPU card :
but when I start training there is an error :
and the result when I try to execute a ML model :
Any idea how to resolve the issue ?
Operating system used: Ubuntu 22.04
Answers
-
Turribeach Dataiku DSS Core Designer, Neuron, Dataiku DSS Adv Designer, Registered, Neuron 2023 Posts: 2,580 Neuron
You need to add all the additional software and drivers that NVIDIA GPUs need to work. These steps will be of course dependant on the OS / version / architecture / hardware / GPU that your server is running and it's not a trivial setup given that all the software versions need to be compatible between themselves and the GPU used and this not always clearly stated on each of the software components. Pretty much all the cloud vendors provide OS images with the software components pre-installed and configured properly to be used on GPU enabled instances so you may be able to leverage those if you are using a Cloud VM. Over 3 years ago I wrote this post which is a complete guide on all the setup steps needed to get GPU training working in a Dataiku instance running on RHEL v7.9. While this post is now outdated it will give you a rough idea of all the steps involved. It will be up to you to work out the specific steps for your required environment. Feel free to post an update when and if you get it working so other people in the Community can benefit from your experience.
-
I will look into that but my first impression is that TimeSeries Forecasting hasn't been updated for a while.
Indeed the DL algorithms proposed in the default install are based on the MxNet library which is no longer maintained since November 2023. Also it is stucked with a cuda 11.7 at max.
this makes things even more complicated since my setup is based on CUDA 12.5 for other reasons / tools.
any idea if its possible to work with newer version of algorithms ? libraries ?
-
Turribeach Dataiku DSS Core Designer, Neuron, Dataiku DSS Adv Designer, Registered, Neuron 2023 Posts: 2,580 Neuron
DL is an advanced topic which is probably not suitable to test on the free Dataiku version specially if you want the latest frameworks. Ultimately Dataiku has to targe specific framework versions to integrate with them and make sure they work. These frameworks are a moving target so you can't really expect Dataiku to always support the latest versions. Support for Pandas 2.x has only just recently been added for instance. Whether this will work or not it's hard to say. If I had to guess I would probably say CUDA 12.5 will not work but it's just a guess. You will need to involve Dataiku Support or Professional Services to get a more clear answer.
-
Thank you for your answer and I will give another try with "generic" AutoML prediction to see if I can make it work.
-
There's an option at the top of Requested Packages to "Add sets of packages".
you will find options that meet your needs. I believe in order for the update to work your dataiku-dss instance must be able to connect to the link at the bottom of these sets:
--find-linksThat being said, I was able (after much testing) to add some packages that worked with a free version (13.3.2) and allowed deep modeling to use my GPU. It assumes that the libraries are available as I did not use the link in my Package Requests. Note that these hard requirements are based on the specificly listed libraries. It also includes some libraries that Dataiku requires for their javascript to work (I assume).
Notes:
- The mxnet-cu112 was not required for the GPU to process.
- I had Cuda 12.7 and verified with nvidia-smi. Cuda 11.8 was compatable.
typing-extensions==4.5.0 # satisfies TF 2.13.x and Torch 2.0.1
-------- Deep-learning libs (CUDA 11.8) ----------
numpy==1.23.5 # keeps MXNet 1.9.1 happytorch==2.0.1+cu118
-------- ML / stats ------------------------------
torchvision==0.15.2+cu118
torchaudio==2.0.2+cu118
tensorflow[and-cuda]==2.13.1 # GPU wheel, pulls nvidia-*-cu11 helpersscikit-learn==1.5.0
scipy>=1.11,<1.12
statsmodels
pmdarima
prophet
gluonts==0.15.1
pydantic==1.10.15 # avoids 2.x which needs newer typing-ext
mxnet==1.9.1 # CPU wheel (simpler), or mxnet-cu112==1.9.1
cloudpickle
matplotlib
opencv-python
flask
For Time Series Forecasting there are other requirements and so I worked to combine the Deep Modeling and Time Series requirements on the same system. Here, mxnet-cu112==1.9.1 is requirednumpy==1.23.5
Deep-Learning (Keras & PyTorch, CUDA 11.8)
typing-extensions==4.5.0 # TF 2.13 & Torch 2.0 agreetorch==2.0.1+cu118
Time-Series (GluonTS + MXNet, CUDA 11.2)
torchvision==0.15.2+cu118
torchaudio==2.0.2+cu118
tensorflow[and-cuda]==2.13.1 # GPU wheel, bundles cu118 libsmxnet-cu112==1.9.1
DSS “must-have” utilities
nvidia-nccl-cu11==2.19.3 # ships libnccl.so.2
gluonts==0.15.1flask
Classical ML / stats
Jinja2 # DSS UI check for GPU
h5py # Keras saving
pillow # image transformsscikit-learn==1.5.0
scipy>=1.11,<1.12
statsmodels
pmdarima
cloudpickleGood luck. It's a real pain and some things should just be available out of the box in my honest opinion.