Survey banner
The Dataiku Community is moving to a new home! Some short term disruption starting next week: LEARN MORE

Visual Time series model training on GPU fails

Level 1
Visual Time series model training on GPU fails


I'm getting this error whilst trying to train a time series model on GPU. 

OSError: cannot open shared object file: No such file or directory

I have done the following so far:

1. Created a cuda 10.2 enabled base image on the DSS and pushed the base images

2. Created a code environment  and added the additional packages for visual time series forecasting (cuda 10.2)

I've also tried to use docker append to add cuda-nvtx-10-2 to the base image.

USER root
# Install cuda-nvtx-10-2
RUN yum-config-manager --add-repo && \
yum install -y cuda-nvtx-10-2 && \
yum clean all
# Globally enable cuda-nvtx-10-2
ENV PATH=/usr/local/cuda-10.2/bin:${PATH} \
USER dataiku

The files are installed and available, but it they're still not found when the code runs.

I've seen online that others resolved this by  including the /usr/local/cuda/lib64 path to $LD_LIBRARY_PATH folder but I'm unable to do so. The ENV from the docker append doesn't seem to take effect. 

Does anyone have any suggestions?



Operating system used: centos (cloud stack)

0 Kudos
1 Reply

 Hi @RiaanB 

As you have also reported this in the support ticket, I will also reply to this here. 

You will need to update LD_LIBRARY_PATH:

ENV LD_LIBRARY_PATH=/usr/local/cuda/compat:/usr/local/cuda/lib64${LD_LIBRARY_PATH:+:$LD_LIBRARY_PATH}

and rebuild images. We are going to fix this permanently in the upcoming releases. 

0 Kudos