Sign up to take part
Registered users can ask their own questions, contribute to discussions, and be part of the Community!
Registered users can ask their own questions, contribute to discussions, and be part of the Community!
Hello,
I'm getting this error whilst trying to train a time series model on GPU.
OSError: libnvToolsExt.so.1: cannot open shared object file: No such file or directory
I have done the following so far:
1. Created a cuda 10.2 enabled base image on the DSS and pushed the base images
2. Created a code environment and added the additional packages for visual time series forecasting (cuda 10.2)
I've also tried to use docker append to add cuda-nvtx-10-2 to the base image.
USER root
# Install cuda-nvtx-10-2
RUN yum-config-manager --add-repo https://developer.download.nvidia.com/compute/cuda/repos/rhel7/x86_64/cuda-rhel7.repo && \
yum install -y cuda-nvtx-10-2 && \
yum clean all
# Globally enable cuda-nvtx-10-2
ENV PATH=/usr/local/cuda-10.2/bin:${PATH} \
LD_LIBRARY_PATH=/usr/local/cuda-10.2/lib64:${LD_LIBRARY_PATH}
USER dataiku
The files are installed and available, but it they're still not found when the code runs.
I've seen online that others resolved this by including the /usr/local/cuda/lib64 path to $LD_LIBRARY_PATH folder but I'm unable to do so. The ENV from the docker append doesn't seem to take effect.
Does anyone have any suggestions?
Thanks
Riaan
Operating system used: centos (cloud stack)
Hi @RiaanB
As you have also reported this in the support ticket, I will also reply to this here.
You will need to update LD_LIBRARY_PATH:
ENV LD_LIBRARY_PATH=/usr/local/cuda/compat:/usr/local/cuda/lib64${LD_LIBRARY_PATH:+:$LD_LIBRARY_PATH}
and rebuild images. We are going to fix this permanently in the upcoming releases.