Error when use macro - Download pre-trained model
Hello,
I have installed Python 3.6 by this way (thanks @sergeyd
and @Andrey
) : https://community.dataiku.com/t5/Setup-Configuration/Debian-10-and-Python-3-6/td-p/11576
I have installed cudnn by this way (thanks @sergeyd
) : https://community.dataiku.com/t5/Setup-Configuration/Debian-10-and-CUDNN/td-p/11595
I have installed them plugins.
But I have always a problem when I want to download a pre-trained model.
Someone can help me please?
Best Answer
-
Yes, you could try installing from ubuntu 16.04 repo as indeed there's no cuda 8 in the debian 10 repo available.
Alternatively you could try the runfile (local) type of installation:
https://developer.nvidia.com/cuda-80-ga2-download-archive
In any case please make sure that cuda 8 is installed properly (According to the logs you've attached the installation doesn't even start). The easiest way to test is by running the 4 commands I've sent you before. When all four of them return something then it makes sense to try it from the DSS side.
Answers
-
Hi @BenGonGon
,Please make sure that CUDA 8 is installed correctly in your system.
To check that, could you send the results of the following commands:
nvcc --version find /usr/local -name "libcublas*" echo $LD_LIBRARY_PATH echo $PATH
Regards
-
Hi @Andrey
,sure, the result is :
bggbecane@BGGDatakikou:/opt/datakikou/bin$ nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2019 NVIDIA Corporation
Built on Wed_Oct_23_19:24:38_PDT_2019
Cuda compilation tools, release 10.2, V10.2.89
bggbecane@BGGDatakikou:/opt/datakikou/bin$ find /usr/local -name "libcublas*"
bggbecane@BGGDatakikou:/opt/datakikou/bin$ echo $LD_LIBRARY_PATH
bggbecane@BGGDatakikou:/opt/datakikou/bin$ echo $PATH
/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games
bggbecane@BGGDatakikou:/opt/datakikou/bin$ -
Thanks,
The plugin that you've installed - Deep learning for images (GPU) uses a specific version of tensorflow-gpu package - 1.4
This version is only compatible with CUDA 8 and cudnn 6
So you'd have to uninstall the CUDA 10.2 by
To remove CUDA Toolkit:To clean up the uninstall:$ sudo apt-get --purge remove "*cublas*" "*cufft*" "*curand*" \"*cusolver*" "*cusparse*" "*npp*" "*nvjpeg*" "cuda*" "nsight*"
To remove NVIDIA Drivers:$ sudo apt-get --purge remove "*nvidia*"
$ sudo apt-get autoremove
After that you could follow either the official guide to install cuda 8 and cudnn 6:
https://docs.nvidia.com/cuda/archive/8.0/cuda-installation-guide-linux/index.html
or the one from here for example:
https://yangcha.github.io/Install-CUDA8/
In the end, you'll need to reboot the computer
Regards
-
I was execute your command and follow second link.
But I have a conflict with libc6-dev and libgcc-8-dev
I have do
apt-get --purge remove libgcc-8-dev
And I think I have kill my linux.
I feel me a few disgust, I continue tomorrow.
I think it is better if I reinstall my linux from scratch.
-
Hello @Andrey
,
How are you?
I have reconstruct all with this install list.
I have use a debian 10.6.0 and I have installed only ssh server in graphical installer mode.#use root account
#add at the end of file /etc/apt/sources.list
deb http://deb.debian.org/debian buster-backports main contrib non-free
apt-get update
apt-get install linux-headers-$(uname -r) -y
apt install -t buster-backports linux-headers-$(uname -r) -y
apt install -t buster-backports nvidia-driver -y
reboot
mkdir /opt/cudnn
cd /opt/cudnn
apt-get install wget -y
wget https://developer.nvidia.com/compute/machine-learning/cudnn/secure/v6/prod/8.0_20170307/Ubuntu16_04_x64/libcudnn6_6.0.20-1+cuda8.0_amd64-deb
wget https://developer.nvidia.com/compute/machine-learning/cudnn/secure/v6/prod/8.0_20170307/Ubuntu16_04_x64/libcudnn6-dev_6.0.20-1+cuda8.0_amd64-deb
wget https://developer.nvidia.com/compute/machine-learning/cudnn/secure/v6/prod/8.0_20170307/Ubuntu16_04_x64/libcudnn6-doc_6.0.20-1+cuda8.0_amd64-deb
dpkg -i libcudnn6*
lsb_release -a
apt-get install software-properties-common -y
apt-key adv --fetch-keys https://developer.download.nvidia.com/compute/cuda/repos/debian10/x86_64/7fa2af80.pub
add-apt-repository "deb https://developer.download.nvidia.com/compute/cuda/repos/debian10/x86_64/ /"
add-apt-repository contrib
apt-get update
apt-get install cuda=8.0.61-1
apt-get install libcudnn6-dev
export PATH=/usr/local/cuda-8.0/bin${PATH:+:${PATH}}
export LD_LIBRARY_PATH=/usr/local/cuda-8.0/lib64${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}
apt-get install build-essential zlib1g-dev libncurses5-dev libgdbm-dev libnss3-dev libssl-dev libreadline-dev libffi-dev -y
apt-get install libsqlite3-0 -y
cd /opt/
wget https://www.python.org/ftp/python/3.6.10/Python-3.6.10.tar.xz
tar xvf Python-3.6.10.tar.xz
cd Python-3.6.10
./configure --enable-optimizations --enable-loadable-sqlite-extensions
make -j 8
make altinstall
python3.6 --version
which python3.6
reboot
cd /opt/
wget https://cdn.downloads.dataiku.com/public/dss/8.0.2/dataiku-dss-8.0.2.tar.gz
tar xzf dataiku-dss-8.0.2.tar.gz
/opt/dataiku-dss-8.0.2/scripts/install/install-deps.sh
mkdir /opt/datakikou/
#use normal account
/opt/dataiku-dss-8.0.2/installer.sh -d /opt/datakikou/ -p 11000 -P python3.6
#use root account
/opt/dataiku-dss-8.0.2/scripts/install/install-boot.sh /opt/datakikou/ bggserver
#use normal account
/opt/datakikou/bin/dss startBut I have always this problem
Can you correct my install list if it is wrong, please?
-
Hi @BenGonGon
,Could you add these commands:
export PATH=/usr/local/cuda-8.0/bin${PATH:+:${PATH}}
export LD_LIBRARY_PATH=/usr/local/cuda-8.0/lib64${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}to the end of your ~/.bashrc file?
After that to check if the installation was correct could you open a new terminal (disconnect from ssh and connect again), repeat these commands and send the results?
nvcc --version find /usr/local -name "libcublas*" echo $LD_LIBRARY_PATH echo $PATH
-
I have add them lines at end /root/.bashrc.
Disconnected and reconnected to ssh.The result is :
root@bggdatakikou:~# nvcc --version
-bash: nvcc : commande introuvable
root@bggdatakikou:~# find /usr/local -name "libcublas*"
root@bggdatakikou:~# echo $LD_LIBRARY_PATH
/usr/local/cuda-8.0/lib64
root@bggdatakikou:~# echo $PATH
/usr/local/cuda-8.0/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
root@bggdatakikou:~# -
Hm, nvcc is still not available, that doesn't look right. (Same for the missing libcublas*)
Was cuda installed with no issues? Can you list a content of
/usr/local/cuda-8.0/bin
-
I have not see issue but I am a noob.
It is because that, I have send you my protocol install.
And I have not cuda-8.0 in /usr/bin
-
ok, well if there's no /usr/local/cuda* directory then I'd say that cuda wasn't actually installed.
Can you re-run
sudo apt-get install cuda=8.0.61-1 -y 2>&1 | tee -a /tmp/cuda-install.log
and attach the log file /tmp/cuda-install.log here
-
The file you want, I hope that help you.
for missing part, I have see them on https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1604/x86_64/
in install protocol do I replace this
apt-get install software-properties-common -y
apt-key adv --fetch-keys https://developer.download.nvidia.com/compute/cuda/repos/debian10/x86_64/7fa2af80.pub
add-apt-repository "deb https://developer.download.nvidia.com/compute/cuda/repos/debian10/x86_64/ /"
add-apt-repository contrib
apt-get updateby this
apt-get install software-properties-common -y
apt-key adv --fetch-keys https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1604/x86_64/7fa2af80.pub
add-apt-repository "deb https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1604/x86_64/ /"
add-apt-repository contrib
apt-get update????
-
It works, thank you very much for your help.
I put the install protocol for others.
For Debian 10.6
#use root account
add at the end of file /etc/apt/sources.list
deb http://deb.debian.org/debian buster-backports main contrib non-free
apt-get update
apt-get install linux-headers-$(uname -r) -y
apt install -t buster-backports linux-headers-$(uname -r) -y
apt install -t buster-backports nvidia-driver -y
reboot
mkdir /opt/cudnn
cd /opt/cudnn
apt-get install wget -y
wget https://developer.nvidia.com/compute/machine-learning/cudnn/secure/v6/prod/8.0_20170307/Ubuntu16_04_x64/libcudnn6_6.0.20-1+cuda8.0_amd64-deb
wget https://developer.nvidia.com/compute/machine-learning/cudnn/secure/v6/prod/8.0_20170307/Ubuntu16_04_x64/libcudnn6-dev_6.0.20-1+cuda8.0_amd64-deb
wget https://developer.nvidia.com/compute/machine-learning/cudnn/secure/v6/prod/8.0_20170307/Ubuntu16_04_x64/libcudnn6-doc_6.0.20-1+cuda8.0_amd64-deb
dpkg -i libcudnn6*
lsb_release -a
apt-get install software-properties-common -y
apt-key adv --fetch-keys https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1604/x86_64/7fa2af80.pub
add-apt-repository "deb https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1604/x86_64/ /"
add-apt-repository contrib
apt-get update
apt-get install cuda=8.0.61-1
apt-get install libcudnn6-dev
export PATH=/usr/local/cuda-8.0/bin${PATH:+:${PATH}}
export LD_LIBRARY_PATH=/usr/local/cuda-8.0/lib64${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}
apt-get install build-essential zlib1g-dev libncurses5-dev libgdbm-dev libnss3-dev libssl-dev libreadline-dev libffi-dev -y
apt-get install libsqlite3-0 -y
-> /root/.bashrc
add at the end of file
export PATH=/usr/local/cuda-8.0/bin${PATH:+:${PATH}}
export LD_LIBRARY_PATH=/usr/local/cuda-8.0/lib64${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}
reboot
cd /opt/
wget https://www.python.org/ftp/python/3.6.10/Python-3.6.10.tar.xz
tar xvf Python-3.6.10.tar.xz
cd Python-3.6.10
./configure --enable-optimizations --enable-loadable-sqlite-extensions
make -j 8
make altinstall
python3.6 --version
which python3.6
reboot
cd /opt/
wget https://cdn.downloads.dataiku.com/public/dss/8.0.2/dataiku-dss-8.0.2.tar.gz
tar xzf dataiku-dss-8.0.2.tar.gz
/opt/dataiku-dss-8.0.2/scripts/install/install-deps.sh
mkdir /opt/datakikou/
#use normal account
/opt/dataiku-dss-8.0.2/installer.sh -d /opt/datakikou/ -p 11000 -P python3.6
#use root account
/opt/dataiku-dss-8.0.2/scripts/install/install-boot.sh /opt/datakikou/ bggserver
#use normal account
/opt/datakikou/bin/dss startbut it is normal, for the same project?
24 Go Ram + GTX 970 = 5 minutes
32 Go Ram + 2x Intel Xeon X5690 @ 3.47GHz = 2.5 minutes -
Regarding the performance, it's hard to say without knowing how exactly you're using the plugin.
The simplest way to monitor how your GPU is being used is by calling, you can get high-level insights from there
watch -n0 nvidia-smi
Since the initial problem was solved, I'd suggest closing this ticket to keep it concise for future readers.
Regards,