Error when use macro - Download pre-trained model

Solved!
BenGonGon
Level 3
Error when use macro - Download pre-trained model

Hello,
I have installed Python 3.6 by this way (thanks @sergeyd and @Andrey ) : https://community.dataiku.com/t5/Setup-Configuration/Debian-10-and-Python-3-6/td-p/11576

I have installed cudnn by this way (thanks @sergeyd) : https://community.dataiku.com/t5/Setup-Configuration/Debian-10-and-CUDNN/td-p/11595

I have installed them plugins.untitled.png

But I have always a problem when I want to download a pre-trained model.untitled1.png

Someone can help me please?

0 Kudos
1 Solution
Andrey
Dataiker Alumni

Yes, you could try installing from ubuntu 16.04 repo as indeed there's no cuda 8 in the debian 10 repo available.

 

Alternatively you could try the runfile (local) type of installation:

https://developer.nvidia.com/cuda-80-ga2-download-archive

 

Screenshot 2020-11-09 at 19.13.18.png

 

In any case please make sure that cuda 8 is installed properly (According to the logs you've attached the installation doesn't even start). The easiest way to test is by running the 4 commands I've sent you before. When all four of them return something then it makes sense to try it from the DSS side.

Andrey Avtomonov
R&D Engineer @ Dataiku

View solution in original post

0 Kudos
14 Replies
Andrey
Dataiker Alumni

Hi @BenGonGon ,

 

Please make sure that CUDA 8 is installed correctly in your system.

To check that, could you send the results of the following commands:

 

nvcc --version

find /usr/local -name "libcublas*"

echo $LD_LIBRARY_PATH

echo $PATH

 

Regards

Andrey Avtomonov
R&D Engineer @ Dataiku
0 Kudos
BenGonGon
Level 3
Author

Hi @Andrey,

sure, the result is :

bggbecane@BGGDatakikou:/opt/datakikou/bin$ nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2019 NVIDIA Corporation
Built on Wed_Oct_23_19:24:38_PDT_2019
Cuda compilation tools, release 10.2, V10.2.89
bggbecane@BGGDatakikou:/opt/datakikou/bin$ find /usr/local -name "libcublas*"
bggbecane@BGGDatakikou:/opt/datakikou/bin$ echo $LD_LIBRARY_PATH

bggbecane@BGGDatakikou:/opt/datakikou/bin$ echo $PATH
/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games
bggbecane@BGGDatakikou:/opt/datakikou/bin$
0 Kudos
Andrey
Dataiker Alumni

Thanks,

The plugin that you've installed - Deep learning for images (GPU) uses a specific version of tensorflow-gpu package - 1.4

This version is only compatible with CUDA 8 and cudnn 6

So you'd have to uninstall the CUDA 10.2 by

To remove CUDA Toolkit:
$ sudo apt-get --purge remove "*cublas*" "*cufft*" "*curand*" \
 "*cusolver*" "*cusparse*" "*npp*" "*nvjpeg*" "cuda*" "nsight*" 
To remove NVIDIA Drivers:
$ sudo apt-get --purge remove "*nvidia*"
To clean up the uninstall:
$ sudo apt-get autoremove

 

After that you could follow either the official guide to install cuda 8 and cudnn 6:

https://docs.nvidia.com/cuda/archive/8.0/cuda-installation-guide-linux/index.html

or the one from here for example:

https://yangcha.github.io/Install-CUDA8/

 

In the end, you'll need to reboot the computer

 

Regards

Andrey Avtomonov
R&D Engineer @ Dataiku
0 Kudos
BenGonGon
Level 3
Author

I was execute your command and follow second link.

But I have a conflict with  libc6-dev and libgcc-8-dev

I have do

apt-get --purge remove libgcc-8-dev

And I think I have kill my linux.

I feel me a few disgust, I continue tomorrow.

I think it is better if I reinstall my linux from scratch.

 

0 Kudos
BenGonGon
Level 3
Author

Hello @Andrey ,
How are you?
I have reconstruct all with this install list.
I have use a debian 10.6.0 and I have installed only ssh server in graphical installer mode.

#use root account
#add at the end of file /etc/apt/sources.list
deb http://deb.debian.org/debian buster-backports main contrib non-free

apt-get update
apt-get install linux-headers-$(uname -r) -y
apt install -t buster-backports linux-headers-$(uname -r) -y
apt install -t buster-backports nvidia-driver -y
reboot
mkdir /opt/cudnn
cd /opt/cudnn

apt-get install wget -y
wget https://developer.nvidia.com/compute/machine-learning/cudnn/secure/v6/prod/8.0_20170307/Ubuntu16_04_x64/libcudnn6_6.0.20-1+cuda8.0_amd64-deb
wget https://developer.nvidia.com/compute/machine-learning/cudnn/secure/v6/prod/8.0_20170307/Ubuntu16_04_x64/libcudnn6-dev_6.0.20-1+cuda8.0_amd64-deb
wget https://developer.nvidia.com/compute/machine-learning/cudnn/secure/v6/prod/8.0_20170307/Ubuntu16_04_x64/libcudnn6-doc_6.0.20-1%2Bcuda8.0_amd64-deb
dpkg -i libcudnn6*
lsb_release -a

apt-get install software-properties-common -y
apt-key adv --fetch-keys https://developer.download.nvidia.com/compute/cuda/repos/debian10/x86_64/7fa2af80.pub
add-apt-repository "deb https://developer.download.nvidia.com/compute/cuda/repos/debian10/x86_64/ /"
add-apt-repository contrib
apt-get update

apt-get install cuda=8.0.61-1
apt-get install libcudnn6-dev

export PATH=/usr/local/cuda-8.0/bin${PATH:+:${PATH}}
export LD_LIBRARY_PATH=/usr/local/cuda-8.0/lib64${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}

apt-get install build-essential zlib1g-dev libncurses5-dev libgdbm-dev libnss3-dev libssl-dev libreadline-dev libffi-dev -y
apt-get install libsqlite3-0 -y

cd /opt/
wget https://www.python.org/ftp/python/3.6.10/Python-3.6.10.tar.xz
tar xvf Python-3.6.10.tar.xz
cd Python-3.6.10
./configure --enable-optimizations --enable-loadable-sqlite-extensions
make -j 8
make altinstall
python3.6 --version
which python3.6
reboot

cd /opt/
wget https://cdn.downloads.dataiku.com/public/dss/8.0.2/dataiku-dss-8.0.2.tar.gz
tar xzf dataiku-dss-8.0.2.tar.gz

/opt/dataiku-dss-8.0.2/scripts/install/install-deps.sh
mkdir /opt/datakikou/
#use normal account
/opt/dataiku-dss-8.0.2/installer.sh -d /opt/datakikou/ -p 11000 -P python3.6
#use root account
/opt/dataiku-dss-8.0.2/scripts/install/install-boot.sh /opt/datakikou/ bggserver
#use normal account
/opt/datakikou/bin/dss start

But I have always this problemuntitled.png

Can you correct my install list if it is wrong, please?

 

 

0 Kudos
Andrey
Dataiker Alumni

Hi @BenGonGon ,

 

Could you add these commands:

export PATH=/usr/local/cuda-8.0/bin${PATH:+:${PATH}}
export LD_LIBRARY_PATH=/usr/local/cuda-8.0/lib64${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}

 

to the end of your ~/.bashrc file? 

 

After that to check if the installation was correct could you open a new terminal (disconnect from ssh and connect again), repeat these commands and send the results?

nvcc --version

find /usr/local -name "libcublas*"

echo $LD_LIBRARY_PATH

echo $PATH
Andrey Avtomonov
R&D Engineer @ Dataiku
0 Kudos
BenGonGon
Level 3
Author

I have add them lines at end /root/.bashrc.
Disconnected and reconnected to ssh.

The result is :

root@bggdatakikou:~# nvcc --version
-bash: nvcc : commande introuvable
root@bggdatakikou:~# find /usr/local -name "libcublas*"
root@bggdatakikou:~# echo $LD_LIBRARY_PATH
/usr/local/cuda-8.0/lib64
root@bggdatakikou:~# echo $PATH
/usr/local/cuda-8.0/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
root@bggdatakikou:~#

 

0 Kudos
Andrey
Dataiker Alumni

Hm, nvcc is still not available, that doesn't look right. (Same for the missing libcublas*)

Was cuda installed with no issues? Can you list a content of 

/usr/local/cuda-8.0/bin

 

Andrey Avtomonov
R&D Engineer @ Dataiku
0 Kudos
BenGonGon
Level 3
Author

I have not see issue but I am a noob.

It is because that, I have send you my protocol install.

And I have not cuda-8.0 in /usr/bin

0 Kudos
Andrey
Dataiker Alumni

ok, well if there's no /usr/local/cuda* directory then I'd say that cuda wasn't actually installed.

 

Can you re-run 

sudo apt-get install cuda=8.0.61-1 -y 2>&1 | tee -a /tmp/cuda-install.log

and attach the log file /tmp/cuda-install.log here

Andrey Avtomonov
R&D Engineer @ Dataiku
0 Kudos
BenGonGon
Level 3
Author

The file you want, I hope that help you.

for missing part, I have see them on https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1604/x86_64/

in install protocol do I replace this

apt-get install software-properties-common -y
apt-key adv --fetch-keys https://developer.download.nvidia.com/compute/cuda/repos/debian10/x86_64/7fa2af80.pub
add-apt-repository "deb https://developer.download.nvidia.com/compute/cuda/repos/debian10/x86_64/ /"
add-apt-repository contrib
apt-get update

by this

apt-get install software-properties-common -y
apt-key adv --fetch-keys https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1604/x86_64/7fa2af80.pub
add-apt-repository "deb https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1604/x86_64/ /"
add-apt-repository contrib
apt-get update

????

0 Kudos
Andrey
Dataiker Alumni

Yes, you could try installing from ubuntu 16.04 repo as indeed there's no cuda 8 in the debian 10 repo available.

 

Alternatively you could try the runfile (local) type of installation:

https://developer.nvidia.com/cuda-80-ga2-download-archive

 

Screenshot 2020-11-09 at 19.13.18.png

 

In any case please make sure that cuda 8 is installed properly (According to the logs you've attached the installation doesn't even start). The easiest way to test is by running the 4 commands I've sent you before. When all four of them return something then it makes sense to try it from the DSS side.

Andrey Avtomonov
R&D Engineer @ Dataiku
0 Kudos
BenGonGon
Level 3
Author

It works, thank you very much for your help.

 

I put the install protocol for others.

For Debian 10.6
#use root account
add at the end of file /etc/apt/sources.list
deb http://deb.debian.org/debian buster-backports main contrib non-free

apt-get update
apt-get install linux-headers-$(uname -r) -y
apt install -t buster-backports linux-headers-$(uname -r) -y
apt install -t buster-backports nvidia-driver -y
reboot
mkdir /opt/cudnn
cd /opt/cudnn

apt-get install wget -y
wget https://developer.nvidia.com/compute/machine-learning/cudnn/secure/v6/prod/8.0_20170307/Ubuntu16_04_x64/libcudnn6_6.0.20-1+cuda8.0_amd64-deb
wget https://developer.nvidia.com/compute/machine-learning/cudnn/secure/v6/prod/8.0_20170307/Ubuntu16_04_x64/libcudnn6-dev_6.0.20-1+cuda8.0_amd64-deb
wget https://developer.nvidia.com/compute/machine-learning/cudnn/secure/v6/prod/8.0_20170307/Ubuntu16_04_x64/libcudnn6-doc_6.0.20-1%2Bcuda8.0_amd64-deb
dpkg -i libcudnn6*
lsb_release -a

apt-get install software-properties-common -y
apt-key adv --fetch-keys https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1604/x86_64/7fa2af80.pub
add-apt-repository "deb https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1604/x86_64/ /"
add-apt-repository contrib
apt-get update

apt-get install cuda=8.0.61-1
apt-get install libcudnn6-dev

export PATH=/usr/local/cuda-8.0/bin${PATH:+:${PATH}}
export LD_LIBRARY_PATH=/usr/local/cuda-8.0/lib64${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}

apt-get install build-essential zlib1g-dev libncurses5-dev libgdbm-dev libnss3-dev libssl-dev libreadline-dev libffi-dev -y
apt-get install libsqlite3-0 -y

-> /root/.bashrc
add at the end of file
export PATH=/usr/local/cuda-8.0/bin${PATH:+:${PATH}}
export LD_LIBRARY_PATH=/usr/local/cuda-8.0/lib64${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}

reboot

cd /opt/
wget https://www.python.org/ftp/python/3.6.10/Python-3.6.10.tar.xz
tar xvf Python-3.6.10.tar.xz
cd Python-3.6.10
./configure --enable-optimizations --enable-loadable-sqlite-extensions
make -j 8
make altinstall
python3.6 --version
which python3.6
reboot

cd /opt/
wget https://cdn.downloads.dataiku.com/public/dss/8.0.2/dataiku-dss-8.0.2.tar.gz
tar xzf dataiku-dss-8.0.2.tar.gz

/opt/dataiku-dss-8.0.2/scripts/install/install-deps.sh
mkdir /opt/datakikou/
#use normal account
/opt/dataiku-dss-8.0.2/installer.sh -d /opt/datakikou/ -p 11000 -P python3.6
#use root account
/opt/dataiku-dss-8.0.2/scripts/install/install-boot.sh /opt/datakikou/ bggserver
#use normal account
/opt/datakikou/bin/dss start

but it is normal, for the same project?
24 Go Ram + GTX 970 = 5 minutes
32 Go Ram + 2x Intel Xeon X5690 @ 3.47GHz = 2.5 minutes

0 Kudos
Andrey
Dataiker Alumni

Regarding the performance, it's hard to say without knowing how exactly you're using the plugin.

The simplest way to monitor how your GPU is being used is by calling, you can get high-level insights from there

watch -n0 nvidia-smi

 

Since the initial problem was solved, I'd suggest closing this ticket to keep it concise for future readers.

 

Regards,

Andrey Avtomonov
R&D Engineer @ Dataiku
0 Kudos