building a custom docker image
I want to run dataiku in docker, and the host has a few GPU's which I want to use. So, it seems I have to build my own image for docker.
This is my recipe: (to keep it simple, and make the build faster, I have omitted CUDA and R for now)
mkdir dataiku-build cd dataiku-build wget https://cdn.downloads.dataiku.com/public/dss/8.0.2/dataiku-dss-8.0.2.tar.gz tar xzf dataiku-dss-8.0.2.tar.gz sudo dataiku-dss-8.0.2/scripts/install/install-deps.sh dataiku-dss-8.0.2/installer.sh -d DATA_DIR -p 11000 DATA_DIR/bin/dssadmin build-base-image --type container-exec --without-cuda --without-r
Let it cook for at few minutes, and then lets check:
$ docker image ls REPOSITORY TAG IMAGE ID CREATED SIZE dku-exec-base-sb59tjz4ll5axakfj8fhfv6g dss-8.0.2 91329c9f35b1 11 seconds ago 1.39GB centos 7 8652b9f0cb4c 7 days ago 204MB docker_web latest fdfe96bedceb 11 months ago 6.27GB jupyter/datascience-notebook 17aba6048f44 1bb52a350a5a 23 months ago 6.27GB
ah yes, good, there on top, 11 seconds old.
Now, lets fire it up to see whats happens:
$ docker run -p 10000:10000 dku-exec-base-sb59tjz4ll5axakfj8fhfv6g:dss-8.0.2 [2020-11-21 15:06:25,794] [1/MainThread] [INFO] [root] Fetching job definition Traceback (most recent call last): File "/usr/lib64/python2.7/runpy.py", line 162, in _run_module_as_main Installing debugging signal handler "__main__", fname, loader, pkg_name) File "/usr/lib64/python2.7/runpy.py", line 72, in _run_code exec code in run_globals File "/opt/dataiku/python/dataiku/container/runner.py", line 483, in <module> execution_id = sys.argv[1] IndexError: list index out of range
That obviously did not work.
Anyone else having success building and running own dataiku docker images?
Answers
-
Sergey Dataiker, Dataiku DSS Core Designer, Dataiku DSS & SQL, Dataiku DSS Core Concepts Posts: 365 Dataiker
Hi @mogul
Not sure I understand what you are trying to achieve with those steps.
If you are looking to utilize GPUs in your ML tasks, you can just start DSS (as you have already installed it), create code environment and choose GPU during model training:
https://doc.dataiku.com/dss/latest/machine-learning/deep-learning/runtime-gpu.html
If you are interested in containerized execution with CUDA support, you will need to set up and configure the container engine (either Kubernetes or Docker) in DSS:
https://doc.dataiku.com/dss/latest/containers/concepts.html
-
The latter. I want to run containerised (docker) with CUDA support.
To do so I followed the guide over here: https://doc.dataiku.com/dss/latest/containers/custom-base-images.html
Which results in a docker image that does not work.
Inspecting with nvidia-smi on the docker host reveals two GPU's
$ nvidia-smi Mon Nov 23 11:28:25 2020 +-----------------------------------------------------------------------------+ | NVIDIA-SMI 455.28 Driver Version: 455.28 CUDA Version: 11.1 | |-------------------------------+----------------------+----------------------+ | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | | | | MIG M. | |===============================+======================+======================| | 0 TITAN RTX On | 00000000:01:00.0 Off | N/A | | 41% 36C P8 10W / 280W | 16MiB / 24217MiB | 0% Default | | | | N/A | +-------------------------------+----------------------+----------------------+ | 1 TITAN RTX Off | 00000000:21:00.0 Off | N/A | | 41% 34C P8 13W / 280W | 5MiB / 24220MiB | 0% Default | | | | N/A | +-------------------------------+----------------------+----------------------+ +-----------------------------------------------------------------------------+ | Processes: | | GPU GI CI PID Type Process name GPU Memory | | ID ID Usage | |=============================================================================| | 0 N/A N/A 1371 G /usr/lib/xorg/Xorg 9MiB | | 0 N/A N/A 1595 G /usr/bin/gnome-shell 4MiB | | 1 N/A N/A 1371 G /usr/lib/xorg/Xorg 4MiB | | 1 N/A N/A 1595 G /usr/bin/gnome-shell 0MiB | +-----------------------------------------------------------------------------+
doing the same inside the container running out dataiku instance:
$ docker exec -it dataiku_dataiku_1 /bin/bash [dataiku@5ca65e340f6a ~]$ nvidia-smi bash: nvidia-smi: command not found
Which was why I thought I had to build a new docker image.
Ah yes, the non CUDA image we are currently running is the one I found on dockerhub: https://hub.docker.com/r/dataiku/dss/