Distributed Training Machine Learning
deeplearnyogi
Registered Posts: 9 ✭✭✭✭
Hi,
What I enjoy about Dataiku is the visual machine learning.
I have a 21 GB Dataset to train and I'd like to try it on Dataiku with XGBOOST however it will take a while.
I have a couple machines that connect in a SSH cluster.
Is there anyway I can create a Dask SSH cluster in Dataiku so I can use the visual machine learning to train the data?
In my jupyter notebook, I create the SSH dask cluster as follows:
from dask.distributed import Client, SSHCluster cluster = SSHCluster( ["localhost", "192.168.1.119", "192.168.1.191"], connect_options={"known_hosts": None,"username": "vinhdiesal"}, worker_options={"nthreads": 20, "local_directory":"/tmp/"}, scheduler_options={"port": 0, "dashboard_address": ":8797"}, worker_module= 'dask_cuda.dask_cuda_worker' ) client = Client(cluster) client
Thanks,
Vinh
Answers
-
Hi,
Thanks for the positive feedback about visual ML in DSS. However, I have to admit, that Dask isn't integrated into it in any way. The only way you could proceed in DSS is by using Notebooks and implementing the Dask interaction yourself.