Install multiple instance of Dss in multiple machines
Hello ,
I would like to install multiple instance of DSS using dcoker in multiple machine to balance the workload in each instance. Could you please support in the best practise in achieving this ?
Answers
-
Hi Bader,
DSS doesn't support load balancing by running multiple backends and load balancing. Technically it's because of the $DATADIR storage that's not designed to be accessed by multiple instances at the same time.
However, the DSS backend itself is not supposed to perform heavy computations. The computationally intensive operations like when the recipes are run can be offloaded to either underlying systems, for example by preferring a SQL engine instead of the DSS backend, other operations like training ML models can be run in Docker container on remote machines or even in a Kubernetes cluster.
Please feel free to refer to this page for more information:
-
Hi Andrey, thanks for instance reply I really appreciated. As far as I know DSS is incompatible with the ability to leverage containers as a processing engine. if theres's a way to submit a spark job or any other engine as its own container. Please let me know
-
Yes, the containerized execution option is not a separate engine in DSS and won't offload all of the operations at the moment, the list of what can be containerized can be found here:
https://doc.dataiku.com/dss/latest/containers/concepts.html#capabilities-and-benefits
As for the Spark, we support running Spark on Kubernetes as described here:
https://doc.dataiku.com/dss/latest/spark/kubernetes/managed.html