Install multiple instance of Dss in multiple machines

Bader
Level 3
Install multiple instance of Dss in multiple machines

Hello , 

I would like to install multiple instance of DSS using dcoker in multiple machine to balance the workload in each instance. Could you please support in the best practise in achieving this ?  

3 Replies
Andrey
Dataiker Alumni

Hi Bader,

DSS doesn't support load balancing by running multiple backends and load balancing. Technically it's because of the $DATADIR storage that's not designed to be accessed by multiple instances at the same time.

However, the DSS backend itself is not supposed to perform heavy computations. The computationally intensive operations like when the recipes are run can be offloaded to either underlying systems, for example by preferring a SQL engine instead of the DSS backend, other operations like training ML models can be run in Docker container on remote machines or even in a Kubernetes cluster. 

Please feel free to refer to this page for more information:

https://doc.dataiku.com/dss/latest/containers/index.html

Andrey Avtomonov
R&D Engineer @ Dataiku
Bader
Level 3
Author

Hi Andrey, thanks for instance reply I really appreciated. As far as I know DSS is incompatible with the ability to leverage containers as a processing engine. if theres's a way to submit a spark job or any other engine as its own container. Please let me know

Andrey
Dataiker Alumni

Yes, the containerized execution option is not a separate engine in DSS and won't offload all of the operations at the moment, the list of what can be containerized can be found here:

https://doc.dataiku.com/dss/latest/containers/concepts.html#capabilities-and-benefits

 

As for the Spark, we support running Spark on Kubernetes as described here:

https://doc.dataiku.com/dss/latest/spark/kubernetes/managed.html

Andrey Avtomonov
R&D Engineer @ Dataiku