Survey banner
The Dataiku Community is moving to a new home! We are temporary in read only mode: LEARN MORE

DSS setup of spark on AKS

jonhli
Level 2
DSS setup of spark on AKS

Hi everyone,

I’m currently trying to set up my DSS instance (running on a VM) to run Spark on AKS, and I'm feeling a bit lost about where to start.

Could you please guide me on where Spark should be installed? Should it be on the AKS cluster or the VM? I realize this might be a basic question, but any assistance or pointers would be greatly appreciated.

Thank you!


Operating system used: Linux

0 Kudos
4 Replies
AlexT
Dataiker

Hi @jonhli,
The recommended approach is to attach an AKS cluster to DSS:
https://doc.dataiku.com/dss/latest/containers/aks/managed.html#initial-setup
Please note the network requirements specifically as DSS and AKS cluster will need to be able to communicate on all ports.
Once you follow the rest of the step build the base spark image and define spark configuration and run the spark integration you should be able to run spark jobs.

https://doc.dataiku.com/dss/latest/containers/setup-k8s.html#optional-setup-spark



Thanks

0 Kudos
jonhli
Level 2
Author

Hello @AlexT ,
Thanks for your prompt reply, the base spark image was built and I was able to push the base image successfully. How can I test if this is working as expected in my projects?

0 Kudos
AlexT
Dataiker

You can create a prepare recipe and run it on spark engine or a pyspark recipe.

0 Kudos
jonhli
Level 2
Author

I see. But is there a way to change the execution engine for all of my recipes in my project to be executed by default in spark on K8s?

0 Kudos