I am a beginner in Spark and I am trying to setup Spark on our Kubernetes cluster. The cluster is now working and I can run Spark jobs; however, I want to access Spark web UI to inspect how my job is being distributed. We usually port-forward a port(4040), but I am not being able to check which pod is the driver pod after running kubectl get pods --all-namespaces. What is the name DataIKU uses for Spark driver/master pod?
TL;DR: how to access Spark web UI on a kubernetes-managed cluster?
Thanks in advance.
the Spark driver is always executed in the DSS box, not in k8s.
You should look for the process using port 4040 on the DSS box (or increments of that, if they are taken already).
Also consider that spark drivers are specific for the job you're executing and not persistent: once the job is completed, they are destroyed. Another job will start its own driver.
FieldEngineer @ Dataiku