Error running spark Recipes on Kubernetes- Initial job has not accepted any resources
when I use DSS v 13 to push execution of visual recipes to containerized execution on Kubernetes cluster(k8s), using Spark as the execution engine. I pushed two images to registry: dku-exec-base and dku-spark-base However, when I run the recipe it takes forever running (creating and deleting pods in k8s), I found this line in Job logs:
Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources
found this on pod log
Environment: SPARK_USER: dssuser2 SPARK_DRIVER_URL: spark://CoarseGrainedScheduler@
samehdss.design-node
:42207 SPARK_EXECUTOR_CORES: 1
I already updated env-site.sh to set DK_BACKEND_EXT_HOST
with correct IP for DSS machine. restarted , pushed base images again. but still not working
the SPARK_DRIVER_URL still shows this host name in log not IP . how can I solve this
Operating system used: Debian 11
Operating system used: Debian 11
Best Answer
-
Hi,
you will need to additionally set spark.driver.host key in your Spark configuration.
The value should be the IP of the DSS machine.
Answers
-
Thank you , this solved the problem .
however I'm curious to know from where the default hardcoded old hostname come from .
I rebuilt the images , update configuration files with no luck