Check out the first Dataiku 8 Deep Dive focusing on Productivity on October 29th Read More

Spark on Kubernetes - Initial job has not accepted any resources

Level 2
Spark on Kubernetes - Initial job has not accepted any resources

Hi,

We've been having a good experience using Spark and containerized execution on our DSS platform. The next step would be to run Spark on Kubernetes, but we're facing some issues.

Things that work:

  • Building (Spark) base images and code-env specific images
  • Pushing images to ECR
  • Starting an EKS cluster (with the same subnet and security group as the DSS machine)

But executing Spark jobs themselves hangs on a repeating message:

Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources.
 

The log seems to show that connectivity between DSS and the cluster work, the work is just not picked up. Have you perhaps experienced this and would you know how to fix this?

Regards,
Rik

0 Kudos
2 Replies
Dataiker
Dataiker

t3.micro may be insufficient as VM to run spark on the nodes. You should use larger instances, or try passing 500m for spark.kubernetes.executor.request.cores and spark.kubernetes.executor.limit.cores in the properties of the Spark config

Level 2
Author

You're right, when switching from t3.micro to t3.medium things work as expected! I'm sure that there's a lot of fun still to be had optimizing the spark settings.

0 Kudos
Labels (3)