Survey banner
The Dataiku Community is moving to a new home! We are temporary in read only mode: LEARN MORE

No connection defined to upload files/jars

Solved!
azorman
Level 2
No connection defined to upload files/jars

I am trying to execute a PySpark recipe on a remote AWS EMR Spark cluster and I am getting:

Your Spark settings don't define a temporary storage for yarn-cluster mode
in act.compute_prepdataset1_NP: No connection defined to upload files/jars

I am using this runtime configuration:
Screenshot 2024-06-01 at 01.02.59.png

I also tried adding:

 

spark.yarn.stagingDir -> hdfs://ip-172-31-43-168.ec2.internal:8020/user/hadoop/.sparkStaging/

 

From the command line I can successfully run:

 

spark-submit --master yarn --deploy-mode cluster --conf spark.executor.memory=4G --conf spark.driver.memory=1G --conf spark.executor.cores=1 --conf spark.num.executors=2 --conf spark.yarn.am.memory=1024m --conf spark.yarn.am.cores=1 test_job.py

 

which mean the communication between the client and the AWS EMR Spark cluster is working fine. I also have S3 and hdfs_root connections working fine. 

Thank you! 


Operating system used: Amazon Linux 2023

0 Kudos
1 Solution
azorman
Level 2
Author

All set! I had the wrong jars.

DSS is working perfectly with AWS EMR v6.15.0 and the very latest v7.1.0 

What an amazing product! I can not say enough good things about the people that developed and continue to develop it.

View solution in original post

0 Kudos
2 Replies
azorman
Level 2
Author
"yarnClusterSettings":{
   "connectionName":"hdfs_root",
   "location":"/user/hadoop/.sparkStaging/"
}

I overcame that problem; this one:

2024-06-03 09:15:54,513 INFO Not running pyspark-over-k8s in cluster mode, not distributing

will be more difficult to overcome. The idea was to run it over the AWS EMR cluster, which I understand is, or will be, deprecated. Not a good decision as far as I am concerned.

0 Kudos
azorman
Level 2
Author

All set! I had the wrong jars.

DSS is working perfectly with AWS EMR v6.15.0 and the very latest v7.1.0 

What an amazing product! I can not say enough good things about the people that developed and continue to develop it.

0 Kudos

Labels

?
Labels (3)
A banner prompting to get Dataiku