I try to build an Mllib model with yarn-cluster set as master, but the execution fails for both Random Forest and Logistic Regression. Input data is the iris dataset on HDFS.
yarn-cluster submission works for PySpark script, and master=local model building also works.
I've only set the master, executor-memory and executor-instances in the Spark config.
The relevant log part:
Exception in thread "main" java.lang.IllegalArgumentException: requirement failed at scala.Predef$.require(Predef.scala:221) at org.apache.spark.deploy.yarn.Client$$anonfun$prepareLocalResources$8$$anonfun$apply$5.apply(Client.scala:501) at org.apache.spark.deploy.yarn.Client$$anonfun$prepareLocalResources$8$$anonfun$apply$5.apply(Client.scala:499) at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33) at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:108)