I am attempting to train an ML model using a Visual Analysis and Spark. However, the job fails with the following message:
[10:01:27] [INFO] [dku.utils] - [2018/11/29-10:01:27.734] [task-result-getter-3] [ERROR] [org.apache.spark.scheduler.TaskSetManager] - Total size of serialized results of 714 tasks (2.7 GB) is bigger than spark.driver.maxResultSize (2.0 GB)
This must mean that the job is collecting results into the driver process, but I am not sure what exactly it is collecting. Can I configure the Visual Analysis to not collect any results? Is there a way other than increasing spark.driver.maxResultSize to resolve this issue?