PySpark python issue: Py4JJavaError: An error occurred while calling o65.sql
jdorgani
Registered Posts: 2 ✭✭✭
Hello everyone I am working on PySpark recipe and I have mentioned the code and getting some issue, I am wondering if someone knows about the following issue? I get the error in the last line of code.
Py4JJavaError: An error occurred while calling o65.sql. : org.apache.spark.SparkException: Job aborted. at org.apache.spark.sql.execution.datasources.FileFormatWriter$.write(FileFormatWriter.scala:198) at org.apache.spark.sql.execution.datasources.InsertIntoHadoopFsRelationCommand.run(InsertIntoHadoopFsRelationCommand.scala:159)
#Write dataframe in a dataset #df: pyspark dataframe datasetdataiku.Dataset(datasetName).write_with_schema(df) #Get written dataset temp = dataiku.Dataset(datasetName) df_new = dkuspark.get_dataframe(hiveContext, temporal) tempView = df.createOrReplaceTempView("temporal_table_name") spark.sql("insert into table name_table select * from temporal_table_name")
Answers
-
Hi,
the "An error occurred while calling o65.sql" message is only a symptom, not a cause, and the true error lies somewhere above in the log. But the sql you're attempting to execute will probably not work, unless you have a "name_table" table defined in a Hive metastore, and the Spark context is setup to connect to that metastore. You're maybe instead looking for
df_result = spark.sql("select * from temporal_table_name")
followed by a regular dataset write like
dkuspark.write_with_schema(dataset_result, df_result)