Sign up to take part
Registered users can ask their own questions, contribute to discussions, and be part of the Community!
Registered users can ask their own questions, contribute to discussions, and be part of the Community!
Hello everyone I am working on PySpark recipe and I have mentioned the code and getting some issue, I am wondering if someone knows about the following issue? I get the error in the last line of code.
#Write dataframe in a dataset
#df: pyspark dataframe
datasetdataiku.Dataset(datasetName).write_with_schema(df)
#Get written dataset
temp = dataiku.Dataset(datasetName)
df_new = dkuspark.get_dataframe(hiveContext, temporal)
tempView = df.createOrReplaceTempView("temporal_table_name")
spark.sql("insert into table name_table select * from temporal_table_name")
Py4JJavaError: An error occurred while calling o65.sql. : org.apache.spark.SparkException: Job aborted. at org.apache.spark.sql.execution.datasources.FileFormatWriter$.write(FileFormatWriter.scala:198) at org.apache.spark.sql.execution.datasources.InsertIntoHadoopFsRelationCommand.run(InsertIntoHadoopFsRelationCommand.scala:159)
Hi,
the "An error occurred while calling o65.sql" message is only a symptom, not a cause, and the true error lies somewhere above in the log. But the sql you're attempting to execute will probably not work, unless you have a "name_table" table defined in a Hive metastore, and the Spark context is setup to connect to that metastore. You're maybe instead looking for
df_result = spark.sql("select * from temporal_table_name")
followed by a regular dataset write like
dkuspark.write_with_schema(dataset_result, df_result)