Writing Pyspark dataframe to partitioned dataset
Options
![Gipple](https://us.v-cdn.net/6038231/uploads/Dataiku/nAvatar11.png)
Gipple
Registered Posts: 5 ✭
I am trying to write the output of a pyspark recipe to a partitioned dataset, but I am receiving an error.
Py4JJavaError: An error occurred while calling o261.savePyDataFrame. : com.dataiku.common.server.APIError$APIErrorException: Illegal time partitioning value : 'null' with a MONTH partitioning
This is how I am trying to write it
# Write recipe outputs CCW_Output_Windows = dataiku.Dataset("CCW_Output_Windows") dkuspark.write_with_schema(CCW_Output_Windows, CCW_Output_df)
I haven't been able to find any documentation on this. Thanks
Answers
-
JordanB Dataiker, Dataiku DSS Core Designer, Dataiku DSS Adv Designer, Registered Posts: 293 DataikerOptions
Hi @Gipple
,It appears that the recipe is finding a null where it expects a month date value. Please check your data as well as how you are selecting the partitions to build from:
Kind Regards,
Jordan