Sign up to take part
Registered users can ask their own questions, contribute to discussions, and be part of the Community!
Registered users can ask their own questions, contribute to discussions, and be part of the Community!
Hi,
I am getting error below while running pivot recipe with Spark engine. Can someone help me on this.
[2020/09/18-18:01:13.644] [FRT-42-FlowRunnable] [INFO] [dku.resourceusage] act.compute_my_mapping_table_by_vehicle_tag_NP - Reporting completion of CRU:{"context":{"type":"JOB_ACTIVITY","authIdentifier":"admin","projectKey":"IMPALA","jobId":"Build_my_mapping_table_by_vehicle_tag_2020-09-18T16-00-34.073","activityId":"compute_my_mapping_table_by_vehicle_tag_NP","activityType":"recipe","recipeType":"pivot","recipeName":"compute_my_mapping_table_by_vehicle_tag"},"id":"OxHCSrYOL1HENQUu","startTime":1600444836021} [2020/09/18-18:01:13.644] [FRT-42-FlowRunnable] [INFO] [dku.usage.computeresource.jek] act.compute_my_mapping_table_by_vehicle_tag_NP - Reporting completion of resource usage: {"context":{"type":"JOB_ACTIVITY","authIdentifier":"admin","projectKey":"IMPALA","jobId":"Build_my_mapping_table_by_vehicle_tag_2020-09-18T16-00-34.073","activityId":"compute_my_mapping_table_by_vehicle_tag_NP","activityType":"recipe","recipeType":"pivot","recipeName":"compute_my_mapping_table_by_vehicle_tag"},"id":"OxHCSrYOL1HENQUu","startTime":1600444836021,"endTime":1600444873644} [2020/09/18-18:01:13.646] [FRT-42-FlowRunnable] [INFO] [dku.flow.activity] act.compute_my_mapping_table_by_vehicle_tag_NP - Run thread failed for activity compute_my_mapping_table_by_vehicle_tag_NP com.dataiku.common.server.APIError$SerializedErrorException: Error in Spark process: 'org.apache.spark.sql.DataFrame org.apache.spark.sql.SQLContext.createDataFrame(org.apache.spark.rdd.RDD, org.apache.spark.sql.types.StructType)' at com.dataiku.dip.dataflow.exec.AbstractCodeBasedActivityRunner.handleErrorFile(AbstractCodeBasedActivityRunner.java:221) at com.dataiku.dip.dataflow.exec.AbstractCodeBasedActivityRunner.handleExecutionResult(AbstractCodeBasedActivityRunner.java:186) at com.dataiku.dip.dataflow.exec.AbstractCodeBasedActivityRunner.execute(AbstractCodeBasedActivityRunner.java:103) at com.dataiku.dip.dataflow.exec.AbstractSparkBasedRecipeRunner.runUsingSparkSubmit(AbstractSparkBasedRecipeRunner.java:206) at com.dataiku.dip.dataflow.exec.AbstractSparkBasedRecipeRunner.doRunSpark(AbstractSparkBasedRecipeRunner.java:115) at com.dataiku.dip.dataflow.exec.AbstractSparkBasedRecipeRunner.runSpark(AbstractSparkBasedRecipeRunner.java:84) at com.dataiku.dip.dataflow.exec.pivot.PivotRecipeSparkExecutor$PivotRecipeSparkModalityCollectionExecutor.run(PivotRecipeSparkExecutor.java:159) at com.dataiku.dip.dataflow.exec.pivot.PivotRecipeSparkExecutor.getModalities(PivotRecipeSparkExecutor.java:118) at com.dataiku.dip.dataflow.exec.pivot.PivotRecipeExecutor.run(PivotRecipeExecutor.java:98) at com.dataiku.dip.dataflow.exec.pivot.PivotRecipeRunner.run(PivotRecipeRunner.java:98) at com.dataiku.dip.dataflow.jobrunner.ActivityRunner$FlowRunnableThread.run(ActivityRunner.java:374) [2020/09/18-18:01:13.781] [ActivityExecutor-30] [INFO] [dku.flow.activity] running compute_my_mapping_table_by_vehicle_tag_NP - activity is finished [2020/09/18-18:01:13.781] [ActivityExecutor-30] [ERROR] [dku.flow.activity] running compute_my_mapping_table_by_vehicle_tag_NP - Activity failed com.dataiku.common.server.APIError$SerializedErrorException: Error in Spark process: 'org.apache.spark.sql.DataFrame org.apache.spark.sql.SQLContext.createDataFrame(org.apache.spark.rdd.RDD, org.apache.spark.sql.types.StructType)' at com.dataiku.dip.dataflow.exec.AbstractCodeBasedActivityRunner.handleErrorFile(AbstractCodeBasedActivityRunner.java:221) at com.dataiku.dip.dataflow.exec.AbstractCodeBasedActivityRunner.handleExecutionResult(AbstractCodeBasedActivityRunner.java:186) at com.dataiku.dip.dataflow.exec.AbstractCodeBasedActivityRunner.execute(AbstractCodeBasedActivityRunner.java:103) at com.dataiku.dip.dataflow.exec.AbstractSparkBasedRecipeRunner.runUsingSparkSubmit(AbstractSparkBasedRecipeRunner.java:206) at com.dataiku.dip.dataflow.exec.AbstractSparkBasedRecipeRunner.doRunSpark(AbstractSparkBasedRecipeRunner.java:115) at com.dataiku.dip.dataflow.exec.AbstractSparkBasedRecipeRunner.runSpark(AbstractSparkBasedRecipeRunner.java:84) at com.dataiku.dip.dataflow.exec.pivot.PivotRecipeSparkExecutor$PivotRecipeSparkModalityCollectionExecutor.run(PivotRecipeSparkExecutor.java:159) at com.dataiku.dip.dataflow.exec.pivot.PivotRecipeSparkExecutor.getModalities(PivotRecipeSparkExecutor.java:118) at com.dataiku.dip.dataflow.exec.pivot.PivotRecipeExecutor.run(PivotRecipeExecutor.java:98) at com.dataiku.dip.dataflow.exec.pivot.PivotRecipeRunner.run(PivotRecipeRunner.java:98) at com.dataiku.dip.dataflow.jobrunner.ActivityRunner$FlowRunnableThread.run(ActivityRunner.java:374) [2020/09/18-18:01:13.781] [ActivityExecutor-30] [INFO] [dku.flow.activity] running compute_my_mapping_table_by_vehicle_tag_NP - Executing default post-activity lifecycle hook [2020/09/18-18:01:13.784] [ActivityExecutor-30] [DEBUG] [dku.datasets.hdfs] running compute_my_mapping_table_by_vehicle_tag_NP - HDFS dataset handler dataset=IMPALA.my_mapping_table_by_vehicle_tag connection=hdfs_managed cpr=/user/dss2-admin/dss_managed_datasets resolvedPath=/IMPALA/my_mapping_table_by_vehicle_tag connRootSA=nullconnRootWithinSA=/user/dss2-admin/dss_managed_datasets configuredRootPathWithinSA=/user/dss2-admin/dss_managed_datasets/IMPALA/my_mapping_table_by_vehicle_tag effectiveRootPathWithinSA=/user/dss2-admin/dss_managed_datasets/IMPALA/my_mapping_table_by_vehicle_tag [2020/09/18-18:01:13.785] [ActivityExecutor-30] [DEBUG] [dku.fsproviders.hdfs] running compute_my_mapping_table_by_vehicle_tag_NP - Build HDFSProvider conn=hdfs_managed cpr=/user/dss2-admin/dss_managed_datasets
Hi,
This error indicates that your setup is not functioning. Could you please provide details about how you installed DSS and Spark ?
If you are working with a Dataiku customer or evaluation, please reach out to Dataiku Support (https://doc.dataiku.com/dss/latest/troubleshooting/obtaining-support.html#editor-support-for-dataiku...)