Spark process with kubernetes not running in DSS

Jorge Carlos
Jorge Carlos Dataiku DSS Core Designer, Dataiku DSS ML Practitioner, Registered Posts: 3 ✭✭✭
edited July 16 in Setup & Configuration

I am trying to configure the usage of kubernetes in DSS, specifically to attach a cluster of Azure Kubernetes to my DSS instance. I have done all the steps mentioned in the documentation(Initial setup — Dataiku DSS 12 documentation), I already pushed the base image, opened the ports 1024 to 65535 to the kubernetes cluster IP(Unable to connect to DSS from container - Dataiku Community), and finally I tested the connection and it is successful, but, when I try to run a spark job in Dataiku, it throws a connection timeout error. Is there any solution to this?

This is the log from the main activity:

Connect to *********:****** [/**********] failed: Connection timed out (Connection timed out), caused by: ConnectException: Connection timed out (Connection timed out)

[20:51:05] [INFO] [dku.flow.activity] - Run thread failed for activity compute_orders_prepared_NP
com.dataiku.common.server.APIError$SerializedErrorException: Connect to ******:**** [/******] failed: Connection timed out (Connection timed out), caused by: ConnectException: Connection timed out (Connection timed out)
    at com.dataiku.dip.dataflow.exec.AbstractSparkBasedRecipeRunner$3.throwFromErrorFileOrLogs(AbstractSparkBasedRecipeRunner.java:333)
    at com.dataiku.dip.dataflow.exec.JobExecutionResultHandler.handleExecutionResult(JobExecutionResultHandler.java:26)
    at com.dataiku.dip.dataflow.exec.AbstractSparkBasedRecipeRunner.runUsingSparkSubmit(AbstractSparkBasedRecipeRunner.java:348)
    at com.dataiku.dip.dataflow.exec.AbstractSparkBasedRecipeRunner.doRunSpark(AbstractSparkBasedRecipeRunner.java:147)
    at com.dataiku.dip.dataflow.exec.AbstractSparkBasedRecipeRunner.runSpark(AbstractSparkBasedRecipeRunner.java:116)
    at com.dataiku.dip.dataflow.exec.AbstractSparkBasedRecipeRunner.runSpark(AbstractSparkBasedRecipeRunner.java:101)
    at com.dataiku.dip.recipes.shaker.ShakerSparkRecipeRunner.run(ShakerSparkRecipeRunner.java:50)
    at com.dataiku.dip.dataflow.jobrunner.ActivityRunner$FlowRunnableThread.run(ActivityRunner.java:378)
[20:51:05] [INFO] [dku.flow.activity] running compute_orders_prepared_NP - activity is finished
[20:51:05] [ERROR] [dku.flow.activity] running compute_orders_prepared_NP - Activity failed
com.dataiku.common.server.APIError$SerializedErrorException: Connect to *****:***** [/*******] failed: Connection timed out (Connection timed out), caused by: ConnectException: Connection timed out (Connection timed out)
    at com.dataiku.dip.dataflow.exec.AbstractSparkBasedRecipeRunner$3.throwFromErrorFileOrLogs(AbstractSparkBasedRecipeRunner.java:333)
    at com.dataiku.dip.dataflow.exec.JobExecutionResultHandler.handleExecutionResult(JobExecutionResultHandler.java:26)
    at com.dataiku.dip.dataflow.exec.AbstractSparkBasedRecipeRunner.runUsingSparkSubmit(AbstractSparkBasedRecipeRunner.java:348)
    at com.dataiku.dip.dataflow.exec.AbstractSparkBasedRecipeRunner.doRunSpark(AbstractSparkBasedRecipeRunner.java:147)
    at com.dataiku.dip.dataflow.exec.AbstractSparkBasedRecipeRunner.runSpark(AbstractSparkBasedRecipeRunner.java:116)
    at com.dataiku.dip.dataflow.exec.AbstractSparkBasedRecipeRunner.runSpark(AbstractSparkBasedRecipeRunner.java:101)
    at com.dataiku.dip.recipes.shaker.ShakerSparkRecipeRunner.run(ShakerSparkRecipeRunner.java:50)
    at com.dataiku.dip.dataflow.jobrunner.ActivityRunner$FlowRunnableThread.run(ActivityRunner.java:378)
[20:51:05] [INFO] [dku.flow.activity] running compute_orders_prepared_NP - Executing default post-activity lifecycle hook
[20:51:05] [INFO] [dku.flow.activity] running compute_orders_prepared_NP - Done post-activity tasks

Also it is worth to mention that I am using the default spark configuration mentioned in the above documentation.


Operating system used: Ubuntu 20.04.1 LTS

Setup Info
    Tags
      Help me…