General / Rule of Thumb Spark Configuration Settings

importthepandas
importthepandas Dataiku DSS Core Designer, Dataiku DSS & SQL, Dataiku DSS Core Concepts, Neuron, Dataiku DSS Adv Designer, Registered, Neuron 2023 Posts: 115 Neuron

We are using managed spark over kubernetes in EKS. We have about 80 active users on our design node, about 1/2 of them use spark regularly. We've tried to make things easy by creating simple spark configurations but are finding that we continuously are changing configurations. With multiple spark applications, has anyone usedspark.dynamicAllocation.enabled? Has anyone established good "boilerplate" configs for different spark workloads in their deployments?


Operating system used: ubuntu 18

Tagged:

Best Answer

  • Alexandru
    Alexandru Dataiker, Dataiku DSS Core Designer, Dataiku DSS ML Practitioner, Dataiku DSS Adv Designer, Registered Posts: 1,226 Dataiker
    Answer ✓

    Hi @importthepandas
    ,


    We typically suggest going with a few boiler plate configs and then modifying as needed based on your need and sometimes even on recipe basis rather then instance level config:

    Common config could look like:

    spark.sql.broadcastTimeout = 3600
    spark.port.maxRetries = 200
    spark.executor.extraJavaOptions=-Duser.timezone=GMT
    spark.driver.extraJavaOptions=-Duser.timezone=GMT
    spark.driver.host = 10.192.71.7
    spark.network.timeout = 500s

    Default:

    spark.executor.memory = 4g
    spark.kubernetes.memoryOverheadFactor = 0.4
    spark.driver.memory = 2g
    spark.executor.instances = 4
    spark.executor.cores = 2
    spark.sql.shuffle.partitions = 40

    larger config :

    spark.executor.memory = 4g
    spark.kubernetes.memoryOverheadFactor = 0.4
    spark.driver.memory = 2g
    spark.executor.instances = 8
    spark.executor.cores = 2
    spark.sql.shuffle.partitions = 80


    High( for jobs that failed with previous config)

    spark.executor.memory = 6g
    spark.kubernetes.memoryOverheadFactor = 0.6
    spark.driver.memory = 3g
    spark.executor.instances = 8
    spark.executor.cores = 2
    spark.sql.shuffle.partitions = 80

    On a per-job basis, you may want to touch only, as needed:
    * spark.executor.instances
    * spark.sql.shuffle.partitions
    * spark.executor.memory / spark.kubernetes.memoryOverheadFactor where you have OOM errors

    For other options like spark.dynamicAllocation.enabled is typically best left as the default false. From what I can see, this is still being worked on
    https://spark.apache.org/docs/latest/running-on-kubernetes.html#future-work


Answers

Setup Info
    Tags
      Help me…