Azure AKS deployment timeout

daniel_adornes
daniel_adornes Dataiku DSS Core Designer, Dataiku DSS ML Practitioner, Dataiku DSS Core Concepts, Dataiku DSS Adv Designer, Registered Posts: 30 ✭✭✭✭✭

Hi everyone!

The API Service deployment to my Kubernetes cluster is often failing with the following error:

Additional technical details: HTTP code: 500, Code: ERR_API_DEPLOYER_K8S_DEPLOYMENT_KUBECTL_FAILED, type: com.dataiku.dip.exceptions.ProcessDiedException

It takes to many minutes to finish (it seems to be a 10-minutes limit) and then fails. Stack trace:

Waiting for deployment "...." rollout to finish: 1 old replicas are pending termination...
error: deployment "..." exceeded its progress deadline.

Is there any place on DSS configs where I should increase this timeout limit?

THks!

Best Answers

  • fchataigner2
    fchataigner2 Dataiker Posts: 355 Dataiker
    Answer ✓

    Hi,

    10min is the default timeout for the `kubectl rollout...` command, but DSS doesn't offer control on it, neither by the --timeout flag nor by setting the progressDeadlineSeconds . You can still fiddle with the file deployment.yaml.template in the DSS installation dir if you need to tweak it (but that's of course unsupported)

  • daniel_adornes
    daniel_adornes Dataiku DSS Core Designer, Dataiku DSS ML Practitioner, Dataiku DSS Core Concepts, Dataiku DSS Adv Designer, Registered Posts: 30 ✭✭✭✭✭
    Answer ✓

    Thks @fchataigner2
    !!

    I found the file under /dataiku/dataiku-dss-9.0.3/resources/api-deployer/kubernetes/deployment.yaml.template

    There was an attribute `initialDelaySeconds` which I changed from 600 to 1200 (same timeout config that we have in our infra).

    All good!

    Thank you!

Setup Info
    Tags
      Help me…