Sign up to take part
Registered users can ask their own questions, contribute to discussions, and be part of the Community!
Registered users can ask their own questions, contribute to discussions, and be part of the Community!
Hi,
We have a python job (source: database (PostgreSQL)), it fails when ran in EKS with error as:
Waiting for logs, time elapsed: 837, status changed to: Error from server (BadRequest): container "c" in pod "dataiku-exec-python-nimlxju-sr7qc" is waiting to start: ContainerCreating
We can see, the container is getting created - "dataiku-exec-python-nimlxju-sr7qc 0/1 ContainerCreating" from CLI.
Update: We were able to run the job on EKS and it was in running state. After 1.5 mins it failed with error message:
Raw error is{"errorType":"SubProcessFailed","message":"Containerized process execution failed, return code 119","stackTrace":[]}
How can we resolve it?
We have a PySpark job with same source and it runs successfully on EKS.
Note: Number of records is ~25M
Thanks,
Hi,
the container being in "containercreating" means kubernetes is setting up the container (fetching the image, connecting stuff...), so it's not yet actually running the recipe. You should attach a diagnostic of the failed job, and also check via kubectl what happened (or happens) with the pod:
kubectl logs dataiku-exec-python-nimlxju-sr7qc
and
kubectl describe pod dataiku-exec-python-nimlxju-sr7qc
Hi,
"return code 119" means that your container ran out of memory and was killed by Kubernetes.
You need to increase your "memory request" and/or "memory limit" settings. Note that if you don't have a memory limit, you may also need to use larger nodes on your Kubernetes cluster.