What ports needs to be open for Elastic AI jobs in Kubernetes?
Assuming the DSS base port is 9000
I guess I need to allow incoming connections to ports 9000-9010 from the EKS CIDR,
But then when I used the DSS > Administration > Settings > Containerized execution > TEST I see that it also tries to connect to 33249
[2023-08-16 12:54:19,130] [1/MainThread] [INFO] [root] Try to ping backend
[2023-08-16 12:54:19,132] [1/MainThread] [DEBUG] [urllib3.connectionpool] Starting new HTTP connection (1):xxxx.yyy..compute.internal:33249
[2023-08-16 12:56:31,116] [1/MainThread] [ERROR] [root] Could not reach Future Kernel: HTTPConnectionPool(host='i-xxxx.yyyyy.compute.internal', port=33249): Max retries exceeded with url: /future/container-test-ping?testId=Vksv25L (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f1bc4542cd0>: Failed to establish a new connection: [Errno 110] Connection timed out'))
Container done
So my question is : is there a complete list of the ports that I need to open?
Operating system used: Amazon Linux 2
Best Answer
-
Alexandru Dataiker, Dataiku DSS Core Designer, Dataiku DSS ML Practitioner, Dataiku DSS Adv Designer, Registered Posts: 1,226 Dataiker
Hi @ecerulm
,As mentioned https://doc.dataiku.com/dss/latest/containers/setup-k8s.html#docker-and-kubectl-setup, The containers running on the cluster must be able to open TCP connections on the DSS host on any port.
Opening dynamic/ ephemeral port range 1024 to 65535 in both directions between DSS and Kubernetes pods should be enough. This would cover both spark-on-k8s as well.