EC2 Dataiku connection to Azure Synapse pool - Need help
Hey all, I need some help in figuring out where the source of an issue lies with a particular connection to a Synapse SQL pool.
Setup:
Dataiku hosted on AWS EC2, built using Ansible + Terraform
Synapse SQL Pool located on Azure
Everything is (supposed to be) on the same internal network
I'm getting the following Error message:
The TCP/IP connection to the host randomdatabase.database.windows.net, port 1433 has failed. Error: "connect timed out. Verify the connection properties. Make sure that an instance of SQL Server is running on the host and accepting TCP/IP connections at the port. Make sure that TCP connections to the port are not blocked by a firewall.".
Additional technical details
Error type:com.microsoft.sqlserver.jdbc.SQLServerException
I've verified that the AWS security group is pointing in the right direction, that the host is correct, that the DB credentials are correct, the DB and Schema names are correct, and that the correct JDBC driver is installed per Dataiku recommendations.
We've also confirmed that the randomdatabase DB is accepting connections.
Any thoughts or suggestions to check on the network or firewall side of things? Thanks in advance
Operating system used: Ubuntu
Answers
-
Turribeach Dataiku DSS Core Designer, Neuron, Dataiku DSS Adv Designer, Registered, Neuron 2023 Posts: 2,067 Neuron
Well clearly you have a networks firewall issue and there isn't much we can do to help you, you really need to speak with your networks team. "Everything is (supposed to be) on the same internal network" => That is not physically possible since you are in a multi-cloud environment.
You can do a little bit of troubleshooting from your side before you contact your networks team. Ping in the number 1 tool most people use ("ping randomdatabase.database.windows.net") but it's relevance on multi-cloud environments which tend to implement zero trust and block ping packets is quickly dimishing. Besides the fact that ping works or doesn't will not guarantee that you can connect to the remote host in a particular port so it's best to leave ping for home lab setups. Old school people would have used telnet to check they can connect to the remote host on the desired port but telnet has long been deprecated as it is an insecure version of SSH so most systems don't even have it installed these days.
The more modern way to check you can establish a TCP connection to a remote system in a specific port is to use nmap:
nmap -sT -p 1433 randomdatabase.database.windows.net
You may need to install nmap but it's available to pretty much all *nix distributions. This command will test if a establish a TCP Connect() on the remote host on the specified port. The result should be something like this:
PORT STATE SERVICE 1433/tcp open obrpd
If it doesn't say open then that's as far as you can go. The connectivity is being blocked by something between you and the remote host and you need to talk to your networks people. When sending the case to them they will want to know the source IP, destination IP, protocol (TCP) and destination port all of which you can easily obtain (use "nslookup randomdatabase.database.windows.net" to see the IP address). Finally it might also help your networks people to provide a trace route:
traceroute randomdatabase.database.windows.net
Again may need to install traceroute but it's also available to pretty much all *nix distributions. This will attempt to provide a network trace of the traffic and all the hops it goes via until it's blocked but it will be highly dependant on where ping is allowed in your network as otherwise traceroute results will be pretty much useless.
-
Appreciate the response! I'll certainly give it a go before reaching out to the networking team but I agree with your initial assessment that it's very likely a firewall issue first and foremost that would need to be resolved in some manner.
Thanks!