Integration with SQL Databases (Postgres) fails authentication
 
            Hi Dataiku community,
I'm currently re-doing the https://academy.dataiku.com/integration-with-sql-databases-1 course and I'm trying to setup a Connection to Postgres.
In the past I had completed this tutorial succesfully and have the Postgres installation in my laptop but the PostgresSQL connection I had setup at the time is not working (cannot authenticate user) so I started it from scratch ( I wonder if I need to remove Postgres and reinstall it just in case).
Step "Create and Configure your PostGres Database" step:
I've setup all the required tables in Postgres. Here's the verification from the SQL Shell:
postgres=# \c dku
You are now connected to database "dku" as user "postgres".
dku=# \l
List of databases
Name | Owner | Encoding | Collate | Ctype | Access privileges
-----------+----------+----------+-----------------------------+-----------------------------+-----------------------
dku | postgres | UTF8 | English_United Kingdom.1252 | English_United Kingdom.1252 |
postgres | postgres | UTF8 | English_United Kingdom.1252 | English_United Kingdom.1252 |
template0 | postgres | UTF8 | English_United Kingdom.1252 | English_United Kingdom.1252 | =c/postgres +
| | | | | postgres=CTc/postgres
template1 | postgres | UTF8 | English_United Kingdom.1252 | English_United Kingdom.1252 | =c/postgres +
| | | | | postgres=CTc/postgres
(4 rows)
dku=# \du
List of roles
Role name | Attributes | Member of
------------------+------------------------------------------------------------+-----------
angelo | | {}
dku_tshirt_admin | | {}
matthieu | | {}
postgres | Superuser, Create role, Create DB, Replication, Bypass RLS | {}
dku=# \dn
List of schemas
Name | Owner
------------+------------------
dku_churn | angelo
dku_tshirt | dku_tshirt_admin
public | postgres
(3 rows)
dku=#
I can also see these in the UI tool of Postgres, PgAdmin.
Concept Summary: Connections to SQL Databases
Check 1
Also, I have checked the instructions under https://doc.dataiku.com/dss/latest/operations/datadir.html under my Data Dir and I can find the jdbc drivers under /home/dataiku/dataiku-dss-10.0.2/lib/ivy/jdbc-postgress and contains the postgresql-42.2.18.jar.
Check 2
I have also checked the instructions from here: https://doc.dataiku.com/dss/latest/connecting/sql/postgresql.html and I had two questions:
- At the top it says "The PostgreSQL driver is pre-installed in DSS. You don’t need any further installation" which is in line with my Check 1 above
- Then it shows how to install it - I assume I don't have to do any of these steps right?
My setup information
Finally, just to provide some information about my machine and setup:
- I'm running DSS in VirtualBox, and have installed DSS 10.0.2
- My host OS is Windows 10, so when I open up the DSS server it's on Windows Chrome browser using the following address: 127.0.0.1:10000
- I run Postgres 14.0 in Windows on the same laptop I use to kick launch the DSS server using Chrome
Any help would be appreciated.
Thank you,
Angelo
Operating system used: Windows 10
Best Answer
- 
             tgb417 Dataiku DSS Core Designer, Dataiku DSS & SQL, Dataiku DSS ML Practitioner, Dataiku DSS Core Concepts, Neuron 2020, Neuron, Registered, Dataiku Frontrunner Awards 2021 Finalist, Neuron 2021, Neuron 2022, Frontrunner 2022 Finalist, Frontrunner 2022 Winner, Dataiku Frontrunner Awards 2021 Participant, Frontrunner 2022 Participant, Neuron 2023 Posts: 1,630 Neuron tgb417 Dataiku DSS Core Designer, Dataiku DSS & SQL, Dataiku DSS ML Practitioner, Dataiku DSS Core Concepts, Neuron 2020, Neuron, Registered, Dataiku Frontrunner Awards 2021 Finalist, Neuron 2021, Neuron 2022, Frontrunner 2022 Finalist, Frontrunner 2022 Winner, Dataiku Frontrunner Awards 2021 Participant, Frontrunner 2022 Participant, Neuron 2023 Posts: 1,630 NeuronJust a thought. It appears that you are trying to connect two operating systems Linux on a vbox virtual computer running the Dataiku DSS application. And your local windows running PostgreSQL. Is that the case? 
 If so, it is my understanding that the default configuration of PostgreSQL is not to allow “network” traffic. Even thought these operating systems are running in the same computer hardware. Each of the operating systems think that they need to communicate out over their network connection. Even though this is a virtual network all within your computer. The magic of virtualization…So Dss has to talk out over a “network” leaving the Linux OS, and PostgreSQL needs to talk out over a “network” leaving your windows OS. To get this to work for testing it might be easier if you ran PostgreSQL on the Vbox at least to test and learn. That said if there are reasons to run them separately have you looked at the PostgreSQL setting to allow it to talk on a network. here is an old description. 
 https://www.thegeekstuff.com/2014/02/enable-remote-postgresql-connection/there are lots of other posts about this issue. If you have already investigated this set of issues. Please ignore this post. Otherwise, do check this out. 
Answers
- 
             Sergey Dataiker, Dataiku DSS Core Designer, Dataiku DSS & SQL, Dataiku DSS Core Concepts, Registered Posts: 365 Dataiker Sergey Dataiker, Dataiku DSS Core Designer, Dataiku DSS & SQL, Dataiku DSS Core Concepts, Registered Posts: 365 DataikerHi @Angelo Can you please clarify two points: - What error message are you getting? A screenshot would be great.
- What do you have in pg_hba.conf file and what user is used for connection in DSS UI?
 
- 
            Hi @sergeyd 
 ,Thank you for coming back to me. 1. See attached error: it's a authentication error for the users setup in https://academy-content.dataiku.com/latest/courses/sql-integration/tech-prerequisites/configure-database.html#create-and-configure-your-postgresql-database 2. The user I'm using to access DSS-UI is "admin" 3. Here's the conf file ( I added the line with "trust" in the end but commented it out as it didn't work) # TYPE DATABASE USER ADDRESS METHOD # "local" is for Unix domain socket connections only 
 local all all scram-sha-256
 #local all all trust
 # IPv4 local connections:
 host all all 127.0.0.1/32 scram-sha-256
 # IPv6 local connections:
 host all all ::1/128 scram-sha-256
 # Allow replication connections from localhost, by a user with the
 # replication privilege.
 local replication all scram-sha-256
 host replication all 127.0.0.1/32 scram-sha-256
 host replication all ::1/128 scram-sha-256
- 
             Sergey Dataiker, Dataiku DSS Core Designer, Dataiku DSS & SQL, Dataiku DSS Core Concepts, Registered Posts: 365 Dataiker Sergey Dataiker, Dataiku DSS Core Designer, Dataiku DSS & SQL, Dataiku DSS Core Concepts, Registered Posts: 365 DataikerThanks. Is there a reason why you have used scram-sha-256 for auth? Can you please change it to md5 and restart Postgres? 
- 
            No reason, I don't recall changing any settings previously. I'll update it to md5 anyway and try again. 
- 
            Steps taken: - Updated the config as per your suggestion
- Closed down PSQL terminal
- Closed down DSS and started it again
- Started PSQL terminal and loggged on using the credentials created previously (dku_tshirt_admin and password) and logged on succesuflly
- Created a new connection in DSS using the same credentials
- DSS failed authenticating dku_tshirt_admin as per screenshot.
 
- 
             Sergey Dataiker, Dataiku DSS Core Designer, Dataiku DSS & SQL, Dataiku DSS Core Concepts, Registered Posts: 365 Dataiker Sergey Dataiker, Dataiku DSS Core Designer, Dataiku DSS & SQL, Dataiku DSS Core Concepts, Registered Posts: 365 DataikerThanks for the tests. There is no need to restart DSS though. Please note that psql has native auth so testing the connection with psql is not the same as the connection from DSS. Were you able to connect to this Postgres instance with an external tool using JDBC drivers like DBVisualizer for example? DSS doesn't modify or alter JDBC drivers. PostgreSQL one is indeed bundled OOTB so you do not need to install it. 
- 
            @sergeyd 
 just to clarify, I wasn't testing the connection through psql: I used psql to setup the DB (dku), the schemas (dku_tshirt, dku_churn), the users (Matthiew, Angelo, dku_tshirt_admin) and then to assign privileges.The reason I mentioned psql was to show the content of these that they are indeed there and the fact that I've used this user to authenticate, so they seem to be ok. I can see these both in psql shell or the PgAdmin UI of Postgres. 
- 
            Also, see attached screenshot from PgAdmin UI in zip file that follows in next comment. 
- 
            @sergeyd 
 did a quick test with DBVisualiser and seem to be able to connect.- Pinged server and was working
- Screenshot before connection (doesn't show all tabs and DB info/data types)
- Screenshot after connection where all above are visible
- Zoom in to dku_tshirt schema to see owner
 Essentially I can see the same info that I can see via psql and I'm able to authenticate using the user credentials setup for dku_tshirt_admin. See screenshots attached. Any other thoughts? 
- 
            Also: tried another dss instance I had installed in VBox, which was clean:I tried to create a brank new connection using dku/dku_tshirt_admoin credentials and this time I got the same error message in French. Going back to my current dss instance, I tried the connection again and grabbed the logs from the system: [2022/04/12-06:10:00.829] [qtp923341586-15] [INFO] [dku.connections.sql.provider] - [ct: 55] Connecting to jdbc:postgresql://localhost:5432/dku with props: {"user":"dku_tshirt_admin","password":"***"} conn=PostgreSQL_tshirt-S8qMBaD
 [2022/04/12-06:10:00.842] [qtp923341586-15] [DEBUG] [dku.connections.sql.driver] - Driver version 42.2
 [2022/04/12-06:10:00.891] [qtp923341586-15] [INFO] [dku] - SQL D:null
 [2022/04/12-06:10:00.891] [qtp923341586-15] [ERROR] [dku.datasets.sql] - Connection error
 org.postgresql.util.PSQLException: FATAL: password authentication failed for user "dku_tshirt_admin"
 at org.postgresql.core.v3.ConnectionFactoryImpl.doAuthentication(ConnectionFactoryImpl.java:613)
 at org.postgresql.core.v3.ConnectionFactoryImpl.tryConnect(ConnectionFactoryImpl.java:161)
 at org.postgresql.core.v3.ConnectionFactoryImpl.openConnectionImpl(ConnectionFactoryImpl.java:213)
 at org.postgresql.core.ConnectionFactory.openConnection(ConnectionFactory.java:51)
 at org.postgresql.jdbc.PgConnection.<init>(PgConnection.java:225)
 at org.postgresql.Driver.makeConnection(Driver.java:465)
 at org.postgresql.Driver.connect(Driver.java:264)
 at com.dataiku.dip.connections.SQLDriverLoader$DriverWrapper.connect(SQLDriverLoader.java:209)
 at com.dataiku.dip.connections.SQLConnectionProvider$GenericSQLConnectionData.problemsAwareConnect(SQLConnectionProvider.java:484)
 at com.dataiku.dip.connections.SQLConnectionProvider$GenericSQLConnectionData.buildConnection(SQLConnectionProvider.java:558)
 at com.dataiku.dip.connections.SQLConnectionProvider.newConnection(SQLConnectionProvider.java:917)
 at com.dataiku.dip.connections.SQLConnectionProvider.newConnection(SQLConnectionProvider.java:907)
 at com.dataiku.dip.connections.SQLConnectionProvider.newConnection(SQLConnectionProvider.java:903)
 at com.dataiku.dip.server.services.ConnectionsTestService.testSQL_NOTRANSACTION(ConnectionsTestService.java:350)
 at com.dataiku.dip.server.services.ConnectionsTestService.testPostgreSQL(ConnectionsTestService.java:383)
 at com.dataiku.dip.server.controllers.ConnectionsController.adminTestPostgreSQL(ConnectionsController.java:519)
 at com.dataiku.dip.server.controllers.ConnectionsController$$FastClassBySpringCGLIB$$149f32e4.invoke(<generated>)
 at org.springframework.cglib.proxy.MethodProxy.invoke(MethodProxy.java:204)
 at org.springframework.aop.framework.CglibAopProxy$CglibMethodInvocation.invokeJoinpoint(CglibAopProxy.java:701)
 at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:150)
 at org.springframework.aop.aspectj.MethodInvocationProceedingJoinPoint.proceed(MethodInvocationProceedingJoinPoint.java:80)
 at com.dataiku.dip.server.controllers.CallTracingAspect.doCall(CallTracingAspect.java:78)
 at sun.reflect.GeneratedMethodAccessor23.invoke(Unknown Source)
 at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:498)
 at org.springframework.aop.aspectj.AbstractAspectJAdvice.invokeAdviceMethodWithGivenArgs(AbstractAspectJAdvice.java:621)
 at org.springframework.aop.aspectj.AbstractAspectJAdvice.invokeAdviceMethod(AbstractAspectJAdvice.java:610)
 at org.springframework.aop.aspectj.AspectJAroundAdvice.invoke(AspectJAroundAdvice.java:65)
 at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:161)
 at org.springframework.aop.interceptor.ExposeInvocationInterceptor.invoke(ExposeInvocationInterceptor.java:91)
 at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:172)
 at org.springframework.aop.framework.CglibAopProxy$DynamicAdvisedInterceptor.intercept(CglibAopProxy.java:633)
 at com.dataiku.dip.server.controllers.ConnectionsController$$EnhancerBySpringCGLIB$$dacecbcf.adminTestPostgreSQL(<generated>)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
 at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:498)
 at org.springframework.web.method.support.InvocableHandlerMethod.doInvoke(InvocableHandlerMethod.java:221)
 at org.springframework.web.method.support.InvocableHandlerMethod.invokeForRequest(InvocableHandlerMethod.java:136)
 at org.springframework.web.servlet.mvc.method.annotation.ServletInvocableHandlerMethod.invokeAndHandle(ServletInvocableHandlerMethod.java:104)
 at org.springframework.web.servlet.mvc.method.annotation.RequestMappingHandlerAdapter.invokeHandleMethod(RequestMappingHandlerAdapter.java:743)
 at org.springframework.web.servlet.mvc.method.annotation.RequestMappingHandlerAdapter.handleInternal(RequestMappingHandlerAdapter.java:672)
 at org.springframework.web.servlet.mvc.method.AbstractHandlerMethodAdapter.handle(AbstractHandlerMethodAdapter.java:82)
 at org.springframework.web.servlet.DispatcherServlet.doDispatch(DispatcherServlet.java:933)
 at com.dataiku.dip.server.controllers.DKUDispatcherServlet.doDispatch(DKUDispatcherServlet.java:50)
 at org.springframework.web.servlet.DispatcherServlet.doService(DispatcherServlet.java:867)
 at org.springframework.web.servlet.FrameworkServlet.processRequest(FrameworkServlet.java:951)
 at org.springframework.web.servlet.FrameworkServlet.doPost(FrameworkServlet.java:853)
 at javax.servlet.http.HttpServlet.service(HttpServlet.java:707)
 at org.springframework.web.servlet.FrameworkServlet.service(FrameworkServlet.java:827)
 at javax.servlet.http.HttpServlet.service(HttpServlet.java:790)
 at org.eclipse.jetty.servlet.ServletHolder$NotAsyncServlet.service(ServletHolder.java:1411)
 at org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:763)
 at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1651)
 at com.dataiku.dip.shaker.server.ResourceFilter.doFilter(ResourceFilter.java:33)
 at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1630)
 at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:567)
 at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
 at org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:602)
 at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127)
 at org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:235)
 at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:1610)
 at org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:233)
 at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1377)
 at org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:188)
 at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:507)
 at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:1580)
 at org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:186)
 at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1292)
 at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
 at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127)
 at org.eclipse.jetty.server.Server.handle(Server.java:501)
 at org.eclipse.jetty.server.HttpChannel.lambda$handle$1(HttpChannel.java:383)
 at org.eclipse.jetty.server.HttpChannel.dispatch(HttpChannel.java:556)
 at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:375)
 at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:273)
 at org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:311)
 at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:105)
 at org.eclipse.jetty.io.ChannelEndPoint$1.run(ChannelEndPoint.java:104)
 at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.runTask(EatWhatYouKill.java:336)
 at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.doProduce(EatWhatYouKill.java:313)
 at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.tryProduce(EatWhatYouKill.java:171)
 at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.run(EatWhatYouKill.java:129)
 at org.eclipse.jetty.util.thread.ReservedThreadExecutor$ReservedThread.run(ReservedThreadExecutor.java:375)
 at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:806)
 at org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.run(QueuedThreadPool.java:938)
 at java.lang.Thread.run(Thread.java:748)
- 
            Thanks @tgb417 
 makes sense. I tried a few things and it worked!Here's how: - I had to find out the IP of the Virtual Box adapter which you can do using the "ipconfig" command in windows
- I also had to allow in the VB configuration a Network adapter connection of that type.
 This was suggested by this post's step 1 in StackOverflow (https://stackoverflow.com/questions/18121666/connect-to-postgresql-database-in-linux-virtualbox-from-win7). The difference being was that for Step 2 my host was not in the virtual box rather than Windows (in this article they're trying to access PostgreSQL in VBox from Windows, while I was trying practically the opposite: to authenticate from VBox/Dataiku to PostgreSQL in Windows). So for the 2nd step in the post to edit the pg_hba.conf file I had to do that in my windows conf. What else I tried: In VBox, I connected to the PostgreSQL setup in Dataiku, the local one i.e. in native Linux environment. There I explored the same files (pg_hba.conf and postgres.conf) and actually tried to test the connection from psql terminal in VBox (linux) and the local database called "dataiku". I was able to connect to the native postgres connection called "dataiku" (which you can also see in the Dataiku UI as when you go to Connections and tested it and connected OK). That "dataiku" postgres connection in the UI was always testing Ok and was setup using "localhost" server i.e. from Vbox linux, DSS server was using "localhost" to access the local "dataiku" postgres db so connection was working. So this is where it's clicked for me: I had to find the IP of my Postgres Server (in windows) and put it there. And this is how it's worked. What else I learned: Dataiku uses the default "dataiku" postgres database setup as the fallback for all the failed connections. e.g. when Initially I used the Sync Recipe or the Prepare recipe and you select your output dataset ("Store into" section) even if you select a PostgreSQL connection that is saved but is not authenticating (as I did initially for my Windows PostgreSQL), Dataiku will by default route it to the fallback "dataiku" postgres database setup in the backend. So when I used psql to connect to the "dataiku" PostgreSQL in VBox, I could see all the tables created in the DSS UI. You're right. Natively in Linux, this would have been so easy, but was great learning after all. 
