Discover this year's submissions to the Dataiku Frontrunner Awards and give kudos to your favorite use cases and success stories!READ MORE

Integration with SQL Databases (Postgres) fails authentication

Solved!
Angelo
Level 3
Integration with SQL Databases (Postgres) fails authentication

Hi Dataiku community,

I'm currently re-doing the https://academy.dataiku.com/integration-with-sql-databases-1 course and I'm trying to setup a Connection to Postgres.

In the past I had completed this tutorial succesfully and have the Postgres installation in my laptop but the PostgresSQL connection I had setup at the time is not working (cannot authenticate user) so I started it from scratch ( I wonder if I need to remove Postgres and reinstall it just in case).

Step "Create and Configure your PostGres Database" step:

I've setup all the required tables in Postgres. Here's the verification from the SQL Shell:

postgres=# \c dku
You are now connected to database "dku" as user "postgres".
dku=# \l
List of databases
Name | Owner | Encoding | Collate | Ctype | Access privileges
-----------+----------+----------+-----------------------------+-----------------------------+-----------------------
dku | postgres | UTF8 | English_United Kingdom.1252 | English_United Kingdom.1252 |
postgres | postgres | UTF8 | English_United Kingdom.1252 | English_United Kingdom.1252 |
template0 | postgres | UTF8 | English_United Kingdom.1252 | English_United Kingdom.1252 | =c/postgres +
| | | | | postgres=CTc/postgres
template1 | postgres | UTF8 | English_United Kingdom.1252 | English_United Kingdom.1252 | =c/postgres +
| | | | | postgres=CTc/postgres
(4 rows)


dku=# \du
List of roles
Role name | Attributes | Member of
------------------+------------------------------------------------------------+-----------
angelo | | {}
dku_tshirt_admin | | {}
matthieu | | {}
postgres | Superuser, Create role, Create DB, Replication, Bypass RLS | {}


dku=# \dn
List of schemas
Name | Owner
------------+------------------
dku_churn | angelo
dku_tshirt | dku_tshirt_admin
public | postgres
(3 rows)

dku=#

I can also see these in the UI tool of Postgres, PgAdmin.

 

Concept Summary: Connections to SQL Databases

Check 1

Also, I have checked the instructions under https://doc.dataiku.com/dss/latest/operations/datadir.html under my Data Dir and I can find the jdbc drivers under  /home/dataiku/dataiku-dss-10.0.2/lib/ivy/jdbc-postgress and contains the postgresql-42.2.18.jar.

Check 2

I have also checked the instructions from here: https://doc.dataiku.com/dss/latest/connecting/sql/postgresql.html and I had two questions:

  1. At the top it says "The PostgreSQL driver is pre-installed in DSS. You don’t need any further installation" which is in line with my Check 1 above
  2. Then it shows how to install it - I assume I don't have to do any of these steps right?

 

My setup information

Finally, just to provide some information about my machine and setup:

  • I'm running DSS in VirtualBox, and have installed DSS 10.0.2
  • My host OS is Windows 10, so when I open up the DSS server it's on Windows Chrome browser using the following address: 127.0.0.1:10000
  • I run Postgres 14.0 in Windows on the same laptop I use to kick launch the DSS server using Chrome

 

Any help would be appreciated.

 

Thank you,

Angelo


Operating system used: Windows 10

0 Kudos
1 Solution
tgb417
Neuron
Neuron

Just a thought.

It appears that you are trying to connect two operating systems Linux on a vbox virtual computer running the Dataiku DSS application.  And your local windows running PostgreSQL. Is that the case?  

If so, it is my understanding that the default configuration of PostgreSQL is not to allow “network” traffic.  Even thought these operating systems are running in the same computer hardware.  Each of the operating systems think that they need to communicate out over their network connection.  Even though this is a virtual network all within your computer.  The magic of virtualization…

So Dss has to talk out over a “network” leaving the Linux OS,  and PostgreSQL needs to talk out over a “network” leaving your windows OS.  

To get this to work for testing it might be easier if you ran PostgreSQL on the Vbox at least to test and learn.  

That said if there are reasons to run them separately have you looked at the PostgreSQL setting to allow it to talk on a network.  

here is an old description. 
https://www.thegeekstuff.com/2014/02/enable-remote-postgresql-connection/

there are lots of other posts about this issue.

If you have already investigated this set of issues.  Please ignore this post.  Otherwise, do check this out.  

--Tom

View solution in original post

13 Replies
sergeyd
Dataiker
Dataiker

Hi @Angelo 

Can you please clarify two points:

  • What error message are you getting? A screenshot would be great. 
  • What do you have in pg_hba.conf file and what user is used for connection in DSS UI? 
Angelo
Level 3
Author

Hi @sergeyd,

Thank you for coming back to me. 

1. See attached error: it's a authentication error for the users setup in https://academy-content.dataiku.com/latest/courses/sql-integration/tech-prerequisites/configure-data... 

2. The user I'm using to access DSS-UI is "admin"

3. Here's the conf file ( I added the line with "trust" in the end but commented it out as it didn't work)

# TYPE DATABASE USER ADDRESS METHOD

# "local" is for Unix domain socket connections only
local all all scram-sha-256
#local all all trust
# IPv4 local connections:
host all all 127.0.0.1/32 scram-sha-256
# IPv6 local connections:
host all all ::1/128 scram-sha-256
# Allow replication connections from localhost, by a user with the
# replication privilege.
local replication all scram-sha-256
host replication all 127.0.0.1/32 scram-sha-256
host replication all ::1/128 scram-sha-256

0 Kudos
sergeyd
Dataiker
Dataiker

Thanks. Is there a reason why you have used scram-sha-256 for auth?

Can you please change it to md5 and restart Postgres? 

Angelo
Level 3
Author

No reason, I don't recall changing any settings previously. I'll update it to md5 anyway and try again.  

0 Kudos
Angelo
Level 3
Author

Steps taken:

  1. Updated the config as per your suggestion
  2. Closed down PSQL terminal
  3. Closed down DSS and started it again
  4. Started PSQL terminal and loggged on using the credentials created previously (dku_tshirt_admin and password) and logged on succesuflly
  5. Created a new connection in DSS using the same credentials
  6. DSS failed authenticating dku_tshirt_admin as per screenshot.
0 Kudos
sergeyd
Dataiker
Dataiker

Thanks for the tests. There is no need to restart DSS though. 

Please note that psql has native auth so testing the connection with psql is not the same as the connection from DSS. Were you able to connect to this Postgres instance with an external tool using JDBC drivers like DBVisualizer for example?

DSS doesn't modify or alter JDBC drivers. PostgreSQL one is indeed bundled OOTB so you do not need to install it.  

Angelo
Level 3
Author

@sergeyd just to clarify, I wasn't testing the connection through psql: I used psql to setup the DB (dku), the schemas (dku_tshirt, dku_churn), the users (Matthiew, Angelo, dku_tshirt_admin) and then to assign privileges.

The reason I mentioned psql was to show the content of these that they are indeed there and the fact that I've used this user to authenticate, so they seem to be ok. I can see these both in psql shell or the PgAdmin UI of Postgres. 

0 Kudos
Angelo
Level 3
Author

Also, see attached screenshot from PgAdmin UI in zip file that follows in next comment.

0 Kudos
Angelo
Level 3
Author

@sergeyd did a quick test with DBVisualiser and seem to be able to connect. 

  • Pinged server and was working
  • Screenshot before connection (doesn't show all tabs and DB info/data types)
  • Screenshot after connection where all above are visible
  • Zoom in to dku_tshirt schema to see owner

Essentially I can see the same info that I can see via psql and I'm able to authenticate using the user credentials setup for dku_tshirt_admin.

See screenshots attached.

Any other thoughts? 

0 Kudos
Angelo
Level 3
Author

Also: tried another dss instance I had installed in VBox,  which was clean:I tried to create a brank new connection using dku/dku_tshirt_admoin credentials and this time I got the same error message in French.

Going back to my current dss instance, I tried the connection again and grabbed the logs from the system:

[2022/04/12-06:10:00.829] [qtp923341586-15] [INFO] [dku.connections.sql.provider] - [ct: 55] Connecting to jdbc:postgresql://localhost:5432/dku with props: {"user":"dku_tshirt_admin","password":"***"} conn=PostgreSQL_tshirt-S8qMBaD
[2022/04/12-06:10:00.842] [qtp923341586-15] [DEBUG] [dku.connections.sql.driver] - Driver version 42.2
[2022/04/12-06:10:00.891] [qtp923341586-15] [INFO] [dku] - SQL D:null
[2022/04/12-06:10:00.891] [qtp923341586-15] [ERROR] [dku.datasets.sql] - Connection error
org.postgresql.util.PSQLException: FATAL: password authentication failed for user "dku_tshirt_admin"
at org.postgresql.core.v3.ConnectionFactoryImpl.doAuthentication(ConnectionFactoryImpl.java:613)
at org.postgresql.core.v3.ConnectionFactoryImpl.tryConnect(ConnectionFactoryImpl.java:161)
at org.postgresql.core.v3.ConnectionFactoryImpl.openConnectionImpl(ConnectionFactoryImpl.java:213)
at org.postgresql.core.ConnectionFactory.openConnection(ConnectionFactory.java:51)
at org.postgresql.jdbc.PgConnection.<init>(PgConnection.java:225)
at org.postgresql.Driver.makeConnection(Driver.java:465)
at org.postgresql.Driver.connect(Driver.java:264)
at com.dataiku.dip.connections.SQLDriverLoader$DriverWrapper.connect(SQLDriverLoader.java:209)
at com.dataiku.dip.connections.SQLConnectionProvider$GenericSQLConnectionData.problemsAwareConnect(SQLConnectionProvider.java:484)
at com.dataiku.dip.connections.SQLConnectionProvider$GenericSQLConnectionData.buildConnection(SQLConnectionProvider.java:558)
at com.dataiku.dip.connections.SQLConnectionProvider.newConnection(SQLConnectionProvider.java:917)
at com.dataiku.dip.connections.SQLConnectionProvider.newConnection(SQLConnectionProvider.java:907)
at com.dataiku.dip.connections.SQLConnectionProvider.newConnection(SQLConnectionProvider.java:903)
at com.dataiku.dip.server.services.ConnectionsTestService.testSQL_NOTRANSACTION(ConnectionsTestService.java:350)
at com.dataiku.dip.server.services.ConnectionsTestService.testPostgreSQL(ConnectionsTestService.java:383)
at com.dataiku.dip.server.controllers.ConnectionsController.adminTestPostgreSQL(ConnectionsController.java:519)
at com.dataiku.dip.server.controllers.ConnectionsController$$FastClassBySpringCGLIB$$149f32e4.invoke(<generated>)
at org.springframework.cglib.proxy.MethodProxy.invoke(MethodProxy.java:204)
at org.springframework.aop.framework.CglibAopProxy$CglibMethodInvocation.invokeJoinpoint(CglibAopProxy.java:701)
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:150)
at org.springframework.aop.aspectj.MethodInvocationProceedingJoinPoint.proceed(MethodInvocationProceedingJoinPoint.java:80)
at com.dataiku.dip.server.controllers.CallTracingAspect.doCall(CallTracingAspect.java:78)
at sun.reflect.GeneratedMethodAccessor23.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.springframework.aop.aspectj.AbstractAspectJAdvice.invokeAdviceMethodWithGivenArgs(AbstractAspectJAdvice.java:621)
at org.springframework.aop.aspectj.AbstractAspectJAdvice.invokeAdviceMethod(AbstractAspectJAdvice.java:610)
at org.springframework.aop.aspectj.AspectJAroundAdvice.invoke(AspectJAroundAdvice.java:65)
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:161)
at org.springframework.aop.interceptor.ExposeInvocationInterceptor.invoke(ExposeInvocationInterceptor.java:91)
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:172)
at org.springframework.aop.framework.CglibAopProxy$DynamicAdvisedInterceptor.intercept(CglibAopProxy.java:633)
at com.dataiku.dip.server.controllers.ConnectionsController$$EnhancerBySpringCGLIB$$dacecbcf.adminTestPostgreSQL(<generated>)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.springframework.web.method.support.InvocableHandlerMethod.doInvoke(InvocableHandlerMethod.java:221)
at org.springframework.web.method.support.InvocableHandlerMethod.invokeForRequest(InvocableHandlerMethod.java:136)
at org.springframework.web.servlet.mvc.method.annotation.ServletInvocableHandlerMethod.invokeAndHandle(ServletInvocableHandlerMethod.java:104)
at org.springframework.web.servlet.mvc.method.annotation.RequestMappingHandlerAdapter.invokeHandleMethod(RequestMappingHandlerAdapter.java:743)
at org.springframework.web.servlet.mvc.method.annotation.RequestMappingHandlerAdapter.handleInternal(RequestMappingHandlerAdapter.java:672)
at org.springframework.web.servlet.mvc.method.AbstractHandlerMethodAdapter.handle(AbstractHandlerMethodAdapter.java:82)
at org.springframework.web.servlet.DispatcherServlet.doDispatch(DispatcherServlet.java:933)
at com.dataiku.dip.server.controllers.DKUDispatcherServlet.doDispatch(DKUDispatcherServlet.java:50)
at org.springframework.web.servlet.DispatcherServlet.doService(DispatcherServlet.java:867)
at org.springframework.web.servlet.FrameworkServlet.processRequest(FrameworkServlet.java:951)
at org.springframework.web.servlet.FrameworkServlet.doPost(FrameworkServlet.java:853)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:707)
at org.springframework.web.servlet.FrameworkServlet.service(FrameworkServlet.java:827)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:790)
at org.eclipse.jetty.servlet.ServletHolder$NotAsyncServlet.service(ServletHolder.java:1411)
at org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:763)
at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1651)
at com.dataiku.dip.shaker.server.ResourceFilter.doFilter(ResourceFilter.java:33)
at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1630)
at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:567)
at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
at org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:602)
at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127)
at org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:235)
at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:1610)
at org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:233)
at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1377)
at org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:188)
at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:507)
at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:1580)
at org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:186)
at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1292)
at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127)
at org.eclipse.jetty.server.Server.handle(Server.java:501)
at org.eclipse.jetty.server.HttpChannel.lambda$handle$1(HttpChannel.java:383)
at org.eclipse.jetty.server.HttpChannel.dispatch(HttpChannel.java:556)
at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:375)
at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:273)
at org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:311)
at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:105)
at org.eclipse.jetty.io.ChannelEndPoint$1.run(ChannelEndPoint.java:104)
at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.runTask(EatWhatYouKill.java:336)
at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.doProduce(EatWhatYouKill.java:313)
at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.tryProduce(EatWhatYouKill.java:171)
at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.run(EatWhatYouKill.java:129)
at org.eclipse.jetty.util.thread.ReservedThreadExecutor$ReservedThread.run(ReservedThreadExecutor.java:375)
at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:806)
at org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.run(QueuedThreadPool.java:938)
at java.lang.Thread.run(Thread.java:748)

 

0 Kudos
tgb417
Neuron
Neuron

Just a thought.

It appears that you are trying to connect two operating systems Linux on a vbox virtual computer running the Dataiku DSS application.  And your local windows running PostgreSQL. Is that the case?  

If so, it is my understanding that the default configuration of PostgreSQL is not to allow “network” traffic.  Even thought these operating systems are running in the same computer hardware.  Each of the operating systems think that they need to communicate out over their network connection.  Even though this is a virtual network all within your computer.  The magic of virtualization…

So Dss has to talk out over a “network” leaving the Linux OS,  and PostgreSQL needs to talk out over a “network” leaving your windows OS.  

To get this to work for testing it might be easier if you ran PostgreSQL on the Vbox at least to test and learn.  

That said if there are reasons to run them separately have you looked at the PostgreSQL setting to allow it to talk on a network.  

here is an old description. 
https://www.thegeekstuff.com/2014/02/enable-remote-postgresql-connection/

there are lots of other posts about this issue.

If you have already investigated this set of issues.  Please ignore this post.  Otherwise, do check this out.  

--Tom
Angelo
Level 3
Author

Thanks @tgb417 makes sense. I tried a few things and it worked!

Here's how:

  • I had to find out the IP of the Virtual Box adapter which you can do using the "ipconfig" command in windows
  • I also had to allow in the VB configuration a Network adapter connection of that type. 

This was suggested by this post's step 1 in StackOverflow (https://stackoverflow.com/questions/18121666/connect-to-postgresql-database-in-linux-virtualbox-from...

The difference being was that for Step 2 my host was not in the virtual box rather than Windows (in this article they're trying to access PostgreSQL in VBox from Windows, while I was trying practically the opposite: to authenticate from VBox/Dataiku to PostgreSQL in Windows). So for the 2nd step in the post to edit the pg_hba.conf file I had to do that in my windows conf.

What else I tried:

In VBox, I connected to the PostgreSQL setup in Dataiku, the local one i.e. in native Linux environment. There I explored the same files (pg_hba.conf and postgres.conf) and actually tried to test the connection from psql terminal in VBox (linux) and the local database called "dataiku".

I was able to connect to the native postgres connection called "dataiku" (which you can also see in the Dataiku UI as when you go to Connections and tested it and connected OK). That "dataiku" postgres connection in the UI was always testing Ok and was setup using "localhost" server i.e. from Vbox linux, DSS server was using "localhost" to access the local "dataiku" postgres db so connection was working. 

So this is where it's clicked for me: I had to find the IP of my Postgres Server (in windows) and put it there. And this is how it's worked. 

What else I learned:

Dataiku uses the default "dataiku" postgres database setup as the fallback for all the failed connections. e.g. when Initially I used the Sync Recipe or the Prepare recipe and you select your output dataset ("Store into" section) even if you select a PostgreSQL connection that is saved but is not authenticating (as I did initially for my Windows PostgreSQL), Dataiku will by default route it to the fallback "dataiku" postgres database setup in the backend. So when I used psql to connect to the "dataiku" PostgreSQL in VBox, I could see all the tables created in the DSS UI.

You're right. Natively in Linux, this would have been so easy, but was great learning after all.

CoreyS
Community Manager
Community Manager

Thank you for sharing your outcome @Angelo

Looking for more resources to help you use Dataiku effectively and upskill your knowledge? Check out these great resources: Dataiku Academy | Documentation | Knowledge Base

A reply answered your question? Mark as ‘Accepted Solution’ to help others like you!

Labels

?
Labels (2)
A banner prompting to get Dataiku