Sign up to take part
Registered users can ask their own questions, contribute to discussions, and be part of the Community!
Added on April 11, 2022 5:23AM
Likes: 0
Replies: 13
Hi Dataiku community,
I'm currently re-doing the https://academy.dataiku.com/integration-with-sql-databases-1 course and I'm trying to setup a Connection to Postgres.
In the past I had completed this tutorial succesfully and have the Postgres installation in my laptop but the PostgresSQL connection I had setup at the time is not working (cannot authenticate user) so I started it from scratch ( I wonder if I need to remove Postgres and reinstall it just in case).
I've setup all the required tables in Postgres. Here's the verification from the SQL Shell:
postgres=# \c dku
You are now connected to database "dku" as user "postgres".
dku=# \l
List of databases
Name | Owner | Encoding | Collate | Ctype | Access privileges
-----------+----------+----------+-----------------------------+-----------------------------+-----------------------
dku | postgres | UTF8 | English_United Kingdom.1252 | English_United Kingdom.1252 |
postgres | postgres | UTF8 | English_United Kingdom.1252 | English_United Kingdom.1252 |
template0 | postgres | UTF8 | English_United Kingdom.1252 | English_United Kingdom.1252 | =c/postgres +
| | | | | postgres=CTc/postgres
template1 | postgres | UTF8 | English_United Kingdom.1252 | English_United Kingdom.1252 | =c/postgres +
| | | | | postgres=CTc/postgres
(4 rows)
dku=# \du
List of roles
Role name | Attributes | Member of
------------------+------------------------------------------------------------+-----------
angelo | | {}
dku_tshirt_admin | | {}
matthieu | | {}
postgres | Superuser, Create role, Create DB, Replication, Bypass RLS | {}
dku=# \dn
List of schemas
Name | Owner
------------+------------------
dku_churn | angelo
dku_tshirt | dku_tshirt_admin
public | postgres
(3 rows)
dku=#
I can also see these in the UI tool of Postgres, PgAdmin.
Also, I have checked the instructions under https://doc.dataiku.com/dss/latest/operations/datadir.html under my Data Dir and I can find the jdbc drivers under /home/dataiku/dataiku-dss-10.0.2/lib/ivy/jdbc-postgress and contains the postgresql-42.2.18.jar.
I have also checked the instructions from here: https://doc.dataiku.com/dss/latest/connecting/sql/postgresql.html and I had two questions:
Finally, just to provide some information about my machine and setup:
Any help would be appreciated.
Thank you,
Angelo
Operating system used: Windows 10
Just a thought.
It appears that you are trying to connect two operating systems Linux on a vbox virtual computer running the Dataiku DSS application. And your local windows running PostgreSQL. Is that the case?
If so, it is my understanding that the default configuration of PostgreSQL is not to allow “network” traffic. Even thought these operating systems are running in the same computer hardware. Each of the operating systems think that they need to communicate out over their network connection. Even though this is a virtual network all within your computer. The magic of virtualization…
So Dss has to talk out over a “network” leaving the Linux OS, and PostgreSQL needs to talk out over a “network” leaving your windows OS.
To get this to work for testing it might be easier if you ran PostgreSQL on the Vbox at least to test and learn.
That said if there are reasons to run them separately have you looked at the PostgreSQL setting to allow it to talk on a network.
here is an old description.
https://www.thegeekstuff.com/2014/02/enable-remote-postgresql-connection/
there are lots of other posts about this issue.
If you have already investigated this set of issues. Please ignore this post. Otherwise, do check this out.
Hi @Angelo
Can you please clarify two points:
Hi @sergeyd
,
Thank you for coming back to me.
1. See attached error: it's a authentication error for the users setup in https://academy-content.dataiku.com/latest/courses/sql-integration/tech-prerequisites/configure-database.html#create-and-configure-your-postgresql-database
2. The user I'm using to access DSS-UI is "admin"
3. Here's the conf file ( I added the line with "trust" in the end but commented it out as it didn't work)
# TYPE DATABASE USER ADDRESS METHOD
# "local" is for Unix domain socket connections only
local all all scram-sha-256
#local all all trust
# IPv4 local connections:
host all all 127.0.0.1/32 scram-sha-256
# IPv6 local connections:
host all all ::1/128 scram-sha-256
# Allow replication connections from localhost, by a user with the
# replication privilege.
local replication all scram-sha-256
host replication all 127.0.0.1/32 scram-sha-256
host replication all ::1/128 scram-sha-256
Thanks. Is there a reason why you have used scram-sha-256 for auth?
Can you please change it to md5 and restart Postgres?
No reason, I don't recall changing any settings previously. I'll update it to md5 anyway and try again.
Steps taken:
Thanks for the tests. There is no need to restart DSS though.
Please note that psql has native auth so testing the connection with psql is not the same as the connection from DSS. Were you able to connect to this Postgres instance with an external tool using JDBC drivers like DBVisualizer for example?
DSS doesn't modify or alter JDBC drivers. PostgreSQL one is indeed bundled OOTB so you do not need to install it.
@sergeyd
just to clarify, I wasn't testing the connection through psql: I used psql to setup the DB (dku), the schemas (dku_tshirt, dku_churn), the users (Matthiew, Angelo, dku_tshirt_admin) and then to assign privileges.
The reason I mentioned psql was to show the content of these that they are indeed there and the fact that I've used this user to authenticate, so they seem to be ok. I can see these both in psql shell or the PgAdmin UI of Postgres.
Also, see attached screenshot from PgAdmin UI in zip file that follows in next comment.
@sergeyd
did a quick test with DBVisualiser and seem to be able to connect.
Essentially I can see the same info that I can see via psql and I'm able to authenticate using the user credentials setup for dku_tshirt_admin.
See screenshots attached.
Any other thoughts?
Also: tried another dss instance I had installed in VBox, which was clean:I tried to create a brank new connection using dku/dku_tshirt_admoin credentials and this time I got the same error message in French.
Going back to my current dss instance, I tried the connection again and grabbed the logs from the system:
[2022/04/12-06:10:00.829] [qtp923341586-15] [INFO] [dku.connections.sql.provider] - [ct: 55] Connecting to jdbc:postgresql://localhost:5432/dku with props: {"user":"dku_tshirt_admin","password":"***"} conn=PostgreSQL_tshirt-S8qMBaD
[2022/04/12-06:10:00.842] [qtp923341586-15] [DEBUG] [dku.connections.sql.driver] - Driver version 42.2
[2022/04/12-06:10:00.891] [qtp923341586-15] [INFO] [dku] - SQL D:null
[2022/04/12-06:10:00.891] [qtp923341586-15] [ERROR] [dku.datasets.sql] - Connection error
org.postgresql.util.PSQLException: FATAL: password authentication failed for user "dku_tshirt_admin"
at org.postgresql.core.v3.ConnectionFactoryImpl.doAuthentication(ConnectionFactoryImpl.java:613)
at org.postgresql.core.v3.ConnectionFactoryImpl.tryConnect(ConnectionFactoryImpl.java:161)
at org.postgresql.core.v3.ConnectionFactoryImpl.openConnectionImpl(ConnectionFactoryImpl.java:213)
at org.postgresql.core.ConnectionFactory.openConnection(ConnectionFactory.java:51)
at org.postgresql.jdbc.PgConnection.<init>(PgConnection.java:225)
at org.postgresql.Driver.makeConnection(Driver.java:465)
at org.postgresql.Driver.connect(Driver.java:264)
at com.dataiku.dip.connections.SQLDriverLoader$DriverWrapper.connect(SQLDriverLoader.java:209)
at com.dataiku.dip.connections.SQLConnectionProvider$GenericSQLConnectionData.problemsAwareConnect(SQLConnectionProvider.java:484)
at com.dataiku.dip.connections.SQLConnectionProvider$GenericSQLConnectionData.buildConnection(SQLConnectionProvider.java:558)
at com.dataiku.dip.connections.SQLConnectionProvider.newConnection(SQLConnectionProvider.java:917)
at com.dataiku.dip.connections.SQLConnectionProvider.newConnection(SQLConnectionProvider.java:907)
at com.dataiku.dip.connections.SQLConnectionProvider.newConnection(SQLConnectionProvider.java:903)
at com.dataiku.dip.server.services.ConnectionsTestService.testSQL_NOTRANSACTION(ConnectionsTestService.java:350)
at com.dataiku.dip.server.services.ConnectionsTestService.testPostgreSQL(ConnectionsTestService.java:383)
at com.dataiku.dip.server.controllers.ConnectionsController.adminTestPostgreSQL(ConnectionsController.java:519)
at com.dataiku.dip.server.controllers.ConnectionsController$$FastClassBySpringCGLIB$$149f32e4.invoke(<generated>)
at org.springframework.cglib.proxy.MethodProxy.invoke(MethodProxy.java:204)
at org.springframework.aop.framework.CglibAopProxy$CglibMethodInvocation.invokeJoinpoint(CglibAopProxy.java:701)
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:150)
at org.springframework.aop.aspectj.MethodInvocationProceedingJoinPoint.proceed(MethodInvocationProceedingJoinPoint.java:80)
at com.dataiku.dip.server.controllers.CallTracingAspect.doCall(CallTracingAspect.java:78)
at sun.reflect.GeneratedMethodAccessor23.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.springframework.aop.aspectj.AbstractAspectJAdvice.invokeAdviceMethodWithGivenArgs(AbstractAspectJAdvice.java:621)
at org.springframework.aop.aspectj.AbstractAspectJAdvice.invokeAdviceMethod(AbstractAspectJAdvice.java:610)
at org.springframework.aop.aspectj.AspectJAroundAdvice.invoke(AspectJAroundAdvice.java:65)
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:161)
at org.springframework.aop.interceptor.ExposeInvocationInterceptor.invoke(ExposeInvocationInterceptor.java:91)
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:172)
at org.springframework.aop.framework.CglibAopProxy$DynamicAdvisedInterceptor.intercept(CglibAopProxy.java:633)
at com.dataiku.dip.server.controllers.ConnectionsController$$EnhancerBySpringCGLIB$$dacecbcf.adminTestPostgreSQL(<generated>)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.springframework.web.method.support.InvocableHandlerMethod.doInvoke(InvocableHandlerMethod.java:221)
at org.springframework.web.method.support.InvocableHandlerMethod.invokeForRequest(InvocableHandlerMethod.java:136)
at org.springframework.web.servlet.mvc.method.annotation.ServletInvocableHandlerMethod.invokeAndHandle(ServletInvocableHandlerMethod.java:104)
at org.springframework.web.servlet.mvc.method.annotation.RequestMappingHandlerAdapter.invokeHandleMethod(RequestMappingHandlerAdapter.java:743)
at org.springframework.web.servlet.mvc.method.annotation.RequestMappingHandlerAdapter.handleInternal(RequestMappingHandlerAdapter.java:672)
at org.springframework.web.servlet.mvc.method.AbstractHandlerMethodAdapter.handle(AbstractHandlerMethodAdapter.java:82)
at org.springframework.web.servlet.DispatcherServlet.doDispatch(DispatcherServlet.java:933)
at com.dataiku.dip.server.controllers.DKUDispatcherServlet.doDispatch(DKUDispatcherServlet.java:50)
at org.springframework.web.servlet.DispatcherServlet.doService(DispatcherServlet.java:867)
at org.springframework.web.servlet.FrameworkServlet.processRequest(FrameworkServlet.java:951)
at org.springframework.web.servlet.FrameworkServlet.doPost(FrameworkServlet.java:853)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:707)
at org.springframework.web.servlet.FrameworkServlet.service(FrameworkServlet.java:827)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:790)
at org.eclipse.jetty.servlet.ServletHolder$NotAsyncServlet.service(ServletHolder.java:1411)
at org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:763)
at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1651)
at com.dataiku.dip.shaker.server.ResourceFilter.doFilter(ResourceFilter.java:33)
at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1630)
at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:567)
at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
at org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:602)
at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127)
at org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:235)
at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:1610)
at org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:233)
at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1377)
at org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:188)
at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:507)
at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:1580)
at org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:186)
at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1292)
at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127)
at org.eclipse.jetty.server.Server.handle(Server.java:501)
at org.eclipse.jetty.server.HttpChannel.lambda$handle$1(HttpChannel.java:383)
at org.eclipse.jetty.server.HttpChannel.dispatch(HttpChannel.java:556)
at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:375)
at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:273)
at org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:311)
at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:105)
at org.eclipse.jetty.io.ChannelEndPoint$1.run(ChannelEndPoint.java:104)
at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.runTask(EatWhatYouKill.java:336)
at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.doProduce(EatWhatYouKill.java:313)
at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.tryProduce(EatWhatYouKill.java:171)
at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.run(EatWhatYouKill.java:129)
at org.eclipse.jetty.util.thread.ReservedThreadExecutor$ReservedThread.run(ReservedThreadExecutor.java:375)
at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:806)
at org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.run(QueuedThreadPool.java:938)
at java.lang.Thread.run(Thread.java:748)
Thanks @tgb417
makes sense. I tried a few things and it worked!
Here's how:
This was suggested by this post's step 1 in StackOverflow (https://stackoverflow.com/questions/18121666/connect-to-postgresql-database-in-linux-virtualbox-from-win7).
The difference being was that for Step 2 my host was not in the virtual box rather than Windows (in this article they're trying to access PostgreSQL in VBox from Windows, while I was trying practically the opposite: to authenticate from VBox/Dataiku to PostgreSQL in Windows). So for the 2nd step in the post to edit the pg_hba.conf file I had to do that in my windows conf.
What else I tried:
In VBox, I connected to the PostgreSQL setup in Dataiku, the local one i.e. in native Linux environment. There I explored the same files (pg_hba.conf and postgres.conf) and actually tried to test the connection from psql terminal in VBox (linux) and the local database called "dataiku".
I was able to connect to the native postgres connection called "dataiku" (which you can also see in the Dataiku UI as when you go to Connections and tested it and connected OK). That "dataiku" postgres connection in the UI was always testing Ok and was setup using "localhost" server i.e. from Vbox linux, DSS server was using "localhost" to access the local "dataiku" postgres db so connection was working.
So this is where it's clicked for me: I had to find the IP of my Postgres Server (in windows) and put it there. And this is how it's worked.
What else I learned:
Dataiku uses the default "dataiku" postgres database setup as the fallback for all the failed connections. e.g. when Initially I used the Sync Recipe or the Prepare recipe and you select your output dataset ("Store into" section) even if you select a PostgreSQL connection that is saved but is not authenticating (as I did initially for my Windows PostgreSQL), Dataiku will by default route it to the fallback "dataiku" postgres database setup in the backend. So when I used psql to connect to the "dataiku" PostgreSQL in VBox, I could see all the tables created in the DSS UI.
You're right. Natively in Linux, this would have been so easy, but was great learning after all.