Error when joining datasets within Vertica (SQL Engine)

Solved!
romcha
Level 1
Error when joining datasets within Vertica (SQL Engine)

Hello,

I am not sure about where the issue comes from, but I experience some trouble using DSS 8.0 and Vertica CE (10.0.1), using corresponding JDBC driver (on Centos 8).

I have successfully imported datasets toward my Vertica DB using DSS, but when it comes to join two of them, as soon as the process gets started it kills the Vertica (single) node. More precisely, the Vertica node raise a SIGILL then the DSS job is aborted with connectivity error.

To be honest I guess it is a Vertica issue.

But it appears that when I copy paste the SQL query from the "show query" UI of DSS and run it as is with another client, let say dbeaver, it works like a charm.

My very first alternative is to work with a R recipe within DSS that does exactly the same job... and it works (but not using the SQL engine).

So I checked the logs of Vertica. One difference spotted is when using the Visual recipe to join the datasets, Vertica logs a "EXPLAIN SELECT [...]" query, where in case of a R recipe Vertica logs only "SELECT [...]" classical queries.

I am not familiar at all with Vertica, but do you feel there is a way to parametrise the Vertica connection to not use EXPLAIN SELECT ?

Thank you !

2020-10-26 11:17:57.393 Init Session:0x7fcb63fff700-a0000000003b05 [Txn] <INFO> Begin Txn: a0000000003b05 'EXPLAIN SELECT [...long but functional query...]'
2020-10-26 11:17:57.398 Init Session:0x7fcb63fff700-a0000000003b05 [Main] <INFO> Handling signal: 4
2020-10-26 11:17:58.009 MetadataPoolMonitor:0x7fcb617fa700 [ResourceManager] <INFO> Update metadata resource pool memory with delta:  Memory(KB): 6
2020-10-26 11:17:58.009 MetadataPoolMonitor:0x7fcb617fa700 <LOG> @v_midev_node0001: 00000/7794: Updated metadata pool:  Memory(KB): 27753

2020-10-26 11:18:02.742 Init Session:0x7fcb63fff700-a0000000003b05 [Main] <ALL> Core dumped to /my-path-to-catalog/v_midev_node0001_catalog
2020-10-26 11:18:04.738 Init Session:0x7fcb63fff700-a0000000003b05 [Main] <PANIC> Received fatal signal SIGILL.
2020-10-26 11:18:04.738 Init Session:0x7fcb63fff700-a0000000003b05 [Main] <PANIC> Info: si_code: 2, si_pid: 53194993, si_uid: 0, si_addr: 0x32bb0f1
0 Kudos
1 Solution
Andrey
Dataiker Alumni

Hi Romain,

After having a look at the diag you've sent I couldn't find a reason why 

dku.recipes.visual.alwaysComputeSQLExecutionPlan parameter wasn't applied.
 
I tried to reproduce it locally with 
- Vertica 10 : 
docker run -p 5433:5433 bryanherger/vertica:10.0.0-0_centos-7
 - JDBC driver:
https://www.vertica.com/client_drivers/10.0.x/10.0.1-0/vertica-jdbc-10.0.1-0.jar

 

But it seemed to work in both cases: with execution plan enabled and disabled. 

 

If you'd like I could propose a troubleshooting session over Zoom whenever you're available to be able to find the problem faster.

 

In the meantime you could:

- Try running the join recipe from the flow by clicking on the recipe and then on "Run" in the right panel. In that case DSS doesn't try to fetch the execution plan at all.

- Try to setup another Vertica 10 instance (possibly using a docker from a snippet above) and see if the issue persists

 

 

 

 

 

Andrey Avtomonov
R&D Engineer @ Dataiku

View solution in original post

0 Kudos
8 Replies
Andrey
Dataiker Alumni

Hello,

Could you try disabling execution plan computation and rerunning the recipe?

To do that you'd need to add 

dku.recipes.visual.alwaysComputeSQLExecutionPlan=false

to dip.properties file located in DATADIR/config 

Don't forget to restart DSS for the change to be taken into account

Andrey Avtomonov
R&D Engineer @ Dataiku
0 Kudos
romcha
Level 1
Author

Hi,

Thanks for the quick reply. I filled the file as recommended 

~$ cat /opt/dss/8.0.1/config/dip.properties 
# Internal DSS properties
logging.limit.s3.ignoredPath=100
logging.limit.s3.ignoredFile=100
logging.limit.filePartitioner.noMatch=100
dku.recipes.visual.alwaysComputeSQLExecutionPlan=false

then restarted the DSS (8.0.1) instance, but it still uses the "Explain select" thing. The execution plan does not look as deactivated to me. Did I miss something?

[2020/10/26-17:24:11.630] [FRT-34-FlowRunnable] [INFO] [dku.queries.executionplan] - Compute execution plan
[2020/10/26-17:24:11.631] [FRT-34-FlowRunnable] [WARN] [dku.queries.executionplan] - Computing execution plan inside a transaction
[2020/10/26-17:24:11.646] [FRT-34-FlowRunnable] [INFO] [dku.queries.executionplan] - Execution plan query: 
EXPLAIN SELECT "[...]"
[2020/10/26-17:24:16.485] [FRT-34-FlowRunnable] [WARN] [dku.flow.sql] - Could not print execution plan
java.sql.SQLRecoverableException: [Vertica][VJDBC](100024) IOException while communicating with server: java.io.EOFException.[...]

 Is there any way to check if the properties is well taken into account (in some log or UI?).

Thanks!

R.

0 Kudos
romcha
Level 1
Author

Hi, 

-- appears my previous reply was not uploaded so here is another go --

Thanks for the quick reply!

I edited the file as follows

$ cat /opt/dss/8.0.1/config/dip.properties 
# Internal DSS properties
logging.limit.s3.ignoredPath=100
logging.limit.s3.ignoredFile=100
logging.limit.filePartitioner.noMatch=100
dku.recipes.visual.alwaysComputeSQLExecutionPlan=false

then restarted the DSS instance  (8.0.1), and rebuild the dataset. 

Unfortunately I still have the issue of killing the vertica node. However I am not sure that the param is used since from the (dss) job log I still have the execution plan and "Explain SQL [...]"... more precisely:

[2020/10/26-17:24:11.602] [FRT-34-FlowRunnable] [INFO] [dku.dataset.sql] - Statement done
[2020/10/26-17:24:11.603] [FRT-34-FlowRunnable] [DEBUG] [dku.connections.sql.provider] - Commit conn=VERTICACON-PtsQPBV
[2020/10/26-17:24:11.630] [FRT-34-FlowRunnable] [INFO] [dku.queries.executionplan] - Compute execution plan
[2020/10/26-17:24:11.631] [FRT-34-FlowRunnable] [WARN] [dku.queries.executionplan] - Computing execution plan inside a transaction
[2020/10/26-17:24:11.646] [FRT-34-FlowRunnable] [INFO] [dku.queries.executionplan] - Execution plan query: 
EXPLAIN SELECT [...]
[2020/10/26-17:24:16.485] [FRT-34-FlowRunnable] [WARN] [dku.flow.sql] - Could not print execution plan
java.sql.SQLRecoverableException: [Vertica][VJDBC](100024) IOException while communicating with server: java.io.EOFException.
	at com.vertica.io.ProtocolStream.logAndConvertToNetworkException(Unknown Source)

 Can I check elsewhere if the param is taken into account at reboot? (like a log or UI).
Thanks!

R

0 Kudos
romcha
Level 1
Author

Hi Andrey,

Thanks for the quick reply!

I did modify the properties file as follows

$> cat /opt/dss/8.0.1/config/dip.properties 
# Internal DSS properties
logging.limit.s3.ignoredPath=100
logging.limit.s3.ignoredFile=100
logging.limit.filePartitioner.noMatch=100
dku.recipes.visual.alwaysComputeSQLExecutionPlan=false

and then restarted the instance. However the execution plan still seems to be run.

Screenshot 2020-10-27 at 15.21.17.png

 And the fact that there is stil an EXPLAIN SELECT might cause the issue... 

Thanks 
R

0 Kudos
Andrey
Dataiker Alumni

Romain, 

To collect more information could you please send a diagnostic of the DSS instance (Administration > Maintenance > Diagnostic tool) and send us the resulting file? Note that you need to be administrator of the DSS instance - else you'll need to ask your admin.
 
You can send it to andrey.avtomonov@dataiku.com If the resulting file is too large for mail (> 15 MB), you can use https://dl.dataiku.com to send it to us. Please don't forget to send the link that is generated when you upload the file.
 
Andrey Avtomonov
R&D Engineer @ Dataiku
0 Kudos
Andrey
Dataiker Alumni

Hi Romain,

After having a look at the diag you've sent I couldn't find a reason why 

dku.recipes.visual.alwaysComputeSQLExecutionPlan parameter wasn't applied.
 
I tried to reproduce it locally with 
- Vertica 10 : 
docker run -p 5433:5433 bryanherger/vertica:10.0.0-0_centos-7
 - JDBC driver:
https://www.vertica.com/client_drivers/10.0.x/10.0.1-0/vertica-jdbc-10.0.1-0.jar

 

But it seemed to work in both cases: with execution plan enabled and disabled. 

 

If you'd like I could propose a troubleshooting session over Zoom whenever you're available to be able to find the problem faster.

 

In the meantime you could:

- Try running the join recipe from the flow by clicking on the recipe and then on "Run" in the right panel. In that case DSS doesn't try to fetch the execution plan at all.

- Try to setup another Vertica 10 instance (possibly using a docker from a snippet above) and see if the issue persists

 

 

 

 

 

Andrey Avtomonov
R&D Engineer @ Dataiku
0 Kudos
romcha
Level 1
Author

Hi Andrey,

I tried using another fresh install of Vertica (using the docker container specified in your reply) and unfortunately the trouble still persist.

If possible, yes, a Zoom call might be the best option to investigate. 

I mp you to schedule the meeting.

Thank you!!

0 Kudos
romcha
Level 1
Author

Hi,

After calling Andrey, we managed to confirm the issue is more related to "the hardware I am using" +  "Vertica 10.0" than from DSS, as far as we managed to use my DSS instance with another Vertica instance for specifically the same job.

Thanks for the support of Andrey!

R.

0 Kudos