Error when joining datasets within Vertica (SQL Engine)

romcha
romcha Registered Posts: 6 ✭✭✭
edited July 16 in Using Dataiku

Hello,

I am not sure about where the issue comes from, but I experience some trouble using DSS 8.0 and Vertica CE (10.0.1), using corresponding JDBC driver (on Centos 8).

I have successfully imported datasets toward my Vertica DB using DSS, but when it comes to join two of them, as soon as the process gets started it kills the Vertica (single) node. More precisely, the Vertica node raise a SIGILL then the DSS job is aborted with connectivity error.

To be honest I guess it is a Vertica issue.

But it appears that when I copy paste the SQL query from the "show query" UI of DSS and run it as is with another client, let say dbeaver, it works like a charm.

My very first alternative is to work with a R recipe within DSS that does exactly the same job... and it works (but not using the SQL engine).

So I checked the logs of Vertica. One difference spotted is when using the Visual recipe to join the datasets, Vertica logs a "EXPLAIN SELECT [...]" query, where in case of a R recipe Vertica logs only "SELECT [...]" classical queries.

I am not familiar at all with Vertica, but do you feel there is a way to parametrise the Vertica connection to not use EXPLAIN SELECT ?

Thank you !

2020-10-26 11:17:57.393 Init Session:0x7fcb63fff700-a0000000003b05 [Txn] <INFO> Begin Txn: a0000000003b05 'EXPLAIN SELECT [...long but functional query...]'
2020-10-26 11:17:57.398 Init Session:0x7fcb63fff700-a0000000003b05 [Main] <INFO> Handling signal: 4
2020-10-26 11:17:58.009 MetadataPoolMonitor:0x7fcb617fa700 [ResourceManager] <INFO> Update metadata resource pool memory with delta:  Memory(KB): 6
2020-10-26 11:17:58.009 MetadataPoolMonitor:0x7fcb617fa700 <LOG> @v_midev_node0001: 00000/7794: Updated metadata pool:  Memory(KB): 27753

2020-10-26 11:18:02.742 Init Session:0x7fcb63fff700-a0000000003b05 [Main] <ALL> Core dumped to /my-path-to-catalog/v_midev_node0001_catalog
2020-10-26 11:18:04.738 Init Session:0x7fcb63fff700-a0000000003b05 [Main] <PANIC> Received fatal signal SIGILL.
2020-10-26 11:18:04.738 Init Session:0x7fcb63fff700-a0000000003b05 [Main] <PANIC> Info: si_code: 2, si_pid: 53194993, si_uid: 0, si_addr: 0x32bb0f1

Best Answer

  • Andrey
    Andrey Dataiker Alumni Posts: 119 ✭✭✭✭✭✭✭
    edited July 17 Answer ✓

    Hi Romain,

    After having a look at the diag you've sent I couldn't find a reason why

    dku.recipes.visual.alwaysComputeSQLExecutionPlan parameter wasn't applied.
    I tried to reproduce it locally with
    - Vertica 10 :
    docker run -p 5433:5433 bryanherger/vertica:10.0.0-0_centos-7
    
    - JDBC driver:
    https://www.vertica.com/client_drivers/10.0.x/10.0.1-0/vertica-jdbc-10.0.1-0.jar

    But it seemed to work in both cases: with execution plan enabled and disabled.

    If you'd like I could propose a troubleshooting session over Zoom whenever you're available to be able to find the problem faster.

    In the meantime you could:

    - Try running the join recipe from the flow by clicking on the recipe and then on "Run" in the right panel. In that case DSS doesn't try to fetch the execution plan at all.

    - Try to setup another Vertica 10 instance (possibly using a docker from a snippet above) and see if the issue persists

Answers

  • Andrey
    Andrey Dataiker Alumni Posts: 119 ✭✭✭✭✭✭✭
    edited July 17

    Hello,

    Could you try disabling execution plan computation and rerunning the recipe?

    To do that you'd need to add

    dku.recipes.visual.alwaysComputeSQLExecutionPlan=false

    to dip.properties file located in DATADIR/config

    Don't forget to restart DSS for the change to be taken into account

  • romcha
    romcha Registered Posts: 6 ✭✭✭
    edited July 17

    Hi,

    Thanks for the quick reply. I filled the file as recommended

    ~$ cat /opt/dss/8.0.1/config/dip.properties 
    # Internal DSS properties
    logging.limit.s3.ignoredPath=100
    logging.limit.s3.ignoredFile=100
    logging.limit.filePartitioner.noMatch=100
    dku.recipes.visual.alwaysComputeSQLExecutionPlan=false

    then restarted the DSS (8.0.1) instance, but it still uses the "Explain select" thing. The execution plan does not look as deactivated to me. Did I miss something?

    [2020/10/26-17:24:11.630] [FRT-34-FlowRunnable] [INFO] [dku.queries.executionplan] - Compute execution plan
    [2020/10/26-17:24:11.631] [FRT-34-FlowRunnable] [WARN] [dku.queries.executionplan] - Computing execution plan inside a transaction
    [2020/10/26-17:24:11.646] [FRT-34-FlowRunnable] [INFO] [dku.queries.executionplan] - Execution plan query: 
    EXPLAIN SELECT "[...]"
    [2020/10/26-17:24:16.485] [FRT-34-FlowRunnable] [WARN] [dku.flow.sql] - Could not print execution plan
    java.sql.SQLRecoverableException: [Vertica][VJDBC](100024) IOException while communicating with server: java.io.EOFException.[...]

    Is there any way to check if the properties is well taken into account (in some log or UI?).

    Thanks!

    R.

  • romcha
    romcha Registered Posts: 6 ✭✭✭
    edited July 17

    Hi,

    -- appears my previous reply was not uploaded so here is another go --

    Thanks for the quick reply!

    I edited the file as follows

    $ cat /opt/dss/8.0.1/config/dip.properties 
    # Internal DSS properties
    logging.limit.s3.ignoredPath=100
    logging.limit.s3.ignoredFile=100
    logging.limit.filePartitioner.noMatch=100
    dku.recipes.visual.alwaysComputeSQLExecutionPlan=false

    then restarted the DSS instance (8.0.1), and rebuild the dataset.

    Unfortunately I still have the issue of killing the vertica node. However I am not sure that the param is used since from the (dss) job log I still have the execution plan and "Explain SQL [...]"... more precisely:

    [2020/10/26-17:24:11.602] [FRT-34-FlowRunnable] [INFO] [dku.dataset.sql] - Statement done
    [2020/10/26-17:24:11.603] [FRT-34-FlowRunnable] [DEBUG] [dku.connections.sql.provider] - Commit conn=VERTICACON-PtsQPBV
    [2020/10/26-17:24:11.630] [FRT-34-FlowRunnable] [INFO] [dku.queries.executionplan] - Compute execution plan
    [2020/10/26-17:24:11.631] [FRT-34-FlowRunnable] [WARN] [dku.queries.executionplan] - Computing execution plan inside a transaction
    [2020/10/26-17:24:11.646] [FRT-34-FlowRunnable] [INFO] [dku.queries.executionplan] - Execution plan query: 
    EXPLAIN SELECT [...]
    [2020/10/26-17:24:16.485] [FRT-34-FlowRunnable] [WARN] [dku.flow.sql] - Could not print execution plan
    java.sql.SQLRecoverableException: [Vertica][VJDBC](100024) IOException while communicating with server: java.io.EOFException.
        at com.vertica.io.ProtocolStream.logAndConvertToNetworkException(Unknown Source)

    Can I check elsewhere if the param is taken into account at reboot? (like a log or UI).
    Thanks!

    R

  • romcha
    romcha Registered Posts: 6 ✭✭✭
    edited July 17

    Hi Andrey,

    Thanks for the quick reply!

    I did modify the properties file as follows

    $> cat /opt/dss/8.0.1/config/dip.properties 
    # Internal DSS properties
    logging.limit.s3.ignoredPath=100
    logging.limit.s3.ignoredFile=100
    logging.limit.filePartitioner.noMatch=100
    dku.recipes.visual.alwaysComputeSQLExecutionPlan=false

    and then restarted the instance. However the execution plan still seems to be run.

    Screenshot 2020-10-27 at 15.21.17.png

    And the fact that there is stil an EXPLAIN SELECT might cause the issue...

    Thanks
    R

  • Andrey
    Andrey Dataiker Alumni Posts: 119 ✭✭✭✭✭✭✭

    Romain,

    To collect more information could you please send a diagnostic of the DSS instance (Administration > Maintenance > Diagnostic tool) and send us the resulting file? Note that you need to be administrator of the DSS instance - else you'll need to ask your admin.
    You can send it to andrey.avtomonov@dataiku.com If the resulting file is too large for mail (> 15 MB), you can use https://dl.dataiku.com to send it to us. Please don't forget to send the link that is generated when you upload the file.
  • romcha
    romcha Registered Posts: 6 ✭✭✭

    Hi Andrey,

    I tried using another fresh install of Vertica (using the docker container specified in your reply) and unfortunately the trouble still persist.

    If possible, yes, a Zoom call might be the best option to investigate.

    I mp you to schedule the meeting.

    Thank you!!

  • romcha
    romcha Registered Posts: 6 ✭✭✭

    Hi,

    After calling Andrey, we managed to confirm the issue is more related to "the hardware I am using" + "Vertica 10.0" than from DSS, as far as we managed to use my DSS instance with another Vertica instance for specifically the same job.

    Thanks for the support of Andrey!

    R.

Setup Info
    Tags
      Help me…