SQL queries using AWS Athena - why an error when adding comments?

Options
Tanguy
Tanguy Dataiku DSS Core Designer, Dataiku DSS & SQL, Dataiku DSS ML Practitioner, Dataiku DSS Core Concepts, Neuron, Dataiku DSS Adv Designer, Registered, Dataiku DSS Developer, Neuron 2023 Posts: 112 Neuron

There seems to be a bug when using SQL notebooks or SQL probes with AWS Athena when inserting comments at both the beginning and the end of the script (which turns out happens quite often, for example when explaining a script at the beginning and putting the WHERE clause inside comment at the end of the query).

For example, those two queries work fine in an SQL notebook:

  1. comment at the beginning of a script1.jpg
  2. comment at the end of a script 2.jpg

but this query returns an error:

  1. comment at both the beginning and the end of a script

3.jpg

This problematic behaviour is also met in SQL probes when defining new metrics (where comments are added by default at the beginning and at the end of the SQL probe!)

4.jpg

Any ideas on how to fix this problem?


Operating system used: Windows 10

Answers

  • ismayiltahirov
    ismayiltahirov Dataiker, Dataiku DSS Core Designer, Dataiku DSS Adv Designer, Registered Posts: 4 Dataiker
    Options

    Hi, It should work with several multiline comments, if using multiline comments is fine for you.
    Screen Shot 2022-11-07 at 9.22.12 PM.png

  • Tanguy
    Tanguy Dataiku DSS Core Designer, Dataiku DSS & SQL, Dataiku DSS ML Practitioner, Dataiku DSS Core Concepts, Neuron, Dataiku DSS Adv Designer, Registered, Dataiku DSS Developer, Neuron 2023 Posts: 112 Neuron
    Options

    Thank you for your answer @ismayiltahirov
    .

    Actually, we have abandoned multi-line comments as they can prevent a Spark SQL recipe from running with an odd number of quotes (and as we work as a French team, unfortunately we often use quotes ).

    Thus, dataiku's support team suggested us to use '--' one-line comments.

    All in all, we encounter problems with both type of comments...

  • ismayiltahirov
    ismayiltahirov Dataiker, Dataiku DSS Core Designer, Dataiku DSS Adv Designer, Registered Posts: 4 Dataiker
    Options

    Hello can you check if switching statements execution mode of the notebook to regular statement will solve the issue?

    regst.png

  • Tanguy
    Tanguy Dataiku DSS Core Designer, Dataiku DSS & SQL, Dataiku DSS ML Practitioner, Dataiku DSS Core Concepts, Neuron, Dataiku DSS Adv Designer, Registered, Dataiku DSS Developer, Neuron 2023 Posts: 112 Neuron
    Options

    This worked indeed for SQL notebooks. However, this configuration seems unavailable for SQL probes:

    comment_SQL_probe.png

  • ismayiltahirov
    ismayiltahirov Dataiker, Dataiku DSS Core Designer, Dataiku DSS Adv Designer, Registered Posts: 4 Dataiker
    Options

    Hi,

    Since sql probes wouldn't use spark, I am curious if it would be ok for you using multiline comments. It should work even if you add an empty one at the end of the script like in a screen below

    probe.png

  • Tanguy
    Tanguy Dataiku DSS Core Designer, Dataiku DSS & SQL, Dataiku DSS ML Practitioner, Dataiku DSS Core Concepts, Neuron, Dataiku DSS Adv Designer, Registered, Dataiku DSS Developer, Neuron 2023 Posts: 112 Neuron
    Options

    Yes, multiline comments work outside of Spark (in our case when using AWS Athena as the SQL engine).

    However, do you think it would be possible for dataiku to provide a uniform comment style that works regardless of the engine and the underlying comments (e.g. odd number quotes inside the comment)?

  • Tanguy
    Tanguy Dataiku DSS Core Designer, Dataiku DSS & SQL, Dataiku DSS ML Practitioner, Dataiku DSS Core Concepts, Neuron, Dataiku DSS Adv Designer, Registered, Dataiku DSS Developer, Neuron 2023 Posts: 112 Neuron
    Options

    Just wanted to mention a new bug encountered on a Spark SQL recipe: the column I tried to add on line #35 is not added to the output table (even though dataiku explicitly informs me that it was added by updating the schema of the output table when validating the recipe). After the build, the column is not added when using those comments in the recipe (I had to remove the comments to successfully add the column).

    comment_Spark_SQL.png

Setup Info
    Tags
      Help me…