Update SQL table

Options
davidmakovoz
davidmakovoz Dataiku DSS Core Designer, Dataiku DSS & SQL, Dataiku DSS ML Practitioner, Dataiku DSS Core Concepts, Neuron, Dataiku DSS Adv Designer, Registered, Neuron 2022, Neuron 2023 Posts: 67 Neuron

I created a recipe that has the input and output datasets in Redshift.

After I ran the recipe and populated the output dataset, I modified it externally, outside of Dataiku by dropping some rows.

And the modification broke something real bad.

First, when I queried the dataset from a SQL notebook I would see the dropped rows.

Then, when I tried to rebuild the dataset, the job stalled. The last line in the log file was:

Drop [table_name]

Now, I can't do anything with this table, can't view it, can't rebuild it.

What did I do wrong? Why can't I modify the table created by Dataiku?

Thank you,

Tagged:

Best Answer

  • ATsao
    ATsao Dataiker Alumni, Registered Posts: 139 ✭✭✭✭✭✭✭✭
    Answer ✓
    Options

    Hi David,

    You might need to add a Commit after your drop statement when dropping the table outside of DSS to ensure that the table is actually being dropped properly.

    Thanks,

    Andrew

Answers

  • ATsao
    ATsao Dataiker Alumni, Registered Posts: 139 ✭✭✭✭✭✭✭✭
    Options

    Hi David,

    Can you clarify exactly how you modified this dataset externally and how you went about dropping some rows from this table? When you run a recipe in overwrite mode (default setting), DSS will attempt to drop the target table/dataset before recreating it. Can you check if you are able to drop this table directly? If so, can you try rebuilding this dataset again afterwards?

    Thanks,

    Andrew

  • davidmakovoz
    davidmakovoz Dataiku DSS Core Designer, Dataiku DSS & SQL, Dataiku DSS ML Practitioner, Dataiku DSS Core Concepts, Neuron, Dataiku DSS Adv Designer, Registered, Neuron 2022, Neuron 2023 Posts: 67 Neuron
    edited July 17
    Options

    Hi Andrew.

    query = """
    delete from [schema_name].[table_name] where [column_name] = value;
    """
    pandas.io.sql.execute(query, cnxn)

    Yes I dropped the table directly. But even after than Dataiku was showing the table as present until I dropped in Dataiku.

    I.e. I executed

    drop table [schema_name].[table_name] 

    outside of DSS and then in the SQL notebook in DSS, and only then the table disappeared from the view in DSS

  • davidmakovoz
    davidmakovoz Dataiku DSS Core Designer, Dataiku DSS & SQL, Dataiku DSS ML Practitioner, Dataiku DSS Core Concepts, Neuron, Dataiku DSS Adv Designer, Registered, Neuron 2022, Neuron 2023 Posts: 67 Neuron
    Options

    Hi Andrew,

    Thank you. I'm using pyodbc library and it turned out that by default the autocommit property is set to False, I changed it True and everything works.

Setup Info
    Tags
      Help me…