Update SQL table
I created a recipe that has the input and output datasets in Redshift.
After I ran the recipe and populated the output dataset, I modified it externally, outside of Dataiku by dropping some rows.
And the modification broke something real bad.
First, when I queried the dataset from a SQL notebook I would see the dropped rows.
Then, when I tried to rebuild the dataset, the job stalled. The last line in the log file was:
Drop [table_name]
Now, I can't do anything with this table, can't view it, can't rebuild it.
What did I do wrong? Why can't I modify the table created by Dataiku?
Thank you,
Best Answer
-
Hi David,
You might need to add a Commit after your drop statement when dropping the table outside of DSS to ensure that the table is actually being dropped properly.
Thanks,
Andrew
Answers
-
Hi David,
Can you clarify exactly how you modified this dataset externally and how you went about dropping some rows from this table? When you run a recipe in overwrite mode (default setting), DSS will attempt to drop the target table/dataset before recreating it. Can you check if you are able to drop this table directly? If so, can you try rebuilding this dataset again afterwards?
Thanks,
Andrew
-
davidmakovoz Dataiku DSS Core Designer, Dataiku DSS & SQL, Dataiku DSS ML Practitioner, Dataiku DSS Core Concepts, Neuron, Dataiku DSS Adv Designer, Registered, Neuron 2022, Neuron 2023 Posts: 67 Neuron
Hi Andrew.
query = """ delete from [schema_name].[table_name] where [column_name] = value; """ pandas.io.sql.execute(query, cnxn)
Yes I dropped the table directly. But even after than Dataiku was showing the table as present until I dropped in Dataiku.
I.e. I executed
drop table [schema_name].[table_name]
outside of DSS and then in the SQL notebook in DSS, and only then the table disappeared from the view in DSS
-
davidmakovoz Dataiku DSS Core Designer, Dataiku DSS & SQL, Dataiku DSS ML Practitioner, Dataiku DSS Core Concepts, Neuron, Dataiku DSS Adv Designer, Registered, Neuron 2022, Neuron 2023 Posts: 67 Neuron
Hi Andrew,
Thank you. I'm using pyodbc library and it turned out that by default the autocommit property is set to False, I changed it True and everything works.