Sign up to take part
Registered users can ask their own questions, contribute to discussions, and be part of the Community!
Added on October 14, 2020 9:53PM
Likes: 0
Replies: 4
I created a recipe that has the input and output datasets in Redshift.
After I ran the recipe and populated the output dataset, I modified it externally, outside of Dataiku by dropping some rows.
And the modification broke something real bad.
First, when I queried the dataset from a SQL notebook I would see the dropped rows.
Then, when I tried to rebuild the dataset, the job stalled. The last line in the log file was:
Drop [table_name]
Now, I can't do anything with this table, can't view it, can't rebuild it.
What did I do wrong? Why can't I modify the table created by Dataiku?
Thank you,
Hi David,
You might need to add a Commit after your drop statement when dropping the table outside of DSS to ensure that the table is actually being dropped properly.
Thanks,
Andrew
Hi David,
Can you clarify exactly how you modified this dataset externally and how you went about dropping some rows from this table? When you run a recipe in overwrite mode (default setting), DSS will attempt to drop the target table/dataset before recreating it. Can you check if you are able to drop this table directly? If so, can you try rebuilding this dataset again afterwards?
Thanks,
Andrew
Hi Andrew.
query = """ delete from [schema_name].[table_name] where [column_name] = value; """ pandas.io.sql.execute(query, cnxn)
Yes I dropped the table directly. But even after than Dataiku was showing the table as present until I dropped in Dataiku.
I.e. I executed
drop table [schema_name].[table_name]
outside of DSS and then in the SQL notebook in DSS, and only then the table disappeared from the view in DSS
Hi Andrew,
Thank you. I'm using pyodbc library and it turned out that by default the autocommit property is set to False, I changed it True and everything works.