I created a recipe that has the input and output datasets in Redshift.
After I ran the recipe and populated the output dataset, I modified it externally, outside of Dataiku by dropping some rows.
And the modification broke something real bad.
First, when I queried the dataset from a SQL notebook I would see the dropped rows.
Then, when I tried to rebuild the dataset, the job stalled. The last line in the log file was:
Now, I can't do anything with this table, can't view it, can't rebuild it.
What did I do wrong? Why can't I modify the table created by Dataiku?
Can you clarify exactly how you modified this dataset externally and how you went about dropping some rows from this table? When you run a recipe in overwrite mode (default setting), DSS will attempt to drop the target table/dataset before recreating it. Can you check if you are able to drop this table directly? If so, can you try rebuilding this dataset again afterwards?
query = """ delete from [schema_name].[table_name] where [column_name] = value; """ pandas.io.sql.execute(query, cnxn)
Yes I dropped the table directly. But even after than Dataiku was showing the table as present until I dropped in Dataiku.
I.e. I executed
drop table [schema_name].[table_name]
outside of DSS and then in the SQL notebook in DSS, and only then the table disappeared from the view in DSS
You might need to add a Commit after your drop statement when dropping the table outside of DSS to ensure that the table is actually being dropped properly.