Update SQL table

Solved!
davidmakovoz
Update SQL table

I created a recipe that has the input and output datasets in Redshift. 

After I ran the recipe and populated the output dataset, I modified it externally, outside of Dataiku by dropping some rows. 

And the modification broke something real bad. 

First, when I queried the dataset from a SQL notebook I would see the dropped rows.

Then, when I tried to rebuild the dataset, the job stalled. The last line in the log file was:

Drop [table_name]

Now, I can't do anything with this table, can't view it, can't rebuild it. 

What did I do wrong? Why can't I modify the table created by Dataiku?

Thank you,

0 Kudos
1 Solution
ATsao
Dataiker

Hi David,

You might need to add a Commit after your drop statement when dropping the table outside of DSS to ensure that the table is actually being dropped properly. 

Thanks,

Andrew

View solution in original post

0 Kudos
4 Replies
ATsao
Dataiker

Hi David, 

Can you clarify exactly how you modified this dataset externally and how you went about dropping some rows from this table? When you run a recipe in overwrite mode (default setting), DSS will attempt to drop the target table/dataset before recreating it. Can you check if you are able to drop this table directly? If so, can you try rebuilding this dataset again afterwards? 

Thanks,

Andrew

0 Kudos
davidmakovoz
Author

Hi Andrew.

query = """
delete from [schema_name].[table_name] where [column_name] = value;
"""
pandas.io.sql.execute(query, cnxn)

 

Yes I dropped the table directly. But even after than Dataiku was showing the table as present until I dropped in Dataiku. 

I.e. I executed 

drop table [schema_name].[table_name] 

outside of DSS and then in the SQL notebook in DSS, and only then the table disappeared from the view in DSS

0 Kudos
ATsao
Dataiker

Hi David,

You might need to add a Commit after your drop statement when dropping the table outside of DSS to ensure that the table is actually being dropped properly. 

Thanks,

Andrew

0 Kudos
davidmakovoz
Author

Hi Andrew,

Thank you. I'm using pyodbc library and it turned out that by default the autocommit property is set to False, I changed it True and everything works.