Dataiku Scenario used for build of table every day
Need Help Urgently
I am using Dataiku with snowflake , I have flows in which source tables refresh daily , So I using scenario with Build mode : Force Rebuild dataset and Dependency and I have found every day table get dropped and new table get created which was not supposed to be done .
We want on table rows get automatically overwrite without dropping the table and recreating it .
Beacuse of recreation of table each time I am not able to use time travel feature of snowflake.
Please let me know is there any way ?
Thanks in advance.
Best Answer
-
Alexandru Dataiker, Dataiku DSS Core Designer, Dataiku DSS ML Practitioner, Dataiku DSS Adv Designer, Registered Posts: 1,225 Dataiker
Hi,
This option is available at the connection level under Administrations - Connections - you SQL Connection Name
If you don't want this behavior for all other datasets on this connection, then simply ask your admin to create a duplicated connection with this enabled to use here.
Answers
-
Miguel Angel Dataiker, Dataiku DSS Core Designer, Dataiku DSS ML Practitioner, Dataiku DSS Adv Designer, Registered Posts: 118 Dataiker
Hi,
Depends on what 'overwrite rows' means in this context. I am guessing you want to do updates.
You can do updates using a SQL code recipe: https://doc.dataiku.com/dss/latest/code_recipes/sql.html
-
PK36313 Partner, Dataiku DSS Core Designer, Dataiku DSS ML Practitioner, Dataiku DSS Adv Designer, Registered Posts: 10 Partner
By Overwrite I mean , Every day when Scenario run Table previous data should get truncated instead of dropping the whole table and creating the new table .
I have check in snowflake by this query show tables history in <schema_name> , that every day when the scenario runs my table get dropped and new table get created .But My need is everyday when I run the Scenario the table data get refreshed /overwrite , Not like whole table get dropped and then new date get created .
I am using Force -rebuild datasets and Dependency , And only SQL Receipes in the flow .
Is there any way with which I can avoid Drop of table ,Please let me know
Thanks in advance -
Alexandru Dataiker, Dataiku DSS Core Designer, Dataiku DSS ML Practitioner, Dataiku DSS Adv Designer, Registered Posts: 1,225 Dataiker
Hi,
You can use the option "truncate to clear data" instead at the connection level. This would avoid dropping the table with the exception of cases where the schema of the table changes in that case it would still be dropped and re-created.
-
PK36313 Partner, Dataiku DSS Core Designer, Dataiku DSS ML Practitioner, Dataiku DSS Adv Designer, Registered Posts: 10 Partner
Could you pleas tell me where can I find this option it would be a great help? , In my target table advance option i am not getting it .
-
Marlan Neuron 2020, Neuron, Registered, Dataiku Frontrunner Awards 2021 Finalist, Neuron 2021, Neuron 2022, Dataiku Frontrunner Awards 2021 Participant, Neuron 2023 Posts: 319 Neuron
Hi @PK36313
,You could also use a SQL Script recipe (rather than a SQL query recipe) to update the output table. In SQL Script recipes, the output table is dropped only if you drop in your script. So you'd simply not drop it and instead insert (or delete and then insert) the desired rows into the output table. We do this all of the time.
Marlan