Dataiku Scenario used for build of table every day

Options
PK36313
PK36313 Partner, Dataiku DSS Core Designer, Dataiku DSS ML Practitioner, Dataiku DSS Adv Designer, Registered Posts: 10 Partner

Need Help Urgently

I am using Dataiku with snowflake , I have flows in which source tables refresh daily , So I using scenario with Build mode : Force Rebuild dataset and Dependency and I have found every day table get dropped and new table get created which was not supposed to be done .
We want on table rows get automatically overwrite without dropping the table and recreating it .

Beacuse of recreation of table each time I am not able to use time travel feature of snowflake.

Please let me know is there any way ?

Thanks in advance.

Best Answer

  • Alexandru
    Alexandru Dataiker, Dataiku DSS Core Designer, Dataiku DSS ML Practitioner, Dataiku DSS Adv Designer, Registered Posts: 1,209 Dataiker
    Answer ✓
    Options

    Hi,

    This option is available at the connection level under Administrations - Connections - you SQL Connection Name

    If you don't want this behavior for all other datasets on this connection, then simply ask your admin to create a duplicated connection with this enabled to use here.

    Screenshot 2022-08-10 at 12.27.32.png

Answers

  • Miguel Angel
    Miguel Angel Dataiker, Dataiku DSS Core Designer, Dataiku DSS ML Practitioner, Dataiku DSS Adv Designer, Registered Posts: 118 Dataiker
    Options

    Hi,

    Depends on what 'overwrite rows' means in this context. I am guessing you want to do updates.

    You can do updates using a SQL code recipe: https://doc.dataiku.com/dss/latest/code_recipes/sql.html

  • PK36313
    PK36313 Partner, Dataiku DSS Core Designer, Dataiku DSS ML Practitioner, Dataiku DSS Adv Designer, Registered Posts: 10 Partner
    Options

    By Overwrite I mean , Every day when Scenario run Table previous data should get truncated instead of dropping the whole table and creating the new table .

    I have check in snowflake by this query show tables history in <schema_name> , that every day when the scenario runs my table get dropped and new table get created .

    But My need is everyday when I run the Scenario the table data get refreshed /overwrite , Not like whole table get dropped and then new date get created .

    I am using Force -rebuild datasets and Dependency , And only SQL Receipes in the flow .

    Is there any way with which I can avoid Drop of table ,Please let me know

    Thanks in advance

  • Alexandru
    Alexandru Dataiker, Dataiku DSS Core Designer, Dataiku DSS ML Practitioner, Dataiku DSS Adv Designer, Registered Posts: 1,209 Dataiker
    Options

    Hi,

    You can use the option "truncate to clear data" instead at the connection level. This would avoid dropping the table with the exception of cases where the schema of the table changes in that case it would still be dropped and re-created.

    Screenshot 2022-08-10 at 12.09.52.png

  • PK36313
    PK36313 Partner, Dataiku DSS Core Designer, Dataiku DSS ML Practitioner, Dataiku DSS Adv Designer, Registered Posts: 10 Partner
    Options

    Could you pleas tell me where can I find this option it would be a great help? , In my target table advance option i am not getting it .

  • Marlan
    Marlan Neuron 2020, Neuron, Registered, Dataiku Frontrunner Awards 2021 Finalist, Neuron 2021, Neuron 2022, Dataiku Frontrunner Awards 2021 Participant, Neuron 2023 Posts: 317 Neuron
    Options

    Hi @PK36313
    ,

    You could also use a SQL Script recipe (rather than a SQL query recipe) to update the output table. In SQL Script recipes, the output table is dropped only if you drop in your script. So you'd simply not drop it and instead insert (or delete and then insert) the desired rows into the output table. We do this all of the time.

    Marlan

Setup Info
    Tags
      Help me…