Best practice for historization SQL Recipe

Registered Posts: 8 ✭✭✭

Hi,

I am currently working on a historization SQL Recipe in which I am trying to insert data from one table to another.

The source table has a different connection that the destination table, but both connections have the same database.

dataiku issue.PNG

My question: If I want to create the historization table, to have the source table in another database(ODS) than the destination (DWH), to insert only the new rows, what is the best practice?

Is a temporary table and use of the same database, then a sync recipe to get data from ODS database to DWH database the best way?

(Topic title edited by moderator to be more descriptive. Original title "SQL Recipe")


Operating system used: Virtual Machine

Welcome!

It looks like you're new here. Sign in or register to get started.

Answers

  • Dataiku DSS Core Designer, Neuron, Dataiku DSS Adv Designer, Registered, Neuron 2023 Posts: 2,315 Neuron

    It really depends on many factors:

    1. whether your ODS and DWH databases are in the same DB server instance in which case you should a single connnection
    2. how much data you have in each side
    3. whether you have a layer (like Cloud storage) where you can move all your data to consolidate it

  • Registered Posts: 8 ✭✭✭

    Sorry for the confusion!

    I need to create the historization table,I have the source table in another server(ODS) than the destination server (DWH), I need to insert only the new rows.

    I am processing data in ODS from a SQL table, but I am trying to get to DWH, and never delete data in DWH if errors as it is used for reporting in Power BI.


    Source: ODS
    Destination: DWH
    In the attached picture I have the process, so I will now need to have 2 tables in ODS, and will have to use a Prepare recipe to get data from ODS to DWH.



    I would like to know what is the best practice for this process!

  • Dataiku DSS Core Designer, Neuron, Dataiku DSS Adv Designer, Registered, Neuron 2023 Posts: 2,315 Neuron

    ODS and DWH are abstract concepts which tell me nothing of your environment. Where is your ODS? What databse? What technology? etc

  • Registered Posts: 8 ✭✭✭

    Hi Turribeach,

    Thank you for the assistance!

    ODS represents a database in server 1 in Microsoft SQL Server

    DWH represent a database in server 2 in Microsoft SQL Server

    The Historization is made by SQL Recipe script, which inserts if not exists, and update cases.

    My plan is to use a Prepare Recipe to get ODS data to a temporary table in DWH, with no modifications, then create historization (SQL Recipe script) from the temp table(DWH) to destination DWH(final table, also used for reporting)

    Notes:

    - DWH destination should not have data removed

    - ODS table will have new rows to insert, received every day by an excel


    Let me know if more information should be provided!


  • Dataiku DSS Core Designer, Neuron, Dataiku DSS Adv Designer, Registered, Neuron 2023 Posts: 2,315 Neuron

    Why can't you not use Linked Servers in SQL Server and do all the ETL in SQL Server? See this post:

    https://stackoverflow.com/questions/1144051/selecting-data-from-two-different-servers-in-sql-server

    Doesn't seem like using Dataiku for this ETL is the best idea. Maybe if your DWH was on a Cloud DWH engine like GCP BigQuery or Snowflake it might make sense to run it via Dataiku.

Welcome!

It looks like you're new here. Sign in or register to get started.

Welcome!

It looks like you're new here. Sign in or register to get started.