Best practice for historization SQL Recipe

Kyest00
Kyest00 Registered Posts: 8 ✭✭✭

Hi,

I am currently working on a historization SQL Recipe in which I am trying to insert data from one table to another.

The source table has a different connection that the destination table, but both connections have the same database.

dataiku issue.PNG

My question: If I want to create the historization table, to have the source table in another database(ODS) than the destination (DWH), to insert only the new rows, what is the best practice?

Is a temporary table and use of the same database, then a sync recipe to get data from ODS database to DWH database the best way?

(Topic title edited by moderator to be more descriptive. Original title "SQL Recipe")


Operating system used: Virtual Machine

Answers

  • Turribeach
    Turribeach Dataiku DSS Core Designer, Neuron, Dataiku DSS Adv Designer, Registered, Neuron 2023 Posts: 2,090 Neuron

    It really depends on many factors:

    1. whether your ODS and DWH databases are in the same DB server instance in which case you should a single connnection
    2. how much data you have in each side
    3. whether you have a layer (like Cloud storage) where you can move all your data to consolidate it

  • Kyest00
    Kyest00 Registered Posts: 8 ✭✭✭

    Sorry for the confusion!

    I need to create the historization table,I have the source table in another server(ODS) than the destination server (DWH), I need to insert only the new rows.

    I am processing data in ODS from a SQL table, but I am trying to get to DWH, and never delete data in DWH if errors as it is used for reporting in Power BI.


    Source: ODS
    Destination: DWH
    In the attached picture I have the process, so I will now need to have 2 tables in ODS, and will have to use a Prepare recipe to get data from ODS to DWH.



    I would like to know what is the best practice for this process!

  • Turribeach
    Turribeach Dataiku DSS Core Designer, Neuron, Dataiku DSS Adv Designer, Registered, Neuron 2023 Posts: 2,090 Neuron

    ODS and DWH are abstract concepts which tell me nothing of your environment. Where is your ODS? What databse? What technology? etc

  • Kyest00
    Kyest00 Registered Posts: 8 ✭✭✭

    Hi Turribeach,

    Thank you for the assistance!

    ODS represents a database in server 1 in Microsoft SQL Server

    DWH represent a database in server 2 in Microsoft SQL Server

    The Historization is made by SQL Recipe script, which inserts if not exists, and update cases.

    My plan is to use a Prepare Recipe to get ODS data to a temporary table in DWH, with no modifications, then create historization (SQL Recipe script) from the temp table(DWH) to destination DWH(final table, also used for reporting)

    Notes:

    - DWH destination should not have data removed

    - ODS table will have new rows to insert, received every day by an excel


    Let me know if more information should be provided!


  • Turribeach
    Turribeach Dataiku DSS Core Designer, Neuron, Dataiku DSS Adv Designer, Registered, Neuron 2023 Posts: 2,090 Neuron

    Why can't you not use Linked Servers in SQL Server and do all the ETL in SQL Server? See this post:

    https://stackoverflow.com/questions/1144051/selecting-data-from-two-different-servers-in-sql-server

    Doesn't seem like using Dataiku for this ETL is the best idea. Maybe if your DWH was on a Cloud DWH engine like GCP BigQuery or Snowflake it might make sense to run it via Dataiku.

Setup Info
    Tags
      Help me…