You now have until September 15th to submit your use case or success story to the 2022 Dataiku Frontrunner Awards!ENTER YOUR SUBMISSION

Best practice for historization SQL Recipe

Kyest00
Level 2
Best practice for historization SQL Recipe

Hi,

I am currently working on a historization SQL Recipe in which I am trying to insert data from one table to another.

The source table has a different connection that the destination table, but both connections have the same database.

dataiku issue.PNG

My question: If I want to create the historization table, to have the source table in another database(ODS) than the destination (DWH), to insert only the new rows, what is the best practice?

Is a temporary table and use of the same database, then a sync recipe to get data from ODS database to DWH database the best way?

 

(Topic title edited by moderator to be more descriptive. Original title "SQL Recipe")


Operating system used: Virtual Machine

0 Kudos
5 Replies
Turribeach
Level 5

It really depends on many factors:

  1. whether your ODS and DWH databases are in the same DB server instance in which case you should a single connnection
  2. how much data you have in each side 
  3. whether you have a layer (like Cloud storage) where you can move all your data to consolidate it 

 

 

0 Kudos
Kyest00
Level 2
Author

Sorry for the confusion!

 I need to create the historization table,I have the source table in another server(ODS) than the destination server (DWH), I need to insert only the new rows.

I am processing data in ODS from a SQL table, but I am trying to get to DWH, and never delete data in DWH if errors as it is used for reporting in Power BI.


Source: ODS 
Destination: DWH 
In the attached picture I have the process, so I will now need to have 2 tables in ODS, and will have to use a Prepare recipe to get data from ODS to DWH.



I would like to know what is the best practice for this process! 

0 Kudos
Turribeach
Level 5

ODS and DWH are abstract concepts which tell me nothing of your environment. Where is your ODS? What databse? What technology? etc

0 Kudos
Kyest00
Level 2
Author

Hi Turribeach,

 

Thank you for the assistance!

ODS represents a database in server 1 in Microsoft SQL Server

DWH represent a database in server 2 in Microsoft SQL Server

 

The Historization is made by SQL Recipe script, which inserts if not exists, and update cases.

My plan is to use a Prepare Recipe to get ODS data to a temporary table in DWH, with no modifications, then create  historization (SQL Recipe script) from the temp table(DWH) to destination DWH(final table, also used for reporting)

Notes:

- DWH destination should not have data removed

- ODS table will have new rows to insert, received every day by an excel


Let me know if more information should be provided!


0 Kudos
Turribeach
Level 5

Why can't you not use Linked Servers in SQL Server and do all the ETL in SQL Server? See this post:

https://stackoverflow.com/questions/1144051/selecting-data-from-two-different-servers-in-sql-server

Doesn't seem like using Dataiku for this ETL is the best idea. Maybe if your DWH was on a Cloud DWH engine like GCP BigQuery or Snowflake it might make sense to run it via Dataiku. 

0 Kudos

Labels

?
Labels (2)
A banner prompting to get Dataiku