Optimizing data upload to Teradata
cronos003
Registered Posts: 7 ✭✭✭✭
Can someone provide a high level overview of how DSS uploads data to Teradata? Is it just submitting a batch of inserts or using any Teradata utilities such as FastLoad or MultiLoad?
I need to move ~70M rows (about 8 CHAR columns of ~2-20 width and a couple INT columns) from SQL Server to Teradata where the bulk of my data resides to continue processing in-database but the upload to Teradata is rather slow. For my dataset it took approximately 7 hours.
I'm looking for any suggestions on optimizing the process. Thanks!
I need to move ~70M rows (about 8 CHAR columns of ~2-20 width and a couple INT columns) from SQL Server to Teradata where the bulk of my data resides to continue processing in-database but the upload to Teradata is rather slow. For my dataset it took approximately 7 hours.
I'm looking for any suggestions on optimizing the process. Thanks!
Tagged:
Answers
-
Hi,
When syncing a dataset from SQL server to Terada, DSS submits a series of batch inserts. As of the latest release today, it is not using the specific Teradata utilities you mention.
In the specific case of syncing between Teradata and HDFS, we do use the fast TDCH method: https://doc.dataiku.com/dss/latest/connecting/sql/teradata.html#fast-sync-using-tdch
To check if there can be any optimization, can you please add a screenshot to your question with the output dataset "Settings > Advanced" tab ?
Best regards,
Alex