Teradata TPT Support

Support for Teradata Parallel Transporter would enable rapid insertion, extraction, and update for Teradata databases. In some cases, TPT is hundreds of times faster than connections using the regular Teradata driver. This would make data pipelines interfacing with very large Teradata datasets substantially faster, especially when large datasets need to be processed outside the database or moved between databases. For enterprises using Teradata for data warehousing, this would be a killer feature for pipeline performance - several of my colleagues stick to local data processing on their laptops or on standalone VMs only because they need to use TPT to get their data moved within their time requirements.

9 Comments

Just to follow up, over the last two days, I've done about 22 hours of loading into Teradata via Dataiku recipes that would take about 10% that time by using TPT. If it's possible to enable this capability, it will be an enormous time saver and allow projects to iterate much faster.

Just to follow up, over the last two days, I've done about 22 hours of loading into Teradata via Dataiku recipes that would take about 10% that time by using TPT. If it's possible to enable this capability, it will be an enormous time saver and allow projects to iterate much faster.

CoreyS
Dataiker Alumni

Thank you for your submission. While we appreciate you taking the time to propose this idea, we wanted to let you know that there has already been a similar idea submitted and logged internally. So we are going to mark this one as a duplicate.

Looking for more resources to help you use Dataiku effectively and upskill your knowledge? Check out these great resources: Dataiku Academy | Documentation | Knowledge Base

A reply answered your question? Mark as โ€˜Accepted Solutionโ€™ to help others like you!
Status changed to: Duplicate

Thank you for your submission. While we appreciate you taking the time to propose this idea, we wanted to let you know that there has already been a similar idea submitted and logged internally. So we are going to mark this one as a duplicate.

Thanks Corey, I'm glad to hear others are interested in this as well. Is there a place the community can go to track the status of this idea or add upvotes?

Thanks Corey, I'm glad to hear others are interested in this as well. Is there a place the community can go to track the status of this idea or add upvotes?

Hi @CoreyS ,

Just to follow up on this one, since it's been proposed internally but hasn't been implemented yet, could we change the status to in backlog so we can track community support for prioritization? This would be a big accelerator for my team since we rely heavily on Teradata and frequently need to move large tables between our production and development environments.

Hi @CoreyS ,

Just to follow up on this one, since it's been proposed internally but hasn't been implemented yet, could we change the status to in backlog so we can track community support for prioritization? This would be a big accelerator for my team since we rely heavily on Teradata and frequently need to move large tables between our production and development environments.

CoreyS
Dataiker Alumni

Hey @natejgardner we marked this as a duplicate because we felt it overlapped with another one of your ideas: Allow users to create Teradata connections 

If you wouldn't mind could you provide some additional context on the difference between both ideas? Thanks in advance!

Looking for more resources to help you use Dataiku effectively and upskill your knowledge? Check out these great resources: Dataiku Academy | Documentation | Knowledge Base

A reply answered your question? Mark as โ€˜Accepted Solutionโ€™ to help others like you!

Hey @natejgardner we marked this as a duplicate because we felt it overlapped with another one of your ideas: Allow users to create Teradata connections 

If you wouldn't mind could you provide some additional context on the difference between both ideas? Thanks in advance!

Ah, thanks for the clarification- 

TPT is Teradata Parallel Transporter, a specialized pipe designed for moving large volumes of data between Teradata environments much faster than is possible with normal database drivers. It'd be a Teradata-specific feature for use when moving data across connections and require special implementation. 

Allow users to create Teradata connections is about the error that occurs when users try to create their own Teradata connection. Currently, only admins have permission to create a new Teradata connection, whereas users can create any other type of connection. When users try to create a Teradata connection without admin privileges, they just get the error message "you should not call this."

TPT should only be enabled for high-throughput queries across environments. It's not optimized for standard queries, so should only be used as the transport where DSS is the engine and the user specifically enables it. It'd be a feature only for use where large tables need to be copied quickly between connections, and usually only makes sense when moving data between two Teradata connections or to another high-throughput queue.

Ah, thanks for the clarification- 

TPT is Teradata Parallel Transporter, a specialized pipe designed for moving large volumes of data between Teradata environments much faster than is possible with normal database drivers. It'd be a Teradata-specific feature for use when moving data across connections and require special implementation. 

Allow users to create Teradata connections is about the error that occurs when users try to create their own Teradata connection. Currently, only admins have permission to create a new Teradata connection, whereas users can create any other type of connection. When users try to create a Teradata connection without admin privileges, they just get the error message "you should not call this."

TPT should only be enabled for high-throughput queries across environments. It's not optimized for standard queries, so should only be used as the transport where DSS is the engine and the user specifically enables it. It'd be a feature only for use where large tables need to be copied quickly between connections, and usually only makes sense when moving data between two Teradata connections or to another high-throughput queue.

CoreyS
Dataiker Alumni
 
Looking for more resources to help you use Dataiku effectively and upskill your knowledge? Check out these great resources: Dataiku Academy | Documentation | Knowledge Base

A reply answered your question? Mark as โ€˜Accepted Solutionโ€™ to help others like you!
Status changed to: In Backlog
 

Thanks!

Thanks!

MichaelG
Community Manager
Community Manager
 
I hope I helped! Do you Know that if I was Useful to you or Did something Outstanding you can Show your appreciation by giving me a KUDOS?

Looking for more resources to help you use DSS effectively and upskill your knowledge? Check out these great resources: Dataiku Academy | Documentation | Knowledge Base

A reply answered your question? Mark as โ€˜Accepted Solutionโ€™ to help others like you!
Status changed to: In the Backlog