Teradata TPT Support

Nate
Nate Neuron, Registered, Neuron 2022, Neuron 2023 Posts: 151 Neuron

Support for Teradata Parallel Transporter would enable rapid insertion, extraction, and update for Teradata databases. In some cases, TPT is hundreds of times faster than connections using the regular Teradata driver. This would make data pipelines interfacing with very large Teradata datasets substantially faster, especially when large datasets need to be processed outside the database or moved between databases. For enterprises using Teradata for data warehousing, this would be a killer feature for pipeline performance - several of my colleagues stick to local data processing on their laptops or on standalone VMs only because they need to use TPT to get their data moved within their time requirements.

3
3 votes

In the Backlog · Last Updated

Comments

  • Nate
    Nate Neuron, Registered, Neuron 2022, Neuron 2023 Posts: 151 Neuron

    Just to follow up, over the last two days, I've done about 22 hours of loading into Teradata via Dataiku recipes that would take about 10% that time by using TPT. If it's possible to enable this capability, it will be an enormous time saver and allow projects to iterate much faster.

  • CoreyS
    CoreyS Dataiker Alumni, Dataiku DSS Core Designer, Dataiku DSS Core Concepts, Registered Posts: 1,150 ✭✭✭✭✭✭✭✭✭

    Thank you for your submission. While we appreciate you taking the time to propose this idea, we wanted to let you know that there has already been a similar idea submitted and logged internally. So we are going to mark this one as a duplicate.

  • Nate
    Nate Neuron, Registered, Neuron 2022, Neuron 2023 Posts: 151 Neuron

    Thanks Corey, I'm glad to hear others are interested in this as well. Is there a place the community can go to track the status of this idea or add upvotes?

  • Nate
    Nate Neuron, Registered, Neuron 2022, Neuron 2023 Posts: 151 Neuron

    Hi @CoreyS
    ,

    Just to follow up on this one, since it's been proposed internally but hasn't been implemented yet, could we change the status to in backlog so we can track community support for prioritization? This would be a big accelerator for my team since we rely heavily on Teradata and frequently need to move large tables between our production and development environments.

  • CoreyS
    CoreyS Dataiker Alumni, Dataiku DSS Core Designer, Dataiku DSS Core Concepts, Registered Posts: 1,150 ✭✭✭✭✭✭✭✭✭

    Hey @natejgardner
    we marked this as a duplicate because we felt it overlapped with another one of your ideas:

    If you wouldn't mind could you provide some additional context on the difference between both ideas? Thanks in advance!

  • Nate
    Nate Neuron, Registered, Neuron 2022, Neuron 2023 Posts: 151 Neuron

    Ah, thanks for the clarification-

    TPT is Teradata Parallel Transporter, a specialized pipe designed for moving large volumes of data between Teradata environments much faster than is possible with normal database drivers. It'd be a Teradata-specific feature for use when moving data across connections and require special implementation.

    is about the error that occurs when users try to create their own Teradata connection. Currently, only admins have permission to create a new Teradata connection, whereas users can create any other type of connection. When users try to create a Teradata connection without admin privileges, they just get the error message "you should not call this."

    TPT should only be enabled for high-throughput queries across environments. It's not optimized for standard queries, so should only be used as the transport where DSS is the engine and the user specifically enables it. It'd be a feature only for use where large tables need to be copied quickly between connections, and usually only makes sense when moving data between two Teradata connections or to another high-throughput queue.

  • Nate
    Nate Neuron, Registered, Neuron 2022, Neuron 2023 Posts: 151 Neuron
Setup Info
    Tags
      Help me…