Delete / Rename files after Sync

Options
rmoore
rmoore Partner, Dataiku DSS Core Designer, Dataiku DSS ML Practitioner, Dataiku DSS Core Concepts, Neuron, Dataiku DSS Adv Designer, Registered, Neuron 2022, Frontrunner 2022 Finalist, Frontrunner 2022 Participant, Neuron 2023 Posts: 33 Neuron

I'm setting up a scheduled process to sync .csv files from Azure Blob Storage into Snowflake and am looking for a best practice to rename or delete the .csv files after they've been synchronized with Snowflake so they aren't processed multiple times.

Is there a recommended way to accomplish this in DSS?

The files are generated from multiple locations every 10 minutes, so I don't think hourly partitioning is a viable option.

Thanks!

Best Answer

  • rmoore
    rmoore Partner, Dataiku DSS Core Designer, Dataiku DSS ML Practitioner, Dataiku DSS Core Concepts, Neuron, Dataiku DSS Adv Designer, Registered, Neuron 2022, Frontrunner 2022 Finalist, Frontrunner 2022 Participant, Neuron 2023 Posts: 33 Neuron
    Answer ✓
    Options

    As a follow-up, we were able to accomplish this by creating a custom macro that renames blob storage files as needed. Feel free to message me if anyone would like to discuss details.

Setup Info
    Tags
      Help me…