Estimate progress from explain plan

natejgardner
natejgardner Neuron, Registered, Neuron 2022, Neuron 2023 Posts: 151 Neuron
edited July 16 in Product Ideas

Currently, there's no progress indicator for recipes (unless those recipes are partitioned). Many SQL databases return the estimated number of result rows for a query via their explain plans. If Dataiku were to extract this estimated number of rows, a progress bar could be created showing current execution progress based on the current number of rows processed. This would be helpful for understanding the current progress on long-running flows. The logs do a good job of telling me how many rows have been inserted, but it's difficult to mentally translate this into a percent complete or a time estimate.

[20:30:09] [INFO] [dku.datasets.sql] - Read 1596680000 records from DB

If Dataiku could provide a progress indicator and/or ETA for recipe operations, it'd help a lot with knowing what to expect when planning data pipelines against large datasets.

Related: it'd be great to know the current queue depth for a recipe to understand whether the bottleneck is on the input or output.

1
1 votes

In the Backlog · Last Updated

Comments

  • Ashley
    Ashley Dataiker, Alpha Tester, Dataiku DSS Core Designer, Registered, Product Ideas Manager Posts: 162 Dataiker

    Thanks for your idea, @natejgardner

    Your idea meets the criteria for submission, we'll reach out should we require more information.

    If you’re reading this and think this would be a great capability to add to DSS, be sure to kudos the original post or leave a comment!

    Take care,

    Ashley

Setup Info
    Tags
      Help me…