Discover the winners & finalists of the 2022 Dataiku Frontrunner Awards!READ THEIR USE CASES

Estimate progress from explain plan

Currently, there's no progress indicator for recipes (unless those recipes are partitioned). Many SQL databases return the estimated number of result rows for a query via their explain plans. If Dataiku were to extract this estimated number of rows, a progress bar could be created showing current execution progress based on the current number of rows processed. This would be helpful for understanding the current progress on long-running flows. The logs do a good job of telling me how many rows have been inserted, but it's difficult to mentally translate this into a percent complete or a time estimate.

[20:30:09] [INFO] [dku.datasets.sql] - Read 1596680000 records from DB

If Dataiku could provide a progress indicator and/or ETA for recipe operations, it'd help a lot with knowing what to expect when planning data pipelines against large datasets.

Related: it'd be great to know the current queue depth for a recipe to understand whether the bottleneck is on the input or output.

1 Comment
Status changed to: Acknowledged

Thanks for your idea, @natejgardner 

Your idea meets the criteria for submission, we'll reach out should we require more information. 

If you’re reading this and think this would be a great capability to add to DSS, be sure to kudos the original post or leave a comment!

Take care,