Currently, there's no progress indicator for recipes (unless those recipes are partitioned). Many SQL databases return the estimated number of result rows for a query via their explain plans. If Dataiku were to extract this estimated number of rows, a progress bar could be created showing current execution progress based on the current number of rows processed. This would be helpful for understanding the current progress on long-running flows. The logs do a good job of telling me how many rows have been inserted, but it's difficult to mentally translate this into a percent complete or a time estimate.
[20:30:09] [INFO] [dku.datasets.sql] - Read 1596680000 records from DB
If Dataiku could provide a progress indicator and/or ETA for recipe operations, it'd help a lot with knowing what to expect when planning data pipelines against large datasets.
Related: it'd be great to know the current queue depth for a recipe to understand whether the bottleneck is on the input or output.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.