The Dataiku Frontrunner Awards have launched to recognize your achievements! SUBMIT YOUR ENTRY

Cache the flow

0 Kudos

Loading the flow of a large project often takes a long time. For example, on my project with 202 datasets, 133 recipes, and 16 zones, loading the flow takes 28 seconds. About 16 seconds is just waiting for the API response to return the flow graph data.

To work around this, I've adopted the habit of keeping two browser tabs open with one always loaded to the flow, so that whenever I need to load the flow I can just start loading then switch to my other tab.

Since the flow is my main point of navigation, it would be great if the computed result at the end of that loading period could be cached in my browser so that whenever I return to the flow, it loads instantly. This will speed up navigation in large projects and allow me to maintain my mental flow while I work. 

1 Comment
fsergot
Dataiker
Dataiker
Status changed to: In Backlog

Hello,

Improving performances is a continuous but arduous topic 🙂 Caching is one option but might raise issues in a collaborative solution like DSS.

In the case of the flow, we have found very promising optimization, especially in cases like yours where there is a combination of many datasets, recipes with flow zones. This is in our short term backlog so you can keep an eye on the release notes (otherwise I'll try to update this idea once we have something released).

Despite all our efforts, it is hard to say that this will solve your point the way you expect. If not, you may want to revisit the topic with your CSM to have more specific diagnostics.

Public