Survey banner
Share your feedback on the Dataiku documentation with this 5 min survey. Thanks! TAKE THE SURVEY

Move objects and Zones on the Flow

I think it would be very useful to be able to move objects and flow zones around in the flow display. It appears Dataiku determines where each recipe, dataset, etc go in the flow and I cannot edit that. I have used Alteryx in the past and it had that ability, which I liked. It allows me to organize the flow however I see fit and make it more visually appealing and more understandable for someone looking at it.

12 Comments
ktgross15
Dataiker

Hi @Ben_Rutan ,

Thanks for the feedback 🙂

The solution to managing large flows is by using flow zones. Through flow zones, you can split your flow into small, named/colored zones, which makes it much easier to orient yourself and focus only where you need to, especially when working with a very large flow. 

Have you worked with flow zones/found them helpful in managing large flows? Happy to collect any other feedback here, and I will log your request, as I know that flow zones wasn't exactly what you were looking for, but is a solution that many users have found helpful 🙂

Katie

Hi @Ben_Rutan ,

Thanks for the feedback 🙂

The solution to managing large flows is by using flow zones. Through flow zones, you can split your flow into small, named/colored zones, which makes it much easier to orient yourself and focus only where you need to, especially when working with a very large flow. 

Have you worked with flow zones/found them helpful in managing large flows? Happy to collect any other feedback here, and I will log your request, as I know that flow zones wasn't exactly what you were looking for, but is a solution that many users have found helpful 🙂

Katie

CoreyS
Dataiker Alumni
 
Looking for more resources to help you use Dataiku effectively and upskill your knowledge? Check out these great resources: Dataiku Academy | Documentation | Knowledge Base

A reply answered your question? Mark as ‘Accepted Solution’ to help others like you!
Status changed to: In Backlog
 
Ben_Rutan
Level 2

Hi @ktgross15 ,

I have used flow zones and they are somewhat helpful, but I think it would be more beneficial if I could rearrange each recipe, dataset, etc however I see fit instead of how Dataiku decides to display them. Just something to add to the wish list 🙂

Ben

Hi @ktgross15 ,

I have used flow zones and they are somewhat helpful, but I think it would be more beneficial if I could rearrange each recipe, dataset, etc however I see fit instead of how Dataiku decides to display them. Just something to add to the wish list 🙂

Ben

In my experience Dataiku does a great job it drawing the flow in a reasonable way. There are indeed cases where things do look a bit awkward (see screen shot below) but even in this situation I don't think I will spend the time it would take to reorg the flow. And any manual flow reorg would be subject to manual maintenance as well, meaning new flow objects will need to be placed in sensible places so I don't think a lot of people will be willing to invest their time on this. Just look at the mess that are most people's Windows desktop or the iPhone App home page.

Screenshot 2022-07-06 at 7.56.04 pm.png

In my experience Dataiku does a great job it drawing the flow in a reasonable way. There are indeed cases where things do look a bit awkward (see screen shot below) but even in this situation I don't think I will spend the time it would take to reorg the flow. And any manual flow reorg would be subject to manual maintenance as well, meaning new flow objects will need to be placed in sensible places so I don't think a lot of people will be willing to invest their time on this. Just look at the mess that are most people's Windows desktop or the iPhone App home page.

Screenshot 2022-07-06 at 7.56.04 pm.png

Agreed! Moving and nesting flow zones - organizing components more effectively in sub sections in a flow zone.

Agreed! Moving and nesting flow zones - organizing components more effectively in sub sections in a flow zone.

AshleyW
Dataiker

Hi @Ben_Rutan ,

Thanks for this suggestion! This idea it not currently planned, but we’ll circle back to this thread if we’re looking for more info in the future.

The Flow is the visual representation of how datasets, recipes, and models work together to move data through an analytical pipeline. Projects can quickly become complex with hundreds of objects on the Flow. We sometimes call these ‘spaghetti’ flows 🙂

As it grows in complexity, manually positioning objects and zones contributes to an unnecessary burden to maintain this arrangement over time, as @Turribeach  mentioned. We’ve also noticed that manual positioning is very personal: while it may facilitate one person’s understanding of a project, it often does the opposite for others.

To help with organizing objects in a complex project, I’ll second @ktgross15 ’s suggestion to use zones. Since Dataiku projects are built around the notion of collaboration, we find that a layout structured around flow zones is more neutral than moving things around.

At Dataiku, we use zones with great success to arrange datasets/recipes/models into logical groupings, making large flows much easier to manage and collaborate on. They’re automatically laid out for me, and I never worry about them colliding or needing to resize them. With a description in the R pane, it’s easy for me (and others) to understand what a zone contributes to the project and see how all of the pieces of the pipeline fit together. (Yes, it drives me nuts when objects move; I've found zones help with that too)

As a bonus, check out the ‘Flow Zones’ flow view of a project. Clicking the ‘hide zones’ button can be a surprising reminder of how much they—and anchoring datasets—can do for you

Before + After view of a zoned flowBefore + After view of a zoned flow

Cheers,
Ashley W.

Status changed to: Rejected

Hi @Ben_Rutan ,

Thanks for this suggestion! This idea it not currently planned, but we’ll circle back to this thread if we’re looking for more info in the future.

The Flow is the visual representation of how datasets, recipes, and models work together to move data through an analytical pipeline. Projects can quickly become complex with hundreds of objects on the Flow. We sometimes call these ‘spaghetti’ flows 🙂

As it grows in complexity, manually positioning objects and zones contributes to an unnecessary burden to maintain this arrangement over time, as @Turribeach  mentioned. We’ve also noticed that manual positioning is very personal: while it may facilitate one person’s understanding of a project, it often does the opposite for others.

To help with organizing objects in a complex project, I’ll second @ktgross15 ’s suggestion to use zones. Since Dataiku projects are built around the notion of collaboration, we find that a layout structured around flow zones is more neutral than moving things around.

At Dataiku, we use zones with great success to arrange datasets/recipes/models into logical groupings, making large flows much easier to manage and collaborate on. They’re automatically laid out for me, and I never worry about them colliding or needing to resize them. With a description in the R pane, it’s easy for me (and others) to understand what a zone contributes to the project and see how all of the pieces of the pipeline fit together. (Yes, it drives me nuts when objects move; I've found zones help with that too)

As a bonus, check out the ‘Flow Zones’ flow view of a project. Clicking the ‘hide zones’ button can be a surprising reminder of how much they—and anchoring datasets—can do for you

Before + After view of a zoned flowBefore + After view of a zoned flow

Cheers,
Ashley W.

+1 for this idea regarding flow zones (can't add a thumbs up as it has been deactivated in this thread).

Though dataiku handles positioning very well, sometimes it can be improved and it would be nice if the user could customize things when explicitly asked.

I am especially thinking of flow zones. Here is an example where dataiku's position is not intuitive:

2.png

The building order follows the numbers. So in this situation the 3rd flow zone appears to the left of the 2nd flow zone, which doesn't make much sense.

+1 for this idea regarding flow zones (can't add a thumbs up as it has been deactivated in this thread).

Though dataiku handles positioning very well, sometimes it can be improved and it would be nice if the user could customize things when explicitly asked.

I am especially thinking of flow zones. Here is an example where dataiku's position is not intuitive:

2.png

The building order follows the numbers. So in this situation the 3rd flow zone appears to the left of the 2nd flow zone, which doesn't make much sense.

+1 as well

In large projects, I really wish I could snap the layout to my own grid and at least lock flow zones in place. They're always bouncing around whenever I add new flow zones or make connections between flow zones, which makes navigating large projects rather difficult.

+1 as well

In large projects, I really wish I could snap the layout to my own grid and at least lock flow zones in place. They're always bouncing around whenever I add new flow zones or make connections between flow zones, which makes navigating large projects rather difficult.

wvde
Level 1

Hi,

+1

Wanted to resurface this chain. I’ve been an active Dataiku user for two years and can speak for both myself and my team of ~15 users when I say this is the number one feature we want Dataiku to release

Every large project I have ever made in Dataiku has had the draw back of having tools and flows randomly scatter across the galaxy when a new connection is made. Input containers ahead of analysis containers, spaghetti connecting lines, asymmetry, and above all, fluid positioning (when new tools are connected) are just some of the many frustrations users currently deal with in large workflows. I’m not exaggerating when I say that not being able to command the location of tools and flows in very large workflows is the most productivity hindering aspect of Dataiku. 

Please consider fixing this in your new release.

Thanks!

Hi,

+1

Wanted to resurface this chain. I’ve been an active Dataiku user for two years and can speak for both myself and my team of ~15 users when I say this is the number one feature we want Dataiku to release

Every large project I have ever made in Dataiku has had the draw back of having tools and flows randomly scatter across the galaxy when a new connection is made. Input containers ahead of analysis containers, spaghetti connecting lines, asymmetry, and above all, fluid positioning (when new tools are connected) are just some of the many frustrations users currently deal with in large workflows. I’m not exaggerating when I say that not being able to command the location of tools and flows in very large workflows is the most productivity hindering aspect of Dataiku. 

Please consider fixing this in your new release.

Thanks!

TEChopra1000
Level 1

+1

I find that sometimes flow zones don't appear to be logically arranged, making their relationship difficult to ascertain. For example, in my project, I have two distinct pipelines with connected flow zones within each pipeline, but no connections between the two pipelines. Dataiku seems to be stacking flow zones from the two pipelines next to each other, even though they have no connections. This makes it difficult to see the two distinct pipelines, and connections within each one. 

 

It would be very helpful if I could manually specify the zone's arrangements. 

It would also be very helpful to have nested zones.

+1

I find that sometimes flow zones don't appear to be logically arranged, making their relationship difficult to ascertain. For example, in my project, I have two distinct pipelines with connected flow zones within each pipeline, but no connections between the two pipelines. Dataiku seems to be stacking flow zones from the two pipelines next to each other, even though they have no connections. This makes it difficult to see the two distinct pipelines, and connections within each one. 

 

It would be very helpful if I could manually specify the zone's arrangements. 

It would also be very helpful to have nested zones.