Community Conundrum 28: News Engagement is live! Read More

Copying a sub flow to another project

Neuron
Neuron
Copying a sub flow to another project

Hi everyone,

I have, for the first time, copied a sub flow into a new project. I have a couple of questions around what this actually means, and I can't see them mentioned in the docs.

In my new project, where I see a copy of my flow, the source dataset is now black - what does this mean?

Capture.PNG

Do I need to take any action here? Also, with regards to the names of all the other datasets, i assume I don't need to rename them as they are now in a project.

When I look back at my old flow, my original dataset now has a little tick/arrow on it: 

ben_p_0-1581331759393.png

What does this mean? Is some relationship being maintained between these projects now, because I copied this flow over?

Thanks for your help,

Ben

6 Replies
Dataiker
Dataiker

Hi Ben, 

The black icon and the arrow mean the dataset is shared across project.

You can see the dependencies of projects using the graph mode (DSS homepage --> See all projects --> switch mosaic to graph).

If you want to get rid of the dependencies, sync the shared dataset in your new project and unshare it. 

Best,

Damien Jacquemart, Lead Data Scientist @Dataiku
Neuron
Neuron
Author

Thanks @DamienJ - I don't want the dependency, what are the steps required to sync and unshare the dataset?

0 Kudos
Dataiker
Dataiker

1. Create a sync recipe from the shared dataset and run it. Let's assume the output dataset name is "dataset_copy"

2. Go the the recipe (double click) just after the the shared dataset that was here before the sync
--> go to input/output
--> change the input to be the dataset dataset_copy
--> save

4. Delete the sync recipe

5. Click on the dataset you want to unshare --> actions --> unshare

You also might want to rebuild the subflow you copied: "Build All" from "Flow action" at the bottom left.

Best,

Damien Jacquemart, Lead Data Scientist @Dataiku
Neuron
Neuron
Author
Thank you for the explanation, that works nicely! 🙂
Dataiker
Dataiker
Awesome !
Damien Jacquemart, Lead Data Scientist @Dataiku
0 Kudos
Dataiker
Dataiker

Hello,

Your observation is accurate! 

1/ When the source dataset is black in "project B", it means that it is a dataset that is shared from another project in project B. However, this dataset is actually computed in this other project - let's call it "project A". 

2/ Yes, you don't need to rename them if you have copied your sub-flow in another project because now your datasets complete names will be PROJECT_KEY.dataset_name. Here the project key is different for PROJECT_A and PROJECT_B. If you had copied them inside the same project, then you would have needed to rename them to avoid conflicts in your database.

3/ The little arrow on your dataset in your project A means that this dataset is shared with one or several other projects. If you click on "Share" in the panel "Action" on the right, then you will have the list of projects were your dataset has been shared. You only need to make sure this source dataset is updated in project A if you need to use it in project B. Otherwise, directly recreate your dataset in project B.

Little tips: You can unshare it either going in project A: Action > Share > Delete the project in the list > Save, or going in project B: Action > Other Actions > Stop sharing to this project. 

Good luck with your project,

Estelle

Estelle
A banner prompting to get Dataiku DSS