Copying a sub flow to another project

Solved!
ben_p
Level 5
Copying a sub flow to another project

Hi everyone,

I have, for the first time, copied a sub flow into a new project. I have a couple of questions around what this actually means, and I can't see them mentioned in the docs.

In my new project, where I see a copy of my flow, the source dataset is now black - what does this mean?

Capture.PNG

Do I need to take any action here? Also, with regards to the names of all the other datasets, i assume I don't need to rename them as they are now in a project.

When I look back at my old flow, my original dataset now has a little tick/arrow on it: 

ben_p_0-1581331759393.png

What does this mean? Is some relationship being maintained between these projects now, because I copied this flow over?

Thanks for your help,

Ben

1 Solution
DamienJ
Dataiker

Hi Ben, 

The black icon and the arrow mean the dataset is shared across project.

You can see the dependencies of projects using the graph mode (DSS homepage --> See all projects --> switch mosaic to graph).

If you want to get rid of the dependencies, sync the shared dataset in your new project and unshare it. 

Best,

Damien Jacquemart, Lead Data Scientist @Dataiku

View solution in original post

12 Replies
DamienJ
Dataiker

Hi Ben, 

The black icon and the arrow mean the dataset is shared across project.

You can see the dependencies of projects using the graph mode (DSS homepage --> See all projects --> switch mosaic to graph).

If you want to get rid of the dependencies, sync the shared dataset in your new project and unshare it. 

Best,

Damien Jacquemart, Lead Data Scientist @Dataiku
ben_p
Level 5
Author

Thanks @DamienJ - I don't want the dependency, what are the steps required to sync and unshare the dataset?

0 Kudos
DamienJ
Dataiker

1. Create a sync recipe from the shared dataset and run it. Let's assume the output dataset name is "dataset_copy"

2. Go the the recipe (double click) just after the the shared dataset that was here before the sync
--> go to input/output
--> change the input to be the dataset dataset_copy
--> save

4. Delete the sync recipe

5. Click on the dataset you want to unshare --> actions --> unshare

You also might want to rebuild the subflow you copied: "Build All" from "Flow action" at the bottom left.

Best,

Damien Jacquemart, Lead Data Scientist @Dataiku
ben_p
Level 5
Author
Thank you for the explanation, that works nicely! ๐Ÿ™‚
DamienJ
Dataiker
Awesome !
Damien Jacquemart, Lead Data Scientist @Dataiku
0 Kudos
Sajid_Khan
Level 3

Thanks @DamienJ ,

It helped me. But now I am facing issues in reverting the share step in version 9.0.4

 

0 Kudos
Sajid_Khan
Level 3

I have found the option.

Thank You

0 Kudos
CoreyS
Dataiker Alumni

Hi @Sajid_Khan if you wouldn't mind, could share where you found it (like a link to it) or what that option is?

Looking for more resources to help you use Dataiku effectively and upskill your knowledge? Check out these great resources: Dataiku Academy | Documentation | Knowledge Base

A reply answered your question? Mark as โ€˜Accepted Solutionโ€™ to help others like you!
0 Kudos
Sajid_Khan
Level 3

Sure @CoreyS ,

 

There are 2 methods as @EstelleB  has explained

1) Project shared from: You can select the shared dataset (the one with black color or the arrow mark). Once you click, on top right Actions > Share > Scroll to the bottom > Delete the project in the list. Here you get a list of projects where you have shared the selected dataset, you can delete the one you want to and then save the changes. This step worked for me as there were some access limitations on the other (shared to) project. You can try both and see which works for you.

2) Project shared to: You can select the shared dataset (the one with black color or the arrow mark). Once you click, on top right Actions > Share > Scroll to the bottom > Stop sharing to this project

Please let me know if any more details required.

------------------------------------------------------------------------------

Thanks @EstelleB for this. It really helped.

 

 

EstelleB
Dataiker

Hello,

Your observation is accurate! 

1/ When the source dataset is black in "project B", it means that it is a dataset that is shared from another project in project B. However, this dataset is actually computed in this other project - let's call it "project A". 

2/ Yes, you don't need to rename them if you have copied your sub-flow in another project because now your datasets complete names will be PROJECT_KEY.dataset_name. Here the project key is different for PROJECT_A and PROJECT_B. If you had copied them inside the same project, then you would have needed to rename them to avoid conflicts in your database.

3/ The little arrow on your dataset in your project A means that this dataset is shared with one or several other projects. If you click on "Share" in the panel "Action" on the right, then you will have the list of projects were your dataset has been shared. You only need to make sure this source dataset is updated in project A if you need to use it in project B. Otherwise, directly recreate your dataset in project B.

Little tips: You can unshare it either going in project A: Action > Share > Delete the project in the list > Save, or going in project B: Action > Other Actions > Stop sharing to this project. 

Good luck with your project,

Estelle

Estelle
Sajid_Khan
Level 3

Hey @EstelleB ,

Can you please help me on where can I find the options,

Shared From: Action > Share > Delete the project in the list > Save, OR

Shared To: Action > Other Actions > Stop sharing to this project.

I am unable to find that option, I am using version 9.0.4

 

Thanks,

Sajid

0 Kudos
Sajid_Khan
Level 3

I have found the option.

Thanks