Survey banner
The Dataiku Community is moving to a new home! We are temporary in read only mode: LEARN MORE

Copy the zone recipes into another zone with new datasets

ESoto
Level 3
Copy the zone recipes into another zone with new datasets

Hello,

 

I have a zone that has recipes I want to have copied into a new zone. This zone has the same format but a different dataset (the month of February vs the original zone was for the month of January). 

I attempted to copy the zone but all it did was copy the zone into the same zone with the same dataset in that zone. I would prefer to not have to redo all the recipes again since it is going to be the same one and would save me time instead of manually checking each one, I did then do this in the new zone.

 

Also I am looking to see how I can get rid of the copy zone without deleting the original (it is attached to the original zone data and recipes, and I can not lose that but I don't want the copy in the zone anymore)

Any help with this would be greatly appreciated, thank you! 

I

0 Kudos
11 Replies
Turribeach

I have a zone that has recipes I want to have copied into a new zone. This zone has the same format but a different dataset (the month of February vs the original zone was for the month of January). 

I used the Zone Copy option of the right pane and it created a new zone for me. Also note how Dataiku deals with input datasets below. With regards intermediate datasets and recipes these will be new copies and only exist in the new zone. It's not clear to me what exactly you are trying to do. Do you want to use new input datasets in the new Zone?


Also I am looking to see how I can get rid of the copy zone without deleting the original (it is attached to the original zone data and recipes, and I can not lose that but I don't want the copy in the zone anymore)

With regards to the zone attached to the original zone that's because the same input dataset can only exist once in the flow, irrespective of where it is used. Therefore the line that you see indicates that the input dataset in the second zone is a reference to the first one. Referenced datasets show with a different colour as well (light blue) so you know they are references not the original dataset. If you select the referenced dataset on the new Zone you will see the source dataset highlighted in the original Zone. To remove this link you can delete all your non-input datasets in the original zone and then select each input dataset and "move" it to the new Zone so it only exists in the new Zone. 

0 Kudos

I think that perhaps this copy Zone action seems confusing to you because you haven't understood what's going on behind the scenes. When you do a Zone copy Dataiku can easily duplicate any intermediate and ending recipes and datasets but what should should it do with input datasets? As explained before a dataset with the same name can only exist once in the flow. So Dataiku assuming you want the same input datasets and creates the references. If you want to use different input datasets it's up to you to modify the new Zone and change them. You could argue that Dataiku should duplicate the input datasets with a new name. However this wouldn't work when the input datasets of the source zone are not true input datasets but references to output datasets in a different upstream zone. But even if the input datasets of the source zone why would you want to create a copy of them if they reference the same physical table/view? Clearly the way Dataiku behaves it's the most logical. You can think of the Zone copy command two actions:

  1. Duplicate all non-input datasets of the source zone
  2. Reference (ie branch) all input datasets of the source zone
0 Kudos
ESoto
Level 3
Author

Thank you so much for responding, I am newer to using Dataiku so I am making sure to understand how everything works.

So to clarify, I don't want a copy of the actual zone itself, I just want a copy of all the recipes that are in that zone, that is it because I need those recipes for the new zone, but the dataset has newer dates compare to the old one.

Eventually, I would like to have API to actually update everything, but I am constructing the design before moving on to these steps.

Thank you again, and I apologize if my question was confusing. 

0 Kudos
Turribeach

Ok no worries. Is your question solved now? 

0 Kudos
ESoto
Level 3
Author

Unfortunately, I am still attempting to figure out how to get this to work.

I just copied the recipes and put it in a test zone. However, it still took the copy of the old datasets although I thought I changed this. 

Again, I just want the recipes themselves copied to this zone to work with the new dataset, not to copy from the old data which is what it is doing as of now.

0 Kudos
Turribeach

How exactly are you doing the copy?

So to clarify, I don't want a copy of the actual zone itself, I just want a copy of all the recipes that are in that zone, that is it because I need those recipes for the new zone,

So you DO to make a copy then! All of the recipes that in the zone is the whole zone. 

but the dataset has newer dates compare to the old one

Please explain what this means. What dataset? A new dataset? Do you want to copy a zone and change the input datasets?

 

 

0 Kudos
ESoto
Level 3
Author

Yes, I want to copy the zone (recipes) but for the recipes to be used for the new dataset that is it (copy and paste)

Zone A has recipes that I want for Zone B. Zone B is not the same as Zone A, just simply still has the same format (all rows and columns have the same names) Zone A. It is updated data (new dates from the old dataset). 

Again, I apologize if I am not making this clear I can NOT use the same dataset as Zone A because the dates are different. And I changed or I thought I changed the dataset to make sure the new Zone is only using the same recipes NOT the same dataset from Zone A but it is still copying all the data from Zone A and not just simply using the same recipes for the new dataset in Zone B. Basically, like hitting copy and paste I just want to copy and paste the recipe itself so the new dataset can do the same thing Zone A.

Worse case I will just look at each recipe and Zone A and just manually do it for now as I am on a deadline. However, if you could provide a screenshot of how to do this that would be helpful, I appreciate your time.

0 Kudos
Turribeach

Copy the zone to a new zone. Then go to the recipe after each of input datasets you have (the ones that start a flow branch) and edit recipe. Then go to the Input/Output tab and change the input dataset to the new dataset you want to use. Then run the new flow zone and all the new data will populate the new datasets.

0 Kudos
ESoto
Level 3
Author

This option is not completely working. I replaced the data with what I actually want and it now in fact is using the data I need. However, it is not giving the option to build upstream and is just simply now moving the data up from the original stream. So now the sorted recipe as an example that was in the stream is now moved up and taken out. 

This is getting too complicated, and I will just resort to manually doing it for now. I would like to request a new feature that is a more simply copy and paste of recipe that is it not having to continuously change the datasets out just to achieve this goal. Thanks again for your help. 

0 Kudos
Turribeach

@ESoto wrote:

This option is not completely working. I replaced the data with what I actually want and it now in fact is using the data I need. However, it is not giving the option to build upstream and is just simply now moving the data up from the original stream. So now the sorted recipe as an example that was in the stream is now moved up and taken out. hanks again for your help. 


I am afraid I can't really understand what describe. If you wish to continue this thread you will need to post screen shots at every step of the way clearly describing what the issue is. Thanks

0 Kudos
Turribeach

Rather than explaining what you are doing can you please explain what exactly you are trying to achieve? 

0 Kudos