Transfer a flow

Solved!
ebbingcasa
Level 3
Transfer a flow

Hi there,

I'm still new to Dataiku, so I'd like to know how to plan ahead. Say, I want to use a flow for another country's data, what do I need to keep in mind when planning the flow? Is there an option to, for example, copy paste everything and then find/replace a country's name across all created datasets?

Thanks!

0 Kudos
1 Solution
Manuel
Dataiker Alumni

Hi,

You should consider the use of Dataiku Applications, which allow the reutilisation of the same flow with different input parameters or data.

    • Video introduction:

I hope this helps.

View solution in original post

6 Replies
emate
Level 5

Hi @ebbingcasa 

I think it is heavily dependant on the project and what are you doing with the data in the end. For example, if you are working on some dashboard, and If the calculations/logic of the flow is the same and data source is the same I think you can basically use filters to add/remove countries, if not, indeed you can copy the flow or recipe by recipe... but I would suggest you to think of a way to keep everything in one flow if possible, especially if there is 1 data source (like SQL DB) for all the countries.

 

Thanks

Mateusz

ebbingcasa
Level 3
Author

Hi Mateusz,

thanks, makes perfect sense.

In my case it all starts with several API calls. So what you're suggesting is that I should start it all with the API python receipt for all countries, then do all the data wrangling and filter for countries as late as possible to not have to copy steps for each country, correct?

Best,
Peter

0 Kudos
emate
Level 5

I don't what to suggest the best solution, as I don't know the project ๐Ÿ™‚

But, If you know that data wrangling is the same for all countries (or huge part of it is the same) I would filter it in the first recipes even -  to boost refresh / flow performance and then modify only specific recipes where you have some differences in data wrangling between countries ๐Ÿ™‚

ebbingcasa
Level 3
Author

Ok, so just because I'm more used to having all these steps sorted in scripts but I'd like to apply it more visually with Dataiku to make it better understandable for everybody else:

Is there a way to apply the same flow multiple times to different input datasets to get different output datasets at the end of a flow? Something like:

{input dataset 1st country | input dataset 2nd country | input dataset 3rd country} => apply same wrangling to each via e.g. prepare receipt(s) => {output dataset 1st country | output dataset 2nd country | output dataset 3rd country}

I'm thinking that concatenating would worsen performance, whereas splitting/filtering leads to having to copy the same wrangling steps, if I'm not mistaken?

0 Kudos
Manuel
Dataiker Alumni

Hi,

You should consider the use of Dataiku Applications, which allow the reutilisation of the same flow with different input parameters or data.

    • Video introduction:

I hope this helps.

ebbingcasa
Level 3
Author

That connected the dots to modules in python etc., thanks a lot! 

0 Kudos