Ready for Dataiku 10? Try out the Crash Course on new features!GET STARTED

Flow Zone Reuse? Can one flow zone be reused from multiple datasets

tgb417
Neuron
Neuron
Flow Zone Reuse? Can one flow zone be reused from multiple datasets

I've been working on a number of file systems to process data about the files on various disks.  The flow zone I've created looks like this.

Image showing: A flow zone with 5 datasets 2 Shell Recipes  2 Visual Preparation Recipes and 1 Join RecipieImage showing: A flow zone with 5 datasets 2 Shell Recipes 2 Visual Preparation Recipes and 1 Join Recipie

 

There are two simple shell scripts that gather data about the same file volume, two preparation recipes that clean up that data, and 1 Join recipe that brings both sets of data together about the one file system. 

For each file system I'm processing I'm creating a new flow zone. With the same steps.  I have to repeat this for a number of volumes attached to my host.  And these volumes will change over time.  Therefore the only real difference is the path where the files come from will change.

Part of the reason to breath the flow zone into two paths at the beginning is that I want to be able to control when each of these recipes runs.  Because the MD5 shell scripts take about 3/4 of an hour for every 100 GB that I need to process.  Where pulling the stats about files takes 1/10 the time.

My question for the community.  Has anyone worked out a way to re-use a single flow zone for multiple datasets?  This would save a bunch of time copying flow zones.  It would save time managing multiple flow zones, and increase reliability because all datasets would get the same processing.

Thoughts?  


Operating system used: Mac OS 10.15.7

--Tom
0 Kudos
7 Replies
KeijiY
Dataiker
Dataiker

Hello Tom,

Thank you so much for the post on Community.

DSS has the "Application-as-recipe" feature, which packages a project's flow into a reusable recipe for other datasets. By defining a flow in a project and converting it into an Application-as-recipe, you can reuse the flow for other datasets. This feature might be useful for your use case. Please see this DSS document https://doc.dataiku.com/dss/latest/applications/application-as-recipe.html for the details of this feature.

I hope this would help.

Sincerely,
Keiji, Dataiku Technical Support

tgb417
Neuron
Neuron
Author

@KeijiY 

Application-as-recipe looks promising. However, I feel like I need a few more details about how to use this feature set.  Is there any training material or more detailed examples on the use of this feature?

--Tom

Oh, I think I may have found some additional information that might be useful to me to get started. https://knowledge.dataiku.com/latest/courses/o16n/dataiku-applications/create-app-as-recipe.html

--Tom
0 Kudos
KeijiY
Dataiker
Dataiker

@tgb417 Yes, you can refer to the knowledge base article you mentioned for the details of the feature. Please let us know if you have any further questions regarding the feature.

Sincerely,
Keiji, Dataiku Technical Support

0 Kudos
tgb417
Neuron
Neuron
Author

@KeijiY 

How do code updates work in the Application-as-recipe senario.

I create the application-as-recipe.  I put it into production. Now I find a bug in the application or a case that did not work as expected.  How do I update that application-as-recipe?

--Tom

--Tom
0 Kudos
KeijiY
Dataiker
Dataiker

Hello @tgb417,

When you run an Application-as-recipe, the latest project flow will be automatically copied and used [DSS doc]. So, it would be fine to just update the flow of the project of the Application-as-recipe.

tgb417
Neuron
Neuron
Author

@KeijiY ,

I’ve worked my way through the knowledge base article on application as recipe, and I have been able to make it work as written.  However, this is a  very specific multi step recipe on making a section of the haiku t-shirt example into a reusable component.  

However, I’m having a hard time generalizing this specific set of instructions into a better understanding of Dataiku applications and more specifically application as recipe so that I can create my own to do the kind of thing I’d like to do.  Is there a more general set of instructional material on this subject that would scaffold the purpose for taking each of the steps listed?  With an eye on being able to create other such recipes.  

cc: @CoreyS 

--Tom
KeijiY
Dataiker
Dataiker

Hello @tgb417,

Thank you so much for the feedback. I have shared the feedback with our Educational Services team.

We really appreciate your feedback.

Sincerely,
Keiji, Dataiku Technical Support