## Can I have a dataset both as input and as output of a recipe (a kind of “Recursive recipe”)?

Highlighted
UserBird Dataiker
###### Can I have a dataset both as input and as output of a recipe (a kind of “Recursive recipe”)?
I'd like to write a recipe where one of the inputs is also an output. The loop exists intentionally: the data-set should get enriched every time the recipe is run and would converge.
1 Solution

Accepted Solutions
jrouquie Dataiker
###### Re: Can I have a dataset both as input and as output of a recipe (a kind of “Recursive recipe”)?

It is not possible to have a dataset both as input and as output of a recipe (this would require the user to specify a convergence criterion, and a way to write to a dataset while reading from it).

But there is hope!

• If the goal is to enrich a dataset, one should have two datasets: foo and foo_enriched

• If it's about iterating until convergence, this can be done inside one recipe (for instance in a Python recipe), and have as output of the recipe the dataset after convergence.

• If it's about updating a dataset on a regular basis (e.g. daily),  then partitioning might be the solution.

2 Replies
jrouquie Dataiker
###### Re: Can I have a dataset both as input and as output of a recipe (a kind of “Recursive recipe”)?

It is not possible to have a dataset both as input and as output of a recipe (this would require the user to specify a convergence criterion, and a way to write to a dataset while reading from it).

But there is hope!

• If the goal is to enrich a dataset, one should have two datasets: foo and foo_enriched

• If it's about iterating until convergence, this can be done inside one recipe (for instance in a Python recipe), and have as output of the recipe the dataset after convergence.

• If it's about updating a dataset on a regular basis (e.g. daily),  then partitioning might be the solution.

jereze Dataiker
###### Re: Can I have a dataset both as input and as output of a recipe (a kind of “Recursive recipe”)?

The answer given by jrouquie is correct. But there are also some (unofficial) hacks to work around:

• Notebooks

• Writing in files (I personally do it for caching API calls)

• SQL

Jeremy, Product Manager at Dataiku
Labels (1)