Changing older projects
I have some older projects that were built when I had a weaker understanding of what you could do in dataiku, and I have identified some redundancies.
How simple is it to reroute the starting data set for a flow? How would you handle it so that it would not ruin the old processes but going forward used the datasource?
Answers
-
Hi,
You can change the input(s) of a recipe to point it to another dataset that you would have previously imported in the Flow. To do so, check the "Input/Output" tab in your recipe screen.
Keep in mind that the downstream processing will only work if the new "starter" dataset doesn't introduce breaking changes, especially regarding the schema.
Best,
Harizo
-
Is there a way to keep the old dataset included (an either or)?
Maybe with a join ?
-
Hi,
If you want to "vertically" concatenate records from older data with more recent ones, assuming that the column names and types do not change, you can use a "Stack" recipe. It is different from the "Join" recipe which "horizontally" combines columns from multiple datasets by matching records using a common dimension called the "join key".
Best,
Harizo