Can a Recipe in a Flow be used to help 'backup' the data to assist with reruns?
I'm wondering if there is a way to use something like a Sync or Export recipe to help provide a kind off roll back / undo facility for an output dataset? This function although applied to data at the end of a flow would need to run first rather than last in the flow. ie
1.Step 1 - take a copy of the output data from the last time the flow ran
2.Step 2 - run the flow to add / replace output data
3. optional step 3 - Manually copy step 1 back into place if needed to rerun the job
I suppose this is more of a database feature but I was trying to see if this could be achieved in the Flow somehow (or in another Flow?).
One option is to use Sync Recipes (you can use one to copy the input and other at the end (a copy of the output), you can also use Sync for step 2 (you can set Sync as replace or append). If you rerun you can decide what you want to build (an option is to Force Build, with which all the flow would be executed).
Another option is to use scenarios. With them you can set your three steps:
1. A Python code step that duplicates the output dataset (you can also copy do it manually doing a copy of a dataset with the user interface).
2. A Build/Train step that builds the flow (here you can also use Python code if you want to do some complex checks, but you can use a normal python code recipe in the flow).
3. Use the Export option to "copy and paste" the backup dataset (generated with Step 1) to another dataset (here you can do it with python again, to do it automatically).
If you need more information about something let me know 🙂
Thanks for this, I've created my first scenario with 2 steps, 1st creates the "backup" of the old output, 2nd builds the new output. I can "roll back" if required by exporting the backup to overwrite the output. This does what I need - simple!