Allow custom copy to output in application-as-recipe

Currently an application-as-recipe copies the result to the output table in what is essentially a sync step as noted in this post.  There is a good rationale for this approach as it accommodates the full range of data connection types.  However, this approach prevents an application-as-recipe from running entirely in database and thus precludes use cases in which only in database execution is practical.

The idea is to provide a setting for an application-as-recipe that leaves the copy step up to the application developer. The application developer would then specify a final Execute Python step in the application scenario that handles the copy to the output dataset. Being able to specify this final Python step would enable the application developer to make the copy to output run in database. This approach extends the one described in this post.

The application developer would be responsible for either assuring that the application-as-recipe is used in expected ways (e.g., all steps run on the same database platform) or otherwise handling the variations that might come up.

cc @pvannies@akshaykatre@fchataigner2 

1 Comment

Fully support this idea. We have discussed with dataiku that the remapping of the connections could happen automatically as well (based on the input dataset). But having the final step up to the application developer indeed solves the problem for the last copy/sync step of the output dataset.

Fully support this idea. We have discussed with dataiku that the remapping of the connections could happen automatically as well (based on the input dataset). But having the final step up to the application developer indeed solves the problem for the last copy/sync step of the output dataset.