My understanding is API services can be used to expose models (input: feature values, output: prediction). What about exposing a full flow? E.g. I’d like to expose through a single API request, a flow that:
Takes as input multiple CSVs The CSVs can't be referenced in the flow as they are generated outside of DSS just before I need to consume the flow.
Executes the flow using these CSVs as input to clean them and join them
Then return the last output of the flow (joined dataset) as a CSV.
This sounds like a use-case for Dataiku Applications. The intuition behind this feature is to allow users to create templated parametrized projects where they can define:
- which input(s) to use and how to provide them (e.g. by uploading a CSV or fetching a SQL table)
- which output(s) are desired: it can be the final Dataset of a Flow exported as CSV, a dashboard, etc.
- what actions are to be performed via a scenario (e.g. build the whole Flow, compute metrics/checks, etc.)
Once the template project (which is called a Dataiku Application) is defined, users can create instances out of it, very much like how you would instantiate objects from a class in object-oriented programming.