Triggering Dataiku scenario from AWS Lambda (Node.js) to EC2‑hosted DSS-best practice&auth approach
Hi Dataiku Community,
We are working on integrating Dataiku DSS (hosted on EC2) with AWS services and would appreciate guidance/documentation reference on the recommended approach and authentication practices.
From what we know:
- currently Dataiku DSS is deployed on an EC2 instance (private network)
- DSS is in a different AWS account from the caller
From AWS side:
- AWS Lambda needs to trigger a Dataiku scenario asynchronously by sharing a json payload message.
- Result/status should be sent back later via a callback
What we are trying to do:
- Call Dataiku from a Node.js Lambda function, Trigger a Dataiku Scenario (we assume its written in Python)
- Send json payload with call, either share entire json or pass parameters (e.g. correlationId, S3 object reference)
- Let the scenario run and optionally send a callback when complete
We are inclined towards utilizing Rest API calls. It would be helpful If we can get some reference documentation or any guidance to start initial implementation for above use case around authentication (using network credentials), connectivity approach/possibility between Aws Lambda and Dataiku, Request/Response Data, etc.
Thanks in advance!
Answers
-
Turribeach Dataiku DSS Core Designer, Neuron, Dataiku DSS Adv Designer, Registered, Neuron 2023, Circle Member Posts: 2,677 NeuronYou can use the REST API or the Python API, your choice. The Python API is much more rich. REST API:
-
You can probably try to trigger the scenario with curl and pass parameters in the JSON body, for example:
curl -X POST "https://<DSS_HOST>/public/api/projects/<PROJECT_KEY>/scenarios/<SCENARIO_ID>/run" \ -H "Authorization: Bearer <API_KEY>" \ -H "Content-Type: application/json" \ -d '{ "param1": "12345", "s3Path": "s3://my-bucket/input/file.json" }'Inside the Python code of the scenario, you can read them with:
import dataiku.scenario scenario = dataiku.scenario.Scenario() params = scenario.get_trigger_params() correlation_id = params.get("param1") s3_path = params.get("s3Path")