Running a scenario with different parameters from API in parallel
We have deployed an API using a Python Function.
It calls a scenario with a custom parameter.
This parameter is used to run the scenario steps for one partition of the data.
In the scenario, it computes additional features and gets predictions from a deployed model.
Since the parameter drives different partitions, can we run multiple API calls in parallel?
In other words, Can the same scenario be run in parallel for different partitions?
Please advise.
Operating system used: Linux
Best Answer
-
Hi,
The following example endpoint will run a SQL query using SQLExecutor2.
import dataiku # Establish the connection to DSS, and set a default project dataiku.set_remote_dss("http://HOST", "API_SECRET") dataiku.set_default_project_key("YOUR_PROJECT") def api_py_function(): executor = dataiku.SQLExecutor2(connection="YOUR_SQL_CONNECTION") df = executor.query_to_df('SELECT * FROM "YOUR_TABLE" LIMIT 10') # Return the result as an array return df.values.tolist()
Is this what you were looking for?
More information about performing SQL queries: https://doc.dataiku.com/dss/latest/python-api/sql.html
More information about connecting to DSS from an API endpoint: https://doc.dataiku.com/dss/latest/python-api/outside-usage.html#setting-up-the-connection-with-dss
Thanks,
Zach
Answers
-
Hi @sujayramaiah
,Unfortunately, it isn't possible to run the same scenario multiple times in parallel.
If you want to build multiple partitions at once, I recommend redesigning your scenario so that it can build multiple partitions in a single run.
Thanks,
Zach
-
tgb417 Dataiku DSS Core Designer, Dataiku DSS & SQL, Dataiku DSS ML Practitioner, Dataiku DSS Core Concepts, Neuron 2020, Neuron, Registered, Dataiku Frontrunner Awards 2021 Finalist, Neuron 2021, Neuron 2022, Frontrunner 2022 Finalist, Frontrunner 2022 Winner, Dataiku Frontrunner Awards 2021 Participant, Frontrunner 2022 Participant, Neuron 2023 Posts: 1,601 Neuron
I'm working on something similar right now. Can you say a bit more about what you have in mind?
-
sujayramaiah Dataiku DSS Core Designer, Dataiku DSS ML Practitioner, Dataiku DSS Adv Designer, Registered Posts: 5 ✭✭✭
Thanks for getting back ZachM
Since we are trying to get realtime predictions from our API deployment, grouping a bunch of partitions so they can be processed by one scenario is not an option at this time for us.
As an alternative, we are trying to run a custom python code in the Code Library from an API end point to avoid scenarios completely.This piece of code will not update any datasets. Can this function be run in parallel?
Using dataikuapi, we are able to execute a function from the library. But when we need to access a SQL Connection object from the project to execute a SQL, How can execute it within the project?
We are able to get a handle to the project by specifying the host and api_key as shown below.
client = dataikuapi.DSSClient(host, api_key)
project = client.get_project(project_key)How can we execute a custom python function? Please advise.
-
sujayramaiah Dataiku DSS Core Designer, Dataiku DSS ML Practitioner, Dataiku DSS Adv Designer, Registered Posts: 5 ✭✭✭
Thanks a lot @ZachM
!!! That worked !