Discover this year's submissions to the Dataiku Frontrunner Awards and give kudos to your favorite use cases and success stories!READ MORE

Getting Scenario Macro Steps parameters

hromo95
Level 1
Level 1
Getting Scenario Macro Steps parameters

Hello

I am interested in using python to get in a list all my scenario step parameters. I am new to this so I am looking for help.

0 Kudos
3 Replies
Ignacio_Toledo

Hi @hromo95,

Welcome to the community. If you are just starting, I'd recommend you to go to these excellent links:

After that, the python API for the scenarios can be found here: https://doc.dataiku.com/dss/latest/python-api/scenarios.html

And, one way to get to your steps definitions (and parameters I suppose), would be with something like this (assuming the project ID is 'project01' and the scenario ID is 'scenario01'):

 

import dataiku
client = dataiku.api_client()
p1 = client.get_project('project01')
sce1 = p1.get_scenario('scenario01')
sce1_settings = sce1.get_settings()
sce1_set.raw_steps

 

Hope this helps you to start!

hromo95
Level 1
Level 1
Author

Thank you for your response. Right now I obtained all the parameters for every step of the 1835 scenarios I have in Dataiku. But now I am looking to obtain just the parameters of the steps available for the scenarios last runs. How can I do this.

0 Kudos
Ignacio_Toledo

I'm uncertain if I understand what are you actually trying to do, so I might be giving you the wrong answer, but if I would like to get a list of the times when the scenarios in a DSS instance were last run I'd do this:

 

import dataiku

client = dataiku.api_client()
scenarios = client.list_running_scenarios(all_users=True) # if you are not admin remove the option)
data_for_df = []
for s in scenarios:
    targets = s['payload']['targets'][0]
    project_id = targets['projectKey']
    scenario_id = targets['objectId']
    startTime = pd.Timestamp(s['startTime'], unit='ms')
    data_for_df.append([startTime, project_id, scenario_id])

scenarios_df = pd.DataFrame(data_for_df, columns=['startTime', 'projectId', 'scenarioId']).sort_values('startTime', ascending=0)

 

This will give you a list with the scenarios currently running or run within the last few hours (however, I can't seem to find in the documentation how long into the past it does look, but from empirical evidence, it looks like is a search in the last 6 hours)

0 Kudos