Ready for Dataiku 10? Try out the Crash Course on new features!GET STARTED

Api to get spark config of a recipe

Level 3
Api to get spark config of a recipe

Hi Team,

I'm trying to fetch the spark configuration a recipe using dataiku python api but I can only extract the config of spark native engine.I have tried using recipe.status.get_selected_engine_details() but it does not tell me the configuration(default, yarn-large, yarn-extra-large) of spark engine.

Sample code snippet is below.

client = dataiku.api_client()

project = client.get_project(project_name)
recipes = project.list_recipes()

for recipe_name in recipes:

spark_config = recipe_name['params']['engineParams']['spark']['sparkConfig']['inheritConf']


Any leads is appreciated. Thankyou!

0 Kudos
3 Replies

Hi nmahdu20,

In DSS, each recipe running on top of Spark points to a Spark configuration that is accessible through the instance settings. If you want to retrieve the name and details of this configuration, here is a simple way of doing so via the API:

client = dataiku.api_client()
project = client.get_project(YOUR_PROJECT_KEY)

# Get the name of the Spark configuration used by your recipe
rcp_spark_conf = project.get_recipe(YOUR_RECIPE_ID) \
    .get_settings() \
    .raw_params \
    .get("sparkConfig") \
print("The Spark configuration for recipe {} is called '{}'".format(YOUR_RECIPE_ID, rcp_spark_conf))

# Retrieve all existing Spark settings at the instance level
instance_spark_confs = client.get_general_settings() \
    .get_raw() \
    .get("sparkSettings") \
# Look up the config used by your recipe
target_spark_conf = next(filter(lambda x: x["name"] == rcp_spark_conf,  instance_spark_confs))

# Print the key-value pairs of your Spark execution configuration
target_spark_exec_conf = {x["key"]: x["value"] for x in target_spark_conf["conf"]}


Hope this helps.



0 Kudos

Note that you can even get better results visually, by selecting the "Spark configurations" view at the bottom left of your Flow screen. DSS will colorize the Spark-based recipes according to the configuration they are using, and you can easily look up the execution settings in the Administration > Settings > Spark section of your instance. 


Screenshot 2021-09-06 at 17.29.08.png

0 Kudos
Level 3

Hi HarizoR,

Thankyou for your reply.

I tried the first code sample but got two issues: 

  1. Not all recipes have sparkConfig key in their json dict so the get() line throws error.
  2. I don't have admin previledges to execute line instance_spark_confs = client.get_general_settings()

Additionally, the visual solution would be easier but we are building a code where we need to store the spark configs of all recipes in a final dataset. But we are able to extract sparkconfig details of only spark_native engine( spark optimized).

0 Kudos
A banner prompting to get Dataiku DSS