Ready for Dataiku 10? Try out the Crash Course on new features!GET STARTED

How to copy recipe using Python API

Solved!
Marlan
Neuron
Neuron
How to copy recipe using Python API

Hello all,

Does anyone have an example of how to copy a recipe using the Python API that you can share with me? I haven't tried to do this myself yet (beyond reading the doc) but hoping I don't have to start from scratch.

I'm considering building a recipe plugin that would execute an existing recipe replacing inputs and outputs of the existing recipe with those of the calling recipe. I'm thinking of also being able to specify new values for defined project variables. 

I've long been unhappy with having to copy and maintain very similar but not exactly the same sets of recipes for training and scoring processes in machine learning projects. Something like what I described may work to have just one set of recipes say for training and then just call those from the score process (or vice versa). 

Thanks!

Marlan

0 Kudos
1 Solution
Marlan
Neuron
Neuron
Author

Thought I would share what I came up with for copying a recipe through the API.

def duplicate_recipe(project_object, exist_name, new_name):
    """
    Duplicate an existing recipe. Duplicated recipe has all of the same settings except that inputs and outputs
    are left unset as we don't want the duplicated recipe and the existing recipe to have the same inputs and
    outputs. New inputs and outputs will be set later by the caller using the add_input and add_output methods.
    Alternatively, the inputs and outputs could be added here via method calls on the builder object.
    """

    # Check if recipe named 'new_name' already exists
    exist_names = [recipe_list_item.name for recipe_list_item in project_object.list_recipes()]
    if new_name in exist_names:
        raise RuntimeError('Recipe {} already exists.'.format(new_name))

    # Get existing recipe
    exist_object = project_object.get_recipe(exist_name)
    exist_settings = exist_object.get_settings()

    # Create skeleton of new recipe (not setting script payload here as it didn't work without an output)
    builder_object = project_object.new_recipe(exist_settings.type, new_name)
    new_object = builder_object.create()

    new_settings = new_object.get_settings()

    # Set script payload if existing recipe had one
    if exist_settings.get_payload() is not None:
        new_settings.set_payload(exist_settings.get_payload())

    # Set other settings to existing recipe values
    # Using API get and set methods would work for some of the settings but not all so instead using the
    # object variable 'recipe_settings' to set all of them.
    skip_keys = [u'name', u'type', u'projectKey', u'inputs',  u'outputs', u'creationTag', u'versionTag']
    for key, value in exist_settings.recipe_settings.items():
        if key not in skip_keys:
            new_settings.recipe_settings[key] = value

    new_settings.save()

    return new_object

View solution in original post

3 Replies
Manuel
Dataiker
Dataiker

Hi Marlan,

Your objective sounds very similar to what "application as recipes" already achieve. Please have a look at the following resources:

I hope this helps.

Best regards

Dataiku Applications are a kind of DSS customization that allows you to reuse projects. Applications-as-Recipes allow you to package part of a Flow into a r...
0 Kudos
Marlan
Neuron
Neuron
Author

Hi @Manuel,

Thanks for the reply. It's good to be reminded of the application as recipe feature. It has fallen off my radar a bit because it can't be used to run SQL recipes all in database as noted in this post.  Almost all of our data is in SQL databases and for many of our processes given the amount of data it's only practical to run these in database (vs. streaming the data through the DSS backend).  But yes otherwise it would be a good fit for my use case. 

I did submit a suggestion in product ideas that would provide a way to run application as recipes all in database.

I've been using recipe plugins as an alternative to application as recipes because I am able to write these so all of the processing is run in database.

Which leads to the question about copying recipes. Any suggestions for how to do that? I assume I can figure it out with some experimentation but was hoping to not have to do that.

Thanks!

Marlan

 

0 Kudos
Marlan
Neuron
Neuron
Author

Thought I would share what I came up with for copying a recipe through the API.

def duplicate_recipe(project_object, exist_name, new_name):
    """
    Duplicate an existing recipe. Duplicated recipe has all of the same settings except that inputs and outputs
    are left unset as we don't want the duplicated recipe and the existing recipe to have the same inputs and
    outputs. New inputs and outputs will be set later by the caller using the add_input and add_output methods.
    Alternatively, the inputs and outputs could be added here via method calls on the builder object.
    """

    # Check if recipe named 'new_name' already exists
    exist_names = [recipe_list_item.name for recipe_list_item in project_object.list_recipes()]
    if new_name in exist_names:
        raise RuntimeError('Recipe {} already exists.'.format(new_name))

    # Get existing recipe
    exist_object = project_object.get_recipe(exist_name)
    exist_settings = exist_object.get_settings()

    # Create skeleton of new recipe (not setting script payload here as it didn't work without an output)
    builder_object = project_object.new_recipe(exist_settings.type, new_name)
    new_object = builder_object.create()

    new_settings = new_object.get_settings()

    # Set script payload if existing recipe had one
    if exist_settings.get_payload() is not None:
        new_settings.set_payload(exist_settings.get_payload())

    # Set other settings to existing recipe values
    # Using API get and set methods would work for some of the settings but not all so instead using the
    # object variable 'recipe_settings' to set all of them.
    skip_keys = [u'name', u'type', u'projectKey', u'inputs',  u'outputs', u'creationTag', u'versionTag']
    for key, value in exist_settings.recipe_settings.items():
        if key not in skip_keys:
            new_settings.recipe_settings[key] = value

    new_settings.save()

    return new_object

View solution in original post

Labels

?
Labels (1)
A banner prompting to get Dataiku DSS