How to copy recipe using Python API

Marlan
Marlan Neuron 2020, Neuron, Registered, Dataiku Frontrunner Awards 2021 Finalist, Neuron 2021, Neuron 2022, Dataiku Frontrunner Awards 2021 Participant, Neuron 2023 Posts: 321 Neuron

Hello all,

Does anyone have an example of how to copy a recipe using the Python API that you can share with me? I haven't tried to do this myself yet (beyond reading the doc) but hoping I don't have to start from scratch.

I'm considering building a recipe plugin that would execute an existing recipe replacing inputs and outputs of the existing recipe with those of the calling recipe. I'm thinking of also being able to specify new values for defined project variables.

I've long been unhappy with having to copy and maintain very similar but not exactly the same sets of recipes for training and scoring processes in machine learning projects. Something like what I described may work to have just one set of recipes say for training and then just call those from the score process (or vice versa).

Thanks!

Marlan

Tagged:

Best Answer

  • Marlan
    Marlan Neuron 2020, Neuron, Registered, Dataiku Frontrunner Awards 2021 Finalist, Neuron 2021, Neuron 2022, Dataiku Frontrunner Awards 2021 Participant, Neuron 2023 Posts: 321 Neuron
    edited July 17 Answer ✓

    Thought I would share what I came up with for copying a recipe through the API.

    def duplicate_recipe(project_object, exist_name, new_name):
        """
        Duplicate an existing recipe. Duplicated recipe has all of the same settings except that inputs and outputs
        are left unset as we don't want the duplicated recipe and the existing recipe to have the same inputs and
        outputs. New inputs and outputs will be set later by the caller using the add_input and add_output methods.
        Alternatively, the inputs and outputs could be added here via method calls on the builder object.
        """
    
        # Check if recipe named 'new_name' already exists
        exist_names = [recipe_list_item.name for recipe_list_item in project_object.list_recipes()]
        if new_name in exist_names:
            raise RuntimeError('Recipe {} already exists.'.format(new_name))
    
        # Get existing recipe
        exist_object = project_object.get_recipe(exist_name)
        exist_settings = exist_object.get_settings()
    
        # Create skeleton of new recipe (not setting script payload here as it didn't work without an output)
        builder_object = project_object.new_recipe(exist_settings.type, new_name)
        new_object = builder_object.create()
    
        new_settings = new_object.get_settings()
    
        # Set script payload if existing recipe had one
        if exist_settings.get_payload() is not None:
            new_settings.set_payload(exist_settings.get_payload())
    
        # Set other settings to existing recipe values
        # Using API get and set methods would work for some of the settings but not all so instead using the
        # object variable 'recipe_settings' to set all of them.
        skip_keys = [u'name', u'type', u'projectKey', u'inputs',  u'outputs', u'creationTag', u'versionTag']
        for key, value in exist_settings.recipe_settings.items():
            if key not in skip_keys:
                new_settings.recipe_settings[key] = value
    
        new_settings.save()
    
        return new_object

Answers

  • Manuel
    Manuel Alpha Tester, Dataiker Alumni, Dataiku DSS Core Designer, Dataiku DSS ML Practitioner, Dataiku DSS Core Concepts, Dataiku DSS Adv Designer, Registered Posts: 193 ✭✭✭✭✭✭✭

    Hi Marlan,

    Your objective sounds very similar to what "application as recipes" already achieve. Please have a look at the following resources:

    I hope this helps.

    Best regards

  • Marlan
    Marlan Neuron 2020, Neuron, Registered, Dataiku Frontrunner Awards 2021 Finalist, Neuron 2021, Neuron 2022, Dataiku Frontrunner Awards 2021 Participant, Neuron 2023 Posts: 321 Neuron

    Hi @Manuel
    ,

    Thanks for the reply. It's good to be reminded of the application as recipe feature. It has fallen off my radar a bit because it can't be used to run SQL recipes all in database as noted in this post. Almost all of our data is in SQL databases and for many of our processes given the amount of data it's only practical to run these in database (vs. streaming the data through the DSS backend). But yes otherwise it would be a good fit for my use case.

    I did submit a suggestion in product ideas that would provide a way to run application as recipes all in database.

    I've been using recipe plugins as an alternative to application as recipes because I am able to write these so all of the processing is run in database.

    Which leads to the question about copying recipes. Any suggestions for how to do that? I assume I can figure it out with some experimentation but was hoping to not have to do that.

    Thanks!

    Marlan

Setup Info
    Tags
      Help me…