Dataiku Python API

Options
KBS
KBS Registered Posts: 2 ✭✭✭

Hello,

I'm trying to use the dataiku api and I want to :

1.create a zone

2.share some existing datasets in the new zone and then create some recipes to the new shred datasets.

The problem is when I share the datasets and create the recipes with the API, in the flow these recipes are not visible in the new zone but in the source where I share the datasets.

Is there any way with the api to create a whole flow with shared datasets to new zone ?

Thanks for your help,

BR


Operating system used: Windows

Answers

  • AlexandreL
    AlexandreL Dataiker Posts: 36 Dataiker
    edited July 17
    Options

    Hi, can you give me a bit more information on how you try to answer this need (a code snippet and your dss version would be perfect). I tested on my side with the following code and both new recipe and output dataset are created in the new zone:

    # Setup necessary handles
    client = dataiku.api_client()
    project = client.get_project("PROJECT_ID")
    flow = project.get_flow()
    
    # Create zone and add dataset
    new_zone = flow.create_zone("ZONE_NAME")
    dataset = project.get_dataset("INPUT_DATASET_NAME")
    new_zone.add_item(dataset)
    
    # Create recipe (sync in the example)
    builder = project.new_recipe("sync")
    builder = builder.with_input("INPUT_DATASET_NAME")
    builder = builder.with_new_output("OUTPUT_DATASET_NAME", "CONNECTION_NAME", format_option_id="FORMAT_ID")
    
    recipe = builder.create()
    job = recipe.run()

    The code snippets used to test come from these parts of the doc:

    https://doc.dataiku.com/dss/latest/python-api/flow.html#working-with-flow-zones

    https://doc.dataiku.com/dss/latest/python-api/flow.html#creating-a-sync-recipe

  • KBS
    KBS Registered Posts: 2 ✭✭✭
    edited July 17
    Options

    Hi AlexandreL,

    Thanks for your reply.

    I have tried this but the problem is that the dataset is moved from the initial zone to new one and I want to keep the source dataset from the previous zone and have a sort of 'copy'of that dataset in the new zone.

    I have tried the add_share method from api and the datasets are shred but one they are shared If i create for each dataset shared a recipe then it appears in the initial zone and not in the new one.

    for item in new_zone.shared:

    for item in new_zone.shared:
      builder = project.new_recipe("sync")
      builder = builder.with_input(item.name)
      builder = builder.with_new_output("OUTPUT_%s" % item.name,"CONNECTION_NAME", format_option_id="FORMAT_ID")
      recipe = builder.create()

Setup Info
    Tags
      Help me…