Dataiku Python API
Hello,
I'm trying to use the dataiku api and I want to :
1.create a zone
2.share some existing datasets in the new zone and then create some recipes to the new shred datasets.
The problem is when I share the datasets and create the recipes with the API, in the flow these recipes are not visible in the new zone but in the source where I share the datasets.
Is there any way with the api to create a whole flow with shared datasets to new zone ?
Thanks for your help,
BR
Operating system used: Windows
Answers
-
Hi, can you give me a bit more information on how you try to answer this need (a code snippet and your dss version would be perfect). I tested on my side with the following code and both new recipe and output dataset are created in the new zone:
# Setup necessary handles client = dataiku.api_client() project = client.get_project("PROJECT_ID") flow = project.get_flow() # Create zone and add dataset new_zone = flow.create_zone("ZONE_NAME") dataset = project.get_dataset("INPUT_DATASET_NAME") new_zone.add_item(dataset) # Create recipe (sync in the example) builder = project.new_recipe("sync") builder = builder.with_input("INPUT_DATASET_NAME") builder = builder.with_new_output("OUTPUT_DATASET_NAME", "CONNECTION_NAME", format_option_id="FORMAT_ID") recipe = builder.create() job = recipe.run()
The code snippets used to test come from these parts of the doc:
https://doc.dataiku.com/dss/latest/python-api/flow.html#working-with-flow-zones
https://doc.dataiku.com/dss/latest/python-api/flow.html#creating-a-sync-recipe
-
Hi AlexandreL,
Thanks for your reply.
I have tried this but the problem is that the dataset is moved from the initial zone to new one and I want to keep the source dataset from the previous zone and have a sort of 'copy'of that dataset in the new zone.
I have tried the add_share method from api and the datasets are shred but one they are shared If i create for each dataset shared a recipe then it appears in the initial zone and not in the new one.
for item in new_zone.shared:
for item in new_zone.shared: builder = project.new_recipe("sync") builder = builder.with_input(item.name) builder = builder.with_new_output("OUTPUT_%s" % item.name,"CONNECTION_NAME", format_option_id="FORMAT_ID") recipe = builder.create()