Question on Datatest Ignore_flow in a recipe
Hi All,
I am on dss 8.0.1
I am trying to create a custom recipe and one of the steps of the recipe is to read a dataset from another project (project .
I am trying to see if i can use the dataset from Project B in the recipe without exposing the dataset to Project A
(The dataset will be called directly in the python code and is not a Input Role).
i use a code like below.
df=dataiku.Dataset('datasetnmae',project_key='projectkey',ignore_flow=True).get_dataframe(infer_with_pandas=True)
For some reason this code works in a Notebook on project A.
But when run as a recipe in project A it fails with the error
com.dataiku.common.server.APIError$SerializedErrorException: Error in python process: At line 99: <class 'Exception'>: Unable to fetch schema for projectkey.datasetname: b'dataset does not exist: projectkey.datasetname'
If someone can suggest if this is feasible?
Best Answer
-
Hi @NN
,The reason why this works in a notebook and not a recipe is because a recipe requires input datasets to be explicitly defined through the UI, not just in the code. So, if you try to reference a dataset in your code which is not set as an input, it will fail (hence the idea of the flow -- any inputs to a recipe should be clear visually).
I would recommend first sharing the dataset from project B to project A, and then setting it as an input to the python recipe. Or, if you want to do this through the python API, the copy_to method of the Dataset class should help.
Best,
Katie
Answers
-
Thanks Katie .. It makes sense. Sharing the dataset definitely works fine.
I misunderstood the ignore_flow option.