Question on Datatest Ignore_flow in a recipe

Solved!
NN
Question on Datatest Ignore_flow in a recipe

Hi All,

I am on dss 8.0.1

I am trying to create a custom recipe and one of the steps of the recipe is to read a dataset from another project (project B).
I am trying to see if i can use the dataset from Project B in the recipe without exposing the dataset to Project A
(The dataset will be called directly in the python code and is not a Input Role).

i use a code like below.

 

df=dataiku.Dataset('datasetnmae',project_key='projectkey',ignore_flow=True).get_dataframe(infer_with_pandas=True)

 

For some reason this code works in a Notebook on project A.
But when run as a recipe in project A it fails with the error 

com.dataiku.common.server.APIError$SerializedErrorException: Error in python process: At line 99: <class 'Exception'>: Unable to fetch schema for projectkey.datasetname: b'dataset does not exist: projectkey.datasetname'

 If someone can suggest if this is feasible?

0 Kudos
1 Solution
ktgross15
Dataiker

Hi @NN ,

The reason why this works in a notebook and not a recipe is because a recipe requires input datasets to be explicitly defined through the UI, not just in the code. So, if you try to reference a dataset in your code which is not set as an input, it will fail (hence the idea of the flow -- any inputs to a recipe should be clear visually).

I would recommend first sharing the dataset from project B to project A, and then setting it as an input to the python recipe. Or, if you want to do this through the python API, the copy_to method of the Dataset class should help.

Best,

Katie

View solution in original post

0 Kudos
2 Replies
ktgross15
Dataiker

Hi @NN ,

The reason why this works in a notebook and not a recipe is because a recipe requires input datasets to be explicitly defined through the UI, not just in the code. So, if you try to reference a dataset in your code which is not set as an input, it will fail (hence the idea of the flow -- any inputs to a recipe should be clear visually).

I would recommend first sharing the dataset from project B to project A, and then setting it as an input to the python recipe. Or, if you want to do this through the python API, the copy_to method of the Dataset class should help.

Best,

Katie

0 Kudos
NN
Author

@ktgross15 

Thanks Katie .. It makes sense. Sharing the dataset definitely works fine.

I misunderstood the ignore_flow option.

0 Kudos