Managed Folder not accessible when running recipe on Containerized Execution

jax79sg
Level 2
Managed Folder not accessible when running recipe on Containerized Execution

Hi, 

I've create a custom recipe with a Dataset and Managed Folder  as input and a Dataset as output. The recipe runs well on DSS but it doesn't work when its configured to run on a Containerized Execution. These are the steps i took and also the error i received after running the receipt.

Setting up

  1. Add the plugin (with the recipe)
  2. In the plugin summary page, created a new code environment with following parameters.
    1. Create new: Managed by DSS
    2. Python: Python36
    3. Build images for: Selected Container: mycontainer
  3. Build new environment without errors

Using it

  1. In my project, went to settings to configure project containerized execution to use 'mycontainer' and code env selection to use my python36 env and checked 'Prevent override by recipes'.
  2. In my project, click on the Dataset and select my recipe
  3. In recipe: Enter the input of my Managed Folder (This folder points to filesystem) and output dataset.
  4. Run the recipe.

Error received:

FileNotFoundError: [Errno 2] No such file or directory: '/mtn/...some path/57CFjKsj'

As far as i can tell, the Containerized Execution cannot see the files in Managed Folder. Could you advise what is going on?

Thank you.

0 Kudos
3 Replies
Alex_Combessie
Dataiker Alumni

Hi,

To run a recipe on Containerized Execution when using managed folders as input/output, you will need to use the Dataiku API for reading/writing files.

In other words, the "regular" local filesystem API which you were using with local execution cannot work anymore. Hence, you will need to use the following methods to interact with the managed folder:

The same APIs also exist in R, as documented here: 

Hope it helps,

Alex Combessie

0 Kudos
jax79sg
Level 2
Author
Thanks Alex, let me try this out and get back to you.
0 Kudos
xavsun
Level 2

Hi @Alex_Combessie , thanks for that. That's really helpful. I have another question - how to remove the files in the managed folder on containerised execution?

0 Kudos