Managed Folder with Container Execution

Options
Astrogurl
Astrogurl Registered Posts: 2

I am trying to run a python recipe and have a model saved in a managed folder. I understand that I have to use get_download_stream() to read the data, but the python module that I need to use (FAISS) does not support reading the saved model as bytes. Is there a way that I can download the file and obtain a path, so that I can just feed in the path to the python module during the containerized execution?


Operating system used: Mac OS Monterey

Tagged:

Best Answer

  • Zach
    Zach Dataiker, Dataiku DSS Core Designer, Dataiku DSS Adv Designer, Registered Posts: 153 Dataiker
    edited July 17 Answer ✓
    Options

    Hi @Astrogurl
    ,

    The following code will download the file to a temporary directory first so that you can pass the path to FAISS:

    import os.path
    import shutil
    import tempfile
    
    import dataiku
    
    folder = dataiku.Folder("FOLDER")
    
    with tempfile.TemporaryDirectory() as temp_dir:
        path = os.path.join(temp_dir, "my-file.txt")
        
        # Download the remote file to `path`
        with folder.get_download_stream("/my-file.txt") as download_stream:
            with open(path, "wb") as local_file:
                shutil.copyfileobj(download_stream, local_file)
                
        # Do stuff with the temp file here
        # It will be automatically deleted when the `temp_dir` block finishes
        print(path)

    Thanks,

    Zach

Answers

Setup Info
    Tags
      Help me…