Dataiku Folder.get_path() is not working. Error: cannot perform direct filesystem access

Options
bon2bone
bon2bone Registered Posts: 5

Hello, I am trying to get the printing of some messages to a log file (log.txt) in logfolder (a managed folder in DSS, connects to sharepoint folder).

Error ==>

Line 128: managed_folder_path = dataiku.Folder("logfolder").get_path()
Job failed: Error in Python process: At line 128: <class 'Exception'>: Folder is not on the local filesystem (uses fsprovider_sharepoint-server_sharepoint-server_shared-documents), cannot perform direct filesystem access. Use the read/write API instead.

My code ==>

import logging

import dataiku
import pandas as pd, numpy as np
from dataiku import pandasutils as pdu
from datetime import datetime
import os.path

logging.basicConfig(level=logging.DEBUG)
managed_folder_path = dataiku.Folder("logfolder").get_path()
logging.root.addHandler(logging.FileHandler(os.path.join(managed_folder_path, "log.txt")))
logging.info("hello")


Operating system used: Unix

Best Answer

  • bon2bone
    bon2bone Registered Posts: 5
    Answer ✓
    Options

    Finally I am able to wrap my head around and create filesystem folder instead (after that I could write morecode to copy that file back to sharepoint):

    - In Flow, + Dataset > Folder.
    - Label = logfolder
    - Store into = filesystem_folders
    - Paritioning = Not partitioned
    - Set output to logfolder from Python recipe

    Code ==>

    import logging
    import dataiku
    import os.path

    logger = logging.getLogger("mylogger")

    if logger.hasHandlers():
    logger.handlers = []

    logger.setLevel(logging.DEBUG)

    #create File for Log
    handler = logging.FileHandler(os.path.join(managed_folder_path, "log.txt"))
    handler.setLevel(logging.DEBUG)

    #log format
    formatter = logging.Formatter('%(asctime)s - %(name)s - %(levelname)s - %(message)s')
    handler.setFormatter(formatter)

    #adding the handler to Logging System
    logger.addHandler(handler)
    logger.error("hello")

Answers

  • Miguel Angel
    Miguel Angel Dataiker, Dataiku DSS Core Designer, Dataiku DSS ML Practitioner, Dataiku DSS Adv Designer, Registered Posts: 118 Dataiker
    Options

    Hi,

    The 'get_path' function can only be used in the specific case of local folders. Information about this function can be found on this article: https://doc.dataiku.com/dss/latest/connecting/managed_folders.html#local-vs-non-local

    The error message we are getting is quite informative about it too:

    Folder is not on the local filesystem (uses fsprovider_sharepoint-server_sharepoint-server_shared-documents), cannot perform direct filesystem access

    Fortunately, it is also telling us what we can do:

    Use the read/write API instead.

    Information about the API can be found in the help: https://doc.dataiku.com/dss/latest/python-api/index.html#python-apis

  • JordanB
    JordanB Dataiker, Dataiku DSS Core Designer, Dataiku DSS Adv Designer, Registered Posts: 293 Dataiker
    edited July 17
    Options

    Hi @bon2bone
    ,

    You are receiving this error ("Folder is not on the local filesystem... Use the read/write API instead") because the get_path() method can only be called for managed folders that are stored on the local filesystem of the DSS server. I recommend taking a look at our documentation on Local vs. Non-Local Managed Folders and instead using the various read/download and write/upload APIs to access your managed folder. Please see the code below for example:

    folder = dataiku.Folder("enter-folder-id/name")
    with
    open("local_file_to_upload") as f: folder.upload_stream("name_of_file_in_folder", f)

    API reference documentation: https://doc.dataiku.com/dss/latest/python-api/managed_folders.html#managed-folders

    If you have any questions, please let us know.

    Thanks!

    Jordan

  • bon2bone
    bon2bone Registered Posts: 5
    Options

    Hello, thanks on the reply.

    For the code below, what is the local file to upload? I am writing text into log.txt only...

    Code ==>

    with open("local_file_to_upload") as f:

  • JordanB
    JordanB Dataiker, Dataiku DSS Core Designer, Dataiku DSS Adv Designer, Registered Posts: 293 Dataiker
    edited July 17
    Options

    Hi @bon2bone
    ,

    Apologies, the code I provided writes a file to the managed folder. If you want to write data to a file within a managed folder, you could do something like this:

    import dataiku
    
    my_str = "hello world"
    my_str_as_bytes = str.encode(my_str)
    logging_folder = dataiku.Folder("VYKX8FzU")
    with logging_folder.get_writer("log.txt") as w:
        w.write(my_str_as_bytes)

    Please let me know if that works for you.

    Thanks!

    Jordan

  • bon2bone
    bon2bone Registered Posts: 5
    Options

    Could I have the writer in append mode? E.g. When I rerun the script the 2nd time, log.txt file should have "hello world" 2 times:

    hello world
    hello world

  • bon2bone
    bon2bone Registered Posts: 5
    Options

    Thanks on the reply. Could you point me to an API method that is able to return the path of a non-local folder?

    On the other hand, maybe I can save locally to DSS and re-copy that log.txt file back to sharepoint?

  • CoreyS
    CoreyS Dataiker Alumni, Dataiku DSS Core Designer, Dataiku DSS Core Concepts, Registered Posts: 1,150 ✭✭✭✭✭✭✭✭✭
    Options

    Thanks for sharing your solution with the rest of the Community @bon2bone
    !

Setup Info
    Tags
      Help me…