Excel file import [R recipe]

B2oriel
B2oriel Dataiku DSS Core Designer, Registered Posts: 5

I have an Excel file in a managed folder, I can't read my file using :

data <- dkuManagedFolderDownloadPath("dsdsJk", "file")

Even using the different options of as = : "raw", "text", "parsed" I get nothing.

Tagged:

Best Answer

  • Catalina
    Catalina Dataiker, Dataiku DSS Core Designer, Registered Posts: 135 Dataiker
    edited July 17 Answer ✓

    You can read an Excel file from a managed folder using a code like below:

    library(dataiku)
    library(xlsx)
    
    content <- dkuManagedFolderDownloadPath("hRJwLbQS","xl2.xlsx", as = "raw")
    temp_f=tempfile()
    writeBin(object=content, con=temp_f)
    read.xlsx(temp_f, sheetName="Sheet1")

Answers

  • B2oriel
    B2oriel Dataiku DSS Core Designer, Registered Posts: 5

    Thank you, everything is working as I expect

  • Vasilisa
    Vasilisa Dataiker, Dataiku DSS Core Designer, Dataiku DSS ML Practitioner, Dataiku DSS Adv Designer, Registered Posts: 5 Dataiker
    edited July 17

    It's worth pointing out that Dataiku has a methoddkuManagedFolderCopyToLocal to copy files to the temp/file_system directory and then read/write from it. It's less optimal than using dkuManagedFolderDownloadPath as it copies file instead of reading through a connection but might work for different file formats.

    s3_folder_name = "<FOLDER ID>"
    # the path in the folder to retrieve:
    s3_file_name = "<FILE NAME>.xlsx"
    
    temp_folder <- tempdir()
    print(temp_folder)
    # check that the directory in tmp exists
    dir.exists(temp_folder)
    
    # Copy to the temp folder
    dkuManagedFolderCopyToLocal(s3_folder_name, temp_folder)
    
    # see the copied files
    content_temp <-list.files(temp_folder)
    print(content_temp)
    # Read file from a temporal folder : get a first file from the list
    file_path_temp <- file.path(temp_folder,s3_file_name)
    data_frame_temp <- read_excel(path=file_path_temp,skip=1)
    



Setup Info
    Tags
      Help me…