Excel file import [R recipe]
B2oriel
Dataiku DSS Core Designer, Registered Posts: 5 ✭
I have an Excel file in a managed folder, I can't read my file using :
data <- dkuManagedFolderDownloadPath("dsdsJk", "file")
Even using the different options of as = : "raw", "text", "parsed" I get nothing.
Best Answer
-
You can read an Excel file from a managed folder using a code like below:
library(dataiku) library(xlsx) content <- dkuManagedFolderDownloadPath("hRJwLbQS","xl2.xlsx", as = "raw") temp_f=tempfile() writeBin(object=content, con=temp_f) read.xlsx(temp_f, sheetName="Sheet1")
Answers
-
Thank you, everything is working as I expect
-
Vasilisa Dataiker, Dataiku DSS Core Designer, Dataiku DSS ML Practitioner, Dataiku DSS Adv Designer, Registered Posts: 5 Dataiker
It's worth pointing out that Dataiku has a method
dkuManagedFolderCopyToLocal
to copy files to the temp/file_system directory and then read/write from it. It's less optimal than usingdkuManagedFolderDownloadPath
as it copies file instead of reading through a connection but might work for different file formats.s3_folder_name = "<FOLDER ID>" # the path in the folder to retrieve: s3_file_name = "<FILE NAME>.xlsx" temp_folder <- tempdir() print(temp_folder) # check that the directory in tmp exists dir.exists(temp_folder) # Copy to the temp folder dkuManagedFolderCopyToLocal(s3_folder_name, temp_folder) # see the copied files content_temp <-list.files(temp_folder) print(content_temp) # Read file from a temporal folder : get a first file from the list file_path_temp <- file.path(temp_folder,s3_file_name) data_frame_temp <- read_excel(path=file_path_temp,skip=1)