reading xls and ppt from a managed folder on a S3 connector
Hello,
I am trying to use acess and write files on a S3 connector
working in notebook, I was able to have something that work for csv files. but I can not find a way for xls or ppt files.
Here is the idea for csv files :
myrawfile <- dkuManagedFolderDownloadPath("myfolder", '/myfile.csv', as = "raw") mydata<- read_delim(myrawfile, delim = ";", )
When I try something similar for xls (read_xls from {tidyverse})
myrawfile <- dkuManagedFolderDownloadPath("myfolder", '/myfile.csv', as = "raw")
mydata<- read_xls(myrawfile)
or with ppt (read_pptx from {officer})
my_raw_ppt<- dkuManagedFolderDownloadPath("my_folder", '/my_file.pptx', as = "raw") my_ppt <- read_pptx(my_raw_ppt)
I get an error :
Error in file.exists(path): invalid 'file' argument Traceback: 1. read_pptx(ppt_bin) 2. file.exists(path)
as these functions are expecting path rather than files
Answers
-
Hi Pascal,
I've included some example code below that should work for this. The R API does not have a way to read a file path from S3. Due to this, we have to copy the file locally to the notebook and then read it from the disk. The code below highlights this.
#Set directory to current notebook path. File is the file you want to access
#getwd() is used to set the path to the notebooks current directory
directory<- paste(getwd(),"/myfolderlocal/", sep="")
file<-paste(directory,"samplepptx.pptx",sep="")#Have to copy to a local folder first, we don't have an API to access folder content on S3 with paths
copy<- dkuManagedFolderCopyToLocal("myfolder",directory)#Read the powerpoint as normal from disk
mydata<-read_pptx(file)
content <- pptx_summary(mydata)
contentThank you.
Andrew