Writing a file to a S3 managed folder from R (eg via ggsave)

Solved!
Pascal_B
Level 2
Writing a file to a S3 managed folder from R (eg via ggsave)

Hello, 

I am not able to write a file to managed folder (hosted on S3) from R. Can you help ?

My current usecase is using ggsave to save a plot to a .png file on a managed folder.

Reading the doc, I found dkuManagedFolderUploadPath function, but I can not figure what would be a relevant value for the "data" argument (from the doc : "data must be a connection providing the data to upload")

How can I generate this "data" in my context along ggplot / ggsave workflow ?
Could you provide me some explanations and a working toy example so I can figure out the steps ?

Thanks for your help
Pascal

1 Solution
AlexT
Dataiker

Hi @Pascal_B , @tanguy ,

In your case, if you are looking to upload a ggplot png or pdf you can simply save it to a tempfile. Then read that file and upload it to the remote (S3)  managed folder.  Below is a code sample:

library(dataiku)
library(ggplot2)

#define folder ID and filename here

output_folder = 'NP2gVabt'
output_filename = 'testing'

# Just for testing create some plots
p1 <- ggplot(mtcars, aes(wt, mpg)) + geom_point()

# create tempfile with ggsave
png_file <- tempfile()

ggsave(png_file, device = "png", plot = p1)

#create a connection to the temp file
tmp_png_file_connection <- file(png_file, "rb")

#write to the managed folder
dkuManagedFolderUploadPath(output_folder, paste(output_filename,'png',sep='.'), tmp_png_file_connection)

#close the connection and remove temp file
close(tmp_png_file_connection)
unlink(png_file)

If you want to write a dataset e.g csv you can do a similar approach using write.csv to a tempfile and then reading and upload that as the "data".

Let me know if this works for you or if you have any other questions. 

 

 

View solution in original post

3 Replies
tanguy

I have the same need as @Pascal_B : how to save files with R into a S3 folder (I have not found the equivalent of saving files using streaming with python) ?

AlexT
Dataiker

Hi @Pascal_B , @tanguy ,

In your case, if you are looking to upload a ggplot png or pdf you can simply save it to a tempfile. Then read that file and upload it to the remote (S3)  managed folder.  Below is a code sample:

library(dataiku)
library(ggplot2)

#define folder ID and filename here

output_folder = 'NP2gVabt'
output_filename = 'testing'

# Just for testing create some plots
p1 <- ggplot(mtcars, aes(wt, mpg)) + geom_point()

# create tempfile with ggsave
png_file <- tempfile()

ggsave(png_file, device = "png", plot = p1)

#create a connection to the temp file
tmp_png_file_connection <- file(png_file, "rb")

#write to the managed folder
dkuManagedFolderUploadPath(output_folder, paste(output_filename,'png',sep='.'), tmp_png_file_connection)

#close the connection and remove temp file
close(tmp_png_file_connection)
unlink(png_file)

If you want to write a dataset e.g csv you can do a similar approach using write.csv to a tempfile and then reading and upload that as the "data".

Let me know if this works for you or if you have any other questions. 

 

 

tanguy

I have the same need as @Pascal_B : how to save files with R into a S3 folder (I have not found the equivalent of saving files using streaming with python) ?

0 Kudos