Try your hand at analyzing royal sentiment in Dataiku DSS! Learn more

Managed folder on S3

Level 1
Managed folder on S3

Hi,

We are migrating Dataiku from an onprem server to AWS. There is a project that currently uses a managed folder with a few csv files in it as an input into an R recipe. Post migration we wish to use an S3 location with HDFS connection into it for any storage. When I repoint the above managed folder to the S3' HDFS connection and try to run the recipe I get an error (see attached screenshot).

Please let me know if it is at all possible to use S3 in the above scenario. And if not, can you please point me in the right direction for the "read/write API" mentioned in the error.

Thank you.

 

 

0 Kudos
4 Replies
Level 1
Author

Just to clarify the current managed folder uses a Filesystem connection.

0 Kudos
Dataiker
Dataiker

Hi,

Are you trying to read/write data from/to a managed folder manually by constructing a path with "get_path" or "file_path"?

If yes, you'd need to use "get_download_stream" and "upload_stream" for reading and writing operations:

https://doc.dataiku.com/dss/latest/python-api/managed_folders.html#dataiku.Folder.get_download_strea...

 

Regards

Andrey Avtomonov
R&D Engineer @ Dataiku
0 Kudos
Level 1
Author

Hi @Andrey , thanks for this. Our Data scientists have a huge R script that picks up files from a managed folder. The code looks something like this:

MyManagedFolder <- dkuManagedFolderPath("WCrIUW3D")

MyDataset = read.csv(paste0(MyManagedFolder , "/MyFile.csv"), stringsAsFactors = F, header = F)

Note there are lots more files in such a folder so ideally we will find an R based solution if at all possible.

0 Kudos
Dataiker
Dataiker

My bad, I missed the fact that you're using R and proposed a Python solution.

In case of R the similar solution would be to use 

dkuManagedFolderDownloadPath

In this case you rely on DSS to read data from the file storage behind managed folder and give you the result depending on what you pass as an "as" parameter.

Regards

Andrey Avtomonov
R&D Engineer @ Dataiku
0 Kudos