Import rds file and processing it in DSS

Go14
Go14 Registered Posts: 5 ✭✭✭

Hi

I want to import RDS file into Dataiku. I have used Managed Folder with S3 as storage and upload the RDS file into the Managed folder(i.e in S3)

I have created an R recipie where i am able to retrieve the data from the Managed folder using dkuManagedFolderDownloadPath R API.

I want to convert the RDS format file into CSV using R recipe, since DSS is not supporting RDS format for Processing in DSS. I have used ReadRDS function to create a dataframe from the RDS file so that i could split the RDS file into 2 csv files.

But when i retrieve the data using dkuManagedFolderDownloadPath API and try to create dataframe using readRDS, it throws me an error that "bad file argument".

It seems readRDS supports only file as input but the data is raw content from ManagedFolder.

Can somebody help me to covert the RDS file(in managed folder) into CSV, so that i could create a Dataset from it in DSS.

Thanks in advance.

Note:

since the rds file size is huge i don't want to store that in DSS local filesystem to use in readRDS function.

Tagged:

Answers

  • Liev
    Liev Dataiker Alumni Posts: 176 ✭✭✭✭✭✭✭✭

    Hi @Go14

    Loading RDA files is not supported natively in DSS, so all the conversions will need to happen in R.

    If loading from S3 is an issue (which it might be, considering the size of the files), maybe you could copy the contents of your remote folder into a local one using this method first.
    Then, load the local files and convert into CSV/dataframes and save into datasets.

    Good luck!

Setup Info
    Tags
      Help me…