How to delete temporary files in Flow

user354687351 Partner, Dataiku DSS Core Designer, Dataiku DSS ML Practitioner, Registered Posts: 2 Partner

I'm connecting my dataiku instance to a postgresql server, and when I'm querying the sql table in a flow, I have to store the result of the query somewhere. I don't want to store it back in the database to avoir cluttering, so I thought about storing it locally in a csv file, however:

>does these files get deleted automatically or do I have to delete them manually? I wouldn't want to accumulate temporary file on the server

>does dataiku keeps the sql metadata (column types, etc.) when extracting to csv?

>is there a possibility to store outputs into RAM or some similar fast&temporary storage?


Best Answer

  • tomas
    tomas Registered, Neuron 2022 Posts: 120 ✭✭✭✭✭
    Answer ✓
    If you store the result of the query in local file system, it is stored in the DSS server's file system (anywhere where it is configured in Connections). These files (this dataset) is deleted automatically if you delete the dataset and you check the checkbox to drop data. If you dont check the checkbox during the removal of the dataset, the files will remain on the server.

    Regarding perf, you can create a ramdisk or attach a fast ssd to the dSS and store the dataset there. Or store it in /tmp.
Setup Info
      Help me…