Temporary files

Solved!
yinkit
Level 1
Temporary files

Hi 

I would like to know how/when the temporary files are cleared.
The documentation says that it is "automatically cleared" but i can still find files from earlier than the past month.

Thanks !

0 Kudos
1 Solution
AlexT
Dataiker

Thanks for clarifying that you are talking about the data/tmp. 

Clearing this has to be done manually: https://doc.dataiku.com/dss/latest/operations/disk-usage.html#id3

Also please note as mentioned in this discussion, some files can be written still be written directly to /tmp by a third party that does not enforce the usage of $TMPDIR env variable. 

We don't have a Macro available to clear the remaining data/tmp files. 

To delete them you would manually or add a cron job that would also stop and start DSS when clearing DATADIR/tmp. I hope this answers your questions.

 

View solution in original post

0 Kudos
4 Replies
AlexT
Dataiker

Hi,

As mentioned in the doc, most of it is automatically cleared when they are no longer needed. There will be exceptions when other third-party components can create files in /tmp like docker, Spark etc.  So it's not unexpected that may still find files in tmp that are older which have not been cleared. 

There may be manual cleanup required depending on what data remains there. 

If you are having an issue with space in /tmp please send a list of the files you currently see.  Also based on your OS configuration /tmp may be automatically cleared on reboot. 

Thanks,

 

0 Kudos
yinkit
Level 1
Author

Hi @AlexT ,

Thanks for taking the time to answer my question !

To be sure to be on the same page, i am referring to `/data/tmp` directory which is DSS space (from my understanding).
What i expect is DSS to handle this space directory by :
- reclaiming it automatically, otherwise
- provide a "Macro" for me to use in order to reclaim it myself automatically (such as the macro for "jobs", "cache", ect..) .

Both topics are unclear for me as they are not detailed in the documentation.

As you said, this folder contains various files coming from third-party components : 
- jetty-0.0.0.0-10001-dip-webapp-_dip-any-8869083011218067384.dir
- jffi8771310595204353495.tmp
- snappy-1.1.4-4a60e4fa-0bc0-47c4-8b7d-ef34bf514802-libsnappyjava.so
- exec-docker-base-image.nYfYuj
...

As this space is owned by DSS, i don't think OS configuration will clear them on reboot

0 Kudos
AlexT
Dataiker

Thanks for clarifying that you are talking about the data/tmp. 

Clearing this has to be done manually: https://doc.dataiku.com/dss/latest/operations/disk-usage.html#id3

Also please note as mentioned in this discussion, some files can be written still be written directly to /tmp by a third party that does not enforce the usage of $TMPDIR env variable. 

We don't have a Macro available to clear the remaining data/tmp files. 

To delete them you would manually or add a cron job that would also stop and start DSS when clearing DATADIR/tmp. I hope this answers your questions.

 

0 Kudos
yinkit
Level 1
Author

Thank you for your answer ๐Ÿ‘

We will manually clean this space if it is not done automatically.

Have a good week-end

0 Kudos