jupyter-run data directory

manuelberbig
Level 2
jupyter-run data directory

Hello,

some of my jupyter-run directories of single notebooks have a huge size. What exaclty is stored in these files. Is it an active session of a jupyter notebook? Some of these directories are very large for notebooks I didn't use for months. I also unloaded the session for these notebooks.

I just found the description in https://doc.dataiku.com/dss/latest/operations/datadir.html. Are these directories deleted internal?

I am looking forward for your answer!

Greetings

0 Kudos
1 Reply
JordanB
Dataiker

Hi @manuelberbig,

The jupyter-run directory requires manual clean up by the DSS user as it is not done automatically. Just note that by cleaning this directory, users will lose the data stored in CWD (current working directory). This storage accrues when users save files in the notebooks without specifying a path - for example, when a user writes from a notebook without specifying a path, the file is stored in jupyter-run/dku-workdirs. You can delete files located in jupyter-run/dku-workdirs/, however, it would "reset" any work done by your users in their respective notebooks if they wrote files locally. IE: wget commands, saved models, etc.
 
In addition to cleaning the jupyter-run directory, we recommend reviewing the following documentation on safely removing nonessential files, such as old job logs and tmp files, to clear up disk space. This includes automating cleanup tasks through the use of DSS macros:
- https://doc.dataiku.com/dss/latest/operations/disk-usage.html
- https://doc.dataiku.com/dss/latest/operations/disk-usage.html#automating-cleanup-tasks-through-dss-m...
 
Please let us know if you have any further questions.
 
Thanks!
Jordan
0 Kudos