jupyter-run data directory

manuelberbig
manuelberbig Registered Posts: 8

Hello,

some of my jupyter-run directories of single notebooks have a huge size. What exaclty is stored in these files. Is it an active session of a jupyter notebook? Some of these directories are very large for notebooks I didn't use for months. I also unloaded the session for these notebooks.

I just found the description in https://doc.dataiku.com/dss/latest/operations/datadir.html. Are these directories deleted internal?

I am looking forward for your answer!

Greetings

Answers

  • JordanB
    JordanB Dataiker, Dataiku DSS Core Designer, Dataiku DSS Adv Designer, Registered Posts: 296 Dataiker

    Hi @manuelberbig
    ,

    The jupyter-run directory requires manual clean up by the DSS user as it is not done automatically. Just note that by cleaning this directory, users will lose the data stored in CWD (current working directory). This storage accrues when users save files in the notebooks without specifying a path - for example, when a user writes from a notebook without specifying a path, the file is stored in jupyter-run/dku-workdirs. You can delete files located in jupyter-run/dku-workdirs/, however, it would "reset" any work done by your users in their respective notebooks if they wrote files locally. IE: wget commands, saved models, etc.
    In addition to cleaning the jupyter-run directory, we recommend reviewing the following documentation on safely removing nonessential files, such as old job logs and tmp files, to clear up disk space. This includes automating cleanup tasks through the use of DSS macros:
    - https://doc.dataiku.com/dss/latest/operations/disk-usage.html
    - https://doc.dataiku.com/dss/latest/operations/disk-usage.html#automating-cleanup-tasks-through-dss-macros
    Please let us know if you have any further questions.
    Thanks!
    Jordan
Setup Info
    Tags
      Help me…