git - clear notebooks before commit

Tanguy
Tanguy Dataiku DSS Core Designer, Dataiku DSS & SQL, Dataiku DSS ML Practitioner, Dataiku DSS Core Concepts, Neuron, Dataiku DSS Adv Designer, Registered, Dataiku DSS Developer, Neuron 2023 Posts: 113 Neuron

Several times we've encountered projects that could not be exported due to "git saturation." From memory, I believe the export limit for a project occurs when version control exceeds 2 GB (but I think this limit has recently been raised).

After investigating, we found that this issue was caused by committing notebooks, particularly those with output related to computer vision tasks, where image data was significantly increasing the project's git size.

One solution could be to clear the notebook outputs before committing them to version control, as demonstrated in this short tutorial:

https://calmcode.io/course/jupyter-lab/clearing-notebooks

Is it possible to configure this behavior as the default for a project's notebooks?

1
1 votes

New · Last Updated

Setup Info
    Tags
      Help me…