Sign up to take part
Registered users can ask their own questions, contribute to discussions, and be part of the Community!
Registered users can ask their own questions, contribute to discussions, and be part of the Community!
Backup
First, locate your data directory (named as "DATA_DIR" in our documentation). This directory contains your configuration, projects (graphs, recipes, notebooks, etc.), connections to databases, the filesystem_managed files, etc. Note that this directory may NOT contain all your data (for instance, data hosted on a database or on a Hadoop cluster).
Make sure you don't have any job running and stop your instance:
DATA_DIR/bin/dss stop
Compress DATA_DIR and save the archive somewhere else:
tar -zcvf your_backup.tar.gz /path/to/DATA_DIR/
Finally, restart the studio:
DATA_DIR/bin/dss start
Running an automatic backup
Here is an example of a bash script you can run periodically with cron:
#!/bin/bash
#Purpose: Backup of DSS data directory
/path/to/DATA_DIR/bin/dss stop
export GZIP=-9
TIME=`date +"%Y-%m-%d"`
FILENAME="backup-dss-data-$TIME.tar.gz"
SRCDIR="/path/to/DATA_DIR"
DSTDIR="/home/backups"
tar -cpzf $DSTDIR/$FILENAME $SRCDIR
/path/to/DATA_DIR/bin/dss start
You could save this script in a file backupscript.sh and set a cron task like the following one (running from Monday to Friday at 6:15am) :
15 6 * * 1-5 /path/to/backupscript.sh
Restoring a backup
To restore a backup, stop DSS, and simply replace the content of DATA_DIR with the content of the archive:
DATA_DIR/bin/dss stop
tar -zxvf your_backup.tar.gz
DATA_DIR/bin/dss start
Backup
First, locate your data directory (named as "DATA_DIR" in our documentation). This directory contains your configuration, projects (graphs, recipes, notebooks, etc.), connections to databases, the filesystem_managed files, etc. Note that this directory may NOT contain all your data (for instance, data hosted on a database or on a Hadoop cluster).
Make sure you don't have any job running and stop your instance:
DATA_DIR/bin/dss stop
Compress DATA_DIR and save the archive somewhere else:
tar -zcvf your_backup.tar.gz /path/to/DATA_DIR/
Finally, restart the studio:
DATA_DIR/bin/dss start
Running an automatic backup
Here is an example of a bash script you can run periodically with cron:
#!/bin/bash
#Purpose: Backup of DSS data directory
/path/to/DATA_DIR/bin/dss stop
export GZIP=-9
TIME=`date +"%Y-%m-%d"`
FILENAME="backup-dss-data-$TIME.tar.gz"
SRCDIR="/path/to/DATA_DIR"
DSTDIR="/home/backups"
tar -cpzf $DSTDIR/$FILENAME $SRCDIR
/path/to/DATA_DIR/bin/dss start
You could save this script in a file backupscript.sh and set a cron task like the following one (running from Monday to Friday at 6:15am) :
15 6 * * 1-5 /path/to/backupscript.sh
Restoring a backup
To restore a backup, stop DSS, and simply replace the content of DATA_DIR with the content of the archive:
DATA_DIR/bin/dss stop
tar -zxvf your_backup.tar.gz
DATA_DIR/bin/dss start
Updated documentation for backing up your DSS Instance can be found here: https://doc.dataiku.com/dss/latest/operations/backups.html?highlight=backup