How do I backup my instance of Data Science Studio?

Options
UserBird
UserBird Dataiker, Alpha Tester Posts: 535 Dataiker

Best Answer

  • jrouquie
    jrouquie Dataiker Alumni Posts: 87 ✭✭✭✭✭✭✭
    Answer ✓
    Options

    Backup

    First, locate your data directory (named as "DATA_DIR" in our documentation). This directory contains your configuration, projects (graphs, recipes, notebooks, etc.), connections to databases, the filesystem_managed files, etc. Note that this directory may NOT contain all your data (for instance, data hosted on a database or on a Hadoop cluster).

    Make sure you don't have any job running and stop your instance:


    DATA_DIR/bin/dss stop

    Compress DATA_DIR and save the archive somewhere else:


    tar -zcvf your_backup.tar.gz /path/to/DATA_DIR/

    Finally, restart the studio:


    DATA_DIR/bin/dss start

    Running an automatic backup

    Here is an example of a bash script you can run periodically with cron:


    #!/bin/bash
    #Purpose: Backup of DSS data directory

    /path/to/DATA_DIR/bin/dss stop

    export GZIP=-9
    TIME=`date +"%Y-%m-%d"`
    FILENAME="backup-dss-data-$TIME.tar.gz"
    SRCDIR="/path/to/DATA_DIR"
    DSTDIR="/home/backups"

    tar -cpzf $DSTDIR/$FILENAME $SRCDIR

    /path/to/DATA_DIR/bin/dss start

    You could save this script in a file backupscript.sh and set a cron task like the following one (running from Monday to Friday at 6:15am) :


    15 6 * * 1-5 /path/to/backupscript.sh

    Restoring a backup

    To restore a backup, stop DSS, and simply replace the content of DATA_DIR with the content of the archive:


    DATA_DIR/bin/dss stop
    tar -zxvf your_backup.tar.gz
    DATA_DIR/bin/dss start

Answers

Setup Info
    Tags
      Help me…