Wednesday, January 4, 2012

Personal Backup Solutions

I recently decided to implement an automated regime of backup (I tended to favour manual backups previously quite simply because there wasn't that much to backup). Below are some of my research notes.


Usability issues obviously that are similar to some of the issues that I've been facing working on a personal project. It feels like they've done a one to one translation of what's in the configuration file to the web interface and even though it works well it just feels a little rough around the edges. My expectations of user interface design is such that in most cases (especially consumer class applications) software should not require you to read a manual. In this case, though it seemed as though the some of the labels were confusing/contradictory and the only way to debug was to resort to the CLI. Setup was simple. Install using repo.

- chkconfig backuppc on
- service backuppc start
- service httpd start
- cd /etc/BackupPC
- htpasswd -cmb apache.users backuppc backuppc
- /etc/BackupPC/ is actually a valid Perl file. Configure as required
- su -s /bin/bash backuppc
- ssh-keygen -t dsa
- ssh-copy-id -i .ssh/

It uses DNS to resolve 'viable hosts' and NetBIOS multicast thereafter. If all else fails though there is a fallback option to change the resolution mechanism via the config file via changes in parameter for the 'nmblookup' command. May need to change MAIL environment variable since by default 'backuppc' account has no configuration files and environment variables setup (I had to because I switched from ~/Maildir to /var/log/mail/*). Use /etc/aliases to forward email to another address. Documentation needs a bit of work. Some parts are skimmed over while others are quite verbose. Had some protocol mismatch issues when backup process was initiated. rsync man page indicated that it may be related to configuration files that may have been located (ssh remotehost /bin/true > out.dat to debug. If it contains non-zero material than obviously there are issues that need to be fixed. You see an error relating to only a certain amount of data being able to be received in the /var/log/BackupPC/* log file/s which is also viewable in the web interface.)

Massive performance issues on an i3 with 4GB once the backups started. The desktop environment actually began to suffer significant latency issues with the mouse cursor actually skipping halfway across the screen a number of times. Had to kill process eventually. Looked at other options such as using a different protocol and changing protocol (other than rsync). Thought about using a solution based on cpulimit, that I built a while back (to deal with a similar issues with bacula and other pieces of software) which would basically act like an ABS brake on CPU utliisation and also automatically changing priority for the process via scripts. Further research indicated that this issue has been alleviated or fixed in subsequent revisions of BackupPC though.


Prefer this solution over others because while its not perfect its still not a full blown backup solution that can can be unwieldly to deal with. I remember using bacula and there used to be some inexplicable errors in the database catalogue as well as some backup failures that couldn't be explained without delving overly deep into logfiles. Over time I figured out how to deal with them and achieved a perfect backup schedule but to be honest I just wanted a guarantee to know that it would work. One thing I did like about it though was the bconsole CLI interface. Single point from which to deal with mount/unmount/restore/backup of data.

Lightweight Options

Considered other lightweight (and even desktop) options such as rdiff-backup/backintime (basically a bunch of scripts) but a bit unwieldly and also didn't have the logging and diagnostics that BackupPC/bacula and other systems had.

Cloud/Filesystem Options

Have thought about cloud based and filesystem based solutions but have backed away for security/bandwidth reasons and would prefer to not to rely on the filesystem only.


Remember Amanda from a while back. Had half configured it previously (basic setup in an experimental environment for possible use in production). This time I decided to do a more complete setup with 'virtual tapes/slots' in a 'virtual multi tape changer machine' setup. Installed using repos and copied relevant xinetd.* file to xinetd.d and ran the following as indicated in crontab sample file.

amcheck -m DailySet1
amdump DailySet1

Had "amanda client 10080 ack timeout fedora" errors. Packaging was slack. Provisions weren't made in xinetd.d/* files in order to properly locate amindex and other file/s causing port 10080 service not to be started. Perhaps it was just the 64-bit version?

Need to create 'virtual tapes/slots' under '/var/amanda/vtapes/slot?'
/dumps/amanda used as temporary storage prior to dumping to (in this case virtual) tape.


/var/lib/amanda/.amandahosts works in a similar way to .rhosts bypass file to control who and which servers can backup/restore.

'strings /var/amanda/vtapes/slot0/* | less' gives you
'dd if=* bs=32k skip=1 | gzip -dc | tar -xpGf -'

Not pretty, even if you're doing it the 'proper way'.


As an aside, I restored to the /tmp directory. Somehow the permissions were erroneous which led to issues with Gnome (dealt with by setting correct permissions on /tmp directory). All the more reason to setup a seperate restoration area.

Logging layout could be streamlined. Spread out over a many different files and directories. Makes it easier to spot a particular time frame but complicates things. 'ls -al/multitail' are your friends here.

Long in the tooth and it shows. However, there does seem to be an effort to modernise judging by the website/Wiki. Zmanda (updated version of amanda with a web based management console) should definitely be at the back of your mind if you ever think of about using amanda.