Web Redundancy
While I've attempted to document this in my more offical emergency howto, it's gotten out of date. These notes are more current. Please read the old emergency howto, but remember it's way out of date.
Contents
Reboot
This is the general procedure for a reboot without loss of web service. FTP or SFTP service will be unavailable for the duration of this process, which is usually between 5 and 15 minutes.
I refer to "backup" and "main" servers. The machines are labled Arthur1, Arthur2, Edison1 and Edison1 etc... The live server is Arthur or Edison and so fourth. I use "backup" to describe the machine that is not currently serving and I use "main" describe the machine that is. So, Arthur2 can be "main" and Arthur1 can be "backup" and vice versa.
- Stop logins to backup machine
- Stop changes to main machine
- Sync up machines
- Check backup machine
- Make backup machine live
- Make sure crontabs aren't insane
- Make sure quotas are enabled appropriately and run quotacheck as needed
- Make sure backups will run on both machines
- Make sure new backup machine can rsync from new active
Making SSH work only for Me
During the sync up step above, I need to block everyone except myself and possibly any SSH user who's tunneling for rsync.
In /etc/ssh/sshd_config set
Banner /etc/ssh/banner AllowUsers dmartin
- and restart sshd.
then comment out AllowUsers and save the config file. This will make sshd work properly on reboot
The file /etc/ssh/banner should contain a message telling people what's up. That filename could be any filename.
MySQL
Syncing the Database
Any machines running mysql should have an account with a home at /home/system/mysql-dumper (or similar). In that dir you should find a mysql.dump.all.sql file (again, or similar). When syncing, run mysql -u root -p < /home/system/mysql-dumper/mysql.dump.all.sql (or similar..)
Stopping Changes
To stop changes, you have to log in to mysql as root and run:
mysql> FLUSH TABLES WITH READ LOCK;
You must not close that session or the lock will expire.
Tracking Changes
I'm trying to make a versioning backup system that will let me see the changes I've made to various files. See bzr for more.
Cron
I've written two scripts ifon and ifstate. They help cron scripts determine where they are so they can run on specific machines or in a specific active/inactive state even if the cron script is the same across machines. The ifon script returns success if it's run on a specific host. This almost never comes up. The ifstate script determines if the host it is on is the active member of a redundant pair or not. This gets used all the time. Here's an example:
0 0 * * * ifstate active && create_accounts.sh
Rsync
I use rsync to keep redundant pairs in sync. I've written my own wrapper script that handles the details. To install on a pair of machines, use install-rsync.tgz on both machines. Some tweaking is still required.
Planning to Switch
If one member of a redundant pair is active for a long time, I need to be careful that the reserve partner is actually ready. I have nagios monitoring for the big stuff, but small things can bite.
- compare lists of rpms (/root/rpms is created every night by the backup routine, so it should be useful)
- compare obvious config files
- /etc/httpd
- /etc/php.ini, /etc/php.d/*
- /etc/my.conf
- /etc/nagios
- check bzr logs
- compare cron files
