DHCP Failover

Tuesday, January 1, 2008

I've been setting up DHCP servers at work to use the failover feature available in ISC-DHCP (the net/isc-dhcp3-server port in FreeBSD). That allows for two servers to work together, sharing a pool of addresses and keeping track of leases handed out by both servers. The dhcpd.conf(5) manpage discusses this feature somewhat. I'll jot down some notes here that are a bit more specific about what all had to be done.

Let's assume the two DHCP servers will have the IP addresses 10.0.0.10 and 10.0.0.20 - with .10 being the 'primary' and .20 being the 'secondary' server. It doesn't really matter which is which - although the logs on the 'primary' server seem a bit more complete. We'll also assume these servers are giving out addresses from a pool of numbers 10.0.0.100-10.0.0.200, and are running DNS caches - so that the DHCP clients should be told to use them for DNS servers.

We'll also use TCP port 520 for communications between the DHCP servers, so be sure to allow for that through any firewalls.

Configuration

On the 10.0.0.10 'primary' machine, the /usr/local/etc/dhcpd.conf file might look like:

failover peer "foo" 
    {
    primary;
    mclt 1800;  # only specified in the primary
    split 128;  # only specified in the primary

    address 10.0.0.10;
    port 520;

    peer address 10.0.0.20;
    peer port 520;

    max-response-delay 30;
    max-unacked-updates 10;
    load balance max seconds 3;                
    }

option domain-name-servers 10.0.0.10, 10.0.0.20;

include "/usr/local/etc/dhcp/master.conf";

and the same file on the 'secondary' 10.0.0.20 machine is very similar:

failover peer "foo" 
    {
    secondary;

    address 10.0.0.20;
    port 520;

    peer address 10.0.0.10;
    peer port 520;

    max-response-delay 30;
    max-unacked-updates 10;
    load balance max seconds 3;                
    }

option domain-name-servers 10.0.0.20, 10.0.0.10;

include "/usr/local/etc/dhcp/master.conf";

The failover peer name, "foo" in this example, will also appear in the DHCP pool configuration, and will be used in a script change the failover state later on.

I created a directory /usr/local/etc/dhcp/ to hold the DHCP config files that will be common to both DHCP servers. That way, it's just a matter of copying the entire directory between servers when a change is made. The /usr/local/etc/dhcp/master.conf file I included from the main server config might look something like:

omapi-port 7911;

default-lease-time 16200;  # 4.5 hours
max-lease-time 16200;

subnet 10.0.0.0 netmask 255.255.255.0
        {
        option routers 10.0.0.1;

        pool
                {
                failover peer "foo";
                deny dynamic bootp clients;

                range 10.0.0.100  10.0.0.200;
                }
        }

The deny dynamic bootp clients;directive is required for any failover pool. The omapi-port 7911; directive will be useful later on for when a server needs to be put into the 'partner-down' state because the other server will be off for a while.

To sync and restart the two servers whenever there's a change to the DHCP configuration, I setup the 10.0.0.20 server to allow root logins through SSH from the root account of 10.0.0.10 using public/private keys, and then put a script named restart_dhcp on the 10.0.0.10 server that looks like:

#!/bin/sh
/usr/local/etc/rc.d/isc-dhcpd restart
scp -pr /usr/local/etc/dhcp 10.0.0.20:/usr/local/etc
ssh 10.0.0.20 /usr/local/etc/rc.d/isc-dhcpd restart

That copies the entire /usr/local/etc/dhcp directory, so if you need to break up your config into more files that get included, they'll all be copied over when you do a restart.

Failover

When one server stops unexpectedly, the remaining server will go into a communications-interrupted state, and continue offering up addresses from its half of the DHCP pools, and will renew leases it knows were given out by the other server.

If the downed server will be out for longer than the mclt value from the server config (1800 seconds (30 minutes) in the examples above). You may want to let the surviving server know that it's on its own so that it can use the entire pool of available addresses. This is done by putting the surviving server into partner-down state.

This has to be done after the other server is really down. Doing it before shutting down the other server doesn't work, because the two servers will get themselves back into a normal state very quickly, probably before you get a chance to shut the 2nd server down.

The omshell program can be used to communicate with a running DHCP server daemon to control it in various ways, including changing the failover state. I put this partner-down script on both the primary and secondary servers:

#!/bin/sh
omshell << EOF
connect
new failover-state
set name = "foo"
open
set local-state = 1
update
EOF

so when one server is going to be down for a while, I can connect to the other server and just run that script.

When the downed server comes back up, the two servers automatically start communicating and eventually get themselves back into a normal state. But only after the recovering server has spent mclt time in recover-wait state, where it renews existing leases but won't offer up new ones. So you probably wouldn't want to go into a partner-down state if the other server will be down for less than that amount of time.

Running the partner-down script when both servers are really up and running doesn't seem to do any harm, as mentioned above the two servers will quickly move back into a normal state. This can be seen by watching the DHCP logs.

Clean Failover

It's possible using OMAPI to shut down a server and have the remaining server automatically switch to "partner-down" mode in a clean way, so that when the downed server comes back up both servers quickly move to "normal" mode, without spending the mclt time in recover-wait state. This script does the trick:

#!/bin/sh
omshell << EOF
connect
new control
open
set state = 2
update
EOF

When run, it causes the dhcpd daemon on the current server to shutdown, and the dhcpd daemon on the other server takes over completely the DHCP pools.

Update: I wrote a bit more about DHCP failover, talking about how to deal with a clock sync problem when the machine boots by scheduling a dhcpd restart a few minutes after boot time.