DHCP Failover

I've been setting up DHCP servers at work to use the failover feature available in ISC-DHCP (the net/isc-dhcp3-server port in FreeBSD). That allows for two servers to work together, sharing a pool of addresses and keeping track of leases handed out by both servers. The dhcpd.conf(5) manpage discusses this feature somewhat. I'll jot down some notes here that are a bit more specific about what all had to be done.

Let's assume the two DHCP servers will have the IP addresses and - with .10 being the 'primary' and .20 being the 'secondary' server. It doesn't really matter which is which - although the logs on the 'primary' server seem a bit more complete. We'll also assume these servers are giving out addresses from a pool of numbers, and are running DNS caches - so that the DHCP clients should be told to use them for DNS servers.

We'll also use TCP port 520 for communications between the DHCP servers, so be sure to allow for that through any firewalls.


On the 'primary' machine, the /usr/local/etc/dhcpd.conf file might look like:

failover peer "foo" 
    mclt 1800;  # only specified in the primary
    split 128;  # only specified in the primary

    port 520;

    peer address;
    peer port 520;

    max-response-delay 30;
    max-unacked-updates 10;
    load balance max seconds 3;                

option domain-name-servers,;

include "/usr/local/etc/dhcp/master.conf";

and the same file on the 'secondary' machine is very similar:

failover peer "foo" 

    port 520;

    peer address;
    peer port 520;

    max-response-delay 30;
    max-unacked-updates 10;
    load balance max seconds 3;                

option domain-name-servers,;

include "/usr/local/etc/dhcp/master.conf";

The failover peer name, "foo" in this example, will also appear in the DHCP pool configuration, and will be used in a script change the failover state later on.

I created a directory /usr/local/etc/dhcp/ to hold the DHCP config files that will be common to both DHCP servers. That way, it's just a matter of copying the entire directory between servers when a change is made. The /usr/local/etc/dhcp/master.conf file I included from the main server config might look something like:

omapi-port 7911;

default-lease-time 16200;  # 4.5 hours
max-lease-time 16200;

subnet netmask
        option routers;

                failover peer "foo";
                deny dynamic bootp clients;


The deny dynamic bootp clients;directive is required for any failover pool. The omapi-port 7911; directive will be useful later on for when a server needs to be put into the 'partner-down' state because the other server will be off for a while.

To sync and restart the two servers whenever there's a change to the DHCP configuration, I setup the server to allow root logins through SSH from the root account of using public/private keys, and then put a script named restart_dhcp on the server that looks like:

/usr/local/etc/rc.d/isc-dhcpd restart
scp -pr /usr/local/etc/dhcp
ssh /usr/local/etc/rc.d/isc-dhcpd restart

That copies the entire /usr/local/etc/dhcp directory, so if you need to break up your config into more files that get included, they'll all be copied over when you do a restart.


When one server stops unexpectedly, the remaining server will go into a communications-interrupted state, and continue offering up addresses from its half of the DHCP pools, and will renew leases it knows were given out by the other server.

If the downed server will be out for longer than the mclt value from the server config (1800 seconds (30 minutes) in the examples above). You may want to let the surviving server know that it's on its own so that it can use the entire pool of available addresses. This is done by putting the surviving server into partner-down state.

This has to be done after the other server is really down. Doing it before shutting down the other server doesn't work, because the two servers will get themselves back into a normal state very quickly, probably before you get a chance to shut the 2nd server down.

The omshell program can be used to communicate with a running DHCP server daemon to control it in various ways, including changing the failover state. I put this partner-down script on both the primary and secondary servers:

omshell << EOF
new failover-state
set name = "foo"
set local-state = 1

so when one server is going to be down for a while, I can connect to the other server and just run that script.

When the downed server comes back up, the two servers automatically start communicating and eventually get themselves back into a normal state. But only after the recovering server has spent mclt time in recover-wait state, where it renews existing leases but won't offer up new ones. So you probably wouldn't want to go into a partner-down state if the other server will be down for less than that amount of time.

Running the partner-down script when both servers are really up and running doesn't seem to do any harm, as mentioned above the two servers will quickly move back into a normal state. This can be seen by watching the DHCP logs.

Clean Failover

It's possible using OMAPI to shut down a server and have the remaining server automatically switch to "partner-down" mode in a clean way, so that when the downed server comes back up both servers quickly move to "normal" mode, without spending the mclt time in recover-wait state. This script does the trick:

omshell << EOF
new control
set state = 2

When run, it causes the dhcpd daemon on the current server to shutdown, and the dhcpd daemon on the other server takes over completely the DHCP pools.

Update: I wrote a bit more about DHCP failover, talking about how to deal with a clock sync problem when the machine boots by scheduling a dhcpd restart a few minutes after boot time.

amqplib 0.2

I noticed the other day that my two RabbitMQ servers were consuming more and more memory - one had gone from an initial 22mb size to over 600mb. As I sat and watched it would grow by 4k or so at regular intervals.

I think what had happened is that I had created an exchange which received lots of messages, and then ran scripts that created automatically-named queues bound to that exchange, but defaulted to not auto-deleting them. I ran these scripts many many times, which left many many queues on the server, all swelling up with lots of messages that would never be consumed. Good thing I caught it, it might have eventually killed my server.

This message in the rabbitmq-discuss list gives useful info on how to get in and see what queues exist on a RabbitMQ server, and how big they are.

It seems to me that having the auto_delete parameter of Channel.queue_declare() default to False is a really bad idea. If you want to keep a queue around after your program exits, I think you should explicitly say so, so I changed the default to True. The Channel.exchange_declare() also has a auto_delete parameter, which I also change the default to True for consistency.

I also did some work on supporting the redirect feature of AMQP, where a server you connect to can tell you to go somewhere else, useful for balancing a cluster. I don't actually have a RabbitMQ cluster, so I put together a utility to fake an AMQP server that tells you to redirect. It works well enough to run the uniitests unchanged against it, each test case being redirected from the fake server to the real server.

With those two changes, I put out a 0.2 release, on my software page and on the Cheeseshop.

amqplib 0.1

I broke down and put together a tarball of my Python AMQP library, and stuck it up as a release 0.1 on the software section of this website, under the section py-amqplib.

Interestingly, someone hit the page and downloaded the tarball less than 3 minutes after I dropped a note about it to the RabbitMQ discussion list - so I guess there's at least some interest out there in this sort of thing :)


For some time I've been using Spread as a messaging bus between processes and machines using Python bindings, but there are a few things that make it not quite ideal for what I've been trying to do.

  • There's no access control
  • Messages are non-persistent - so if a receiver daemon is down and some important message comes through, it's SOL
  • The wire protocol is not documented, the docs basically just say use the C client library.
  • The Python bindings to the C library have a glitch of some sort when used in py-exim-localscan, I had to resort to a small ctypes wrapper to get around this.

I ran across the Advanced Message Queuing Protocol(AMQP), with RabbitMQ as one implementation of the protocol, that looks like a better fit for my needs.

There's a Python client library available named QPID, but there are a few issues with that:

  • Relies on threading, which is trouble when Python is embedded in something else, or if you want to try using it in Stackless Python
  • Lacking documentation
  • Has to load a big AMQP XML Spec file, which takes a few seconds.

I decided to take a whack at my own AMQP client, partially as a learning excercise to learn more about the protocol. I wrote a program to take the AMQP 0-8 spec file and statically generate the skeleton of a Python module, and then fleshed it out by hand. The generator is able to put lots of documentation from the spec file into Python docstrings, so the pydoc of this module is fairly decent. Because the module is statically generated, it should be easier to debug than QPID which generates lots of stuff on-the-fly. It's also much faster at making the first connection because it's not parsing the spec file. I also thew in SSL support in since it wasn't too difficult.

It has a ways to go, and some parts are probably naively conceived, but it does seem to work.

The first thing I've used it for is a syslog->AMQP bridge. I've setup my FreeBSD syslogd to feed all info or higher events to a Python daemon, which extracts the date, time, program name, host name, etc and reformats as an AMQP message and published to a 'syslog' topic exchange with the program name as the routing key.

My plan is then to write other daemons that subscribe to the 'sshd' topic for example, and then generate higher-level messages that say things like: 'block IP address xx.xx.xx.xx' in case of failed login attempts. Then i just need one daemon to listen for these firewall control message and make changes to the PF tables.

It's fun stuff. The only weak part is that there's no way to tell if the original syslog message was spoofed, but after that point, AMQP access controls should keep things trustworthy.

See py-amqplib for a Mercurial repository and eventual downloads.

Markdown and Pygments

This blog is mainly being written as Markdown text stored in a database, and I thought it would be nice to add the ability to use Pygments to add syntax highlighting to various bits of code within the entries.

There are some DjangoSnippets entries on how to do this, notably #360 which first runs text through Markdown to generate HTML and then BeautifulSoup to extract parts marked up in the original pre-Markdown text as <pre class="foo">...</pre> to be run through Pygments and then re-inserted back into the overall Markdown-generated HTML.

The problem with this is that the text within <pre>...</pre> needs to valid HTML with things like: e_mail='<foo@bar.edu>' escaped as e_mail='&lt;foo@bar.edu>', otherwise BeautifulSoup thinks in that example that you have a screwed up <foo> tag and tries to fix that up.

Making sure all the <, &, and other characters special to HTML are escaped within a large chunk of code misses out on the convenience of using Markdown. I decided to go with an arrangement in which regular Markdown code blocks are used, but if the first line begins with pygments:<lexer>, then that block is pygmentized.

So if I enter something like:

Here is some code

    if a < b:
        print a

It ends up as:

Here is some code

if a < b:
    print a

What I came up with is this derivative of Snippet #360

from htmlentitydefs import name2codepoint
from HTMLParser import HTMLParser
from markdown import markdown
from BeautifulSoup import BeautifulSoup
from pygments.lexers import LEXERS, get_lexer_by_name
from pygments import highlight
from pygments.formatters import HtmlFormatter

# a tuple of known lexer names
_lexer_names = reduce(lambda a,b: a + b[2], LEXERS.itervalues(), ())

# default formatter
_formatter = HtmlFormatter(cssclass='source')    

class _MyParser(HTMLParser):
    def __init__(self):
        self.text = []
    def handle_data(self, data):
    def handle_entityref(self, name):

def _replace_html_entities(s):
    Replace HTML entities in a string
    with their unicode equivalents.  For
    example, '&amp;' is replaced with just '&'

    mp = _MyParser()
    return u''.join(mp.text)  

def markdown_pygment(txt):
    Convert Markdown text to Pygmentized HTML

    html = markdown(txt)
    soup = BeautifulSoup(html)
    dirty = False
    for tag in soup.findAll('pre'):
        if tag.code:
            txt = tag.code.renderContents()
            if txt.startswith('pygments:'):
                lexer_name, txt = txt.split('\n', 1)
                lexer_name = lexer_name.split(':')[1]
                txt = _replace_html_entities(txt)
                if lexer_name in _lexer_names:
                    lexer = get_lexer_by_name(lexer_name, stripnl=True, encoding='UTF-8')
                    tag.replaceWith(highlight(txt, lexer, _formatter))
                    dirty = True
    if dirty:
        html = unicode(soup)

    return html

Stackless Python and Sockets

I've been intrigued by Stackless Python for a while, and finally got around to installing it one one of my machines. FreeBSD doesn't have a port available, so after creating an ezjail to isolate the installation, it was just a matter of fetching and extracting stackless-251-export.tar.bz2 and doing a standard ./configure && make && make install

The installation looks pretty much like a normal Python installation on FreeBSD, with a /usr/local/bin/python binary and libraries in /usr/local/lib/python2.5

Networking is something I especially wanted to check out with Stackless, and the examples on the Stackless website mostly make use of a stacklesssocket.py module which is a separate download. That module has unittests built in as the module's main function, but when running it on my FreeBSD 7.0-CURRENT box, it died with an exception ending in:

File "stacklesssocket.py.ok", line 286, in handle_connect
AttributeError: 'NoneType' object has no attribute 'send'

after doing some digging, I found that stacklesssocket.py has a dispatcher class which is a subclass of a class by the same name in Python's asyncore.py module. stacklesssocket.dispatcher.connect() calls asyncore.dispatcher.connect() which may directly call the object's handle_connect() method before returning back to stacklesssocket.dispatcher.connect(). However stacklesssocket.dispatcher.connect() doesn't setup that channel until after the call to asyncore.dispatcher.connect() returns. So when handle_connect() tries to send a message over a channel that doesn't exist yet, an exception is raised.

This trivial patch seems to fix the problem - only sending a message over the channel if it exists (which should only happen if there's another tasklet waiting on it back in a stacklesssocket.dispatcher.connect() method).

--- stacklesssocket.py.orig       2007-09-18 20:58:02.000835000 -0500
+++ stacklesssocket.py  2007-09-18 22:03:13.370709131 -0500
@@ -282,7 +282,7 @@

    # Inform the blocked connect call that the connection has been made.
    def handle_connect(self):
-        if self.socket.type != SOCK_DGRAM:
+        if (self.socket.type != SOCK_DGRAM) and self.connectChannel:

    # Asyncore says its done but self.readBuffer may be non-empty

With that patch, the unittests run successfully - at least on my box.

Building ports on old FreeBSDs - revised

This is a revision of an earlier post which has instructions that no longer work.

Are you running an older version of FreeBSD, and getting errors like this when you try to build a port?

"/usr/ports/Mk/bsd.port.mk", line 2416: warning: String comparison operator should be either == or !=
"/usr/ports/Mk/bsd.port.mk", line 2416: warning: String comparison operator should be either == or !=
"/usr/ports/Mk/bsd.port.mk", line 2416: Malformed conditional (((${OSVERSION} < 504105 || (${OSVERSION} 
    >= 600000 && ${OSVERSION} < 600103) || (${OSVERSION} >= 700000 && ${OSVERSION} < 700012)) && 
    ${PKGORIGIN} != "ports-mgmt/pkg_install") || exists(${LOCALBASE}/sbin/pkg_info))

If so, it's because the ports maintainers have started using expressions in the ports Makefiles which are not understood by the versions of make that come with old FreeBSDs.

The official recommended fix would be to upgrade your FreeBSD, but if that's not practical you can at least install a newer version of make to get by for a bit longer. This can be done in just a few minutes with two main steps: temporarily bring back an older /usr/ports/Mk which is compatible with FreeBSD 4.x - and then build and install the devel/make port which used to be present in the ports tree.

The ports tree is in CVS, so it's possible to checkout older revisions of selected directories. This FreeBSD Handbook page lists the anonymous CVS repositories available. For this example I'm going to use anoncvs@anoncvs1.FreeBSD.org:/home/ncvs We only need two small directories, so it probably doesn't really matter which one you use.

It seems like the first commit to the ports infrastructure which broke compatibility happened around Feb 5th, 2007 - so let's backup the current /usr/ports/Mk and check out one from Feb 4th:

cd /usr/ports
mv Mk Mk.original
cvs -d anoncvs@anoncvs1.FreeBSD.org:/home/ncvs co -D "04 Feb 2007" -d Mk ports/Mk

You should now be able to build some ports, at least those that don't use incompatible syntax in their individual Makefiles and don't require a newer ports infrastructure. Now, if you don't already have a devel/make port, use CVS to bring that back too, then build and install it:

cd /usr/ports/devel
cvs -d anoncvs@anoncvs1.FreeBSD.org:/home/ncvs co -D "04 Feb 2007" -d make ports/devel/make
cd make
make install
make clean

Lastly, set your system to use the new ports make in place of the old system make, and do some cleanup:

cd /usr/bin
mv make make.old
ln -s /usr/local/bin/make .

cd /usr/ports
rm -rf Mk
mv Mk.original Mk
rm -rf /usr/ports/devel/make

You should now be in better shape for trying to build new ports. 4.x isn't officially supported anymore by the ports maintainers, so there may be some individual port breakage - but at least you're over the first hurdle.

Logitech QuickCam Pro 5000 on a Mac Mini

I was recently working on finding a reasonable webcam that would work with an Intel Mac Mini, since Apple no longer sells the iSight and the prices for them on eBay are outrageous.

After reading reports that newer versions of the Logitech QuickCam Pro 5000 (labeled as "Vista Ready") worked on Macs, I picked one up and tried it out on my Intel MacBook laptop, running OSX 10.4.8 at the time I believe.

Despite what I had read on various postings, the camera didn't show up at all, although the built-in microphone worked OK, appearing as something like "Unknown USB Audio Device". The camera worked OK in Windows, so I figured the hardware was OK - and returned it to the store and started looking again.

After a while, 10.4.9 came out, and supposedly included updates that supported more webcams. I bought another QuickCam Pro 5000, and this time found that the camera and microphone worked in iChat, but the version of Skype I had at the time ( only saw the microphone.

I figured this was good enough and took it out to the person with the Mini. While there, I updated OSX to 10.4.10. When I plugged in the QuickCam, I found that the camera worked, but now the microphone didn't show up at all. When plugged into my 10.4.9 laptop, the camera and mic worked fine. Apparently Apple broke something in the 10.4.10 update. (there's a discussion of it here.)

After poking around, I found the AppleUSBAudio system extension, which seemed like a likely suspect. By replacing it with the same extension from 10.4.9, I was able to get the mic working - it went something like:

sudo -s
(type in password)

cd /System/Library/Extensions
kextunload AppleUSBAudio.kext

(backup the AppleUSBAudio.kext directory somewhere else)
(copy the 10.4.9 AppleUSBAudio.kext directory to this directory)

(permissions got changed moving between machines, fix that up)
chown -R root:wheel AppleUSBAudio.kext  

kextload AppleUSBAudio.kext

Plugged in the webcam, and now both camera and mic work. Tried a newer Skype -, and that works too.

So I think we'll be able to chat between the Mac Mini and the MacBook for a while - at least until Apple updates the OS again.

In a followup entry, I replaced this camera with a Logitech QuickCam Communicate STX.

FreeBSD, OpenBSD, and 2007 Daylight Savings Time

Just a followup to my earlier post on updating FreeBSD to handle the changes in Daylight Savings Time starting in 2007....

There are a couple mentions [1], [2] on the freebsd-stable mailing list that the /etc/localtime file is binary compatible across various versions of FreeBSD.

This means you only have to get one good copy of /etc/localtime for your timezone, either by pulling from a FreeBSD 6.2 or higher machine, or installing the misc/zoneinfo port on one older machine and run tzsetup. Once you have that good version, you can copy it around to your other FreeBSD boxes, regardless of what version they're running.

This seems to also work between FreeBSD and OpenBSD (and perhaps Net and Dragonfly). I had a single OpenBSD 3.7 machine I wanted to update, and it seems to work OK to use a /etc/localtime pulled from a FreeBSD box. On OpenBSD (3.7 at least), /etc/localtime was a symlink to a file in /usr/share/zoneinfo/. I just removed the symlink and made a new one pointing to a new timezone file I stuck somewhere on the disk. Checking for the correct DST dates in OpenBSD seems to be the same as with FreeBSD:

zdump -v /etc/localtime | grep 2007

Look for March 11th as the start date.

Building ports on old FreeBSDs

The instructions in this entry no longer work, please check out the revised entry on this issue.

I've got a couple older FreeBSD machines at work, running 4.7 and 4.8, that have started having trouble building ports. Apparently some changes have been made to the ports infrastructure that are not compatible with make on older FreeBSDs. The error message given when a running 'make' in a port directory is:

"/usr/ports/Mk/bsd.port.mk", line 2292: warning: String comparison operator should be either == or !=
"/usr/ports/Mk/bsd.port.mk", line 2292: warning: String comparison operator should be either == or !=
"/usr/ports/Mk/bsd.port.mk", line 2292: Malformed conditional (((${OSVERSION} < 504105 || 
      (${OSVERSION} >= 600000 && ${OSVERSION} < 600103) 
      || (${OSVERSION} >= 700000 && ${OSVERSION}     < 700012)) 
     && ${PKGORIGIN} != "ports-mgmt/pkg_install") || exists(${LOCALBASE}/sbin/pkg_info))
"/usr/ports/Mk/bsd.port.mk", line 2293: warning: String comparison operator should be either == or !=
"/usr/ports/Mk/bsd.port.mk", line 2293: warning: String comparison operator should be either == or !=
"/usr/ports/Mk/bsd.port.mk", line 2293: Malformed conditional ((${OSVERSION} < 504105 || 
     (${OSVERSION} >= 600000 && ${OSVERSION} < 600103) 
     || (${OSVERSION} >= 700000 && ${OSVERSION} < 700012)) 
     && ${PKGORIGIN} != "ports-mgmt/pkg_install")
"/usr/ports/Mk/bsd.port.mk", line 2308: if-less else
"/usr/ports/Mk/bsd.port.mk", line 2308: Need an operator
"/usr/ports/Mk/bsd.port.mk", line 2322: if-less endif
"/usr/ports/Mk/bsd.port.mk", line 2322: Need an operator
"/usr/ports/Mk/bsd.port.mk", line 5987: if-less endif
"/usr/ports/Mk/bsd.port.mk", line 5987: Need an operator
make: fatal errors encountered -- cannot continue

I ran into this when trying to update timezone info. The problem can be fixed by installing a newer version of make on the system.

First, backup /usr/ports/Mk/bsd.port.mk

 cd /usr/ports/Mk
 cp -p bsd.port.mk bsd.port.mk.orig

Then patch it with this file or edit with your favorite editor going to line 2292 (the first error line above), where there's a chunk of .if .if....endif .endif statements and tossing everything except this first cluster of variables being set:

PKG_CMD?=              ${LOCALBASE_REL}/sbin/pkg_create
PKG_ADD?=              ${LOCALBASE_REL}/sbin/pkg_add
PKG_DELETE?=   ${LOCALBASE_REL}/sbin/pkg_delete
PKG_INFO?=             ${LOCALBASE_REL}/sbin/pkg_info
PKG_VERSION?=          ${LOCALBASE_REL}/sbin/pkg_version

save it. Now build and install a newer version of make, "Berkeley make, back-ported to FreeBSD 4.x"

cd /usr/ports/devel/make
make install
make clean

use it instead of the old make

cd /usr/bin
mv make make.old
ln -s /usr/local/bin/make .

restore the bsd.ports.mk file you backed up

cd /usr/ports/Mk
mv bsd.port.mk.orig bsd.port.mk

You should be able to build ports again using the current ports tree infrastructure.