Fibre Channel

I'm going to be updating my home server soon, and I've often thought it would be nice to have some fast server-class harddisks for speed and reliability, maybe even arranged in a mirror because I've got a lot of stuff I wouldn't want to lose. A couple weeks ago I started looking into Fibre Channel gear that's available on eBay and was surprised to see how cheap some of this stuff was, with several 10k rpm and 15k rpm drives advertised as new going for 1/4 or less of the retail prices. I bit the bullet and bought a bunch of HBAs, cables, and drives for under $300.

The first HBA I got was a Qlogic QLA2000 for $1.99, which seems to be a stripped down version of the QLA2100 that has a 32-bit PCI interface instead of the 2100's 64-bit PCI-X setup. The machine I'm putting this into only has regular 32-bit PCI slots, so I'm not really loosing out on anything by using the cheaper card. Also got a QLA2200 which is a 64-bit card, but it works fine in a 32-bit PCI slot.

The drives are a pair of "new" 146.8GB Seagate Cheetah 10K.7 drives, part # ST3146707FC with an IBM label on them for $230 total. To buy the SCSI versions of those drives from NewEgg at today's price would cost $430 each, so I've saved over $600 by going this route. However, I entered one of the drive's serial numbers into Seagate's warranty webpage, and found that it's not eligible for warranty through them - you must go through the OEM they sold the drives to (IBM in this case). Did some poking around on IBM's site and it's not obvious if/how you'd get warranty service through them. I guess that's the price you pay buying this type of drive.

A 5-pack of HSSDC-DB9 cables was just $20, and the final piece was a "Start" T-Card directly from CK Computer Systems for $34. (the FAQ on that site was really helpful)

Basically just plugged it all together, and fired the machine up. The Qlogic card shows a BIOS boot message saying to hit ALT-Q to get into their setup. There's one part in their BIOS utility where it scans your loop for devices - it showed the card, and then 15 blank spots. I thought I was screwed at first, but after hitting page down several times I found the drive at id 120. Wow, FC can handle a lot of devices compared to SCSI.

Booted FreeBSD 6.1 (already installed on a regular ATA disk), and saw it detect the Qlogic card with the isp(4) driver, and a da0 drive. Once I saw that working, I tried adding ispfw_load="YES" to /boot/loader.conf. On the next reboot, it paused after it detected the isp card, presumably loading the firmware that comes with FreeBSD. The relevant dmesg parts are:

isp0: <Qlogic ISP 2100 PCI FC-AL Adapter> port 0xce00-0xceff mem 
      0xfe7df000-0xfe7dffff irq 17 at device 2.0 on pci2
isp0: [GIANT-LOCKED]
----
da0 at isp0 bus 0 target 120 lun 0
da0: <IBM-SSG S0BE146 3709> Fixed Direct Access SCSI-3 device
da0: 100.000MB/s transfers, Tagged Queueing Enabled
da0: 137501MB (275154368 524 byte sectors: 255H 63S/T 17127C)

The camcontrol utility seems to work with the HBA/drive combination no problem at all. However, when I tried to do a fdisk /dev/da0 it errored out with:

fdisk: can't read fdisk partition table
fdisk: /boot/mbr: length must be a multiple of sector size

Oops, didn't like the 524 byte sectors. I'll cover how I dealt with that in part 2.

USB GPS on FreeBSD

A while ago I picked up a Holux GR-213U USB GPS receiver for pretty cheap on eBay. It's worked well in Windows, even on Windows within-a-Mac using Parallels. I thought I should give it a try using gpsd on FreeBSD, since I see nobody's reported it as working or not on their hardware page.

Stuck it into one of my FreeBSD 6.1 boxes, and saw in /var/log/messages:

ugen0: Prolific Technology Inc. USB-Serial Controller, rev 1.10/3.00, addr 2

That sounded pretty good, never messed with USB serial on FreeBSD before, so wasn't sure if the /dev/ugen0 device was what gpsd needed to talk to. Turns out it wasn't. After digging for a while, tried

kldload uplcom

and then unplugged/replugged the USB receiver - and now it shows up as

ucom0: Prolific Technology Inc. USB-Serial Controller, rev 1.10/3.00, addr 2

and a /dev/cuaU0 device showed up. I guess that makes sense in now that I see it working. The uplcom(4) module is required because the device is a Prolific chip, and that module also brings in the ucom(4) module automatically which provides the tty interface (/dev/cuaU*) gpsd needs to operate. Other USB serial devices might require a different modlue than "uplcom" - the SEE ALSO section of the ucom man page shows other possibilities.

Tried running gpsd in debug mode with

gpsd -N -n D 2 /dev/cuaU0

and was rewarded with lots of output from the receiver. Ran "cgps" and saw a human-friendly display of the GPS readings, but it kept flipping between 2D and 3D fix. Not sure what that's about yet, but at least the USB connection is working.

"touch: not found" in installworld

I've been updating one of my servers from FreeBSD 6.0 to 6.1, and had done a make buildworld a weeek or so ago, but didn't get around to actually installing it at the time. Since then, I cvsup'ed the source again, and the only real change was in /etc/rc.d/jail. I figured I didn't need to buildworld again since it's just a shell script, doesn't get compiled, and is installed by mergemaster (instead of make installworld).

When I did make installworld on my slightly outdated world, it errored out almost right away with: "touch: not found". The FreeBSD FAQ mentions that:

This error does not mean that the touch(1) utility is missing. The error is instead propably due to the dates of the files being set sometime in the future. If your CMOS-clock is set to local time you need to run the command adjkerntz -i to adjust the kernel clock when booting into single user mode.

I wasn't in single user mode, and the clock was correct - however my /usr/src/sys/conf/newvers.sh file was dated later than the world I had built, and that seemed to be causing the error. Using touch(1) to set the date back on that one file to match the world seems to have fixed the problem.

Portaudit and ezjail

Portaudit is a handy utility for FreeBSD that lets you know if any of your installed ports has a known security vulnerability. Part of the install puts a script in /usr/local/etc/periodic/security, which adds a report on ports that should be updated, to the daily security e-mail the system sends to root.

If you have jails setup on your machine, they may have their own ports installed which you'd probably also want checked by portaudit. The brute-force way to do it would be to install separate copies of portaudit inside each jail, and keep an eye on separate daily security e-mails from each jail looking for problems.

In my case, I've been running jails setup by ezjail, and didn't want to install portaudit over and over again. Instead, I came up with this minor shell script that checks each ezjail. If you save it as /usr/local/etc/periodic/security/410.portaudit_ezjail, then it'll run each day, right after the main portaudit periodic script that updates the vulnerability db and checks the main machine, and include the output in the main machine's security e-mail.

#!/bin/sh

#
# Run portaudit against packages installed in ezjails, as
# a periodic security job.
#
#
# 2006-05-05 Barry Pederson <bp@barryp.org>
#


JAIL_CONFIGDIR="/usr/local/etc/ezjail"
PACKAGE_DIR="/var/db/pkg"

# If there is a global system configuration file, suck it in.
#
if [ -r /etc/defaults/periodic.conf ]; then
    . /etc/defaults/periodic.conf
    source_periodic_confs
fi

case "${daily_status_security_portaudit_enable:-YES}" in
    [Nn][Oo])
        ;;
    *)
                for jailname in `ls $JAIL_CONFIGDIR`
                do
                    . "${JAIL_CONFIGDIR}/${jailname}"
                    eval rootdir=\"\$jail_${jailname}_rootdir\"    

                    echo
                    echo "Jail: $jailname"
                    echo "-------------------------"

                    echo "ls ${rootdir}${PACKAGE_DIR} | xargs portaudit" |
            su -fm "${daily_status_security_portaudit_user:-nobody}"
                done
        ;;
esac

I have to admit I'm not too fluent with shell scripting, and would have been much more comfortable writing it in Python, but that's probably a bit of overkill in this case.


Doh! As soon as I finished writing this, I happened to check the ezjail website, and found a link to jailaudit, by Philipp Wuensche which looks to do a similar thing but with more options, and has been submitted as a port.

Sharing a ports tree with ezjail

ezjail's ezjail-admin utility has a -P option to the update subcommand that causes it to fetch/update a ports tree into the basejail directory that all jails then share. However, if your machine already has a /usr/ports tree, that seems like a big waste of space. Why not have jails use that existing tree through mount_nullfs the same way the basejail is shared?

One of the files ezjail creates along with a new jail is /etc/fstab.jailname, that contains something like:

/data/jails/basejail /data/jails/jailname/basejail nullfs ro 0 0

(/data/jails was where I setup ezjail to store my jails)
Just add another line to that file like:

/usr/ports /data/jails/jailname/usr/ports nullfs ro 0 0

And make sure your jail has an empty /usr/ports directory (which is something you can put in a flavour if you're going to be doing this often). When your jail starts, you'll have a readonly view of the main machine's ports tree.

To keep both jailed and non-jailed systems from trying to put any port-building working-directories or downloaded distribution files in /usr/ports, the /etc/make.conf files (both the "real" one and the ones inside jails) should contain something like:

WRKDIRPREFIX=           /var/ports
DISTDIR=                /var/ports/distfiles
PACKAGES=               /var/ports/packages

ezjail's default flavour takes care of the jailed copies of this for you. If you make your own flavour, be sure it includes a similar /etc/make.conf

One last trick... If you're using portupgrade, run portsdb -u after updating your ports from your non-jailed environment. That way, if you're also running portupgrade inside the jail, it won't see its INDEX db as being out of date and complain that it can't fix it because the filesystem is readonly. On my machines I update using portsnap (a great tool BTW, also available to older BSDs as a port) with this trivial script:

#!/bin/sh

portsnap fetch
portsnap update

#
# Also update portupgrade database
#
portsdb -u

NAT and Jails

In experimenting with jails, I've had a need to put them on machines in which I didn't have extra public IP addresses to assign to the NIC. Turns out you can easily assign private addresses to an interface, and setup NAT (Network Address Translation) to allow the jails access to the rest of the world.

The loopback interface lo0 seems to work pretty well for this. On one machine I put ezjail on, I just picked the IP block 10.51.50.x out of my hat, and added an alias address on-the-fly with this command:

ifconfig lo0 alias 10.51.50.1 netmask 255.255.255.255

To make it happen at boot time, add this to /etc/rc.conf:

ifconfig_lo0_alias0="inet 10.51.50.1 netmast 255.255.255.255"

To setup FreeBSD's PF to NAT to the 10.51.50.x block, this went into /etc/pf.conf, after any scrub directives but before any block/pass type rules:

nat on $ext_if from 10.51.50.0/24 to any -> $ext_if

Reload the PF configuration with:

pfctl -f /etc/pf.conf

On another machine, I did mostly the same setup, except for using 127.x.x.x numbers. Not sure if there's any advantage one way or the other, both machines seemed to work pretty much the same.

ezjail really does make jails easy

Virtualization is something I've been interested in for some time, dabbling with VMWare on Windows, and eagerly awaiting Xen+BSD and AMD's Pacifica-enabled chips. FreeBSD's jail feature gives many of the same benefits but with relatively little overhead, as long as you're interested in working with the same version of FreeBSD in your "virtual" system as your "host" is running. Jails are a great way to isolate software - for security reasons, to run different versions of the same package, or just to allow yourself a sandbox to mess with that you can easily wipe out and recreate in a few seconds.

The man page for jail describes how to setup a jail by hand, which seems a bit involved. Luckily I stumbled across ezjail, which makes creating jails a breeze. Once it's setup, you can create and "boot" a fully functioning jail with just three commands. ezjail arranges things so most of the FreeBSD userland is shared between the jails, and the files unique to each jail take up as little as 2mb.

The initial setup is basically:

  1. install the port in sysutils/ezjail
  2. Add ezjail_enable="YES" to /etc/rc.conf
  3. edit /usr/local/etc/ezjail.conf to set where you want your jails created. (In my case I used /data/jails)
  4. make sure your /usr/src tree is complete
  5. run ezjail-admin update

That last command can take a lot of time (maybe hours), since it does a full make buildworld, make installworld. If you've already built your world, there's a -i parameter for skipping that step and just doing the make installworld.

Once that's all done, in your jail directory there is a basejail which contains about 130+mb of files that will be shared between jails, newjail which is a skeleton containing about 2mb of files that gets copied to any new jails you create, and flavours which is basically another set of skeleton directories that get copied over the newjail skeleton when your jail is created.

At this point, you can create and boot a jail with:

  1. ifconfig lo0 alias 127.66.0.1 netmask 255.255.255.255 or similar to give one of your network interfaces an IP the jail can use.
  2. ezjail create myjail 127.66.0.1 creates a new directory (/data/jails/myjail in my case) that's a copy of newjail and sets a few other things up.
  3. /usr/local/etc/rc.d/ezjail.sh start myjail

At this point the jail is up and running. You can "log into" it by first finding out the integer id of the jail with jls, and then running jexec <jail-id> /bin/sh

There are a few things that are missing in this barebones install, mainly no /etc/resolv.conf so domain name lookups don't work, no /etc/localtime so time in the jail shows as UTC. You can fix these problems and add your own customizations easily by using a flavour (don't mess with the newjail template directory).

You can stop and wipe out your jail with

/usr/local/etc/rc.d/ezjail.sh stop myjail
ezjail-admin delete -w myjail

Then, to make a new flavour and make a jail using that flavour, something like

cd /data/jails/flavours
cp -pr default myflavour
cd myflavour/etc
cp -p /etc/resolv.conf .
cp -p /etc/localtime .
ezjail-admin create -f myflavour myjail 127.66.0.1
/usr/local/etc/rc.d/ezjail.sh start myjail

At this point, you've created a new jail with your customizations, and would use jls again to find the jail-id, and jexec to start a shell inside the running jail.

A flavour may also contain packages you wish to install upon jail creation, and commands to execute when the jail is created. Check out the ezjail.flavour file in your flavour directory. I've used it to install common useful things like bash, vim, gmake, and libiconv and gettext which take a long time to build that you don't want to repeat for every jail.

mod_python segfault fixed

Just as a followup, it seems the segfault in mod_python on FreeBSD I mentioned before was found and fixed. Turns out to not be any kind of pointer/memory corruption like I thought, but rather a mishandled return code from an APR (Apache Portable Runtime) function. Oh well, I got to play with gdb, ddd, and valgrind a bit, which is good stuff to be familiar with.

Restoring Boot Sectors in FreeBSD

At work the other day, we had a long power outage, and afterwards one of our FreeBSD 5.2.1 boxes refused to come back up. It'd power up, go through the BIOS stuff, show the FreeBSD boot manager that lets you select which slice to boot, but when you hit F1, the screen would go black and the machine would reset.

Booted off the 5.2.1 install CD, and after entering fixit mode, was able to mount the disk and see that the files seemed to be intact. Couldn't run fsck though, the 5.2.1 CD seemed to be missing fsck_4.2bsd.

FreeSBIE 1.1 on the other hand, was able to fsck the disk, but that didn't solve the problem. Next guess was that something in the /boot directory was hosed. I'd setup the machine to do weekly dumps of the root partition to another machine, and was able to extract /boot from a few days before and pull it back onto this machine over the network using FreeSBIE, but it still wouldn't boot.

Next theory was that something in the boot sectors was bad. First tried restoring the MBR (Master Boot Record) from copy that's kept in /boot - even though it was working well enough to show the F1 prompt to select the slice. Wanted to keep what 5.2.1 had been using, so mounted the non-booting disk readonly and made sure to have boot0cfg use the copy there instead of anything that might have been on the FreeSBIE disc.

mkdir /foo
mount -r /dev/twed0s1a /foo
boot0cfg -B -b /foo/boot/boot0 /dev/twed0
reboot

Unfortunately, that didn't help. Each slice (partition in non-BSD terminology) also has boot sectors, and to restore them, turns out you use the bsdlabel (a.k.a. disklabel) utility. Again from FreeSBIE:

mkdir /foo
mount -r /dev/twed0s1a /foo
bsdlabel -B -b /foo/boot/boot /dev/twed0s1
reboot

That did it. Apparently something in the slice's boot sectors was messed up.

Debugging mod_python with Valgrind

Other people have reported the same problem with mod_python on FreeBSD I had seen before, so I'm happy that I'm not losing my mind.

I took a stab at using Valgrind to find the problem. Didn't actually find anything, but I thought I'd jot down notes on how I went about this.

First, the Valgrind port didn't seem to work on FreeBSD 6.0. When I tried running it against the sample code in the Valgrind Quick Start guide, it didn't find anything wrong with it. Ended up finding a FreeBSD 5.4 machine, which did see the expected problem.

Next, I built the Apache 2.0.x port with: make WITH_THREADS=1 WITH_DEBUG=1, and then built mod_python which uses APXS and picks up the debug compile option from that.

Then, in the mod_python distribution, went into the test directory, and downloaded a Valgrind suppression file for Python, valgrind-python.supp, and in it uncommented the suppressions for PyObject_Free and PyObject_Realloc (otherwise the Valgrind output is full of stuff that is really OK). Then tweaked test/test.py around line 307 where it starts Apache, to insert

valgrind --tool=memcheck --logfile=/tmp/valgrind_httpd --suppressions=valgrind-python.supp

At the front of the cmd variable that's being composed to execute httpd.

Finally, ran python test.py, and then looked at /tmp/valgrind_httpd.pid#### to see the results.