RancherOS/ISOLinux/Syslinux on FreeBSD bhyve

After messing with Docker on my laptop, I thought it would be interesting to setup a VM on my FreeBSD server to run RancherOS. I've been using vm-bhyve to manage the VMs, and have been running Ubuntu without much problem, so I figured another Linux distro would be fine ... but ended up opening a whole can of worms and while I did get it running eventually, I learned more about grub and booting on bhyve than I really wanted. I thought I'd jot down some notes here for future reference.

To start with, bhyve is not a general hypervisor that can boot any PC-compatible disk or CD image you throw at it, the way something like KVM, VMWare, or Parallels can. It doesn't start a VM in 16-bit mode and go through a old-school BIOS boot sequence where it reads a Master Boot Record and executes whatever's there. It knows how to load a FreeBSD kernel, and with grub2-bhyve it can boot disks and CDs that use Grub2 - such as Ubuntu.

Unfortunately, RancherOS doesn't use grub, instead it uses Syslinux/ISOLinux on their ISO images and harddisk installations. When bhyve boots using the grub loader, it doesn't find any grub menu on the disk, and just drops you into a grub command prompt.

GNU GRUB  version 2.00

Minimal BASH-like line editing is supported. For the first word, TAB
lists possible command completions. Anywhere else TAB lists possible
device or file completions.


Fortunately, the grub commandline is like a mini-os, with lots of abilities to look around the disks, and it turns out manually boot things like RancherOS.

The first command to run is:

set pager=1

so that future commands don't just scroll off the screen.


displays a list of commands, help <command> gives a short explanation. ls lets you start poking around, in this case giving:

(cd0) (cd0,msdos1) (host)

Now we're getting somewhere. Trying ls (cd0) gives

Device cd0: Filesystem type iso9660 - Label `RancherOS' - Last modification time 2018-09-19 03:09:12 Wednesday, UUID 2018-09-19-03-09-12-00 - Total size 176128 sectors

ls -l (cd0)/ gives

DIR          20180919030909 boot/
DIR          20180919030912 rancheros/

OK, a boot directory, getting closer. ls -l (cd0)/boot gives

170          20180919030909 global.cfg
66978212     20180919030909 initrd-v1.4.1
DIR          20180919030909 isolinux/
1373         20180919030909 linux-current.cfg
12734        20180919030909 rancher.png
5523216      20180919030909 vmlinuz-4.14.67-rancher2

There we go, isolinux, but no grub files, no wonder it doesn't boot. After lots and lots of messing around learning grub, I was able to get an initial boot of the CD image from the grub> prompt with:

linux (cd0)/boot/vmlinuz-4.14.67-rancher2
initrd (cd0)/boot/initrd-v1.4.1

And it started! After lots of Linux boot output I was rewarded with:

                ,        , ______                 _                 _____ _____TM
   ,------------|'------'| | ___ \               | |               /  _  /  ___|
  / .           '-'    |-  | |_/ /__ _ _ __   ___| |__   ___ _ __  | | | \ '--.
  \/|             |    |   |    // _' | '_ \ / __| '_ \ / _ \ '__' | | | |'--. \
    |   .________.'----'   | |\ \ (_| | | | | (__| | | |  __/ |    | \_/ /\__/ /
    |   |        |   |     \_| \_\__,_|_| |_|\___|_| |_|\___|_|     \___/\____/
    \___/        \___/     Linux 4.14.67-rancher2

    RancherOS #1 SMP Thu Sep 13 15:37:04 UTC 2018 rancher ttyS0
    docker-sys: eth0: lo:
rancher login:

Very cool, but what's the login? Userid is rancher, but there is no default password. According to the rancher docs, the ISO image is supposed to auto-login. Now what?

After rebooting and getting back to the grub> prompt, and digging around more, I found that cat (cd0)/boot/global.cfg showed:

APPEND rancher.autologin=tty1 rancher.autologin=ttyS0 rancher.autologin=ttyS1 rancher.autologin=ttyS1 console=tty1 console=ttyS0 console=ttyS1 printk.devkmsg=on panic=10

Ah, LInux command parameters including autologin stuff. To apply them it ended up being (again at the grub> prompt):

linux (cd0)/boot/vmlinuz-4.14.67-rancher2 rancher.autologin=tty1 rancher.autologin=ttyS0 rancher.autologin=ttyS1 rancher.autologin=ttyS1 console=tty1 console=ttyS0 console=ttyS1 printk.devkmsg=on panic=10
initrd (cd0)/boot/initrd-v1.4.1

(that commandline could probably be simplified since we can see from the banner that our VM console is ttyS0, so we probably don't need the params relating to tty1 or ttyS1) This time I got the cattle banner from above, and a beautiful:

Autologin default
[rancher@rancher ~]$

A simple sudo -s (not requiring a password) gives root access. At that point you can do whatever, including installing onto a harddisk.

To get a RancherOS harddisk installation to boot, you'd have to go through similar steps with grub in exploring around the (hd0,1) disk to find the kernel, initrd, and kernel params. The grub commands for booting can be saved permanently in the vm-bhyve config for this machine with grub_runX lines like:

grub_run0="linux (hd0,1)/boot/vmlinuz-4.14.67-rancher2 rancher.autologin=ttyS0 printk.devkmsg=on rancher.state.dev=LABEL=RANCHER_STATE rancher.state.wait panic=10 console=tty0"
grub_run1="initrd (hd0,1)/boot/initrd-v1.4.1"

So the full vm-bhyve config file looks like (in case you're wondering - I hate it when people give snippets of code but don't show where it should go exactly):

grub_run0="linux (hd0,1)/boot/vmlinuz-4.14.67-rancher2 rancher.autologin=ttyS0 printk.devkmsg=on rancher.state.dev=LABEL=RANCHER_STATE rancher.state.wait panic=10 console=tty0"
grub_run1="initrd (hd0,1)/boot/initrd-v1.4.1"

With that, my VM now boots without manual intervention, even though the virtual disk doesn't use grub.

Firefox Focus content-blocking of fonts almost drove me mad

I have a little personal webapp I use on my iPhone, that relied on Bootstrap3 glyphicons. However on my iPhone it started displaying weird emojis instead of the icons after upgrading to iOS 11. Other people's iPhones displayed everything fine, desktop browsers displayed everything fine, seemed to be just my phone, WTF?

Even looking at the BS3 components sample page I'd see emojis, WTF!?

Tried switching to open-iconic fonts, same problem (different emojis though), WTF!!?

Finally found this coment on BS3's Github saying it was due to content-blocking. Turns out I had Firefox Focus installed, and probably during the iOS upgrade I also upgraded Focus which must of coincidentally starting blocking the webfonts at that time.

Disabling content-blocking fixed the problem, yay! Just as a reference, the place to go (at least in iOS 11) is:

Settings App, scroll down to "Safari", scroll down to "Content Blockers", then in "Allow These Content Blockers:" disable "Firefox Focus"

Mozilla's Focus support page says:

Web fonts - fonts that are downloaded from the server (may slow down web pages). Web fonts are typefaces used to style the text on some web pages. Blocking Web fonts will alter the appearance of text on any pages where Web fonts are used, but all text will still display legibly.

Someone should tell them web fonts are used for more then just text, and blocking them can make your icons illegible.

Fortunately, you don't have to completely give up Focus content-blocking. In the Focus app, there's a little gear icon in the upper-right that lets you in a more fine-grained fashion enable/disable blocking of web fonts, but keep the other blocking of ad trackers, etc. After turning only web-font blocking off, and re-enabling content-blocking overall in the phone's Safari settings, I still have working icons in my little app.

Mercurial escaped colors

After upgrading Mercurial to 4.2 on my FreeBSD 10.x boxes, there was a problem in that the mercurial color extension was now enabled, and suddenly things like hg status were showing output like

ESC[0;34;1mM ESC[0mESC[0;34;1mpf.confESC[0m

after lots of digging, finally figured out it was caused by my PAGER environment variable being set to more, pretty outdated. Fixed it on-the-fly with export PAGER='less -X' and got nice colorized output. Made it permanent by editing ~/.profile and replacing a line I had setting the PAGER with a new one:

PAGER='less -X';    export PAGER

Winbind failure do to incorrect time

I had the weirdest thing suddenly start happening last night that took several hours to finally figure out was a time-related issue.

I've got an Ubuntu box that uses pam_winbind to allow for logging into a machine using an Active Directory account. Normally I connect with an SSH key, but once in when doing sudo -s I enter an AD password to become root. Last night that sudo -s suddenly stopped working.

Luckily I had another non-AD account that I could connect with, and sudo worked for that, so I could become root and poke around. The logs showed:

sudo: pam_unix(sudo:auth): authentication failure; logname=barry.pederson uid=14283 euid=0 tty=/dev/pts/0 ruser=barry.pederson rhost=  user=barry.pederson
sudo: pam_unix(sudo:auth): conversation failed
sudo: pam_unix(sudo:auth): auth could not identify password for [barry.pederson]

That was weird, I could log into other things though that used the same AD account, so I knew the password was right and the account wasn't locked out.

I hoped by the next morning, some cache thing would expire and I'd be back in business, but no dice.

Poking around some more I found if I disabled my SSH keys, I couldn't log in at all, so it was really a pam_winbind issue, not a sudo one. The logs for a SSH password login attempt were a bit more informative:

pam_unix(sshd:auth): authentication failure; logname= uid=0 euid=0 tty=ssh ruser= rhost=xxx.yyy.zzz  user=barry.pederson
pam_winbind(sshd:auth): getting password (0x00000388)
pam_winbind(sshd:auth): pam_get_item returned a password
pam_winbind(sshd:auth): request wbcLogonUser failed: WBC_ERR_AUTH_ERROR, PAM error: PAM_AUTH_ERR (7), NTSTATUS: NT_STATUS_LOGON_FAILURE, Error message was: Logon failure
pam_winbind(sshd:auth): user 'barry.pederson' denied access (incorrect password or invalid membership)
Failed password for barry.pederson from x.x.x.x port 50655 ssh2

WTF? I know the password's right, I've been typing it all morning into other systems. I even tried wbinfo --authenticate barry.pederson on this box and it accepted my passwords.

Much time was spent Googling, trying various tweaks to smb.conf, etc. Finally, I don't remember why, I thought to check the date with ntpdate -d my.ad.server and it came back with offset -338.308573 sec. Holy crap, that's more than 5 minutes! Even though ntpd is running.

Anyhow, once the clock was fixed to be closer to the AD server, logins and sudo started working again.


Put in a actual, recognized SSL Certificate on the site, and setup redirects to run everything through that now.

Figured that was a reasonable thing to do because people are still occasionally downloading old software from this site, and the cert was free for the year (Gandi).

Hopefully by the time it expires the Let's Encrypt service will be up and running.

IPv6 World Launch Day

IPv6 Logo I've been working on getting this website up and running under IPv6, and it turned out to be somewhat involved. Firstly, I signed up with Hurricane Electric's tunnelbroker.net, to get IPv6 connectivity, because my ISP doesn't offer it yet. Setup my own DNS servers running nsd, which was a bit of a learning curve, but in the long run I think it'll be better than working with goofy DNS managers like you'd find on registrar or hosting websites. NameCheap is now letting you setup IPv6 glue records right on their website (previously you had to file a support ticket), so that made things easier.

The only big glitch I ran into is that on FreeBSD, using simply

listen [::]:80;

to listen to both IPv4 and IPv6 didn't work. When trying that, I found that any request coming in as IPv4 would give weird 403 or 404 (I don't remember which) errors, where it seemed nginx just didn't know what virtual host to go to. Linux doesn't seem to have that problem. Ended up using separate listen statements, as in:

listen 80 default_server;
listen [::]:80 default_server ipv6only=on;

for the main site, but VERY IMPORTANTLY, the remaining sites could not have the ipv6only=on directive, they just simply say

listen  80;
listen [::]:80;

(found that trick in this ServerFault page). This also has the advantage of showing proper IPv4 IP addresses in the logs, instead of IPv4-mapped IPv6 addresses such as ::ffff:, so I ended up doing the same thing on a Linux box even though it handled dual-stack by default just fine.

I also for testing purposes, made aliases

To force one protocol or the other. When you use http://barryp.org/blog/, it's not obvious which you're using.

q2java and qwpython on GitHub

Quite some time ago I worked on a couple interesting projects, q2java which embedded Java into a Quake2 server and then allowed for games to be written in Java; and qwpython, which wrapped up the QuakeWorld dedicated server engine as a Python module and came with a QuakeC -> Python translator, so existing QW games like CTF could be converted to Python and hacked on from there.

Some time back I had created Subversion repositories of the various releases of those packages (didn't really know or use VCSes back then), and had them on an Apache mod_svn server. Well, SVN is kind of a PITA, and I'd like to not have to keep that server config going, so I looked at converting those repos to something else.

First tried hgsvn, which has worked pretty decently for smaller or simpler repos, but something in the q2java one make it crap out.

Next tried the hg convert extension, which worked better, but only when using a local SVN URL (like file://...), and then didn't manage to pick up the tags correctly.

Decided to give Git a try, first with Converting a Subversion repository to Git (in seven steps). It worked, but was kind of a nasty process for a Git newbie to wade through. The tags still didn't seem quite right.

Finally, realized GitHub has its own Subversion import built right into the website, and gave that a try. Very very nice, made importing the svn repo a breeze, and seems to have gotten the tags just right. Only took two steps: enter the svn URL, fill in a translation table of svn userids to git authors. I'd recommend this highly for anyone looking to move off svn. See the q2java and qwpython results here.

PyCon 2012

Headed off for PyCon 2012 tomorrow. Last one I was at was 2007 in Dallas, can't believe it's been 5 years. Looking forward to seeing some cool stuff, and maybe playing some games in the evening.

Make sure virtualization is enabled in the BIOS

I just wasted a fair amount of time on a RedHat 6.1 box being setup to be a hypervisor with KVM, trying to figure how why when I ran virsh version it was telling me among other things

internal error Cannot find suitable emulator for x86_64

All the appropriate packages such as qemu-kvm were installed, but it just didn't seem to want to work. Finally as I was about to try reinstalling RHEL, I remoted into the actual console and saw:

kvm: disabled by bios

Doh!, and looking back in /var/log/messages the same thing was buried deep within all the boot noise. While trying to figure this out I managed to just be looking for virt or qemu in the logs and somehow didn't search for kvm. Enabled virtualization in the BIOS and everything's gravy now.

So there you go, if you're Googling that first error message and get lots of other nonsense, look for the message about the BIOS.

amqplib 1.0.0

I attended OSCON for the first time this year, and to celebrate I thought I'd wrap up the Python amqplib library a bit and consider it more-or-less finished for what it is (a simple blocking 0-8 client), and call it 1.0.0 You can find it on the in PyPi and Google Project Hosting

It's definitely a worthwhile upgrade in that it's significantly faster than amqplib 0.6.1, and has a fair number of bug fixes. Also noteworthy are support for Python 3.x (via 2to3) and IPv6