Markdown and Pygments

This blog is mainly being written as Markdown text stored in a database, and I thought it would be nice to add the ability to use Pygments to add syntax highlighting to various bits of code within the entries.

There are some DjangoSnippets entries on how to do this, notably #360 which first runs text through Markdown to generate HTML and then BeautifulSoup to extract parts marked up in the original pre-Markdown text as <pre class="foo">...</pre> to be run through Pygments and then re-inserted back into the overall Markdown-generated HTML.

The problem with this is that the text within <pre>...</pre> needs to valid HTML with things like: e_mail='<foo@bar.edu>' escaped as e_mail='&lt;foo@bar.edu>', otherwise BeautifulSoup thinks in that example that you have a screwed up <foo> tag and tries to fix that up.

Making sure all the <, &, and other characters special to HTML are escaped within a large chunk of code misses out on the convenience of using Markdown. I decided to go with an arrangement in which regular Markdown code blocks are used, but if the first line begins with pygments:<lexer>, then that block is pygmentized.

So if I enter something like:

Here is some code

    pygments:python
    if a < b:
        print a

It ends up as:


Here is some code

if a < b:
    print a

What I came up with is this derivative of Snippet #360

from htmlentitydefs import name2codepoint
from HTMLParser import HTMLParser
from markdown import markdown
from BeautifulSoup import BeautifulSoup
from pygments.lexers import LEXERS, get_lexer_by_name
from pygments import highlight
from pygments.formatters import HtmlFormatter

# a tuple of known lexer names
_lexer_names = reduce(lambda a,b: a + b[2], LEXERS.itervalues(), ())

# default formatter
_formatter = HtmlFormatter(cssclass='source')    

class _MyParser(HTMLParser):
    def __init__(self):
        HTMLParser.__init__(self)
        self.text = []
    def handle_data(self, data):
        self.text.append(data)
    def handle_entityref(self, name):
        self.text.append(unichr(name2codepoint[name]))

def _replace_html_entities(s):
    """
    Replace HTML entities in a string
    with their unicode equivalents.  For
    example, '&amp;' is replaced with just '&'

    """
    mp = _MyParser()
    mp.feed(s)
    mp.close()
    return u''.join(mp.text)  

def markdown_pygment(txt):
    """
    Convert Markdown text to Pygmentized HTML

    """
    html = markdown(txt)
    soup = BeautifulSoup(html)
    dirty = False
    for tag in soup.findAll('pre'):
        if tag.code:
            txt = tag.code.renderContents()
            if txt.startswith('pygments:'):
                lexer_name, txt = txt.split('\n', 1)
                lexer_name = lexer_name.split(':')[1]
                txt = _replace_html_entities(txt)
                if lexer_name in _lexer_names:
                    lexer = get_lexer_by_name(lexer_name, stripnl=True, encoding='UTF-8')
                    tag.replaceWith(highlight(txt, lexer, _formatter))
                    dirty = True
    if dirty:
        html = unicode(soup)

    return html

Stackless Python and Sockets

I've been intrigued by Stackless Python for a while, and finally got around to installing it one one of my machines. FreeBSD doesn't have a port available, so after creating an ezjail to isolate the installation, it was just a matter of fetching and extracting stackless-251-export.tar.bz2 and doing a standard ./configure && make && make install

The installation looks pretty much like a normal Python installation on FreeBSD, with a /usr/local/bin/python binary and libraries in /usr/local/lib/python2.5

Networking is something I especially wanted to check out with Stackless, and the examples on the Stackless website mostly make use of a stacklesssocket.py module which is a separate download. That module has unittests built in as the module's main function, but when running it on my FreeBSD 7.0-CURRENT box, it died with an exception ending in:

File "stacklesssocket.py.ok", line 286, in handle_connect
  self.connectChannel.send(None)
AttributeError: 'NoneType' object has no attribute 'send'

after doing some digging, I found that stacklesssocket.py has a dispatcher class which is a subclass of a class by the same name in Python's asyncore.py module. stacklesssocket.dispatcher.connect() calls asyncore.dispatcher.connect() which may directly call the object's handle_connect() method before returning back to stacklesssocket.dispatcher.connect(). However stacklesssocket.dispatcher.connect() doesn't setup that channel until after the call to asyncore.dispatcher.connect() returns. So when handle_connect() tries to send a message over a channel that doesn't exist yet, an exception is raised.

This trivial patch seems to fix the problem - only sending a message over the channel if it exists (which should only happen if there's another tasklet waiting on it back in a stacklesssocket.dispatcher.connect() method).

--- stacklesssocket.py.orig       2007-09-18 20:58:02.000835000 -0500
+++ stacklesssocket.py  2007-09-18 22:03:13.370709131 -0500
@@ -282,7 +282,7 @@

    # Inform the blocked connect call that the connection has been made.
    def handle_connect(self):
-        if self.socket.type != SOCK_DGRAM:
+        if (self.socket.type != SOCK_DGRAM) and self.connectChannel:
            self.connectChannel.send(None)

    # Asyncore says its done but self.readBuffer may be non-empty

With that patch, the unittests run successfully - at least on my box.

Building ports on old FreeBSDs - revised

This is a revision of an earlier post which has instructions that no longer work.

Are you running an older version of FreeBSD, and getting errors like this when you try to build a port?

"/usr/ports/Mk/bsd.port.mk", line 2416: warning: String comparison operator should be either == or !=
"/usr/ports/Mk/bsd.port.mk", line 2416: warning: String comparison operator should be either == or !=
"/usr/ports/Mk/bsd.port.mk", line 2416: Malformed conditional (((${OSVERSION} < 504105 || (${OSVERSION} 
    >= 600000 && ${OSVERSION} < 600103) || (${OSVERSION} >= 700000 && ${OSVERSION} < 700012)) && 
    ${PKGORIGIN} != "ports-mgmt/pkg_install") || exists(${LOCALBASE}/sbin/pkg_info))

If so, it's because the ports maintainers have started using expressions in the ports Makefiles which are not understood by the versions of make that come with old FreeBSDs.

The official recommended fix would be to upgrade your FreeBSD, but if that's not practical you can at least install a newer version of make to get by for a bit longer. This can be done in just a few minutes with two main steps: temporarily bring back an older /usr/ports/Mk which is compatible with FreeBSD 4.x - and then build and install the devel/make port which used to be present in the ports tree.

The ports tree is in CVS, so it's possible to checkout older revisions of selected directories. This FreeBSD Handbook page lists the anonymous CVS repositories available. For this example I'm going to use anoncvs@anoncvs1.FreeBSD.org:/home/ncvs We only need two small directories, so it probably doesn't really matter which one you use.

It seems like the first commit to the ports infrastructure which broke compatibility happened around Feb 5th, 2007 - so let's backup the current /usr/ports/Mk and check out one from Feb 4th:

cd /usr/ports
mv Mk Mk.original
cvs -d anoncvs@anoncvs1.FreeBSD.org:/home/ncvs co -D "04 Feb 2007" -d Mk ports/Mk

You should now be able to build some ports, at least those that don't use incompatible syntax in their individual Makefiles and don't require a newer ports infrastructure. Now, if you don't already have a devel/make port, use CVS to bring that back too, then build and install it:

cd /usr/ports/devel
cvs -d anoncvs@anoncvs1.FreeBSD.org:/home/ncvs co -D "04 Feb 2007" -d make ports/devel/make
cd make
make install
make clean

Lastly, set your system to use the new ports make in place of the old system make, and do some cleanup:

cd /usr/bin
mv make make.old
ln -s /usr/local/bin/make .

cd /usr/ports
rm -rf Mk
mv Mk.original Mk
rm -rf /usr/ports/devel/make

You should now be in better shape for trying to build new ports. 4.x isn't officially supported anymore by the ports maintainers, so there may be some individual port breakage - but at least you're over the first hurdle.

Logitech QuickCam Pro 5000 on a Mac Mini

I was recently working on finding a reasonable webcam that would work with an Intel Mac Mini, since Apple no longer sells the iSight and the prices for them on eBay are outrageous.

After reading reports that newer versions of the Logitech QuickCam Pro 5000 (labeled as "Vista Ready") worked on Macs, I picked one up and tried it out on my Intel MacBook laptop, running OSX 10.4.8 at the time I believe.

Despite what I had read on various postings, the camera didn't show up at all, although the built-in microphone worked OK, appearing as something like "Unknown USB Audio Device". The camera worked OK in Windows, so I figured the hardware was OK - and returned it to the store and started looking again.

After a while, 10.4.9 came out, and supposedly included updates that supported more webcams. I bought another QuickCam Pro 5000, and this time found that the camera and microphone worked in iChat, but the version of Skype I had at the time (2.6.0.137) only saw the microphone.

I figured this was good enough and took it out to the person with the Mini. While there, I updated OSX to 10.4.10. When I plugged in the QuickCam, I found that the camera worked, but now the microphone didn't show up at all. When plugged into my 10.4.9 laptop, the camera and mic worked fine. Apparently Apple broke something in the 10.4.10 update. (there's a discussion of it here.)

After poking around, I found the AppleUSBAudio system extension, which seemed like a likely suspect. By replacing it with the same extension from 10.4.9, I was able to get the mic working - it went something like:

sudo -s
(type in password)

cd /System/Library/Extensions
kextunload AppleUSBAudio.kext

(backup the AppleUSBAudio.kext directory somewhere else)
(copy the 10.4.9 AppleUSBAudio.kext directory to this directory)

(permissions got changed moving between machines, fix that up)
chown -R root:wheel AppleUSBAudio.kext  

kextload AppleUSBAudio.kext

Plugged in the webcam, and now both camera and mic work. Tried a newer Skype - 2.6.0.148, and that works too.

So I think we'll be able to chat between the Mac Mini and the MacBook for a while - at least until Apple updates the OS again.

In a followup entry, I replaced this camera with a Logitech QuickCam Communicate STX.

FreeBSD, OpenBSD, and 2007 Daylight Savings Time

Just a followup to my earlier post on updating FreeBSD to handle the changes in Daylight Savings Time starting in 2007....

There are a couple mentions [1], [2] on the freebsd-stable mailing list that the /etc/localtime file is binary compatible across various versions of FreeBSD.

This means you only have to get one good copy of /etc/localtime for your timezone, either by pulling from a FreeBSD 6.2 or higher machine, or installing the misc/zoneinfo port on one older machine and run tzsetup. Once you have that good version, you can copy it around to your other FreeBSD boxes, regardless of what version they're running.

This seems to also work between FreeBSD and OpenBSD (and perhaps Net and Dragonfly). I had a single OpenBSD 3.7 machine I wanted to update, and it seems to work OK to use a /etc/localtime pulled from a FreeBSD box. On OpenBSD (3.7 at least), /etc/localtime was a symlink to a file in /usr/share/zoneinfo/. I just removed the symlink and made a new one pointing to a new timezone file I stuck somewhere on the disk. Checking for the correct DST dates in OpenBSD seems to be the same as with FreeBSD:

zdump -v /etc/localtime | grep 2007

Look for March 11th as the start date.

Building ports on old FreeBSDs

The instructions in this entry no longer work, please check out the revised entry on this issue.

I've got a couple older FreeBSD machines at work, running 4.7 and 4.8, that have started having trouble building ports. Apparently some changes have been made to the ports infrastructure that are not compatible with make on older FreeBSDs. The error message given when a running 'make' in a port directory is:

"/usr/ports/Mk/bsd.port.mk", line 2292: warning: String comparison operator should be either == or !=
"/usr/ports/Mk/bsd.port.mk", line 2292: warning: String comparison operator should be either == or !=
"/usr/ports/Mk/bsd.port.mk", line 2292: Malformed conditional (((${OSVERSION} < 504105 || 
      (${OSVERSION} >= 600000 && ${OSVERSION} < 600103) 
      || (${OSVERSION} >= 700000 && ${OSVERSION}     < 700012)) 
     && ${PKGORIGIN} != "ports-mgmt/pkg_install") || exists(${LOCALBASE}/sbin/pkg_info))
"/usr/ports/Mk/bsd.port.mk", line 2293: warning: String comparison operator should be either == or !=
"/usr/ports/Mk/bsd.port.mk", line 2293: warning: String comparison operator should be either == or !=
"/usr/ports/Mk/bsd.port.mk", line 2293: Malformed conditional ((${OSVERSION} < 504105 || 
     (${OSVERSION} >= 600000 && ${OSVERSION} < 600103) 
     || (${OSVERSION} >= 700000 && ${OSVERSION} < 700012)) 
     && ${PKGORIGIN} != "ports-mgmt/pkg_install")
"/usr/ports/Mk/bsd.port.mk", line 2308: if-less else
"/usr/ports/Mk/bsd.port.mk", line 2308: Need an operator
"/usr/ports/Mk/bsd.port.mk", line 2322: if-less endif
"/usr/ports/Mk/bsd.port.mk", line 2322: Need an operator
"/usr/ports/Mk/bsd.port.mk", line 5987: if-less endif
"/usr/ports/Mk/bsd.port.mk", line 5987: Need an operator
make: fatal errors encountered -- cannot continue

I ran into this when trying to update timezone info. The problem can be fixed by installing a newer version of make on the system.

First, backup /usr/ports/Mk/bsd.port.mk

 cd /usr/ports/Mk
 cp -p bsd.port.mk bsd.port.mk.orig

Then patch it with this file or edit with your favorite editor going to line 2292 (the first error line above), where there's a chunk of .if .if....endif .endif statements and tossing everything except this first cluster of variables being set:

PKG_CMD?=              ${LOCALBASE_REL}/sbin/pkg_create
PKG_ADD?=              ${LOCALBASE_REL}/sbin/pkg_add
PKG_DELETE?=   ${LOCALBASE_REL}/sbin/pkg_delete
PKG_INFO?=             ${LOCALBASE_REL}/sbin/pkg_info
PKG_VERSION?=          ${LOCALBASE_REL}/sbin/pkg_version

save it. Now build and install a newer version of make, "Berkeley make, back-ported to FreeBSD 4.x"

cd /usr/ports/devel/make
make install
make clean


use it instead of the old make

cd /usr/bin
mv make make.old
ln -s /usr/local/bin/make .

restore the bsd.ports.mk file you backed up

cd /usr/ports/Mk
mv bsd.port.mk.orig bsd.port.mk

You should be able to build ports again using the current ports tree infrastructure.

2007 Daylight Savings Time changes and FreeBSD

Starting in 2007, the dates that daylight savings time begins and ends is changing in the US and other countries. For FreeBSD it seems versions 6.2 and higher should already know about the new DST dates. A machine can be checked with

zdump -v /etc/localtime | grep 2007

A machine with the old DST settings will show lines that begin with:

/etc/localtime  Sun Apr  1 07:59:59 2007 
/etc/localtime  Sun Apr  1 08:00:00 2007 
/etc/localtime  Sun Oct 28 06:59:59 2007 
/etc/localtime  Sun Oct 28 07:00:00 2007

which is wrong, (April 1st and Oct 28th). A machine that's correct should show:

/etc/localtime  Sun Mar 11 07:59:59 2007 
/etc/localtime  Sun Mar 11 08:00:00 2007 
/etc/localtime  Sun Nov  4 06:59:59 2007 
/etc/localtime  Sun Nov  4 07:00:00 2007

March 11th and Nov 4th being the new days that DST switches.

Updating a FreeBSD box seems to be just a matter in installing the misc/zoneinfo port, and then running the tzsetup command which gives you menus to pick your timezone again.

this blog has links for info on updating other OSes.

Going to PyCon 2007

I've got my plane ticket, hotel reservation and conference registration for PyCon 2007 all lined up, so I'll be headed for Texas in 6 weeks.

mod_scgi redirection

While working on a new Django project, I noticed something odd about running it under mod_scgi: if you were POSTing to a URL, /foo for example, and the view for that URL did a relative redirect, as in django.http.HttpResponseRedirect('/bar'), the 302 redirect wasn't making it back to the browser. Instead, the browser was acting like the result of POST /foo was a 200 OK followed by the data you'd receive from GET /bar, without the browser knowing that it coming from a new location. The big drawback to this is that if you do a reload, the browser tries to POST to /foo again, instead of just GET /bar. The Django docs recommend always responding to POSTs with redirects, just for this reason.

Strictly speaking, redirects should be absolute URLs (see section 14.30 in the HTTP specs), and if you use one of those, it acts as expected. Django is full of relative redirects, but the framework at this time doesn't seem to try and convert them to absolute. There is ticket #987 in the Django Trac that talks about this a bit.

Browsers seem to handle relative redirects OK through, and that behavior doesn't occur with the Django test http server. Having mod_scgi conceal what Django is doing is not so good.

Digging into the mod_scgi sourcecode apache2/mod_scgi.c reveals a section of code that's causing this change:

location = apr_table_get(r->headers_out, "Location");

if (location && location[0] == '/' &&
    ((r->status == HTTP_OK) || ap_is_HTTP_REDIRECT(r->status))) {

    apr_brigade_destroy(bb);

    /* Internal redirect -- fake-up a pseudo-request */
    r->status = HTTP_OK;

    /* This redirect needs to be a GET no matter what the original
    * method was.
    */
    r->method = apr_pstrdup(r->pool, "GET");
    r->method_number = M_GET;

    ap_internal_redirect_handler(location, r);
    return OK;
}

Tossing that section of code causes mod_scgi to leave the relative redirects alone.

Full Text Searching with SQLite

I'd like to add a search feature back on this site. Previously, I had an arrangement setup with PyBlosxom and ht://Dig, but now that it's Django-powered, I'd like to do something that worked directly from the database instead of crawling the site like ht://Dig did.

After looking a while at various text search engines, I remembered seeing that SQLite just added an FTS1 module in version 3.3.8, which sounds pretty easy to use. Unfortunately the FreeBSD port databases/sqlite3 doesn't build with that feature.

After poking around a bit, I got it to build with FTS manually, and after a whole bunch more messing around, came up with a patch to add an option to the port to build sqlite3 with FTS. The patch has been submitted to the FreeBSD bug tracker as ports/106281. Hopefully I have all my ducks lined up on that.


Anyhow, in testing the FTS a bit, I found one thing they only hint at on the SQLite website. There's a Porter stemmer built in, even though the wiki says: "The module does not perform stemming of any sort." You activate it by adding tokenize porter to the table declaration, for example (adapted from their example).

CREATE VIRTUAL TABLE recipe USING FTS1(name, ingredients, tokenize porter);

once you've done that, and inserted some sample data:

INSERT INTO recipe VALUES('broccoli stew', 'broccoli peppers cheese tomatoes');

the searches don't have to be quite as exact, for example:

SELECT name FROM recipe WHERE ingredients MATCH 'pepper'

hits the 'broccoli stew' recipe even through it has 'peppers' and you searched for 'pepper'.

Not sure why the Porter stemmer isn't documented in the SQLite wiki, perhaps it's still a work in progress or being changed for FTS2.