Automatically backup installed FreeBSD packages

A while ago I threw together this script to automatically create package files for all installed ports on a FreeBSD box. That way, if a portupgrade doesn't work out, you can delete the broken package, and pkg_add the backup.

Stick this in /usr/local/etc/periodic/daily, and the system will automatically bundle up copies of the installed software and stick them in /usr/local/packages if they don't already exist in there.

#!/bin/sh
#
# Make sure backups exist of all installed FreeBSD packages
#
# 2005-03-20 Barry Pederson <bp@barryp.org>
#

ARCHIVE="/usr/local/packages"

#
# Figure out which pkg_tools binaries to use
#
if [ -f /usr/local/sbin/pkg_info ]
then
    PKG_TOOLS="/usr/local/sbin"
else
    PKG_TOOLS="/usr/sbin"
fi

#
# Make sure backup directory exists
#
if [ ! -d $ARCHIVE ]
then
    mkdir $ARCHIVE
fi

cd $ARCHIVE

for p in `${PKG_TOOLS}/pkg_info -E "*"`
do
    if [ ! -f ${p}.tgz ]
    then
        ${PKG_TOOLS}/pkg_create -b ${p}
    fi
done

Getting PyBlosxom SCGI working under Lighttpd

Took another whack at getting PyBlosxom/SCGI working with Lighttpd, this time with better success. (I'm still getting up-to-speed with Lighttpd). This is working with the exact same SCGI setup I was working on the other day.

To elaborate a bit, the setup I'm trying to achieve is to:

  • Have the blog to be completely under "/blog/" in the URL namespace
  • Not get it confused with anything else that begins with "/blog" such as "/blog2".
  • Use "/blog/static/" URLs for serving static resources like CSS stylesheets and images off the disk (instead of running those requests through PyBlosxom's CGI code).

This is what I ended up with, seems to work fairly well, and I'm impressed with how Lighttpd makes it easy to put together a understandable configuration.

#
# External redirection to add a trailing "/" if exactly 
# "/blog" is requested
#
url.redirect = (
                "^/blog$" => "http://barryp.org:81/blog/",
               )

#
# The PyBlosxom Blog, lives under the "/blog/" url namespace
#
$HTTP["url"] =~ "^/blog/" {
    #
    # Static resources served from the disk
    #
    $HTTP["url"] =~ "^/blog/static/" {
        alias.url = ("/blog/static/" => "/data/blog/static/")
    }

    #
    # Everything non-static goes through SCGI
    #
    $HTTP["url"] !~ "^/blog/static/" {
        scgi.server = ( "/blog" => (
                                     (
                                     "host" => "127.0.0.1",
                                     "port" => 8040,
                                     "check-local" => "disable",
                                     )
                                   )
        )
    }
}

FastCGI, SCGI, and Apache: Background and Future

Ran across Mark Mayo's blog entry: FastCGI, SCGI, and Apache: Background and Future, which discusses exactly the things I've been struggling with this weekend. I have to agree that sticking an interpreter like Python directly into Apache is a lot of trouble. I've delved into Apache sourcecode, and the mass of macros and #ifdefs is enough to send you running away screaming. To try and graft Python onto that is just begging for trouble - and I've had some experience myself with grafting interpreters onto other things.

Running your webcode in separate processes just makes a lot of sense. You have much more freedom with choice of language and version of language. You can easily run things under different userids, chrooted, in jails/zones, on completely separate machines, completely separate OSes, maybe within virtual machines running different OSes on the same hardware.

Anyhow, thought I'd mention this because Mark's writeup made a lot of sense to me and I thought it was worth keeping a link to it.

Doing things the DJB way

While doing a bit more searching for daemontools info, found the djb way website, which has some nice writeups on daemontools and djbdns (which I also use a fair amount).

mod_python segfault on FreeBSD

I've been testing mod_python 3.2.x betas as requested by the developers on their mailing list. Unfortunately there seems to be some subtle memory-related but that only occurs on FreeBSD (or at least FreeBSD the way I normally install it along with Apache and Python).

Made some mention of it here and an almost identical problem is reported for MacOSX, even down to the value 0x58 being at the top of the backtrace.

Did a lot of poking around the core with gdb and browsing of the mod_python and Apache sourcecode, but never quite saw where the problem could be. Took another approach and started stripping down the big mod_python testsuite, and found that the test that was failing ran fine by itself, but when it ran after another test for handling large file uploads - then it would crash.

So I suspect there's a problem in a whole different area of mod_python, that's screwing something up in memory that doesn't trigger a segfault til later during the connectionhandler test. My latest post to the list covers some of that.

Running a SCGI server under daemontools

Yesterday, I was working on Running PyBlosxom through SCGI, but during that time, I was running the SCGI server by hand in a console window. Once it was working I needed to arrange a way to run this in a more permanent fashion. Daemontools seems like an easy way to set this up, I already had it running on my server.

Daemontools runs a process called svscan that looks for directories in /var/service (the default when installed through the FreeBSD port) that contain an executable named run. If svscan also finds a log/run executable in that directory, it starts that too and ties the two together with a pipe. Daemontools includes a multilog program that reads from the pipe (stdin), and writes out and rotates log file for you automatically.

To get PyBlosxom/SCGI running under this, started by making a temporary directory, and copying in the three files needed to run PyBlosxom through SCGI

mkdir /tmp/pyblosxom
cd /tmp/pyblosxom

cp ~/config.py .
cp ~/wsgi_app.py .
cp ~/scgi_server.py .

(The first two files come from the PyBlosxom distribtution (the first one is customized). The third file is the one I came up with yesterday)

Next, I came up with a run script to execute the SCGI server under the www userid, with stderr tied to stdout. Daemontools has a setuidgid program that makes this pretty easy

#!/bin/sh
exec 2>&1
exec setuidgid www ./scgi_server.py

Next, made a log subdirectory, a log/main subdirectory to hold the actual log files (owned by www).

mkdir log
mkdir log/main
chown www:www log/main

And in the log directory put another tiny run script.

#!/bin/sh
exec setuidgid www multilog t ./main

Finally, made both run scripts executable, and moved the whole thing into /var/service

chmod +x run
chmod +x log/run
cd ..
mv pyblosxom /var/service

svscan sees the new directories within a few seconds, starts up both run scripts automatically, and you're in business. See the current contents of the log with:

cat /var/service/pyblosxom/log/main/current

Stop and restart the server with the Daemontools svc utility:

cd /var/service
svc -d pyblosxom ; svc -u pyblosxom

Running PyBlosxom through SCGI

Out of curiosity, ran the Apache Benchmark program ab on the plain CGI installation of PyBlosxom on my little server (-n 100 -c 10), and got around 1.5 requests/second. Decided to give SCGI a try, and got some better results.

Went about this based on what I had read in Deploying TurboGears with Lighttpd and SCGI. Tried Lighttpd at first, and it mostly worked, but I've got an Apache setup right now, so wanted to stick with that for the moment (and it seems a bit quicker anyhow). Basically started by loading flup with easy_install.

 
    easy_install flup

Copied the config.py and wsgi_app.py files from the PyBlosxom distribution into a directory, and added this little script into that same directory:

#!/usr/bin/env python
import sys
from flup.server.scgi_fork import WSGIServer
from wsgi_app import application

server = WSGIServer(application, 
                 scriptName='/blog', 
                 bindAddress=('127.0.0.1', 8040)
                )
ret = server.run()
sys.exit(ret and 42 or 0)

Installed mod_scgi built for Apache2 and added two lines to the config

LoadModule scgi_module libexec/apache2/mod_scgi.so

SCGIMount /blog 127.0.0.1:8040

Notice how the scriptName and bindAddress parameters in the Python code are matched in the SCGIMount Apache directive. With this setup, running the same ab benchmark yields about 10 to 15 requests/second - not too bad. Running the threaded SCGI server (remove the _fork from the first import line) wasn't as good, only 3 or 8 requests/second.

The setup seems a bit shaky in that the benchmark values seem to keep decreasing with every run, especially in the threaded mode. So there may be some problems in my setup or in flup/scgi/pyblosxom_wsgi.

Even if it was working fine, SCGI is probably overkill for running PyBlosxom when you're not expecting a lot of traffic. And if you were, you'd probably run it with --static to generate static pages. But it was a reasonable thing to fool with for the day when you want to run a more dynamic WSGI app.

Lighttpd matching default virtual hosts.

While taking a look at Lightttpd, found that it didn't seem possible to setup a condition that only acted if a host wasn't specified in the request header. Filed ticket 458 in the Lighttpd Trac, so that a config could use

$HTTP["host"] == ""

Currently, a non-specified host is stored as a NULL, and comparisons to NULL always fail.

Otherwise Lighttpd seems fairly decent, and has some advantages to running CGIs in that you can easily suexec them to run under other userids. Apache only seems to want to do that for virtual hosts or personal user folders - not arbitrary CGIs in non-user folders.

Going to PyCon 2006

Found out this week that I get to attend PyCon 2006 on my employer's dime. Never been to anything like this before, so it should be interesting. Probably will attend the more web-related presentations.

Trying PyBlosxom

Going to give PyBlosxom a try, seems like a pretty simple system for throwing together a simple blog. Right now simple sounds pretty good. A lot of the Python-related blogs I normally end up seeing seem to use this software, so I figure it can't be too bad. Things like Zope/Plone/Turbogears seem like way overkill for just a simple one-person setup.

I'm kind of interested in the idea of a blog as a resume, so I'll try to write down some of the things I've worked on or figured out.

Previously, I've put some things on Advogato, however I always felt a bit guilty entering items that were too lengthy or not interesting enough for other readers. I guess that doesn't stop most bloggers, but at least on my own server I feel I can abuse it as much as I want.