I'd like to add a search feature back on this site. Previously, I had an arrangement setup with PyBlosxom and ht://Dig, but now that it's Django-powered, I'd like to do something that worked directly from the database instead of crawling the site like ht://Dig did.
After looking a while at various text search engines, I remembered seeing that SQLite just added an FTS1 module in version 3.3.8, which sounds pretty easy to use. Unfortunately the FreeBSD port databases/sqlite3 doesn't build with that feature.
After poking around a bit, I got it to build with FTS manually, and after a whole bunch more messing around, came up with a patch to add an option to the port to build sqlite3 with FTS. The patch has been submitted to the FreeBSD bug tracker as ports/106281. Hopefully I have all my ducks lined up on that.
Anyhow, in testing the FTS a bit, I found one thing they only hint at on the SQLite website. There's a Porter stemmer built in, even though the wiki says: "The module does not perform stemming of any sort." You activate it by adding tokenize porter to the table declaration, for example (adapted from their example).
CREATE VIRTUAL TABLE recipe USING FTS1(name, ingredients, tokenize porter);
once you've done that, and inserted some sample data:
INSERT INTO recipe VALUES('broccoli stew', 'broccoli peppers cheese tomatoes');
the searches don't have to be quite as exact, for example:
SELECT name FROM recipe WHERE ingredients MATCH 'pepper'
hits the 'broccoli stew' recipe even through it has 'peppers' and you searched for 'pepper'.
Not sure why the Porter stemmer isn't documented in the SQLite wiki, perhaps it's still a work in progress or being changed for FTS2.
posted at: 18:20 | tags: bugs freebsd sqlite | 0 comments | permanent link to this entry
I've been doing a lot with Django lately, and initially set it up using mod_python as the Django docs recommend, but still have some reservations about that kind of arrangement. I'd like to go back to running it under SCGI or something similar.
Django has support builtin for FastCGI, but after trying to install mod_fastcgi in my Apache 2.0.x setup, decided it was a PITA. mod_scgi is quite easy to setup in Apache (even though the documentation is mostly nonexistent). After finding where Django implements its FastCGI support using the flup module, I saw that with just a few minor tweaks Django could be made to support all of flup's protocols, including SCGI and AJP (Apache Jserv Protocol).
AJP turns out to be very interesting because it's included standard with Apache 2.2 as mod_proxy_ajp, and can work with mod_proxy_balancer - meaning you could setup multiple Django instances and have Apache share the load between them.
After testing a bit, I submitted a patch, and will probably switch to running my Django sites as AJP servers managed by daemontools, and frontended by Apache 2.2
posted at: 20:33 | tags: apache bugs django scgi | 0 comments | permanent link to this entry
Just as a followup, it seems the segfault in mod_python on FreeBSD I mentioned before was found and fixed. Turns out to not be any kind of pointer/memory corruption like I thought, but rather a mishandled return code from an APR (Apache Portable Runtime) function. Oh well, I got to play with gdb, ddd, and valgrind a bit, which is good stuff to be familiar with.
posted at: 21:40 | tags: apache bugs freebsd mod_python | 0 comments | permanent link to this entry
Other people have reported the same problem with mod_python on FreeBSD I had seen before, so I'm happy that I'm not losing my mind.
I took a stab at using Valgrind to find the problem. Didn't actually find anything, but I thought I'd jot down notes on how I went about this.
First, the Valgrind port didn't seem to work on FreeBSD 6.0. When I tried running it against the sample code in the Valgrind Quick Start guide, it didn't find anything wrong with it. Ended up finding a FreeBSD 5.4 machine, which did see the expected problem.
Next, I built the Apache 2.0.x port with: make WITH_THREADS=1 WITH_DEBUG=1, and then built mod_python
which uses APXS and picks up the debug compile option from that.
Then, in the mod_python distribution, went into the test directory, and downloaded a Valgrind suppression
file for Python, valgrind-python.supp,
and in it uncommented the suppressions for PyObject_Free and PyObject_Realloc (otherwise the Valgrind output is full
of stuff that is really OK).
Then tweaked test/test.py around line 307 where it starts Apache, to insert
valgrind --tool=memcheck --logfile=/tmp/valgrind_httpd --suppressions=valgrind-python.supp
At the front of the cmd variable that's being composed to execute httpd.
Finally, ran python test.py, and then looked at /tmp/valgrind_httpd.pid#### to
see the results.
posted at: 22:17 | tags: bugs freebsd mod_python | 0 comments | permanent link to this entry
I've been testing mod_python 3.2.x betas as requested by the developers on their mailing list. Unfortunately there seems to be some subtle memory-related but that only occurs on FreeBSD (or at least FreeBSD the way I normally install it along with Apache and Python).
Made some mention of it here
and an almost identical problem is reported for MacOSX,
even down to the value 0x58 being at the top of the backtrace.
Did a lot of poking around the core with gdb and browsing of the mod_python and Apache sourcecode, but never quite saw where the problem could be. Took another approach and started stripping down the big mod_python testsuite, and found that the test that was failing ran fine by itself, but when it ran after another test for handling large file uploads - then it would crash.
So I suspect there's a problem in a whole different area of mod_python, that's screwing something up in memory that doesn't trigger a segfault til later during the connectionhandler test. My latest post to the list covers some of that.
posted at: 13:56 | tags: apache bugs freebsd mod_python unix | 0 comments | permanent link to this entry
While taking a look at Lightttpd, found that it didn't seem possible to setup a condition that only acted if a host wasn't specified in the request header. Filed ticket 458 in the Lighttpd Trac, so that a config could use
$HTTP["host"] == ""
Currently, a non-specified host is stored as a NULL, and comparisons to NULL always fail.
Otherwise Lighttpd seems fairly decent, and has some advantages to running CGIs in that you can easily suexec them to run under other userids. Apache only seems to want to do that for virtual hosts or personal user folders - not arbitrary CGIs in non-user folders.
posted at: 11:36 | tags: bugs lighttpd | 0 comments | permanent link to this entry