Date: Mon, 30 Mar 1998 16:03:30 -0500 (EST) From: John Fieber <jfieber@indiana.edu> To: Simon Shapiro <shimon@simon-shapiro.org> Cc: freebsd-database@FreeBSD.ORG Subject: Mail indexing infrastructure Message-ID: <Pine.BSF.3.96.980330154855.8177B-100000@fallout.campusview.indiana.edu> In-Reply-To: <XFMail.980330115211.shimon@simon-shapiro.org>
next in thread | previous in thread | raw e-mail | index | archive | help
On Mon, 30 Mar 1998, Simon Shapiro wrote: > > The FreeBSD mailing list archive is 620MB large. There are currently > > 270,000 messages. The archive grow with 100,000 messages/year. > > Excellent. How many years back do we want to keep? The current indexed archive goes back to 1994. > Also, if the current engine is so great, how come all these people are > excited about replacing it? Thread retrieval and date scoping. However, most proposed solutions involve a wholesale replacement rather than augumenting what we have, which works pretty well, all told. Basically, the vector-space ranked retrieval we already have, possibly scoped by date, is the best way to start a search, followed by thread retrieval once a promising message has been found. Wolfram's home-brew solution for threads is more along the lines of what we need. I have working date scoping in prototype, but there are performance problems--freeWAIS really doesn't handle that sort of thing very well and I'm a bit concerned about killing www.freebsd.org with it because I know it will be a popular feature. I also have half a mind to provide relevance feedback (a "find more like this..." link) but my free time is much smaller than the things I have to fill it with. :( -john To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-database" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?Pine.BSF.3.96.980330154855.8177B-100000>