Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 30 Mar 1998 16:03:30 -0500 (EST)
From:      John Fieber <jfieber@indiana.edu>
To:        Simon Shapiro <shimon@simon-shapiro.org>
Cc:        freebsd-database@FreeBSD.ORG
Subject:   Mail indexing infrastructure
Message-ID:  <Pine.BSF.3.96.980330154855.8177B-100000@fallout.campusview.indiana.edu>
In-Reply-To: <XFMail.980330115211.shimon@simon-shapiro.org>

next in thread | previous in thread | raw e-mail | index | archive | help
On Mon, 30 Mar 1998, Simon Shapiro wrote:

> > The FreeBSD mailing list archive is 620MB large. There are currently
> > 270,000 messages. The archive grow with 100,000 messages/year.
> 
> Excellent.  How many years back do we want to keep?

The current indexed archive goes back to 1994.

> Also, if the current engine is so great, how come all these people are
> excited about replacing it?

Thread retrieval and date scoping.  However, most proposed
solutions involve a wholesale replacement rather than augumenting
what we have, which works pretty well, all told.  

Basically, the vector-space ranked retrieval we already have,
possibly scoped by date, is the best way to start a search,
followed by thread retrieval once a promising message has been
found. Wolfram's home-brew solution for threads is more along the
lines of what we need.

I have working date scoping in prototype, but there are
performance problems--freeWAIS really doesn't handle that sort of
thing very well and I'm a bit concerned about killing
www.freebsd.org with it because I know it will be a popular
feature.

I also have half a mind to provide relevance feedback (a "find
more like this..." link) but my free time is much smaller than
the things I have to fill it with.  :(

-john


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-database" in the body of the message



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?Pine.BSF.3.96.980330154855.8177B-100000>