Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 30 Mar 1998 12:37:11 -0800 (PST)
From:      Simon Shapiro <shimon@simon-shapiro.org>
To:        John Fieber <jfieber@indiana.edu>
Cc:        freebsd-database@FreeBSD.ORG
Subject:   RE: Mailing list search interface
Message-ID:  <XFMail.980330123711.shimon@simon-shapiro.org>
In-Reply-To: <Pine.BSF.3.96.980330134953.7795A-100000@fallout.campusview.indiana.edu>

next in thread | previous in thread | raw e-mail | index | archive | help

On 30-Mar-98 John Fieber wrote:
> On Mon, 30 Mar 1998, Simon Shapiro wrote:
> 
>> Truth must be told, currently PostgreSQL uses Unix files to store its
>> indices and tables, so performance is not all that it could be.  I am
> 
> A properly constructed index for a full text database (read: NOT
> glimpse) requires very little disk i/o for most queries.  Eg,
> prefix trie hashing requires about two reads per search term in
> the query.  I just read a paper describing some optimtzaion that
> reduces that to one read about 50% of the time.

A picture starts emerging here, folks.  We normalize the normalizable and
then build a datatype which knows to do dictionary based searches on the
text.

The excellent news here is that disk I/O per record can be reduced.  This
allows us to easily utilize more than one Unix instance/host per database. 
This gives us the memory and CPU bandwidth.  This can turn really useful
real fast.

BTW, when considering text/scripts/database alternatives, think not only
about generating the search indices, but query too.  Descent RDBMS engines
cache these things very well, in userspace.


----------


Sincerely Yours, 

Simon Shapiro
Shimon@Simon-Shapiro.ORG                      Voice:   503.799.2313

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-database" in the body of the message



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?XFMail.980330123711.shimon>