Date: Mon, 30 Mar 1998 12:37:11 -0800 (PST) From: Simon Shapiro <shimon@simon-shapiro.org> To: John Fieber <jfieber@indiana.edu> Cc: freebsd-database@FreeBSD.ORG Subject: RE: Mailing list search interface Message-ID: <XFMail.980330123711.shimon@simon-shapiro.org> In-Reply-To: <Pine.BSF.3.96.980330134953.7795A-100000@fallout.campusview.indiana.edu>
next in thread | previous in thread | raw e-mail | index | archive | help
On 30-Mar-98 John Fieber wrote: > On Mon, 30 Mar 1998, Simon Shapiro wrote: > >> Truth must be told, currently PostgreSQL uses Unix files to store its >> indices and tables, so performance is not all that it could be. I am > > A properly constructed index for a full text database (read: NOT > glimpse) requires very little disk i/o for most queries. Eg, > prefix trie hashing requires about two reads per search term in > the query. I just read a paper describing some optimtzaion that > reduces that to one read about 50% of the time. A picture starts emerging here, folks. We normalize the normalizable and then build a datatype which knows to do dictionary based searches on the text. The excellent news here is that disk I/O per record can be reduced. This allows us to easily utilize more than one Unix instance/host per database. This gives us the memory and CPU bandwidth. This can turn really useful real fast. BTW, when considering text/scripts/database alternatives, think not only about generating the search indices, but query too. Descent RDBMS engines cache these things very well, in userspace. ---------- Sincerely Yours, Simon Shapiro Shimon@Simon-Shapiro.ORG Voice: 503.799.2313 To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-database" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?XFMail.980330123711.shimon>