Date: Tue, 11 Oct 2005 22:21:35 +0200 From: Wolfram Schneider <wosch@FreeBSD.org> To: Tim Wilde <twilde@dyndns.com> Cc: www@FreeBSD.org Subject: Re: Using Yahoo! or Google search bar instead of search.cgi Message-ID: <434C1ECF.4090608@FreeBSD.org> In-Reply-To: <Pine.BSF.4.63.0510101337140.4465@manganese.bos.dyndns.org> References: <Pine.BSF.4.63.0510101337140.4465@manganese.bos.dyndns.org>
next in thread | previous in thread | raw e-mail | index | archive | help
Tim Wilde wrote: > (Apologies for breaking threading, just joined freebsd-www so I don't > have the appropriate messages for a References: header.) > > As I mentioned in my earlier post, I think an even bigger problem than > the one Murray mentioned can be observed by the fact that a search for > "kernel" returns no results at all. I guess what happens here: "kernel" is a very common word (believe it or not). google has 18.900 hits for the word "kernel" on www.freebsd.org. Common words (e.g. "a", "the", "an", "www", "is") are usually ignored by search engines to save space or to speed up searches. These are known as "stop words." Even google has stop words. From my memory, search.cgi has a dynamic stop word list - words which hit the limit of 20.000 will be ignored. -Wolfram > At DynDNS, we recently started indexing our site using ht://Dig > (http://www.htdig.org/), and have been very happy with the flexibility > it provides for tuning search results to get the most relevant > matches. It is also a true spider, crawling the website over HTTP > rather than searching on the filesystem as the current search.cgi > seems to do. -- Wolfram Schneider <wosch@FreeBSD.org> http://wolfram.schneider.org
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?434C1ECF.4090608>