Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 9 Jul 2002 22:16:42 +0200
From:      Brad Knowles <brad.knowles@skynet.be>
To:        Eric Anderson <anderson@centtech.com>, Ross Lippert <ripper@eskimo.com>
Cc:        joseph@randomnetworks.com, freebsd-doc@freebsd.org, freebsd-chat@freebsd.org
Subject:   Re: Beta FreeBSD search engine
Message-ID:  <a05111b3cb950f4f23d0e@[10.0.1.15]>
In-Reply-To: <3D2B43EF.955661FC@centtech.com>
References:  <200207091944.MAA05507@eskimo.com> <3D2B43EF.955661FC@centtech.com>

next in thread | previous in thread | raw e-mail | index | archive | help
At 3:13 PM -0500 2002/07/09, Eric Anderson wrote:

>  Ok, all good thoughts.. One question:
>
>  How can I determine a language for a page by looking at it?

	You need dictionaries of words in various languages, then you do 
a sort | uniq of all words in the document and compare it against the 
language dictionaries.  The language dictionary with the highest 
number of hits is most likely to be the one in which the document is 
written.

-- 
Brad Knowles, <brad.knowles@skynet.be>

"They that can give up essential liberty to obtain a little temporary
safety deserve neither liberty nor safety."
     -Benjamin Franklin, Historical Review of Pennsylvania.

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-doc" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?a05111b3cb950f4f23d0e>