From owner-freebsd-stable@FreeBSD.ORG Mon Mar 20 17:26:28 2006 Return-Path: X-Original-To: freebsd-stable@freebsd.org Delivered-To: freebsd-stable@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 3FD1116A401 for ; Mon, 20 Mar 2006 17:26:28 +0000 (UTC) (envelope-from allbery@ece.cmu.edu) Received: from bache.ece.cmu.edu (BACHE.ECE.CMU.EDU [128.2.129.23]) by mx1.FreeBSD.org (Postfix) with ESMTP id A254243D55 for ; Mon, 20 Mar 2006 17:26:27 +0000 (GMT) (envelope-from allbery@ece.cmu.edu) Received: from [10.9.204.128] (dsl093-061-215.pit1.dsl.speakeasy.net [66.93.61.215]) by bache.ece.cmu.edu (Postfix) with ESMTP id 9231BB4; Mon, 20 Mar 2006 12:26:25 -0500 (EST) In-Reply-To: <1E32EDEA-55BD-4E5A-A1F6-9AEE92A6A5A6@khera.org> References: <1CC90809-B601-425F-AA6C-9927F7EA97AA@khera.org> <441EE362.8060008@gmx.de> <1E32EDEA-55BD-4E5A-A1F6-9AEE92A6A5A6@khera.org> Mime-Version: 1.0 (Apple Message framework v746.3) Content-Type: text/plain; charset=US-ASCII; delsp=yes; format=flowed Message-Id: <43B3539D-F388-4D88-9D3C-F14B36CB2FA7@ece.cmu.edu> Content-Transfer-Encoding: 7bit From: "Brandon S. Allbery KF8NH" Date: Mon, 20 Mar 2006 12:26:23 -0500 To: Vivek Khera X-Mailer: Apple Mail (2.746.3) Cc: , Kamikaze , LoN, freebsd-stable@freebsd.org Subject: Re: utf-8 support in libc? X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 20 Mar 2006 17:26:28 -0000 On Mar 20, 2006, at 12:21 , Vivek Khera wrote: > I expect that to happen. What I'm more curious about is the > collating speed. Ie, how fast are the sorting and string > comparison functions. The clam here is that in *BSD these are > somehow not fast. I'm not sure if that is a BSD issue or a > Postgres issue for not taking advantage of the BSD functions properly. I don't think that's the issue, so much as that FreeBSD *doesn't support* UTF-8 collation so the database has to use its own (possibly slower than platform-optimized) collation libraries. (en_US.UTF-8/LC_COLLATE is symlinked to a US-ASCII collation sequence which is identical to binary. This is incorrect for UTF-8; there're all kinds of strange things that need to be done to sort UTF-8 properly.) -- brandon s. allbery [linux,solaris,freebsd,perl] allbery@kf8nh.com system administrator [openafs,heimdal,too many hats] allbery@ece.cmu.edu electrical and computer engineering, carnegie mellon university KF8NH