From owner-freebsd-ports@FreeBSD.ORG Thu Jan 10 02:11:20 2008 Return-Path: Delivered-To: ports@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 2AC3716A417; Thu, 10 Jan 2008 02:11:20 +0000 (UTC) (envelope-from alex.kovalenko@verizon.net) Received: from vms040pub.verizon.net (vms040pub.verizon.net [206.46.252.40]) by mx1.freebsd.org (Postfix) with ESMTP id 1047713C458; Thu, 10 Jan 2008 02:11:20 +0000 (UTC) (envelope-from alex.kovalenko@verizon.net) Received: from [10.0.3.231] ([70.21.165.95]) by vms040.mailsrvcs.net (Sun Java System Messaging Server 6.2-6.01 (built Apr 3 2006)) with ESMTPA id <0JUE00EOVOS10IB0@vms040.mailsrvcs.net>; Wed, 09 Jan 2008 20:12:35 -0600 (CST) Date: Wed, 09 Jan 2008 21:09:05 -0500 From: "Alexandre \"Sunny\" Kovalenko" In-reply-to: <1199927795.304.70.camel@shumai.marcuscom.com> To: Joe Marcus Clarke Message-id: <1199930945.46097.11.camel@RabbitsDen> MIME-version: 1.0 X-Mailer: Evolution 2.12.3 FreeBSD GNOME Team Port Content-type: text/plain Content-transfer-encoding: 7bit References: <1199893999.756.29.camel@RabbitsDen> <1199900104.304.28.camel@shumai.marcuscom.com> <1199925635.9959.10.camel@RabbitsDen> <1199927795.304.70.camel@shumai.marcuscom.com> Cc: ports@freebsd.org, gnome@freebsd.org, Alexander Nedotsukov Subject: Re: [patch] glib20, UTF-8 and string collation X-BeenThere: freebsd-ports@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Porting software to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 10 Jan 2008 02:11:20 -0000 On Wed, 2008-01-09 at 20:16 -0500, Joe Marcus Clarke wrote: > On Wed, 2008-01-09 at 19:40 -0500, Alexandre "Sunny" Kovalenko wrote: > > On Wed, 2008-01-09 at 12:35 -0500, Joe Marcus Clarke wrote: > > > On Wed, 2008-01-09 at 10:53 -0500, Alexandre "Sunny" Kovalenko wrote: > > > > I have seen recent commit WRT string collation in devel/glib20 by > > > > marcus, so I have decided to check if there is an interest to fix SEGV > > > > in g_utf8_collate when it is given 8-bit non-UTF-8 string(s) to collate. > > > > > > Any commits I have made in the area of UTF-8 are completely accidental. > > > I am not the UTF-8 guy. Both bland and jylefort have expressed interest > > > in this. Perhaps one of them will comment. > > > > I hope so. Just in case, they would decide to, I have reduced the > > situation to the small program below. I get > > > > GLib-CRITICAL **: g_convert: assertion `str != NULL' failed > > > > and no core dump from this simple program, whereas Evolution manages to > > pass NULL to strcoll further down in g_utf8_collate and get SEGV for its > > pains. > > That sounds like a no-no for Evolution to be dereferencing a NULL > pointer. Hopefully they'd fix this to prevent the problem. It's not Evolution, it is glib, specifically g_utf8_collate, which would call strcoll(3) blindly on the return of g_utf8_normalize inside gunicollate.c. And now, I can get core dumped out of this simple program as well, merely by setting CHARSET=en_US.UTF-8 (I had it is ASCII in the terminal window, which would trigger different path within g_utf8_collate). > > > > > Conversely, if the answer still is "Evolution should not have done > > that", I will happily crawl back under my rock and keep my patch > > locally. > > I can't imagine you're alone in this. But then again, any Cyrillic mail > that comes my way is always spam, so what do I know. More importantly, it is UTF-8 spam -- in order to trigger this, you need KOI8-R or CP1251, and in the sorted column to boot. I suspect that Latin1 or ShiftJIS would do the trick too. Now, how about this: would you be amenable to this Really Harmless(tm) patch, which merely adds error checking along the lines used in the same function, about dozen lines up ;) --- glib/gunicollate.c.B 2008-01-09 20:48:25.000000000 -0500 +++ glib/gunicollate.c 2008-01-09 20:49:35.000000000 -0500 @@ -166,6 +166,9 @@ str1_norm = g_utf8_normalize (str1, -1, G_NORMALIZE_ALL_COMPOSE); str2_norm = g_utf8_normalize (str2, -1, G_NORMALIZE_ALL_COMPOSE); + g_return_val_if_fail (str1_norm != NULL, 0); + g_return_val_if_fail (str2_norm != NULL, 0); + if (g_get_charset (&charset)) { result = strcoll (str1_norm, str2_norm); I can add it to your files/extra-patch-glib_gunicollate.c, or package it separately -- I really hate it when I start Evolution after portupgrade to write some E-mails real quick, only to find out that I have forgotten to patch glib... again. -- Alexandre "Sunny" Kovalenko