Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 09 Jan 2008 21:09:05 -0500
From:      "Alexandre \"Sunny\" Kovalenko" <alex.kovalenko@verizon.net>
To:        Joe Marcus Clarke <marcus@marcuscom.com>
Cc:        ports@freebsd.org, gnome@freebsd.org, Alexander Nedotsukov <bland@freebsd.org>
Subject:   Re: [patch] glib20, UTF-8 and string collation
Message-ID:  <1199930945.46097.11.camel@RabbitsDen>
In-Reply-To: <1199927795.304.70.camel@shumai.marcuscom.com>
References:  <1199893999.756.29.camel@RabbitsDen> <1199900104.304.28.camel@shumai.marcuscom.com> <1199925635.9959.10.camel@RabbitsDen> <1199927795.304.70.camel@shumai.marcuscom.com>

next in thread | previous in thread | raw e-mail | index | archive | help

On Wed, 2008-01-09 at 20:16 -0500, Joe Marcus Clarke wrote: 
> On Wed, 2008-01-09 at 19:40 -0500, Alexandre "Sunny" Kovalenko wrote:
> > On Wed, 2008-01-09 at 12:35 -0500, Joe Marcus Clarke wrote: 
> > > On Wed, 2008-01-09 at 10:53 -0500, Alexandre "Sunny" Kovalenko wrote:
> > > > I have seen recent commit WRT string collation in devel/glib20 by
> > > > marcus, so I have decided to check if there is an interest to fix SEGV
> > > > in g_utf8_collate when it is given 8-bit non-UTF-8 string(s) to collate.
> > > 
> > > Any commits I have made in the area of UTF-8 are completely accidental.
> > > I am not the UTF-8 guy.  Both bland and jylefort have expressed interest
> > > in this.  Perhaps one of them will comment.
> > 
> > I hope so. Just in case, they would decide to, I have reduced the
> > situation to the small program below. I get 
> > 
> > GLib-CRITICAL **: g_convert: assertion `str != NULL' failed
> > 
> > and no core dump from this simple program, whereas Evolution manages to
> > pass NULL to strcoll further down in g_utf8_collate and get SEGV for its
> > pains.
> 
> That sounds like a no-no for Evolution to be dereferencing a NULL
> pointer.  Hopefully they'd fix this to prevent the problem.

It's not Evolution, it is glib, specifically g_utf8_collate, which would
call strcoll(3) blindly on the return of g_utf8_normalize inside
gunicollate.c. And now, I can get core dumped out of this simple program
as well, merely by setting CHARSET=en_US.UTF-8 (I had it is ASCII in the
terminal window, which would trigger different path within
g_utf8_collate).

> 
> > 
> > Conversely, if the answer still is "Evolution should not have done
> > that", I will happily crawl back under my rock and keep my patch
> > locally.
> 
> I can't imagine you're alone in this.  But then again, any Cyrillic mail
> that comes my way is always spam, so what do I know.

More importantly, it is UTF-8 spam -- in order to trigger this, you need
KOI8-R or CP1251, and in the sorted column to boot. I suspect that
Latin1 or ShiftJIS would do the trick too.

Now, how about this: would you be amenable to this Really Harmless(tm)
patch, which merely adds error checking along the lines used in the same
function, about dozen lines up ;)

--- glib/gunicollate.c.B 2008-01-09 20:48:25.000000000 -0500
+++ glib/gunicollate.c	2008-01-09 20:49:35.000000000 -0500
@@ -166,6 +166,9 @@
   str1_norm = g_utf8_normalize (str1, -1, G_NORMALIZE_ALL_COMPOSE);
   str2_norm = g_utf8_normalize (str2, -1, G_NORMALIZE_ALL_COMPOSE);
 
+  g_return_val_if_fail (str1_norm != NULL, 0);
+  g_return_val_if_fail (str2_norm != NULL, 0);
+
   if (g_get_charset (&charset))
     {
       result = strcoll (str1_norm, str2_norm);

I can add it to your files/extra-patch-glib_gunicollate.c, or package 
it separately -- I really hate it when I start Evolution after portupgrade
to write some E-mails real quick, only to find out that I have forgotten
to patch glib... again.

-- 
Alexandre "Sunny" Kovalenko




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?1199930945.46097.11.camel>