Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 09 Jan 2008 10:53:19 -0500
From:      "Alexandre \"Sunny\" Kovalenko" <alex.kovalenko@verizon.net>
To:        ports@freebsd.org, gnome@freebsd.org
Subject:   [patch] glib20, UTF-8 and string collation
Message-ID:  <1199893999.756.29.camel@RabbitsDen>

next in thread | raw e-mail | index | archive | help

--=-blBbqgpbioDwTwFoAlam
Content-Type: text/plain
Content-Transfer-Encoding: 7bit

I have seen recent commit WRT string collation in devel/glib20 by
marcus, so I have decided to check if there is an interest to fix SEGV
in g_utf8_collate when it is given 8-bit non-UTF-8 string(s) to collate.

Good (but by no means only) example of this would be using Evolution to
open mailbox with the mix of KOI-8, CP1251 and UTF-8 message subjects
and order them by the subject. Admittedly, I do not know whether there
are special symbols that trigger the situation or any mix would do. vova
at fbsd ru posted test case mailbox under the link below.

Full discussion including my first approach to fix this problem could be
found here

http://bugzilla.gnome.org/show_bug.cgi?id=492389

Slightly different approach is attached to this E-mail.

Without either patch, my Evolution will core dump on start-up. 

First patch was rejected by gnome folks with the recommendation "Don't
do that", which, unfortunately, is not that easy to follow ;)

Any comments from people with the knowledge of gnome, UTF-8 and string
collation will be greatly appreciated.

I am not subscribed to ports@, please, make sure to keep me on CC list
when replying.

-- 
Alexandre "Sunny" Kovalenko

--=-blBbqgpbioDwTwFoAlam
Content-Disposition: attachment; filename=gunidecomp.c.patch
Content-Type: text/x-patch; name=gunidecomp.c.patch; charset=UTF-8
Content-Transfer-Encoding: 7bit

--- glib/gunidecomp.c.BAK	2008-01-09 09:07:46.000000000 -0500
+++ glib/gunidecomp.c	2008-01-09 09:17:04.000000000 -0500
@@ -22,6 +22,7 @@
 #include "config.h"
 
 #include <stdlib.h>
+#include <string.h>
 
 #include "glib.h"
 #include "gunidecomp.h"
@@ -528,6 +529,14 @@
   result = g_ucs4_to_utf8 (result_wc, -1, NULL, NULL, NULL);
   g_free (result_wc);
 
+  // Upstream callers rely on the returned pointer to be valid
+  // and produce core if it does not (witness collation routine).
+#define NOT_VALID_UTF8_STRING	"*** This was not a valid UTF-8 string ***"
+  if(!result)
+  {
+    result = g_malloc(strlen(NOT_VALID_UTF8_STRING) + 1);
+    strcpy(result, NOT_VALID_UTF8_STRING);
+  }
   return result;
 }
 

--=-blBbqgpbioDwTwFoAlam--




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?1199893999.756.29.camel>