Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 19 Jun 2013 15:13:37 GMT
From:      Corinna Vinschen <vinschen@redhat.com>
To:        freebsd-gnats-submit@FreeBSD.org
Subject:   misc/179721: char<->wchar_t mismatch in glob(3), fnmatch(3), regexec(3)
Message-ID:  <201306191513.r5JFDbXa054868@oldred.freebsd.org>
Resent-Message-ID: <201306191520.r5JFK0Ta039071@freefall.freebsd.org>

next in thread | raw e-mail | index | archive | help

>Number:         179721
>Category:       misc
>Synopsis:       char<->wchar_t mismatch in glob(3), fnmatch(3), regexec(3)
>Confidential:   no
>Severity:       non-critical
>Priority:       low
>Responsible:    freebsd-bugs
>State:          open
>Quarter:        
>Keywords:       
>Date-Required:
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Wed Jun 19 15:20:00 UTC 2013
>Closed-Date:
>Last-Modified:
>Originator:     Corinna Vinschen
>Release:        none
>Organization:
Red Hat
>Environment:
CYGWIN_NT-6.2-WOW64 VMBERT864 1.7.21(0.266/5/3) 2013-06-17 10:34 i686 Cygwin

>Description:
Hi,

It seems there's a mismatch between char and wchar_t in the glob(3)
functionality.  I stumbled over this problem, because Cygwin is using
FreeBSD's glob, fnmatch, and regcomp code.

All three functions convert input strings to wide character and do
test and comparisons on the wide char representation.  All three
functions call the __collate_range_cmp function in some scenario
(glob ,for instance, in match() when a range pattern is handled).

However, while all three functions operate on wchar_t chars, the
__collate_range_cmp function in locale/collcmp.c converts the
characters to char and calls strcoll_l on them.  This results in a
comparison which only works with ASCII chars, but not with the full
UNICODE character range.

An easy solution might be to call wcscoll_l from __collate_range_cmp,
but __collate_range_cmp is also called from other places, namely
from vfscanf, with char input.  Therefore the best way out might be
to introduce something along the lines of a __wcollate_range_cmp
function, as outlined below.
>How-To-Repeat:

>Fix:
See attached patch for a suggestion

Patch attached with submission follows:

Index: lib/libc/gen/fnmatch.c
===================================================================
RCS file: /home/ncvs/src/lib/libc/gen/fnmatch.c,v
retrieving revision 1.21
diff -u -p -r1.21 fnmatch.c
--- lib/libc/gen/fnmatch.c	17 Nov 2012 01:49:24 -0000	1.21
+++ lib/libc/gen/fnmatch.c	19 Jun 2013 15:12:34 -0000
@@ -285,8 +285,8 @@ rangematch(pattern, test, flags, newp, p
 
 			if (table->__collate_load_error ?
 			    c <= test && test <= c2 :
-			       __collate_range_cmp(table, c, test) <= 0
-			    && __collate_range_cmp(table, test, c2) <= 0
+			       __wcollate_range_cmp(table, c, test) <= 0
+			    && __wcollate_range_cmp(table, test, c2) <= 0
 			   )
 				ok = 1;
 		} else if (c == test)
Index: lib/libc/gen/glob.c
===================================================================
RCS file: /home/ncvs/src/lib/libc/gen/glob.c,v
retrieving revision 1.36
diff -u -p -r1.36 glob.c
--- lib/libc/gen/glob.c	12 Apr 2013 00:41:52 -0000	1.36
+++ lib/libc/gen/glob.c	19 Jun 2013 15:12:34 -0000
@@ -836,8 +836,8 @@ match(Char *name, Char *pat, Char *paten
 				if ((*pat & M_MASK) == M_RNG) {
 					if (table->__collate_load_error ?
 					    CHAR(c) <= CHAR(k) && CHAR(k) <= CHAR(pat[1]) :
-					       __collate_range_cmp(table, CHAR(c), CHAR(k)) <= 0
-					    && __collate_range_cmp(table, CHAR(k), CHAR(pat[1])) <= 0
+					       __wcollate_range_cmp(table, CHAR(c), CHAR(k)) <= 0
+					    && __wcollate_range_cmp(table, CHAR(k), CHAR(pat[1])) <= 0
 					   )
 						ok = 1;
 					pat += 2;
Index: lib/libc/locale/collate.h
===================================================================
RCS file: /home/ncvs/src/lib/libc/locale/collate.h,v
retrieving revision 1.17
diff -u -p -r1.17 collate.h
--- lib/libc/locale/collate.h	17 Nov 2012 01:49:29 -0000	1.17
+++ lib/libc/locale/collate.h	19 Jun 2013 15:12:34 -0000
@@ -73,6 +73,7 @@ u_char	*__collate_substitute(struct xloc
 int	__collate_load_tables(const char *);
 void	__collate_lookup(struct xlocale_collate *, const u_char *, int *, int *, int *);
 int	__collate_range_cmp(struct xlocale_collate *, int, int);
+int	__wcollate_range_cmp(struct xlocale_collate *, int, int);
 #ifdef COLLATE_DEBUG
 void	__collate_print_tables(void);
 #endif
Index: lib/libc/locale/collcmp.c
===================================================================
RCS file: /home/ncvs/src/lib/libc/locale/collcmp.c,v
retrieving revision 1.20
diff -u -p -r1.20 collcmp.c
--- lib/libc/locale/collcmp.c	17 Nov 2012 01:49:29 -0000	1.20
+++ lib/libc/locale/collcmp.c	19 Jun 2013 15:12:34 -0000
@@ -50,3 +50,13 @@ int __collate_range_cmp(struct xlocale_c
 	l.components[XLC_COLLATE] = (struct xlocale_component *)table;
 	return (strcoll_l(s1, s2, &l));
 }
+int __wcollate_range_cmp(struct xlocale_collate *table, int c1, int c2)
+{
+	static wchar_t s1[2], s2[2];
+
+	s1[0] = c1;
+	s2[0] = c2;
+	struct _xlocale l = {{0}};
+	l.components[XLC_COLLATE] = (struct xlocale_component *)table;
+	return (wcscoll_l(s1, s2, &l));
+}
Index: lib/libc/regex/regcomp.c
===================================================================
RCS file: /home/ncvs/src/lib/libc/regex/regcomp.c,v
retrieving revision 1.42
diff -u -p -r1.42 regcomp.c
--- lib/libc/regex/regcomp.c	2 Mar 2013 01:08:09 -0000	1.42
+++ lib/libc/regex/regcomp.c	19 Jun 2013 15:12:34 -0000
@@ -789,10 +789,10 @@ p_b_term(struct parse *p, cset *cs)
 				(void)REQUIRE((uch)start <= (uch)finish, REG_ERANGE);
 				CHaddrange(p, cs, start, finish);
 			} else {
-				(void)REQUIRE(__collate_range_cmp(table, start, finish) <= 0, REG_ERANGE);
+				(void)REQUIRE(__wcollate_range_cmp(table, start, finish) <= 0, REG_ERANGE);
 				for (i = 0; i <= UCHAR_MAX; i++) {
-					if (   __collate_range_cmp(table, start, i) <= 0
-					    && __collate_range_cmp(table, i, finish) <= 0
+					if (   __wcollate_range_cmp(table, start, i) <= 0
+					    && __wcollate_range_cmp(table, i, finish) <= 0
 					   )
 						CHadd(p, cs, i);
 				}


>Release-Note:
>Audit-Trail:
>Unformatted:



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?201306191513.r5JFDbXa054868>