From owner-svn-src-all@FreeBSD.ORG Mon May 5 14:50:53 2014 Return-Path: Delivered-To: svn-src-all@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id B426453F; Mon, 5 May 2014 14:50:53 +0000 (UTC) Received: from svn.freebsd.org (svn.freebsd.org [IPv6:2001:1900:2254:2068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id A091415A4; Mon, 5 May 2014 14:50:53 +0000 (UTC) Received: from svn.freebsd.org ([127.0.1.70]) by svn.freebsd.org (8.14.8/8.14.8) with ESMTP id s45EorZH037443; Mon, 5 May 2014 14:50:53 GMT (envelope-from pfg@svn.freebsd.org) Received: (from pfg@localhost) by svn.freebsd.org (8.14.8/8.14.8/Submit) id s45EorXF037442; Mon, 5 May 2014 14:50:53 GMT (envelope-from pfg@svn.freebsd.org) Message-Id: <201405051450.s45EorXF037442@svn.freebsd.org> From: "Pedro F. Giffuni" Date: Mon, 5 May 2014 14:50:53 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-10@freebsd.org Subject: svn commit: r265361 - stable/10/lib/libc/locale X-SVN-Group: stable-10 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-all@freebsd.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: "SVN commit messages for the entire src tree \(except for " user" and " projects" \)" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 05 May 2014 14:50:53 -0000 Author: pfg Date: Mon May 5 14:50:53 2014 New Revision: 265361 URL: http://svnweb.freebsd.org/changeset/base/265361 Log: MFC r265095, r265167; citrus: Avoid invalid code points. The UTF-8 decoder should not accept byte sequences which decode to unicode code positions U+D800 to U+DFFF (UTF-16 surrogates).[1] Contrary to the original OpenBSD patch, we do pass U+FFFE and U+FFFF, both values are valid "non-characters" [2] and must be mapped through UTFs. [1] http://www.cl.cam.ac.uk/~mgk25/unicode.html#utf-8 [2] http://www.unicode.org/faq/private_use.html Reported by: Stefan Sperling [1] Thanks to: jilles [2] Obtained from: OpenBSD Modified: stable/10/lib/libc/locale/utf8.c Modified: stable/10/lib/libc/locale/utf8.c ============================================================================== --- stable/10/lib/libc/locale/utf8.c Mon May 5 14:50:44 2014 (r265360) +++ stable/10/lib/libc/locale/utf8.c Mon May 5 14:50:53 2014 (r265361) @@ -203,6 +203,13 @@ _UTF8_mbrtowc(wchar_t * __restrict pwc, errno = EILSEQ; return ((size_t)-1); } + if (wch >= 0xd800 && wch <= 0xdfff) { + /* + * Malformed input; invalid code points. + */ + errno = EILSEQ; + return ((size_t)-1); + } if (pwc != NULL) *pwc = wch; us->want = 0;