From owner-svn-src-all@FreeBSD.ORG Wed Apr 30 21:43:26 2014 Return-Path: Delivered-To: svn-src-all@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 8D825C5F for ; Wed, 30 Apr 2014 21:43:26 +0000 (UTC) Received: from nm29-vm1.bullet.mail.bf1.yahoo.com (nm29-vm1.bullet.mail.bf1.yahoo.com [98.139.213.144]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 29D4D1A74 for ; Wed, 30 Apr 2014 21:43:25 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yahoo.com; s=s2048; t=1398894197; bh=pJ9bKVekZcMTun7ugqyw1RDm0XfoAoDDJ6amByqGxtg=; h=Received:Received:Received:X-Yahoo-Newman-Id:X-Yahoo-Newman-Property:X-YMail-OSG:X-Yahoo-SMTP:X-Rocket-Received:Message-ID:Date:From:User-Agent:MIME-Version:To:CC:Subject:References:In-Reply-To:Content-Type:Content-Transfer-Encoding; b=gkEqBK2F0DC/r+uGAHZbZDzJY4tU2vtvgkc78fFRKt6+VKsNlUyd8wMxQaur5yH0OoZr4OO3nwt138JVIRtebJbo7A9Do3x825dw416l/tIHXUUMFD5yhTXquhSO8d1dDpygcfSgXIGR5Mf8Lwpd//Ekcgif38knctYTv5kXKD559jIK3Yn6qKSKcXHaUrsC9/q0gY6y58k/9rmy3BqUHozyvUZptjnL2x8E6urw1GtibFlwAXd4BhLA/O/zEjI18ZpKdQiKYpd2NqdoqhN4MpWgDq4VAyAo4rBB3HUu+/8r2df3cYPYKtRQPIxiMbNzFVWDre4YU3O5vsSiRPGgQA== DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=s2048; d=yahoo.com; b=fgrrmMgiuUTEkyQGzBjAxRrBwozFBYCIBEwTCu2M9zm/44KVzQsx8FKWKj+4dBIjQpunPArasGe0YHntYHTmN21jjvWq1Z9sAdHcHTQVdxsR/WiKxB02vwf0yVTzOEzmn3TCRYA1JeDoUbLzqIUqlSeHNEJHkCVknY27G+Z7uSO2RDj54MGunalxhgYL5BHD2g4Ll9m8VQVkv+P9htv7wg4FnedH2x+bjhmgIYbYoqYtbYii+b3AJKZDGP9pQcjGLM0Ldz6eSFJf/ralRZH2WtqCaEUU6FECKHYQuFbxjf24vyjVedZynfmIMsYsiTfdS984d2/ZHjWFBrruS3jlWg==; Received: from [66.196.81.171] by nm29.bullet.mail.bf1.yahoo.com with NNFMP; 30 Apr 2014 21:43:17 -0000 Received: from [98.139.213.10] by tm17.bullet.mail.bf1.yahoo.com with NNFMP; 30 Apr 2014 21:43:17 -0000 Received: from [127.0.0.1] by smtp110.mail.bf1.yahoo.com with NNFMP; 30 Apr 2014 21:43:17 -0000 X-Yahoo-Newman-Id: 885846.190.bm@smtp110.mail.bf1.yahoo.com X-Yahoo-Newman-Property: ymail-3 X-YMail-OSG: w.qUspoVM1kPmjjJ5BGQv1l1yYXtUYgXt0muA4yQY.XZJdD 1u8gnpNeyagxHaeDGEQGDMBSg1IxhQb7E8RXpoH5WdAx1szzXKZcyEyfb21T Fsfk_WuTzBLU.3PEAls1mEKK_I2LTCrYneSC82dCZUCisf1RRbjtpxDY8Wpt hTxa6aBAaAyHrMLbVZJu8_iT2vTdrldhrA9CpGzXfzd_n3RcBHGbJ.JRgn2M ZKmNiMsd1pBuii_GxJ6m2TF2W0POcXB3Gnm8v5r91y4vyH_M0vFlkbV7StqX 0bHupEjMeU19brRcdTUXOSUQ_xQVNpfqLhDc9WEHqYbU8gqwPPXxpwiiIfDv _KXXaMcFAtUh2tLxC.8eb64wuntdWzjOP48s3EWc3Npn.4cbP9pO1Qgvk__j 0SAgfOmXEuTr6rwzDKDDx4bg4H7vUi.JEI.mMRq.lPdwK80zIzSu0IoPncW7 11yMia945zpcW5lQLnhSTJeYQfn.4fVmjVyZ4iER7TK_8Ql8Yf1MXxM1aecT BrScchiZ0uQ8xWCosdF0L7q8fXmd6VokUHUIvGjnlTjZ6dsnLeAyYWgPmxdh 04RIAennG5hPO85osrVzp8TCNN1nzJN6BXlXsv_Q2QdWHcTks1m88udpF250 Cu.46DkklX73d4DNfNyu0YaA2e.jTVAY0ShMhZnksMp2luz3.ZCT6t4Js2ma sLpP9L7mAKqILTvus5svy9wKOdTl_BcktgWRRtkzS_oyTWYKmfNtvJAFB3IK CEigTenT0JMKx3g-- X-Yahoo-SMTP: xcjD0guswBAZaPPIbxpWwLcp9Unf X-Rocket-Received: from [192.168.0.102] (pfg@190.157.126.109 with plain [63.250.193.228]) by smtp110.mail.bf1.yahoo.com with SMTP; 30 Apr 2014 14:43:17 -0700 PDT Message-ID: <53616E78.3010301@freebsd.org> Date: Wed, 30 Apr 2014 16:43:20 -0500 From: Pedro Giffuni User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:24.0) Gecko/20100101 Thunderbird/24.4.0 MIME-Version: 1.0 To: Jilles Tjoelker Subject: Re: svn commit: r265095 - head/lib/libc/locale References: <201404291525.s3TFPvmt097589@svn.freebsd.org> <20140430211028.GA61757@stack.nl> In-Reply-To: <20140430211028.GA61757@stack.nl> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: svn-src-head@freebsd.org, svn-src-all@freebsd.org, src-committers@freebsd.org X-BeenThere: svn-src-all@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list List-Id: "SVN commit messages for the entire src tree \(except for " user" and " projects" \)" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 30 Apr 2014 21:43:26 -0000 On 04/30/14 16:10, Jilles Tjoelker wrote: > On Tue, Apr 29, 2014 at 03:25:57PM +0000, Pedro F. Giffuni wrote: >> Author: pfg >> Date: Tue Apr 29 15:25:57 2014 >> New Revision: 265095 >> URL: http://svnweb.freebsd.org/changeset/base/265095 >> Log: >> citrus: Avoid invalid code points. >> >> From the OpenBSD log: >> The UTF-8 decoder should not accept byte sequences which decode to unicode >> code positions U+D800 to U+DFFF (UTF-16 surrogates), U+FFFE, and U+FFFF. >> http://www.cl.cam.ac.uk/~mgk25/unicode.html#utf-8 >> http://unicode.org/faq/utf_bom.html#utf8-4 >> Reported by: Stefan Sperling >> Obtained from: OpenBSD >> MFC after: 5 days >> Modified: >> head/lib/libc/locale/utf8.c >> Modified: head/lib/libc/locale/utf8.c >> ============================================================================== >> --- head/lib/libc/locale/utf8.c Tue Apr 29 15:12:23 2014 (r265094) >> +++ head/lib/libc/locale/utf8.c Tue Apr 29 15:25:57 2014 (r265095) >> @@ -203,6 +203,14 @@ _UTF8_mbrtowc(wchar_t * __restrict pwc, >> errno = EILSEQ; >> return ((size_t)-1); >> } >> + if ((wch >= 0xd800 && wch <= 0xdfff) || >> + wch == 0xfffe || wch == 0xffff) { >> + /* >> + * Malformed input; invalid code points. >> + */ >> + errno = EILSEQ; >> + return ((size_t)-1); >> + } >> if (pwc != NULL) >> *pwc = wch; >> us->want = 0; > Hmm, I think U+FFFE and U+FFFF should be passed through normally. > According to http://www.unicode.org/faq/private_use.html they are > "noncharacters" (basically a more private variant of private-use > characters) and must be mapped through UTFs. > > The part that rejects U+D800 to U+DFFF is definitely correct, though. > http://unicode.org/faq/utf_bom.html#utf8-4 tells to do only that. > > The part about U+FFFE and U+FFFF in > http://www.cl.cam.ac.uk/~mgk25/unicode.html#utf-8 seems out of date. > Note the last modified date of that page: 2009-05-11. > > On another note, everything above U+0010FFFF should perhaps be rejected > since those codes, which cannot be encoded in UTF-16, were excluded from > Unicode and ISO 10646. > Thank you! I will fix soon the UTF-8 part. Pedro.