Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 3 Jul 2005 01:12:37 +0000 (UTC)
From:      "R. Imura" <imura@FreeBSD.org>
To:        src-committers@FreeBSD.org, cvs-src@FreeBSD.org, cvs-all@FreeBSD.org
Subject:   cvs commit: src/sys/sys iconv.h
Message-ID:  <200507030112.j631Cbxt039620@repoman.freebsd.org>

next in thread | raw e-mail | index | archive | help
imura       2005-07-03 01:12:37 UTC

  FreeBSD src repository

  Modified files:
    sys/sys              iconv.h 
  Log:
  Switch Unicode charset name from "ISO-10646-UCS-2" to "UTF-16BE".
  Using ISO-10646-UCS-2 will cause a problem when we use our own
  iconv functions in the future, or port iconv other than GNU
  libiconv.
  
  Each vendors treat "UCS-2" as follows, and endian issue is
  vendor specific:
  
   - Solaris 8 iconv
    Little Endian with BOM
  
   - HP-UX iconv
    Big Endian
  
   - NetBSD/i386 1.6 iconv
    Little Endian
  
   - GNU libiconv
    Big Endian
  
   - glibc(RedHat AS 2.1 x86) iconv
    Little Endian
  
   - IANA
    Name: ISO-10646-UCS-2
    MIBenum: 1000
    Source: the 2-octet Basic Multilingual Plane, aka Unicode
            this needs to specify network byte order: the standard
            does not specify (it is a 16-bit integer space)
    Alias: csUnicode
  
   - MSDN
    Little Endian
    http://msdn.microsoft.com/library/en-us/cpref/html/frlrfsystemtextencodingclassgetencodingtopic2.asp
  
  Now using UTF-16BE is harmless, because
  - same as UCS-2 with 2 byte range (U+0000 - U+FFFF)
  - kernel code of each file systems(cd9660, msdosfs, ntfs) believes
    Unicode is a 2 byte character at this time.
  - UDF has only 2 byte range of Unicode filenames.
  - It's defined at RFC2781.
  
  So I believe it's time to change before starting new RELENG_6. :)
  
  Approved by:    re (scottl)
  
  Revision  Changes    Path
  1.11      +1 -1      src/sys/sys/iconv.h



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200507030112.j631Cbxt039620>