Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 05 Nov 2008 16:39:11 -0700 (MST)
From:      "M. Warner Losh" <imp@bsdimp.com>
To:        ivoras@gmail.com
Cc:        svn-src-head@FreeBSD.org, svn-src-all@FreeBSD.org, src-committers@FreeBSD.org, des@FreeBSD.org
Subject:   Re: svn commit: r184691 - head/sys/compat/linprocfs
Message-ID:  <20081105.163911.420518480.imp@bsdimp.com>
In-Reply-To: <9bbcef730811051526p3a978848uf904a149cb81fbce@mail.gmail.com>
References:  <200811051508.mA5F89XD030040@svn.freebsd.org> <20081105.150108.1649771743.imp@bsdimp.com> <9bbcef730811051526p3a978848uf904a149cb81fbce@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help
In message: <9bbcef730811051526p3a978848uf904a149cb81fbce@mail.gmail.com>
            "Ivan Voras" <ivoras@gmail.com> writes:
: 2008/11/5 M. Warner Losh <imp@bsdimp.com>:
: > In message: <200811051508.mA5F89XD030040@svn.freebsd.org>
: >            Dag-Erling Smorgrav <des@FreeBSD.org> writes:
: > :   utf-8
: >
: > Is there some reason to prefer utf-8 over the 8-bit iso character set
: > we were using?
: 
: Reason? You mean you actually *like* 8-bit code pages in the first place? :)

Liked?  Not necessarily.  Understood: yes.  Just didn't have a clue
why the change.

: As a person from a country that has during its history decided it
: really needs 3-4 dots and dashes in its alphabet that make it (the
: alphabet) not representable in ASCII, and who has had Many Fun Days
: converting between various 8-bit code pages, ISO standard or not, and
: especially with deducing which code page is actually being used as all
: bytes are created equal (and Microsoft just *had* to tweak two letters
: from iso8859-2 into Latin2), I welcome UTF-8 with a warm room, a beer,
: peanuts and a backrub.

Hmmmm.  peanuts....

: UTF-8 (as opposed to old 8-bit code pages which need to die as soon as
: possible and UTF-16 which got itself messed up with endianess) in
: unambiguous. A sequence of proper UTF-8 bytes (and UTF-8 has a
: structure so not every random collection of bytes with the 8th bit set
: is proper UTF-8) can always be linked to the same letter.
: 
: This is why there's such a big push to get systems to properly support
: UTF-8. FreeBSD had a SoC project this year that was supposed to
: properly implement Unicode collations (and thus collation of UTF-8
: strings) but it looks dead or in a dormant state right now (though I
: didn't follow it attentively).

That makes sense.

Warner



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20081105.163911.420518480.imp>