Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 16 Feb 2009 22:05:17 +0200
From:      Mihai =?utf-8?q?Don=C8=9Bu?= <mihai.dontu@gmail.com>
To:        freebsd-questions@freebsd.org
Cc:        Wojciech Puchar <wojtek@wojtek.tensor.gdynia.pl>, Daniel Leal <dleal@webvolution.net>
Subject:   Re: accents in file names
Message-ID:  <200902162205.17644.mihai.dontu@gmail.com>
In-Reply-To: <E2340929-392C-48C0-B8B6-F5527C5A249D@mac.com>
References:  <499498A4.4000103@webvolution.net> <20090212235015.U97916@wojtek.tensor.gdynia.pl> <E2340929-392C-48C0-B8B6-F5527C5A249D@mac.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On Friday 13 February 2009, Chuck Swiger wrote:
> On Feb 12, 2009, at 2:50 PM, Wojciech Puchar wrote:
> >>> accented letter to my freebsd box, the accented letter simply
> >>> disappear.
> >>
> >> UFS supports 8-bit characters except for "/" and "\0", but you also
> >> need to run a terminal with UTF8 support and use a correct font to
> >> view such things.
> >
> > why? i use ISO-8859-2
>
> You've answered "why" when you state that you set up a locale which
> supports ISO Latin-X charset.  If you are running in the default C/
> POSIX locale, using the US-ASCII character set and a font that only
> knows about 7-bit ASCII glyphs, then you won't get accented characters.
>
> > UFS doesn't deal with encoding at all, just store what you give
>
> That's right, which means you need to use filenames encoded in UTF8
> rather than in arbitrary Unicode.

UTF-8 is what we prefer these days, but the filesystem can handle anything 
that is ASCII compatible (like you said: Shift_JIS, EUC-JP etc.).

Now, I assume Daniel was copying "filé.txt" from a non-UFS (Windows box, 
FAT32, NTFS etc) filesystem to UFS, because this is the only case I can think 
of and in which such a problem might appear.

> People in Asia tend to want UTF-16 
> or UTF-32 encoding (although historical encodings like Big5, Shift-
> JIS, and now GB18030 for China are still rather popular, and those are
> multibyte encodings), and things like gcc's implementation of
> widechars or Python are standardizing on UTF-32.

-- 
Mihai Donțu



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200902162205.17644.mihai.dontu>