Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 17 Jun 1998 20:55:58 +0000 (GMT)
From:      Terry Lambert <tlambert@primenet.com>
To:        kline@tao.thought.org (Gary Kline)
Cc:        michaelh@cet.co.jp, Matthew.Alton@anheuser-busch.com, FreeBSD-fs@FreeBSD.ORG, Scott.Smallie@anheuser-busch.com, Hackers@FreeBSD.ORG
Subject:   Re: Filesystem Development Toolkit
Message-ID:  <199806172055.NAA28313@usr01.primenet.com>
In-Reply-To: <199806171819.LAA06262@tao.thought.org> from "Gary Kline" at Jun 17, 98 11:19:58 am

next in thread | previous in thread | raw e-mail | index | archive | help
> 	If Terry's FS-based Unicode support would fit into this, 
> 	it'd be interesting.  So far I'm working on localizations
> 	via the locale catalogs.  This may be a short-term solution
> 	and a broader, global solution may be a FS with wchar_t 
> 	support.
> 
> 	Any thoughts; or am I too far off-course?

Unicode cn_pnbuf code is part of the problem for the NTFS, which, like
VFAT, has multiple namespaces which must be kept coherent, and whose
coherency can't be implemented via late-binding.

The bigger problem here is that nameidata structure is not treated
as relatively opaque, except for the name spaces an FS is interested
in accessing.

Because the cn_pnbuf is freed at random locations in the kernel, this
dictates implementation for VOP's which utilize the nameidata, such as
VOP_LOOKUP, VOP_RENAME, VOP_LINK, VOP_CREATE, VOP_UNLINK, etc..

In order to be able to deal with both the Unicode and the DOS code
page based 8.3 name at the same time, the path needs to be broken up
into a parsed path structure, wherein seperate components are grouped.

For the initial case, where we are still passing an 8 bit string to
those system calls that take paths, the easies conversionis a direct
mapping to code page 0 (ISO 8859-1) in the Unicode 16/8 set (or the
ISO 10646 32/8 set).  System calls that operate purely on Unicode
objects can come later, and the legacy support can be pushed into
libc (open( 8bit:8859-1, ) -> uopen( 16bit:Unicode, )) later.  POSIX
compatability is an issue that can be dealt with in the library.

Alternately, a wchar_t encoding could have a "magic" introducer that
is prepended to strings, and ignored except when the strings are
passed to system calls (open( char *) vs. open( wchar_t *) prototpyes,
as in c++ namespace overlaoding).  Maybe "0wchar_t0 r e a l s t r i n g"
for strings declared "wide". (_W"realstring").

It doesn't really matter; it depend on who you want to take the hit
(I prefer hitting th old code).


					Terry Lambert
					terry@lambert.org
---
Any opinions in this posting are my own and not those of my present
or previous employers.

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-fs" in the body of the message



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199806172055.NAA28313>