From owner-freebsd-hackers Wed Jun 17 13:56:30 1998 Return-Path: Received: (from majordom@localhost) by hub.freebsd.org (8.8.8/8.8.8) id NAA04840 for freebsd-hackers-outgoing; Wed, 17 Jun 1998 13:56:30 -0700 (PDT) (envelope-from owner-freebsd-hackers@FreeBSD.ORG) Received: from smtp02.primenet.com (daemon@smtp02.primenet.com [206.165.6.132]) by hub.freebsd.org (8.8.8/8.8.8) with ESMTP id NAA04774; Wed, 17 Jun 1998 13:56:15 -0700 (PDT) (envelope-from tlambert@usr01.primenet.com) Received: (from daemon@localhost) by smtp02.primenet.com (8.8.8/8.8.8) id NAA18054; Wed, 17 Jun 1998 13:56:06 -0700 (MST) Received: from usr01.primenet.com(206.165.6.201) via SMTP by smtp02.primenet.com, id smtpd018027; Wed Jun 17 13:56:04 1998 Received: (from tlambert@localhost) by usr01.primenet.com (8.8.5/8.8.5) id NAA28313; Wed, 17 Jun 1998 13:55:58 -0700 (MST) From: Terry Lambert Message-Id: <199806172055.NAA28313@usr01.primenet.com> Subject: Re: Filesystem Development Toolkit To: kline@tao.thought.org (Gary Kline) Date: Wed, 17 Jun 1998 20:55:58 +0000 (GMT) Cc: michaelh@cet.co.jp, Matthew.Alton@anheuser-busch.com, FreeBSD-fs@FreeBSD.ORG, Scott.Smallie@anheuser-busch.com, Hackers@FreeBSD.ORG In-Reply-To: <199806171819.LAA06262@tao.thought.org> from "Gary Kline" at Jun 17, 98 11:19:58 am X-Mailer: ELM [version 2.4 PL25] MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: owner-freebsd-hackers@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.ORG > If Terry's FS-based Unicode support would fit into this, > it'd be interesting. So far I'm working on localizations > via the locale catalogs. This may be a short-term solution > and a broader, global solution may be a FS with wchar_t > support. > > Any thoughts; or am I too far off-course? Unicode cn_pnbuf code is part of the problem for the NTFS, which, like VFAT, has multiple namespaces which must be kept coherent, and whose coherency can't be implemented via late-binding. The bigger problem here is that nameidata structure is not treated as relatively opaque, except for the name spaces an FS is interested in accessing. Because the cn_pnbuf is freed at random locations in the kernel, this dictates implementation for VOP's which utilize the nameidata, such as VOP_LOOKUP, VOP_RENAME, VOP_LINK, VOP_CREATE, VOP_UNLINK, etc.. In order to be able to deal with both the Unicode and the DOS code page based 8.3 name at the same time, the path needs to be broken up into a parsed path structure, wherein seperate components are grouped. For the initial case, where we are still passing an 8 bit string to those system calls that take paths, the easies conversionis a direct mapping to code page 0 (ISO 8859-1) in the Unicode 16/8 set (or the ISO 10646 32/8 set). System calls that operate purely on Unicode objects can come later, and the legacy support can be pushed into libc (open( 8bit:8859-1, ) -> uopen( 16bit:Unicode, )) later. POSIX compatability is an issue that can be dealt with in the library. Alternately, a wchar_t encoding could have a "magic" introducer that is prepended to strings, and ignored except when the strings are passed to system calls (open( char *) vs. open( wchar_t *) prototpyes, as in c++ namespace overlaoding). Maybe "0wchar_t0 r e a l s t r i n g" for strings declared "wide". (_W"realstring"). It doesn't really matter; it depend on who you want to take the hit (I prefer hitting th old code). Terry Lambert terry@lambert.org --- Any opinions in this posting are my own and not those of my present or previous employers. To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-hackers" in the body of the message