From owner-freebsd-i18n Tue Jun 19 9:46:13 2001 Delivered-To: freebsd-i18n@freebsd.org Received: from tx.citynet.net (tx.citynet.net [208.154.179.12]) by hub.freebsd.org (Postfix) with ESMTP id D98EB37B401 for ; Tue, 19 Jun 2001 09:46:08 -0700 (PDT) (envelope-from jasonf@citynet.net) Received: from Neptune ([63.145.134.12]) by tx.citynet.net (8.11.3/8.11.3=Outbound) with SMTP id f5JGjiO29380 for ; Tue, 19 Jun 2001 12:45:44 -0400 Message-ID: <002101c0f8df$870e9850$0200000a@Neptune> From: "Jason Francis" To: Subject: FreeBSD Unicode support Date: Tue, 19 Jun 2001 12:47:28 -0400 MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="----=_NextPart_000_001E_01C0F8BD.FF3C6770" X-Priority: 3 X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook Express 5.00.2919.6700 X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2462.0000 Sender: owner-freebsd-i18n@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG This is a multi-part message in MIME format. ------=_NextPart_000_001E_01C0F8BD.FF3C6770 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable One of the arguments made by microsoft for switching the Hotmail = frontend to Windows 2000 was the need for foreign language support.=20 "Hotmail had the requirement to launch in new markets, and did not want = to continue to invest in keeping the FreeBSD locale tables up to date = and other maintenance activities. China and Japan are two important = growing markets for MSN, so multibyte character sets had to be = supported. FreeBSD lacked the necessary Unicode support." I want to know if there is any real merit to these claims, or if it's = just more marketing drivel from Microsoft. I often use central european characters, but I have never needed to do = so under my FreeBSD systems. I have noticed, however, that an ls on my = mp3 shares reveals that characters such as = e,s,c,r,z,=FD,=E1,=ED,=E9,=A7, and u appear as question marks. This = lack of support is preventing me from moving soley to a FreeBSD = environment. Is work being done to bring Unicode support to FreeBSD that will allow = it to have better support for globalization and foreign languages? ------=_NextPart_000_001E_01C0F8BD.FF3C6770 Content-Type: text/html; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable
One of the arguments made by microsoft = for=20 switching the Hotmail frontend to Windows 2000 was the need for foreign = language=20 support. 
 
"Hotmail had the requirement to launch = in new=20 markets, and did not want to continue to invest in keeping the FreeBSD = locale=20 tables up to date and other maintenance activities. China and Japan are = two=20 important growing markets for MSN, so multibyte character sets had to be = supported. FreeBSD lacked the necessary Unicode support."
 
I want to know if there is any real = merit to these=20 claims, or if it's just more marketing drivel from = Microsoft.
 
I often use central european = characters, but I have=20 never needed to do so under my FreeBSD systems.  I have noticed, = however,=20 that an ls on my mp3 shares reveals that characters such as = ě,=9A,č,ř,=9E,=FD,=E1,=ED,=E9,=A7,=20 and ů appear as question marks.  This lack of support is = preventing me from=20 moving soley to a FreeBSD environment.
 
Is work being done to bring Unicode = support to=20 FreeBSD that will allow it to have better support for globalization and = foreign=20 languages?
------=_NextPart_000_001E_01C0F8BD.FF3C6770-- To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-i18n" in the body of the message From owner-freebsd-i18n Tue Jun 19 17:31:17 2001 Delivered-To: freebsd-i18n@freebsd.org Received: from mail.mk.bsdclub.org (adsl3219.ea.rim.or.jp [202.247.149.219]) by hub.freebsd.org (Postfix) with ESMTP id 2000537B401 for ; Tue, 19 Jun 2001 17:31:13 -0700 (PDT) (envelope-from motoyuki@mk.bsdclub.org) Received: from sakura.mk.bsdclub.org (sakura.mk.bsdclub.org [3ffe:505:2022:0:2a0:c9ff:fe20:9aff]) by mail.mk.bsdclub.org (8.11.3+3.4W/3.7W/smtpfeed 1.12) with ESMTP/inet6 id f5K0VBL26880; Wed, 20 Jun 2001 09:31:11 +0900 (JST) Received: from sakura.mk.bsdclub.org (localhost.mk.bsdclub.org [127.0.0.1]) by sakura.mk.bsdclub.org (8.11.4/3.7W) with ESMTP/inet id f5K0VBq54775; Wed, 20 Jun 2001 09:31:11 +0900 (JST) Message-Id: <200106200031.f5K0VBq54775@sakura.mk.bsdclub.org> To: "Jason Francis" Cc: freebsd-i18n@FreeBSD.ORG Subject: Re: FreeBSD Unicode support From: Motoyuki Konno X-Mailer: mh-e on Mule 2.3 / Emacs 19.34.1 References: <002101c0f8df$870e9850$0200000a@Neptune> Mime-Version: 1.0 (generated by tm-edit 7.106) Content-Type: text/plain; charset=US-ASCII Date: Wed, 20 Jun 2001 09:31:11 +0900 Sender: owner-freebsd-i18n@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG Hi, "Jason Francis" wrote: > One of the arguments made by microsoft for switching the Hotmail > frontend to Windows 2000 was the need for foreign language support. > > "Hotmail had the requirement to launch in new markets, and did not want > to continue to invest in keeping the FreeBSD locale tables up to date > and other maintenance activities. China and Japan are two important > growing markets for MSN, so multibyte character sets had to be > supported. FreeBSD lacked the necessary Unicode support." > > I want to know if there is any real merit to these claims, or if it's > just more marketing drivel from Microsoft. > Is work being done to bring Unicode support to FreeBSD that will allow > it to have better support for globalization and foreign languages? I agree with the idea "Unicode support is important". But I think Unicode cannot solve the "multibyte character support" problem. Is Unicode widely used in Japan ? The answer is NO. Microsoft Windows* users in Japan use SJIS (SJIS is also known as Microsoft Kanji code) for plain text. Most of FreeBSD users in Japan use EUC, SJIS or ISO-2022-JP (also known as JIS code). ISO-2022-JP is the standard for e-mail. # For example, I use: # EUC for UNIXen (includes *BSD), SJIS for MS Windows, # ISO-2022-JP for e-mail. P.S. Unicode support is one of the goal of Citrus project. Please see "Policy" section of http://citrus.bsdclub.org/. -- ------------------------------------------------------------------------ Motoyuki Konno motoyuki@bsdclub.org (Home) motoyuki@FreeBSD.ORG (FreeBSD Project) http://www.freebsd.org/~motoyuki/ (WWW) To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-i18n" in the body of the message From owner-freebsd-i18n Tue Jun 19 18:29:49 2001 Delivered-To: freebsd-i18n@freebsd.org Received: from post.webmailer.de (natpost.webmailer.de [192.67.198.65]) by hub.freebsd.org (Postfix) with ESMTP id DB07C37B401 for ; Tue, 19 Jun 2001 18:29:46 -0700 (PDT) (envelope-from freebsd-ml@econos.de) Received: from stefan-bt (p3E9B8E77.dip.t-dialin.net [62.155.142.119]) by post.webmailer.de (8.9.3/8.8.7) with SMTP id DAA19223 for ; Wed, 20 Jun 2001 03:29:45 +0200 (MET DST) From: Stefan Hoffmeister To: freebsd-i18n@FreeBSD.ORG Subject: Re: FreeBSD Unicode support Date: Wed, 20 Jun 2001 03:28:56 +0200 Organization: Econos Message-ID: References: <002101c0f8df$870e9850$0200000a@Neptune> <200106200031.f5K0VBq54775@sakura.mk.bsdclub.org> In-Reply-To: <200106200031.f5K0VBq54775@sakura.mk.bsdclub.org> X-Mailer: Forte Agent 1.8/32.548 MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Sender: owner-freebsd-i18n@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG : On Wed, 20 Jun 2001 09:31:11 +0900, Motoyuki Konno wrote: >Most of FreeBSD users in Japan >use EUC, SJIS or ISO-2022-JP (also known as JIS code). >ISO-2022-JP is the standard for e-mail. Please excuse my ignorance of the Japanese language (and Asian languages in general) - but is there a concept of "upper-case" and "lower-case" in these languages? The reason I am asking is the existing MBCS support in FreeBSD: tolower and toupper exist, but are only defined on ASCII (and with a lot of luck on 8 bit as well - haven't looked). This implies that almost certainly tolower and toupper will not work on any multi-byte character (except if the encoding is in a manner that would support). >Is Unicode widely used in Japan ? The answer is NO. Does it matter whether Unicode is widely used in Japan, when the internal (!) representation of a tool is Unicode, with renditions to whatever the user prefers? Shouldn't the question rather be: How "easy" is it to "process" the language? And UCS-2 / UCS-4 should be pretty easy to process? To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-i18n" in the body of the message From owner-freebsd-i18n Wed Jun 20 10:44:30 2001 Delivered-To: freebsd-i18n@freebsd.org Received: from nsmail.corel.com (smtp.corel.com [206.47.20.225]) by hub.freebsd.org (Postfix) with ESMTP id 0905337B403 for ; Wed, 20 Jun 2001 10:44:27 -0700 (PDT) (envelope-from andrec@corel.com) Received: from andrecwin2000 ([120.150.2.101]) by nsmail.corel.com (Netscape Messaging Server 3.61) with SMTP id AAA74E4 for ; Wed, 20 Jun 2001 13:44:25 -0400 Message-ID: <06b201c0f9c9$2500b540$65029678@andrecwin2000> From: "Andre Charbonneau" To: Subject: wide character classification funcitons Date: Wed, 20 Jun 2001 13:39:46 -0700 MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="----=_NextPart_000_06AF_01C0F98E.786F82A0" X-Priority: 3 X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook Express 5.50.4133.2400 X-MimeOLE: Produced By Microsoft MimeOLE V5.50.4133.2400 Sender: owner-freebsd-i18n@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG This is a multi-part message in MIME format. ------=_NextPart_000_06AF_01C0F98E.786F82A0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable Hi, I am currently running freeBSD 4.3 and correct me if I'm wrong but there = seems to be no support for the wide character classification functions, = such as iswalpha, iswdigit, iswupper etc... Do I have to install a specific package to get these functions on my = system? Will it be included in future releases of freeBSD? Is there any other functions I can use instead that will do the job? (I need this functionality because I'm currently trying to implement a = functions that will classify characters in a string made of unicode = characters.) Thanks, Andre Charbonneau Software developer/architect, globalization ------=_NextPart_000_06AF_01C0F98E.786F82A0 Content-Type: text/html; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable
Hi,
I am currently running freeBSD 4.3 and = correct me=20 if I'm wrong but there seems to be no support for the wide character=20 classification functions, such as iswalpha, iswdigit, iswupper=20 etc...
 
Do I have to install a specific package = to get=20 these functions on my system?
Will it be included in future releases = of=20 freeBSD?
Is there any other functions I can use = instead that=20 will do the job?
 
(I need this functionality because I'm = currently=20 trying to implement a functions that will classify characters in a = string made=20 of unicode characters.)
 
Thanks,
Andre Charbonneau
Software = developer/architect,=20 globalization
------=_NextPart_000_06AF_01C0F98E.786F82A0-- To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-i18n" in the body of the message From owner-freebsd-i18n Wed Jun 20 16:45:15 2001 Delivered-To: freebsd-i18n@freebsd.org Received: from mail.mk.bsdclub.org (adsl3219.ea.rim.or.jp [202.247.149.219]) by hub.freebsd.org (Postfix) with ESMTP id 0206D37B406 for ; Wed, 20 Jun 2001 16:45:08 -0700 (PDT) (envelope-from motoyuki@mk.bsdclub.org) Received: from sakura.mk.bsdclub.org (sakura.mk.bsdclub.org [3ffe:505:2022:0:2a0:c9ff:fe20:9aff]) by mail.mk.bsdclub.org (8.11.3+3.4W/3.7W/smtpfeed 1.12) with ESMTP/inet6 id f5KNj6L89286; Thu, 21 Jun 2001 08:45:06 +0900 (JST) Received: from sakura.mk.bsdclub.org (localhost.mk.bsdclub.org [127.0.0.1]) by sakura.mk.bsdclub.org (8.11.4/3.7W) with ESMTP/inet id f5KNj6q64062; Thu, 21 Jun 2001 08:45:06 +0900 (JST) Message-Id: <200106202345.f5KNj6q64062@sakura.mk.bsdclub.org> To: Stefan Hoffmeister Cc: freebsd-i18n@FreeBSD.ORG Subject: Re: FreeBSD Unicode support From: Motoyuki Konno X-Mailer: mh-e on Mule 2.3 / Emacs 19.34.1 References: <002101c0f8df$870e9850$0200000a@Neptune> <200106200031.f5K0VBq54775@sakura.mk.bsdclub.org> Mime-Version: 1.0 (generated by tm-edit 7.106) Content-Type: text/plain; charset=US-ASCII Date: Thu, 21 Jun 2001 08:45:06 +0900 Sender: owner-freebsd-i18n@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG Stefan Hoffmeister wrote: > : On Wed, 20 Jun 2001 09:31:11 +0900, Motoyuki Konno wrote: > > >Most of FreeBSD users in Japan > >use EUC, SJIS or ISO-2022-JP (also known as JIS code). > >ISO-2022-JP is the standard for e-mail. > > Please excuse my ignorance of the Japanese language (and Asian languages > in general) - but is there a concept of "upper-case" and "lower-case" in > these languages? Yes. There is no "upper-case" nor "lower-case" concept in Japanese *native* characters such as Kanji, Kata-kana, Hira-gana. But, there are "alphabet" characters in Japanese character sets. Japanese character sets includes: Kanji, Kata-kana, Hira-gana, Alphabet ([A-Za-z]), Number ([0-9]), Greek characters, etc. 'A' is 0x41 in ASCII. There is another 'A' character in Japanese character sets. For example, multibyte-'A' is 0x2341 in JIS. According to the "CJKV Information Processing" published by O'Reilly, Chinese and Korean character sets also includes "alphabet" characters. > >Is Unicode widely used in Japan ? The answer is NO. > > Does it matter whether Unicode is widely used in Japan, when the internal > (!) representation of a tool is Unicode, with renditions to whatever the > user prefers? > Shouldn't the question rather be: How "easy" is it to "process" the > language? And UCS-2 / UCS-4 should be pretty easy to process? There are difficulties with this solution, because there are problems aound conversion between Unicode and non-Unicode. For example, please refer the following page: http://www.debian.or.jp/~kubota/unicode-symbols.html One of the goals of Citrus project is "CSI : Charset Independent". Any kind of encoding (encoding is the process of mapping a character to a numeric value) is acceptable. I've heard that there is the session of Citrus Project in the FREENIX track on June 29. If you have interests about BSD's multilingual support and go to USENIX, please attend this session. See http://www.usenix.org/events/usenix01/tech/freenix.html -- ------------------------------------------------------------------------ Motoyuki Konno motoyuki@bsdclub.org (Home) motoyuki@FreeBSD.ORG (FreeBSD Project) http://www.freebsd.org/~motoyuki/ (WWW) To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-i18n" in the body of the message From owner-freebsd-i18n Thu Jun 21 8:10:49 2001 Delivered-To: freebsd-i18n@freebsd.org Received: from peorth.iteration.net (peorth.iteration.net [208.190.180.178]) by hub.freebsd.org (Postfix) with ESMTP id C95E037B401 for ; Thu, 21 Jun 2001 08:10:45 -0700 (PDT) (envelope-from keichii@iteration.net) Received: by peorth.iteration.net (Postfix, from userid 1001) id 40E3559228; Thu, 21 Jun 2001 10:10:45 -0500 (CDT) Date: Thu, 21 Jun 2001 10:10:45 -0500 From: "Michael C . Wu" To: Motoyuki Konno Cc: Jason Francis , freebsd-i18n@FreeBSD.ORG Subject: Re: FreeBSD Unicode support Message-ID: <20010621101045.A63569@peorth.iteration.net> Reply-To: "Michael C . Wu" Mail-Followup-To: "Michael C . Wu" , Motoyuki Konno , Jason Francis , freebsd-i18n@FreeBSD.ORG References: <002101c0f8df$870e9850$0200000a@Neptune> <200106200031.f5K0VBq54775@sakura.mk.bsdclub.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.2.5i In-Reply-To: <200106200031.f5K0VBq54775@sakura.mk.bsdclub.org>; from motoyuki@bsdclub.org on Wed, Jun 20, 2001 at 09:31:11AM +0900 X-PGP-Fingerprint: 5025 F691 F943 8128 48A8 5025 77CE 29C5 8FA1 2E20 X-PGP-Key-ID: 0x8FA12E20 Sender: owner-freebsd-i18n@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG On Wed, Jun 20, 2001 at 09:31:11AM +0900, Motoyuki Konno scribbled: | Hi, | | "Jason Francis" wrote: | > One of the arguments made by microsoft for switching the Hotmail | > frontend to Windows 2000 was the need for foreign language support. | > | > "Hotmail had the requirement to launch in new markets, and did not want | > to continue to invest in keeping the FreeBSD locale tables up to date | > and other maintenance activities. China and Japan are two important | > growing markets for MSN, so multibyte character sets had to be | > supported. FreeBSD lacked the necessary Unicode support." | > | > I want to know if there is any real merit to these claims, or if it's | > just more marketing drivel from Microsoft. I always wonder about the people who think that Asian people do not need to read their own language or something... | > Is work being done to bring Unicode support to FreeBSD that will allow | > it to have better support for globalization and foreign languages? Yes. | I agree with the idea "Unicode support is important". But I think | Unicode cannot solve the "multibyte character support" problem. | | Is Unicode widely used in Japan ? The answer is NO. Microsoft | Windows* users in Japan use SJIS (SJIS is also known as Microsoft | Kanji code) for plain text. Most of FreeBSD users in Japan | use EUC, SJIS or ISO-2022-JP (also known as JIS code). | ISO-2022-JP is the standard for e-mail. Because Unicode and UTF-8 are too unwieldy... For example, how do we delete a character that has a variable length? | P.S. | Unicode support is one of the goal of Citrus project. | Please see "Policy" section of http://citrus.bsdclub.org/. Linux already has Unicode locales while we are lightyears behind. :( IBM has the Linux UTF-8 locales for download. -- +-----------------------------------------------------------+ | keichii@iteration.net | keichii@freebsd.org | | http://iteration.net/~keichii | Yes, BSD is a conspiracy. | +-----------------------------------------------------------+ To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-i18n" in the body of the message From owner-freebsd-i18n Thu Jun 21 11: 6:38 2001 Delivered-To: freebsd-i18n@freebsd.org Received: from peorth.iteration.net (peorth.iteration.net [208.190.180.178]) by hub.freebsd.org (Postfix) with ESMTP id 44D5737B406 for ; Thu, 21 Jun 2001 11:06:36 -0700 (PDT) (envelope-from keichii@iteration.net) Received: by peorth.iteration.net (Postfix, from userid 1001) id B834E59228; Thu, 21 Jun 2001 13:06:35 -0500 (CDT) Date: Thu, 21 Jun 2001 13:06:35 -0500 From: "Michael C . Wu" To: Andre Charbonneau Cc: freebsd-i18n@FreeBSD.ORG Subject: Re: wide character classification funcitons Message-ID: <20010621130635.A69109@peorth.iteration.net> Reply-To: "Michael C . Wu" Mail-Followup-To: "Michael C . Wu" , Andre Charbonneau , freebsd-i18n@FreeBSD.ORG References: <06b201c0f9c9$2500b540$65029678@andrecwin2000> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.2.5i In-Reply-To: <06b201c0f9c9$2500b540$65029678@andrecwin2000>; from andrec@corel.com on Wed, Jun 20, 2001 at 01:39:46PM -0700 X-PGP-Fingerprint: 5025 F691 F943 8128 48A8 5025 77CE 29C5 8FA1 2E20 X-PGP-Key-ID: 0x8FA12E20 Sender: owner-freebsd-i18n@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG On Wed, Jun 20, 2001 at 01:39:46PM -0700, Andre Charbonneau scribbled: | Hi, | I am currently running freeBSD 4.3 and correct me if I'm wrong but there seems to be no support for the wide character classification functions, such as iswalpha, iswdigit, iswupper etc... | | Do I have to install a specific package to get these functions on my system? | Will it be included in future releases of freeBSD? | Is there any other functions I can use instead that will do the job? | | (I need this functionality because I'm currently trying to implement a functions that will classify characters in a string made of unicode characters.) Only in 5.0-current and newer... I will MFC this stuff soon into RELENG_4 if no one objects -- +-----------------------------------------------------------+ | keichii@iteration.net | keichii@freebsd.org | | http://iteration.net/~keichii | Yes, BSD is a conspiracy. | +-----------------------------------------------------------+ To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-i18n" in the body of the message