Date: Wed, 5 May 1999 21:49:34 -0500 From: "G. Adam Stanislav" <adam@whizkidtech.net> To: Dmitrij Tejblum <dima@tejblum.dnttm.rssi.ru> Cc: freebsd-hackers@freebsd.org Subject: Re: wc* routines Message-ID: <19990505214934.B217@whizkidtech.net> In-Reply-To: <199905041711.VAA04689@tejblum.dnttm.rssi.ru>; from Dmitrij Tejblum on Tue, May 04, 1999 at 09:11:45PM %2B0400 References: <199905041711.VAA04689@tejblum.dnttm.rssi.ru>
next in thread | previous in thread | raw e-mail | index | archive | help
On Tue, May 04, 1999 at 09:11:45PM +0400, Dmitrij Tejblum wrote: > I don't like your idea that WEOF == INT_MIN. Apparently, everyone else > have WEOF == -1 (== EOF), and there is no reason why we should not too. > I don't know about "debugging purposes". WEOF == EOF should allow more > code sharing with existing libc. Note that FreeBSD already have some > very sparse and nonstandard (but functional) wchar support. Now that I have actually started coding, I agree with you. :-) I changed it to (-1). > Note that a major portion of <wctype.h> already almost implemented in FreeBSD: > plain ctype functions work with wide characters. So it should be fairly > easy to write an almost working <wctype.h>. (BTW, it is somewhere on my > ToDo list for quite some time, but now not that far from the top). It is fairly easy regardless. :-) It is different from plain ctype functions though. For example, iswdigit(ch) must return TRUE if ch is a digit in Devanagari or Chinese or anything else. It also must be locale independent. If ch is a digit in Unicode (or on any plane of its ISO 10646 extension), then it is a digit even if one's locale does not know about it. I have evaluated two existing packages today. They both include the functionality of <wctype.h> and more (the authors thought C standard did not go far enough). I exchanged email with both authors. They both suggested I should include either library in the base distribution and have the C routines be a front end to it. I liked the idea at first, but, having thought about it for a couple hours, I am now more inclined to place both libraries into the ports collection and continue working on my routines. One good thing about evaluating those two packages is that I noticed one of them is using very much the same algorithms I have come up with. It is good to get an affirmation that I am moving on the right track. :-) The main problem with doing the front end is that we would either have to include one of their libraries into the C library, thus adding to it things that simply do not belong there, or we have to link programs with their library just to use wctype functions from the standard C library, which would open a whole can of worms. Nevertheless, I have posted links to their libraries on the web page and am open to comments. I have also discovered an important link today: It is no longer necessary to spend $305 to get your own copy of the ISO 10646 standard: It can be downloaded from the web either in MS Word format (yeah, right) or as a PostScript file. The link is now listed on the page, which, again, is at http://www.whizkidtech.net/i18n/wc/. I will also need to get some input on some "philosophical" questions. Namely, I will need to build several tables for the wctype.h functionality. The thing is that the standard is open: New codes can and will be added to it. I need to decide whether to hardcode the tables or place them into files. At this point, I am leaning toward the hardcoded solution for several reasons: A file can be misplaced or lost, or even corrupted; the changes do not happen too often; the changes do not affect major languages and are of little consequence to most computer users (so if Egyptian hieroglyphics are added to plane 1 as planned, Egyptologists will need to update their C libraries, while us mortals may pretty much ignore it); it is just as easy to download an update of the C library as an update of several files. For what it's worth, I will need to write some utilities for my own use, utilities to create the code for tables. So any time they add some new code of interest to only a small group of people, the group can use the utilities on their own computers, and simply recompile the library even if I am on vacation, or whatever. Adam To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-hackers" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?19990505214934.B217>