From owner-freebsd-bugs Mon Jul 14 12:14:14 1997 Return-Path: Received: (from root@localhost) by hub.freebsd.org (8.8.5/8.8.5) id MAA08642 for bugs-outgoing; Mon, 14 Jul 1997 12:14:14 -0700 (PDT) Received: (from jmb@localhost) by hub.freebsd.org (8.8.5/8.8.5) id MAA08634; Mon, 14 Jul 1997 12:14:09 -0700 (PDT) From: "Jonathan M. Bresler" Message-Id: <199707141914.MAA08634@hub.freebsd.org> Subject: Re: ispunct(3) [was: FreeBSD-2.1.1] To: wollman@khavrinen.lcs.mit.edu (Garrett Wollman) Date: Mon, 14 Jul 1997 12:14:09 -0700 (PDT) Cc: jmb@FreeBSD.ORG, freebsd-bugs@FreeBSD.ORG In-Reply-To: <199707141651.MAA07262@khavrinen.lcs.mit.edu> from "Garrett Wollman" at Jul 14, 97 12:51:26 pm X-Mailer: ELM [version 2.4 PL24] Content-Type: text Sender: owner-bugs@FreeBSD.ORG X-Loop: FreeBSD.org Precedence: bulk Garrett, i am sure that you are correct, but i would prefer more portable code. UCHAR_MIN seems to be in short supply. (UCHAR_MAX might be too). isascii() is everywhere. i need to buy a copy of greg lehey's book. kryten: {46} uname -a FreeBSD kryten.yyy.zzz 2.2-STABLE FreeBSD 2.2-STABLE #0: Wed Jun 25 16:53:36 EDT 1997 jmb@kryten.yyy.zzz:/usr/src/sys/compile/KRYTEN i386 kryten: {47} which cc /usr/bin/cc kryten: {48} cc --version 2.7.2.1 kryten: {49} cc /tmp/a.c /tmp/a.c: In function `main': /tmp/a.c:10: `UCHAR_MIN' undeclared (first use this function) /tmp/a.c:10: (Each undeclared identifier is reported only once /tmp/a.c:10: for each function it appears in.) kryten: {55} grep UCHAR /usr/include/* /usr/include/machine/* /usr/include/machine/limits.h:#define UCHAR_MAX 255 /* max value for an unsigned char */ arcue1(14)% uname -a SunOS arcue1 5.5.1 Generic_103640-08 sun4u sparc SUNW,Ultra-Enterprise arcue1(15)% which cc /opt/SUNWspro/bin/cc arcue1(16)% cc -V cc: SC4.0 18 Oct 1995 C 4.0 usage: cc [ options] files. Use 'cc -flags' for details arcue1(17)% cc /tmp/a.c "/tmp/a.c", line 10: undefined symbol: UCHAR_MIN cc: acomp failed for /tmp/a.c xxx: {30} % uname -a BSD/OS xxx.yyy.zzz 2.1 BSDI BSD/OS 2.1 Kernel #1: Thu Jan 9 05:02:29 EST 1997 root@xxx.yyy.zzz:/usr/src/sys/compile/DIGI i386 xxx: {31} % which cc /usr/bin/cc xxx: {32} % cc -v gcc version 1.42 xxx: {33} % cc /tmp/a.c /tmp/a.c: In function main: /tmp/a.c:10: `UCHAR_MIN' undeclared (first use this function) /tmp/a.c:10: (Each undeclared identifier is reported only once /tmp/a.c:10: for each function it appears in.) xxx: {34} % grep UCHAR /usr/include/* /usr/include/machine/* /usr/include/machine/limits.h:#define UCHAR_MAX 255 /* max value for an unsigned char */ Garrett Wollman wrote: > > < said: > > > ispunct() is only useful for ASCII input. > > the correct way to use ispunct() and the rest of the functions > > listed in ctype(3) is to call isascii() first > > BZZZZT! > > There is no such thing as isascii() in Standard C. The domain of all > of the ctype(3) functions, as Bruce noted earlier in this thread, is > [UCHAR_MIN,UCHAR_MAX] union {EOF}. > > > wollman@khavrinen(173)$ cat >foo.c > #include > #include > #include > > int > main(void) > { > int c; > > for (c = UCHAR_MIN; c < UCHAR_MAX; c++) { > if (ispunct(c)) > putchar(c); > } > putchar('\n'); > if (ispunct(EOF)) > printf("EOF\n"); > return 0; > } > wollman@khavrinen(174)$ cc -o foo foo.c > wollman@khavrinen(175)$ ./foo > !"#$%&'()*+,-./:;<=>?@[\]^_`{|}~¡¢£¤¥¦§¨©ª«¬­®¯°±²³´µ¶·¸¹º»¼½¾¿×÷ > wollman@khavrinen(176)$ > > In case you can't read the high-bit characters there, they are all the > punctuation characters from the ISO 8859-1 (``Latin 1'') character > set.