From owner-freebsd-chat Thu Apr 11 3:19:36 2002 Delivered-To: freebsd-chat@freebsd.org Received: from bast.unixathome.org (bast.unixathome.org [216.187.105.150]) by hub.freebsd.org (Postfix) with ESMTP id C6D8437B41C for ; Thu, 11 Apr 2002 03:19:31 -0700 (PDT) Received: from wocker (wocker.unixathome.org [192.168.0.99]) by bast.unixathome.org (Postfix) with ESMTP id 3E6283F30; Thu, 11 Apr 2002 06:20:24 -0400 (EDT) From: "Dan Langille" Organization: DVL Software Limited To: Terry Lambert Date: Thu, 11 Apr 2002 06:19:29 -0400 MIME-Version: 1.0 Subject: Re: what are these characters please? Reply-To: dan@langille.org Cc: chat@freebsd.org In-reply-to: <3CB4FBFB.9D2AC7E0@mindspring.com> X-mailer: Pegasus Mail for Windows (v4.01) Content-type: text/plain; charset=US-ASCII Content-transfer-encoding: 7BIT Content-description: Mail message body Message-Id: <20020411102024.3E6283F30@bast.unixathome.org> Sender: owner-freebsd-chat@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.org On 10 Apr 2002 at 19:59, Terry Lambert wrote: > Dan Langille wrote: > > I found these characters in a recent cvs-all commit: > > > > 20 20 20 20 20 5b 53 75 62 6d 69 74 74 65 64 20 | [Submitted | 62 > > 79 3a 20 56 69 6c 6c 65 20 53 6b 79 74 74 1b |by: Ville Skytt.| 2c 41 > > 64 1b 28 42 20 3c 76 69 6c 6c 65 2e 73 6b |,Ad.(B > 61 40 69 6b 69 2e 66 69 3e 5d 0a 20 20 |ytta@iki.fi>]. | > > > > When viewed under vi, I get: > > > > Ville Skytt^[,Ad^[(B > > ANSI character set selector escape sequence for 7 bit representation > of 8 bit characters. > > If I had to guess, I would say "eth", which is a "D" with a bar in it, > unlike "thorn", which is an "O" with a forwars slash through it. 8-). > > Obviously a deficiency in the encapsulation of a cut-and-paste > that was not attributed by encoding, because CVS commit logs are > not MIME encapsulated. Given that I'm trying to process the cvs-all messages into XML documents (using the perl module XML::Writer which does not do any encoding beyond characters such as >, <, etc), any suggestions as to how to deal with such characters? I've been looking through cpan but I suspect I'm using the wrong search criteria ("encoding"). Any clues? -- Dan Langille The FreeBSD Diary - http://freebsddiary.org/ - practical examples To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-chat" in the body of the message