Skip site navigation (1)Skip section navigation (2)
Date:      20 Feb 1999 01:38:49 +0100
From:      Kai.Grossjohann@CS.Uni-Dortmund.DE
To:        Sue Blake <sue@welearn.com.au>
Cc:        Mark Ovens <marko@uk.radan.com>, questions@FreeBSD.ORG
Subject:   Re: cleaning a text file
Message-ID:  <864sohkixy.fsf@slowfox.frob.org>
In-Reply-To: Sue Blake's message of "Tue, 16 Feb 1999 11:49:59 %2B1100"
References:  <19990215201056.19929@welearn.com.au> <Pine.BSF.3.91.990215010943.20451F-100000@dsinw.com> <19990216095232.J2207@lemis.com> <19990216103740.60271@welearn.com.au> <19990216002703.A337@localhost> <19990216114959.08931@welearn.com.au>

next in thread | previous in thread | raw e-mail | index | archive | help
Sue Blake <sue@welearn.com.au> writes:

  > On Tue, Feb 16, 1999 at 12:27:03AM +0000, Mark Ovens wrote:
  > > 
  > > First you need to identify the offending characters.
  > 
  > Indeed. That is my sole problem.

Well, search forward for the following regex:

  [^a-z0-9A-Z_+= \t\r\n-]

If you find a character that's ok, add it to the list.  After all,
there are only 255 characters, and some of them will be bad.  So you
won't have to add characters often.

Or are you saying you're looking at Japanese or Chinese text with
multibyte characters?  Then, you're screwed.

kai
-- 
I like _b_o_t_h kinds of music.


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-questions" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?864sohkixy.fsf>