Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 16 Feb 1999 12:15:00 +1100
From:      Sue Blake <sue@welearn.com.au>
To:        Dan Nelson <dnelson@emsphone.com>
Cc:        Greg Lehey <grog@lemis.com>, rick hamell <hamellr@dsinw.com>, freebsd-questions@FreeBSD.ORG
Subject:   Re: cleaning a text file
Message-ID:  <19990216121500.33635@welearn.com.au>
In-Reply-To: <19990215185722.A21817@dan.emsphone.com>; from Dan Nelson on Mon, Feb 15, 1999 at 06:57:22PM -0600
References:  <19990215201056.19929@welearn.com.au> <Pine.BSF.3.91.990215010943.20451F-100000@dsinw.com> <19990216095232.J2207@lemis.com> <19990216103740.60271@welearn.com.au> <19990215185722.A21817@dan.emsphone.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On Mon, Feb 15, 1999 at 06:57:22PM -0600, Dan Nelson wrote:
> In the last episode (Feb 16), Sue Blake said:
> > The problem is that I don't know which funny characters exist in the
> > file, if any. I want to find out what they are, so I can search for
> > them and eyeball them before killing them.
> 
> How about something like 
> 
> grep "^[ -~]" file.txt
> 
> That will print any lines that have characters outside the standard
> printable ascii set.  Then you can look at the oddball letters and
> figure out appropriate replacement characters.

Hey, yeah, that'd be a great first check, enough to give it a clean
bill of health or deal with a few characters that are easily spotted.

Don Read sent this one too:

fold -w1 yourfile.txt |sort |uniq | grep -v "[A-Za-z0-9]"

which seems to do the trick. It's very slow, but it works.


With either or both of these, it's just a matter of finding the
character among what's pulled out, determining its character number,
checking its context in the file and making a decision about
substitution, then running tr or doing a replace with a text editor.
For the most common case, where there is nothing wrong with the file,
it's possible to confirm that the file is OK as is.


Reidar Bratsberg mentioned a utility called pep which might be good,
but so far I haven't been able to randomly press the right buttons to
make it compile. Experiments continue.


-- 

Regards,
        -*Sue*-


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-questions" in the body of the message



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?19990216121500.33635>