Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 16 Feb 1999 01:00:15 +0000
From:      Mark Ovens <marko@uk.radan.com>
To:        Sue Blake <sue@welearn.com.au>
Cc:        questions@FreeBSD.ORG
Subject:   Re: cleaning a text file
Message-ID:  <19990216010015.A190@localhost>
In-Reply-To: <19990216002703.A337@localhost>; from Mark Ovens on Tue, Feb 16, 1999 at 12:27:03AM %2B0000
References:  <19990215201056.19929@welearn.com.au> <Pine.BSF.3.91.990215010943.20451F-100000@dsinw.com> <19990216095232.J2207@lemis.com> <19990216103740.60271@welearn.com.au> <19990216002703.A337@localhost>

next in thread | previous in thread | raw e-mail | index | archive | help
On Tue, Feb 16, 1999 at 12:27:03AM +0000, Mark Ovens wrote:
> On Tue, Feb 16, 1999 at 10:37:40AM +1100, Sue Blake wrote:
> > 
> > The problem is that I don't know which funny characters exist in the
> > file, if any. I want to find out what they are, so I can search for
> > them and eyeball them before killing them.
> > 
> > 
> > Just knowing which characters they are would give me many solutions
> > immediately. There still doesn't seem to be a way to find this out :-(
> > 
> 
> First you need to identify the offending characters. Use od(1) or
> hexdump(1) to identify them and then work out a filter.
> 
> Are they all extended ASCII (>127) chars? or are some of them
> control (<32) chars?. You could possibly use awk(1) as a filter,
> or write a simple C prog using issprint() and isspace().
> 
> HTH
> 

As soon as I'd sent my previous message I remembered something. If
you have (or can lay your hands on) a copy of The Unix Programming
Environment by Kernighan & Pike you will find, starting on p172,
this very problem addressed, complete with several (short) C code
listings which give you the option to print the offending characters
as octal codes or to strip them from the file.

> > Maybe there's a long way... somehow put a line-feed after each character
> > in the file (with sed?) and then sort it and look at the top and bottom
> > of the sorted file.
> > 
> > -- 
> > 
> > Regards,
> >         -*Sue*-
> > 
> > 
> > To Unsubscribe: send mail to majordomo@FreeBSD.org
> > with "unsubscribe freebsd-questions" in the body of the message
> > 
> 
> -- 
>       FreeBSD - The Power To Serve http://www.freebsd.org
>       My Webpage http://www.users.globalnet.co.uk/~markov
> _______________________________________________________________
> Mark Ovens, CNC Apps Engineer, Radan Computational Ltd. Bath UK
> CAD/CAM solutions for Sheetmetal Working Industry
> mailto:marko@uk.radan.com                  http://www.radan.com
> 
> 
> To Unsubscribe: send mail to majordomo@FreeBSD.org
> with "unsubscribe freebsd-questions" in the body of the message
> 

-- 
      FreeBSD - The Power To Serve http://www.freebsd.org
      My Webpage http://www.users.globalnet.co.uk/~markov
_______________________________________________________________
Mark Ovens, CNC Apps Engineer, Radan Computational Ltd. Bath UK
CAD/CAM solutions for Sheetmetal Working Industry
mailto:marko@uk.radan.com                  http://www.radan.com


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-questions" in the body of the message



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?19990216010015.A190>