Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 5 Sep 2008 12:27:12 -0400
From:      "Mark B." <mkbucc@gmail.com>
To:        "Giorgos Keramidas" <keramida@ceid.upatras.gr>
Cc:        freebsd-questions@freebsd.org
Subject:   Re: How to delete non-ASCII chars in file
Message-ID:  <59f4cb420809050927w71fea733mcf7a2071c24cdc93@mail.gmail.com>
In-Reply-To: <87vdxa4p2p.fsf@kobe.laptop>
References:  <59f4cb420809050714i16ebe30bmd9f325592f05516e@mail.gmail.com> <87vdxa4p2p.fsf@kobe.laptop>

next in thread | previous in thread | raw e-mail | index | archive | help
On Fri, Sep 5, 2008 at 10:58 AM, Giorgos Keramidas
<keramida@ceid.upatras.gr> wrote:

> $ echo '^Fhello^F' | sed -e 's/[^[:print:]]*//' | hd
> 00000000  68 65 6c 6c 6f 06 0a                              |hello..|
> 00000007
> $

Thanks.

> The matching pattern is wrong.  You need `[^[:print:]]'.  The character
> class of printable characters is `[:print:]', and you can negate the
> pattern with `[^xxxx]' where `xxxx' is the character class; hence the
> extra pair of brackets in `[^[:print:]]'.

In case you are interested, I've patched the re_format man page with this
example.  I had read it, and it says :print: is the "name of the character
class."  I think the concrete example helps clarify things.

A follow question--is it possible to use that statement in a Makefile (BSD)?
A straight cut 'n paste didn't work, and I couldn't figure out the escaping to
make it work.

Thanks,

m

cd to /usr/src/lib/libc/regex/ and apply this patch.

--- /dev/null Fri Sep  5 12:12:21 2008
+++ re_format.7        Fri Sep  5 12:18:29 2008
@@ -288,6 +288,10 @@
 A locale may provide others.
 A character class may not be used as an endpoint of a range.
 .Pp
+To match all characters not in a class, use a bracket expression
+like this:
+.Ql [^[:print:]] .
+.Pp
 There are two special cases\(dd of bracket expressions:
 the bracket expressions
 .Ql [[:<:]]



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?59f4cb420809050927w71fea733mcf7a2071c24cdc93>