Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 2 Sep 2013 17:09:33 +0200 (CEST)
From:      Damian Weber <dweber@htw-saarland.de>
To:        Andriy Gapon <avg@FreeBSD.org>
Cc:        FreeBSD Current <freebsd-current@FreeBSD.org>, freebsd-standards@FreeBSD.org
Subject:   Re: bug with special bracket expressions in regular expressions
Message-ID:  <alpine.BSF.2.00.1309021706470.24899@magritte.htw-saarland.de>
In-Reply-To: <5224A693.3000904@FreeBSD.org>
References:  <5224A693.3000904@FreeBSD.org>

next in thread | previous in thread | raw e-mail | index | archive | help


On Mon, 2 Sep 2013, Andriy Gapon wrote:

> re_format(7) says:
>      There are two special cases? of bracket expressions: the bracket expres?
>      sions ?[[:<:]]? and ?[[:>:]]? match the null string at the beginning and
>      end of a word respectively.  A word is defined as a sequence of word
>      characters which is neither preceded nor followed by word characters.  A
>      word character is an alnum character (as defined by ctype(3)) or an
>      underscore.  This is an extension, compatible with but not specified by
>      IEEE Std 1003.2 (?POSIX.2?), and should be used with caution in software
>      intended to be portable to other systems.
> 
> However I observe the following:
> $ echo "cd0 cd1 xx" | sed 's/cd[0-9][^ ]* *//g'
> xx
> $ echo "cd0 cd1 xx" | sed 's/[[:<:]]cd[0-9][^ ]* *//g'
> cd1 xx
> 
> In my opinion '[[:<:]]' should not affect how the pattern is matched in this case.
> 
> Any thoughts, suggestions?

there are two simpler expressions, whose difference I don't understand either
(tested on 8.4-PRERELEASE)

$ echo "cd0 cd1 xx" | sed 's/cd[0-9] //g'
xx
$ echo "cd0 cd1 xx" | sed 's/[[:<:]]cd[0-9] //g'
cd1 xx

-- Damian




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?alpine.BSF.2.00.1309021706470.24899>