Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 02 Sep 2013 19:45:02 +0300
From:      Andriy Gapon <avg@FreeBSD.org>
To:        FreeBSD Current <freebsd-current@FreeBSD.org>
Subject:   Re: bug with special bracket expressions in regular expressions
Message-ID:  <5224C08E.1070404@FreeBSD.org>
In-Reply-To: <5224A693.3000904@FreeBSD.org>
References:  <5224A693.3000904@FreeBSD.org>

next in thread | previous in thread | raw e-mail | index | archive | help
on 02/09/2013 17:54 Andriy Gapon said the following:
> 
> re_format(7) says:
>      There are two special cases‡ of bracket expressions: the bracket expres‐
>      sions ‘[[:<:]]’ and ‘[[:>:]]’ match the null string at the beginning and
>      end of a word respectively.  A word is defined as a sequence of word
>      characters which is neither preceded nor followed by word characters.  A
>      word character is an alnum character (as defined by ctype(3)) or an
>      underscore.  This is an extension, compatible with but not specified by
>      IEEE Std 1003.2 (“POSIX.2”), and should be used with caution in software
>      intended to be portable to other systems.
> 
> However I observe the following:
> $ echo "cd0 cd1 xx" | sed 's/cd[0-9][^ ]* *//g'
> xx
> $ echo "cd0 cd1 xx" | sed 's/[[:<:]]cd[0-9][^ ]* *//g'
> cd1 xx
> 
> In my opinion '[[:<:]]' should not affect how the pattern is matched in this case.

It seems that the code works like this:
- first it matches "cd0 " and "removes" it
- then it passes "cd1 xx" for matching with a flag that tells that this is not
  a real start of the string
- thus the matching code
 o knows that this is not a real line start, so it can't match [[:<:]]
   just for that reason
 o it does _not_ know what was the character before the start of the given
   substring, so it can not know if it could match [[:<:]]

So matching fails.
Not sure if this is an internal problem of regex(3) or a problem of how sed(1)
uses regex(3).

-- 
Andriy Gapon



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?5224C08E.1070404>