Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 2 Aug 2012 08:04:47 -0600 (MDT)
From:      Warren Block <wblock@wonkity.com>
To:        RW <rwmaillists@googlemail.com>
Cc:        freebsd-questions@freebsd.org
Subject:   Re: buggy awk regex handling?
Message-ID:  <alpine.BSF.2.00.1208020759350.80875@wonkity.com>
In-Reply-To: <20120802141738.62ef1e45@gumby.homeunix.com>
References:  <743721353.9443.1343906452119.JavaMail.sas1@172.29.249.242> <20120802141738.62ef1e45@gumby.homeunix.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On Thu, 2 Aug 2012, RW wrote:

> On Thu, 02 Aug 2012 13:20:52 +0200
> kaltheat wrote:
>
>> I tried to replace three letters with three letters by awk using the
>> sub-routine. I assumed that my regular expression does mean the
>> following:
>>
>> match if three letters of any letter of alphabet occurs anywhere in
>> input
>>
>> $ echo AbC | awk '{sub(/[[:alpha:]]{3}/,"cBa"); print;}'
>> AbC
>>
>> As you can see the result was unexpected.
>> When I try doing it for at least one letter, it works:
>>
>> $ echo AbC | awk '{sub(/[[:alpha:]]+/,"cBa"); print;}'
>> cBa
>> ...
>> What am I doing wrong?
>> Or is awk buggy?
>
> Traditional awk implementations don't support {n}, but I think POSIX
> implementations should.

Using gawk instead of awk agrees with that.  Printing the result of the 
sub (the number of substitutions performed) makes it a little more 
clear:

% echo AbC | awk '{print sub(/[[:alpha:]]{3}/,"cBa"); print;}'
0
AbC

% echo AbC | gawk '{print sub(/[[:alpha:]]{3}/,"cBa"); print;}'
1
cBa

sed can handle it:

% echo AbC | sed -E 's/[[:alpha:]]{3}/cBa/'
cBa



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?alpine.BSF.2.00.1208020759350.80875>