Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 2 Oct 2000 23:35:07 +0100
From:      Mark Ovens <marko@freebsd.org>
To:        Christopher Rued <c.rued@xsb.com>
Cc:        "Andresen,Jason R." <jandrese@mitre.org>, freebsd-questions@FreeBSD.ORG
Subject:   Re: Perl question
Message-ID:  <20001002233507.A252@parish>
In-Reply-To: <14808.63902.442934.667120@chris.xsb.com>; from c.rued@xsb.com on Mon, Oct 02, 2000 at 05:09:50PM -0400
References:  <14808.52583.347797.384055@chris.xsb.com> <20001002191537.G252@parish> <20001002192617.I252@parish> <39D8D5D9.67A3074B@mitre.org> <14808.63902.442934.667120@chris.xsb.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On Mon, Oct 02, 2000 at 05:09:50PM -0400, Christopher Rued wrote:
> Andresen,Jason R. writes:
>  > > BTW, your RE should have a ``*'' as well:
>  > > 
>  > >         /x.*?y/
>  > > 
>  > 
>  > Maybe, it depends on exactly what he was trying to get.
>  > 
>  > The first 3 character match where x and y are the first and third
>  > character respectivly, then x.y is exactly what you want.  The smallest
>  > set of characters that have x and y as boundry values?  Then your x.*?y
>  > is correct.  The smallest set of characters that have x and y as
>  > boundries and have at least one character in between them?  x.+?y is
>  > needed.
> 
> The RE I used was precisely what I wanted: x.y (an `x' followed by
> exactly one character followed by a `y').
> 
> When I run the following:
> 
>     #!/usr/bin/perl
>     $a = "xayxbyxcyxdy";
>     @s = $a =~ /x.y/;
>     print "\@s is @s\n";
> 
> I get:
>    
>     @s is 1
> 
> 
> 
> So, I seem to be getting the truth value rather than the first match
> in the string.  If, however, I wrap the entire RE in a parentheses
> (make it a subexpression) like so:
> 

Well, () is not strictly a subexpression. It causes whatever is matched to
be remembered so that it can be recalled later (using \1, \2, etc.) similar
to \(...\) in sed(1).

>     #!/usr/bin/perl
>     $a = "xayxbyxcyxdy";
>     @s = $a =~ /(x.y)/;
>     print "\@s is @s\n";
> 
> I get the results I wanted to begin with:
> 
>     @s is xay
> 
> (I discovered this shortly after I sent the first message about this).
> 
> 
> 
> What confuses me is that if I specify the global option, I do not need
> to use a subexpression.  For example, if I run the following code:
> 
>     #!/usr/bin/perl
>     $a = "xayxbyxcyxdy";
>     @s = $a =~ /x.y/g;
>     print "\@s is @s\n";
> 
> I get:
> 
>     @s is xay xby xcy xdy
> 
> 
> So, this leaves me with a couple of questions, the main one being:
>   Why the different treatment for single matches and global
>   matches?
> 
> and a less important one:
>   Why is there no way to have the first match assigned to a scalar,
>   since we can be sure that there will be at most one match returned?
> 

AIUI, the construct ``$a =~ /x.y/;'' just returns TRUE or FALSE and thus is
used in if():

	if ($a =~ /x.y/) {

		.....
	}

I would guess that if you specify a global match, or use () to memorize the
match then perl(1) saves it because it is reasonable to assume that you
require more than TRUE or FALSE.

If you get a definitive answer I'd be interested in knowing what it is.

> 
> 
> If anyone can explain this, and/or answer the questions posed above,
> I'd appreciate it.
> 
> 						-Chris

-- 
		4.4 - The number of the Beastie
________________________________________________________________
51.44°N  FreeBSD - The Power To Serve http://www.freebsd.org
2.057°W  My Webpage http://ukug.uk.freebsd.org/~mark
mailto:marko@freebsd.org                http://www.radan.com



To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-questions" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20001002233507.A252>