Date: Tue, 30 Dec 2008 17:48:02 -0800 From: Gary Kline <kline@thought.org> To: freebsd-questions@freebsd.org Subject: Re: well, blew it... sed or perl q again. Message-ID: <20081231014802.GB46220@thought.org> In-Reply-To: <20081230211633.GA24525@marge.bs.l> References: <20081230193111.GA32641@thought.org> <20081230211633.GA24525@marge.bs.l>
next in thread | previous in thread | raw e-mail | index | archive | help
On Tue, Dec 30, 2008 at 10:16:33PM +0100, Bertram Scharpf wrote: > Hi Gary, > > Am Dienstag, 30. Dez 2008, 11:31:14 -0800 schrieb Gary Kline: > > The problem is that there are many, _many_ embedded > > "<A HREF="http://whatever> Site</A> in my hundreds, or > > thousands, or files. I only want to delete the > > "http://<junkfoo.com>" lines, _not_ the other Href links. > > > > sed or perl? > > Ruby. Untested: > > $ ruby -i.bak -pe 'next if ~/href="([^"]*)"/i and $1 == "http://example.com"' somefile.html > > Probably you want to do something more sophisticated. > > Bertram > Hi Bertram, Well, after about 45 minutes of mousing cut/paste/edit, plus editing scripts, i ain't there yet. if i use the perl -e 'print unless "/m/http:/" || eof; close ARGV if eof' *.htm no errors, but the new.htm is == new.htm.bak; in other words, it looks like a partial match on just "http" fails. Don't know why. i'm pretty sure the entire "<A HREF="http://foobar.com"> xxx </A>" would do it. roland, the dbl quote were necessary it seems. maybe i'll try parens. gary > > -- > Bertram Scharpf > Stuttgart, Deutschland/Germany > http://www.bertram-scharpf.de -- Gary Kline kline@thought.org http://www.thought.org Public Service Unix http://jottings.thought.org http://transfinite.thought.org The 2.17a release of Jottings: http://jottings.thought.org/index.php
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20081231014802.GB46220>