Date: Tue, 27 Dec 2005 10:04:45 -0600 From: David Kelly <dkelly@hiwaay.net> To: Jack Stone <antennex@hotmail.com> Cc: freebsd-questions@freebsd.org Subject: Re: a SED need Message-ID: <20051227160445.GA56368@Grumpy.DynDNS.org> In-Reply-To: <BAY106-F1673797A89767CF16F02ECC370@phx.gbl> References: <BAY106-F1673797A89767CF16F02ECC370@phx.gbl>
next in thread | previous in thread | raw e-mail | index | archive | help
On Tue, Dec 27, 2005 at 09:18:56AM -0600, Jack Stone wrote: > I have some HTML files with hundreds of URLs that I need to modify using a > search/replace string. I assume that SED(1) is the right tool to use, but > every syntax I've tried has not worked. > > Here is what I'm trying to do: > Change full URLs to relative paths, in other words, chop off the > "http://www.example.com/" portion: > > >From this: > <li><a href="http://www.example.com/model/many.html"> > To this: > <li><a href="model/many.html"> > > I think it is the slashes and quotes that are giving me fits as I'm very > much a novice on SED(1) syntax. Am sure sed is the right high power production tool for getting the job done but I get such things done easier in awk. Am sure many say the same about perl. Sed, awk, perl, is the evolutionary order. Save this as something like "example.awk" and chmod +x to make it executable for easy reuse. Or you could "awk -f example.exe input > output" By saving to a file you bypass the need to escape characters from the shell (which will be different depending on csh vs. sh) and yet again from the RE parser. The escapes below are to make sure the literal character is used for regular expression rather than a possible RE interpretation. Contains two patterns to match. The first matches the thing you are looking to change. The match regular expression is repeated in gsub() where its replaced with the plain text you desire. "Print" causes the line to be outputed, and "next" ends the processing of that input line so the next pattern isn't tried. Therefore the next match-all pattern prints everything the first skipped. #!/usr/bin/awk -f /<a href=\"http:\/\/www.example.com\// { gsub(/<a href=\"http:\/\/www.example.com\//, "<a href=\"") print next } { print } -- David Kelly N4HHE, dkelly@HiWAAY.net ======================================================================== Whom computers would destroy, they must first drive mad.
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20051227160445.GA56368>