Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 13 Mar 1998 15:50:31 -0500
From:      "Gary Palmer" <gpalmer@FreeBSD.ORG>
To:        Donald Burr <dburr@POBoxes.com>
Cc:        FreeBSD Ports <freebsd-ports@FreeBSD.ORG>, FreeBSD Questions <freebsd-questions@FreeBSD.ORG>
Subject:   Re: Squid: Proxying for fun and profit 
Message-ID:  <21582.889822231@gjp.erols.com>
In-Reply-To: Your message of "Fri, 13 Mar 1998 07:20:37 PST." <XFMail.980313072037.dburr@POBoxes.com> 

next in thread | previous in thread | raw e-mail | index | archive | help
Donald Burr wrote in message ID
<XFMail.980313072037.dburr@POBoxes.com>:
> -----BEGIN PGP SIGNED MESSAGE-----
> The catch, though, is that I don't want this automatic fetching to cross
> site boundaries.  For example, let's say I'm indexing
> http://www.freebsd.org, and I get along to a page mentioning a new device
> driver doohickey by Acme Computer (http://www.acme.com/).  I would like it
> to skip over www.acme.com --ie only index www.freebsd.org pages. 
> Obviously, this is so that my index thing doesn't run wild and try and
> download the entire Web to my computer, which I don't want!  [I do
> have a lot of disk space, but not THAT much! -- like Steven Wright said,
> "You can't have everything -- where would you put it?"]
> 
> Is there anything available (either in ports, or a Perl script that
> someone hacked up, etc.) that will do this?

Use a hacked up version of webcopy that doesn't write to disk. You can make 
webcopy use your proxy host, and it won't walk outside the hostname or path 
that you start it on. That'll preload the pages on your proxy very nicely.

Gary
--
Gary Palmer                                          FreeBSD Core Team Member
FreeBSD: Turning PC's into workstations. See http://www.FreeBSD.ORG/ for info



To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-questions" in the body of the message



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?21582.889822231>