From owner-freebsd-questions Fri Mar 13 12:51:01 1998 Return-Path: Received: (from majordom@localhost) by hub.freebsd.org (8.8.8/8.8.8) id MAA09316 for freebsd-questions-outgoing; Fri, 13 Mar 1998 12:51:01 -0800 (PST) (envelope-from owner-freebsd-questions@FreeBSD.ORG) Received: from gjp.erols.com (alex-va-n008c243.moon.jic.com [206.156.18.253]) by hub.freebsd.org (8.8.8/8.8.8) with ESMTP id MAA09288; Fri, 13 Mar 1998 12:50:47 -0800 (PST) (envelope-from gjp@gjp.erols.com) Received: from gjp.erols.com (localhost.erols.com [127.0.0.1]) by gjp.erols.com (8.8.8/8.8.7) with ESMTP id PAA21586; Fri, 13 Mar 1998 15:50:31 -0500 (EST) (envelope-from gjp@gjp.erols.com) X-Mailer: exmh version 2.0.1 12/23/97 To: Donald Burr cc: FreeBSD Ports , FreeBSD Questions From: "Gary Palmer" Subject: Re: Squid: Proxying for fun and profit In-reply-to: Your message of "Fri, 13 Mar 1998 07:20:37 PST." Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Date: Fri, 13 Mar 1998 15:50:31 -0500 Message-ID: <21582.889822231@gjp.erols.com> Sender: owner-freebsd-questions@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.ORG Donald Burr wrote in message ID : > -----BEGIN PGP SIGNED MESSAGE----- > The catch, though, is that I don't want this automatic fetching to cross > site boundaries. For example, let's say I'm indexing > http://www.freebsd.org, and I get along to a page mentioning a new device > driver doohickey by Acme Computer (http://www.acme.com/). I would like it > to skip over www.acme.com --ie only index www.freebsd.org pages. > Obviously, this is so that my index thing doesn't run wild and try and > download the entire Web to my computer, which I don't want! [I do > have a lot of disk space, but not THAT much! -- like Steven Wright said, > "You can't have everything -- where would you put it?"] > > Is there anything available (either in ports, or a Perl script that > someone hacked up, etc.) that will do this? Use a hacked up version of webcopy that doesn't write to disk. You can make webcopy use your proxy host, and it won't walk outside the hostname or path that you start it on. That'll preload the pages on your proxy very nicely. Gary -- Gary Palmer FreeBSD Core Team Member FreeBSD: Turning PC's into workstations. See http://www.FreeBSD.ORG/ for info To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-questions" in the body of the message