Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 19 Mar 2003 11:18:48 +0000
From:      Wayne Pascoe <freebsd@penguinpowered.org.uk>
To:        freebsd-questions@freebsd.org
Subject:   Checking an out of date website
Message-ID:  <20030319111848.GA76626@marvin.penguinpowered.org.uk>

next in thread | raw e-mail | index | archive | help
Hi all,

I'm wondering if someone can recommend a tool to help me do the
following : 

I have a very large website (2.4GB) of which much of the content is out
of date / no longer used.

What I want is a tool that will download a copy of the website, so that
I can see which bits are still linked to and part of the site heirachy. 

If I match the list of files from this tool against the list of files on
the server, the difference should be all files that are not linked to
from the site, and are not navigable to. In theory, I could then safely
delete these files.

I've found a couple of tools that can mostly do this. My problem comes
in from the fact that much of the navigation is done in flash though. So
most of the link checkers / spiders can't follow these links.

This could mean that what I end up with is not the whole of the site
that is still active.

Does anyone have any advice for a solution to this problem ?

Thanks in advance,

-- 
Wayne Pascoe

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-questions" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20030319111848.GA76626>