Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 4 Nov 2005 11:29:12 -0600
From:      Kirk Strauser <kirk@strauser.com>
To:        freebsd-questions@freebsd.org
Subject:   Re: Fast diff command for large files?
Message-ID:  <200511041129.17912.kirk@strauser.com>
In-Reply-To: <436B8ADF.4000703@mac.com>
References:  <200511040956.19087.kirk@strauser.com> <436B8ADF.4000703@mac.com>

next in thread | previous in thread | raw e-mail | index | archive | help
--nextPart3264035.pAfWVuXc3O
Content-Type: text/plain;
  charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable
Content-Disposition: inline

On Friday 04 November 2005 10:22, Chuck Swiger wrote:

> Multigigabyte?  Find another approach to solving the problem, a text-base
> diff is going to require excessive resources and time.  A 64-bit platform
> with 2 GB of RAM & 3GB of swap requires ~1000 seconds to diff ~400MB.

There really aren't many options.  For the patient, here's what's happening:

Our legacy application runs on FoxPro.  Our web application runs on a=20
PostgreSQL database that's a mirror of the FoxPro tables.

We do the mirroring by running a program that dumps the FoxPro tables out a=
s=20
tab-delimited files.  Thus far, we'd been using PostgreSQL's "copy from"=20
command to read those files into the database.  In reality, though, a very,=
=20
very small percentage of rows in those tables actually change.  So, I wrote=
=20
a program that takes the output of diff and converts it into a series of=20
"delete" and "insert" commands; benchmarking shows that this is roughly 300=
=20
times faster in our use.

And that's why I need a fast diff.  Even if it takes as long as the databas=
e=20
bulk loads, we can run it on another server and use 20 seconds of CPU for=20
PostgreSQL instead of 45 minutes.  The practical upshot is that the=20
database will never get sluggish, even if the other "diff server" is loaded=
=20
to the gills.
=2D-=20
Kirk Strauser

--nextPart3264035.pAfWVuXc3O
Content-Type: application/pgp-signature

-----BEGIN PGP SIGNATURE-----

iD8DBQBDa5pt5sRg+Y0CpvERAlDzAJ4ljAuI//Jf9YABy5bC2+C3g7NAcgCeMt6J
6fvneAVD2YqkCQBaMpVeQXU=
=kX3b
-----END PGP SIGNATURE-----

--nextPart3264035.pAfWVuXc3O--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200511041129.17912.kirk>