Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 4 Nov 2003 14:25:20 -0800
From:      Kris Kennaway <kris@obsecurity.org>
To:        ticso@cicely.de
Cc:        Kris Kennaway <kris@obsecurity.org>
Subject:   Re: New alpha 5.x bug
Message-ID:  <20031104222520.GA72254@rot13.obsecurity.org>
In-Reply-To: <20031104221251.GJ42463@cicely12.cicely.de>
References:  <20031101103955.GA42891@rot13.obsecurity.org> <20031104031740.GA67484@rot13.obsecurity.org> <20031104124826.GH42463@cicely12.cicely.de> <20031104175552.GA70699@rot13.obsecurity.org> <20031104221251.GJ42463@cicely12.cicely.de>

next in thread | previous in thread | raw e-mail | index | archive | help

--2fHTh5uZTiUOsy+g
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

On Tue, Nov 04, 2003 at 11:12:51PM +0100, Bernd Walter wrote:
> On Tue, Nov 04, 2003 at 09:55:52AM -0800, Kris Kennaway wrote:
> > On Tue, Nov 04, 2003 at 01:48:27PM +0100, Bernd Walter wrote:
> > > I can't speak for this problem yet, because my test systems are a bit
> > > older, but speaking for the pipe corruption:
> > > I did a lots of bzip1, tar, scp, nfs(client) without noticing any
> > > sign of problem.
> > > What is so special with the port cluster?
> > > I have no clue about it's design.
> >=20
> > It does lots of parallel package builds (untar, pkg_add, compile, tar) =
and NFS copying.
>=20
> Any special NFS options?
> tcp, udp, v2, v3, IPv4, IPv6?

Here is a typical mount -v:

axp7# mount -v
216.136.204.23:/a/nfs/alpha/5.dir1 on / (nfs, read-only, fsid 00ff000404000=
000)
devfs on /dev (devfs, local, fsid 01ff000303000000)
/dev/md0c on /etc (ufs, local, writes: sync 732 async 64400, reads: sync 55=
109 async 8784, fsid 13d19a3f414b62cc)
/dev/md1c on /var (ufs, local, writes: sync 338 async 35923, reads: sync 22=
620 async 0, fsid 19d19a3f8ce9f280)
/dev/md2c on /tmp (ufs, local, writes: sync 12 async 20, reads: sync 13 asy=
nc 0, fsid 1bd19a3fa96729ea)
/dev/da0e on /a (ufs, local, soft-updates, writes: sync 137415 async 139508=
00, reads: sync 12817244 async 903873, fsid f1d29a3f5b723dc6)
bento:/var/portbuild on /var/portbuild (nfs, fsid 02ff000404000000)
bento:/var/portbuild/alpha/5/ports on /a/tmp/5/chroot/24703/a/ports (nfs, r=
ead-only, fsid ebff120404000000)
bento:/var/portbuild/alpha/5/src on /a/tmp/5/chroot/24703/usr/src (nfs, rea=
d-only, fsid ecff120404000000)
bento:/var/portbuild/alpha/5/doc on /a/tmp/5/chroot/24703/usr/opt/doc (nfs,=
 read-only, fsid edff120404000000)
devfs on /a/tmp/5/chroot/24703/dev (devfs, local, fsid eeff120303000000)
bento:/var/portbuild/alpha/5/ports on /a/tmp/5/chroot/25765/a/ports (nfs, r=
ead-only, fsid efff120404000000)
bento:/var/portbuild/alpha/5/src on /a/tmp/5/chroot/25765/usr/src (nfs, rea=
d-only, fsid f0ff120404000000)
bento:/var/portbuild/alpha/5/doc on /a/tmp/5/chroot/25765/usr/opt/doc (nfs,=
 read-only, fsid f1ff120404000000)
devfs on /a/tmp/5/chroot/25765/dev (devfs, local, fsid f2ff120303000000)

The NFS mounts are nfsv3,intr,ro.

> Just to get the picture complete.
> The build is local and the package is then copied to a NFS server on
> which t has a corrupted CRC?

=46rom my memory of tests I ran a few months ago, the bzip2 CRC is
corrupted when the package is created locally.  The package is copied
to the server via scp.

> Is the bzip2 CRC wrong, or the tar CRC (does tar have a CRC?), or both?

Again from memory, the file is truncated, and there might be some
garbage (e.g. zeros) at the end.

> Can you say how likely such a corruption is?

On the last build 42 packages were corrupted out of about 7500.

> Are other packages compiled during copying a package file to the server?

Yes.  Typically there are 5 builds running at a time on the client
machines.

> Are the building machines memory stressed while creating the bz file or
> while copying it?

The machines are definitely busy (building other packages) while the
package is created and copied, although the machines should not be
paging.

> Really - it's hard to believe that pipe itself is the problem.
> I do lots of buildworlds with CFLAGS=3D-pipe and a corruption would
> very likely stop building.

I know that the problem began between 5.1-R and August 6, but I have
not been able to track it down beyond this.  There was work on both
pipes and VM in that time period, which is why I am suspicious of
both.

Kris
--2fHTh5uZTiUOsy+g
Content-Type: application/pgp-signature
Content-Disposition: inline

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.3 (FreeBSD)

iD8DBQE/qCdQWry0BWjoQKURAidPAKCJ+vh5MxOXO87wMqr+XJbAQ1gH1gCg/rfJ
B1Nzhbhmk4kwP9sDVb3Yua0=
=TF88
-----END PGP SIGNATURE-----

--2fHTh5uZTiUOsy+g--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20031104222520.GA72254>