Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 6 Oct 2006 21:20:20 +0300
From:      Kostik Belousov <kostikbel@gmail.com>
To:        Vivek Khera <vivek@khera.org>
Cc:        stable@freebsd.org, Kris Kennaway <kris@obsecurity.org>
Subject:   Re: ffs snapshot lockup
Message-ID:  <20061006182020.GH26993@deviant.kiev.zoral.com.ua>
In-Reply-To: <F314836D-7A2D-416A-8705-33F6E2EE6196@khera.org>
References:  <555B84D2-520F-44D6-84D6-CF9CE7EE47C7@khera.org> <20060922203654.GA65693@xor.obsecurity.org> <847DD3A5-D5DD-4D3E-B755-64B13D1DA506@khera.org> <20061003084315.GA89654@deviant.kiev.zoral.com.ua> <40CE3CF0-49D2-4335-A0B8-34B5251E9E19@khera.org> <20061005083027.GK89654@deviant.kiev.zoral.com.ua> <5178C89F-B645-4A82-A7C9-FC09D458FE30@khera.org> <20061006073950.GD26993@deviant.kiev.zoral.com.ua> <20061006175714.GA15880@xor.obsecurity.org> <F314836D-7A2D-416A-8705-33F6E2EE6196@khera.org>

next in thread | previous in thread | raw e-mail | index | archive | help

--oxV4ZoPwBLqAyY+a
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

On Fri, Oct 06, 2006 at 02:11:05PM -0400, Vivek Khera wrote:
>=20
> On Oct 6, 2006, at 1:57 PM, Kris Kennaway wrote:
>=20
> >>This is very strange. You 3 instances of getty where just reading the
> >>tty input, and all suspectible processes (like sshd) are waiting =20
> >>on net
> >>events. No processes are blocked on the fs. One nfsd is serving =20
> >>the request,
> >>and dump is active.
> >
> >To repeat something I said earlier: when creating a snapshot
> >(e.g. which dump -L does), the entire system may become unresponsive
> >untilk the snapshot completes, which can take many minutes.
>=20
> I know snapshot takes a while -- we're used to that.
>=20
> >How long are you waiting before pronouncing the system deadlocked?
> >
>=20
>=20
> 10's of minutes.
> >What does ^T on the console (e.g. when trying to log in), show you?
>=20
There were no active snapshotting in the progress. Snapshot was already
made, and dump happily processed in the moment captured in the script.

> nothing.  the console is non-responsive.  the remote shells are non =20
> responsive to any input.
>=20
> I'm now convinced it was all stemming from some bug in bge driver (at =20
> least for my specific chipset.)  Last night I put in an old spare =20
> 3c905 NIC and turned off the motherboard bge via BIOS.
>=20
> I can't make the machine lock up at all, even with the watchdog =20
> running, and doing level0 dumps.
>=20
> Also, even though this NIC is only 10/100 and the prior was running =20
> at GigE speed, the system is *way* more responsive to network =20
> operations.  For example, when I logged in this morning my IMAP mail =20
> client took barely a second or or so to open my inbox, whereas before =20
> it would take upwards of 10 seconds.
>=20
> This machine was always this way since it was first set up running =20
> 5.3.  I can't believe I lived with it for so long...  I'd like to =20
> find a nice stable GigE NIC for it, since I know that the onboard bge =20
> is definitely sub-optimal with FreeBSD.  Dell's diagnostics don't =20
> find any hardware fault, for what that's worth.
>=20
> Curiously, I have a handful of other Dell servers at the office which =20
> all have bge and run just great at GigE speed to the same switch.
>=20
> If it does lock up again, I'll be sure to let you know!
>=20
Was this system patched by the stuff I submitted to you ?



--oxV4ZoPwBLqAyY+a
Content-Type: application/pgp-signature
Content-Disposition: inline

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.5 (FreeBSD)

iD8DBQFFJp5jC3+MBN1Mb4gRAk9+AKCEQ2EqglvQZ8hZtieYjDEcQlED8ACgg6if
9iALnpP8YUnwh9/bkTb0ZdQ=
=XDS/
-----END PGP SIGNATURE-----

--oxV4ZoPwBLqAyY+a--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20061006182020.GH26993>