Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 16 Jan 2007 16:21:52 -0500
From:      Kris Kennaway <kris@obsecurity.org>
To:        Willem Jan Withagen <wjw@withagen.nl>
Cc:        Scott Oertel <freebsd@scottevil.com>, Willem Jan Withagen <wjw@digiware.nl>, freebsd-stable@freebsd.org, Kris Kennaway <kris@obsecurity.org>
Subject:   Re: running mksnap_ffs
Message-ID:  <20070116212152.GB1041@xor.obsecurity.org>
In-Reply-To: <45AD3BA4.8090505@withagen.nl>
References:  <200701161934.l0GJY1mh057095@ambrisko.com> <45AD3507.402@withagen.nl> <20070116203739.GA343@xor.obsecurity.org> <45AD3BA4.8090505@withagen.nl>

next in thread | previous in thread | raw e-mail | index | archive | help

--/NkBOFFp2J2Af1nK
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

On Tue, Jan 16, 2007 at 09:55:00PM +0100, Willem Jan Withagen wrote:
> Kris Kennaway wrote:
> ......
>=20
> >>>The file-system would come to a stop, processes stuck on bio, snap-sho=
ts
> >>>not finishing etc.  This was caused by the system running out of usable
> >>>buffers.  The change forces them to be flushed every so often.  This is
> >>>independant of locking.  10 might be to aggresive.  Some scaling of
> >>>nbuf would probably be better.
> >>When I run mksnap_ffs it runs to the point where ANY access to the=20
> >>filesystem gives that process a lockup.
> >
> >Yes, that is expected.  Actually it begins when something accesses the
> >directory in which the snapshot is being made, since that causes the
> >parent directory to be locked...then something tries to access the
> >parent directory, which eventually cascades back to the root.
> >
> >>Getting the file system back is only thru "hard reboot". Trying to do i=
t=20
> >>the gentle way locks the whole system.
> >
> >Or waiting until the snapshot operation finishes.  You (still) haven't
> >determined that it's actually hanging as opposed to just waiting for
> >the snapshot operation to finish.
>=20
> True, and that is what I was refering to.
>=20
> * I've let it run for 12 hours on 1,5T (that's why I asked for other
> 	experiences)
> * I looked at diskstats with gstat:
> 	that turned out that everything was idle for > 5 minutes
>=20
> Then I concluded that it was locked.

OK, that does sound like it's deadlocked.  You could try Doug's patch,
or it might be another (unknown) condition.  If so, you'll need to do
some additional debugging with a serial console to figure out what is
wrong.

Kris

--/NkBOFFp2J2Af1nK
Content-Type: application/pgp-signature
Content-Disposition: inline

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (FreeBSD)

iD8DBQFFrUHvWry0BWjoQKURAtlpAKCJe0+DJSbIQ+5cB1ltWlCMwnkORgCeIUAn
QesuPWFuDhChu42rTtDegFg=
=+cdI
-----END PGP SIGNATURE-----

--/NkBOFFp2J2Af1nK--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20070116212152.GB1041>