From owner-freebsd-stable@FreeBSD.ORG Tue Jan 16 21:21:57 2007 Return-Path: X-Original-To: freebsd-stable@freebsd.org Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id D21A316A494 for ; Tue, 16 Jan 2007 21:21:57 +0000 (UTC) (envelope-from kris@obsecurity.org) Received: from elvis.mu.org (elvis.mu.org [192.203.228.196]) by mx1.freebsd.org (Postfix) with ESMTP id BCBA013C448 for ; Tue, 16 Jan 2007 21:21:57 +0000 (UTC) (envelope-from kris@obsecurity.org) Received: from obsecurity.dyndns.org (elvis.mu.org [192.203.228.196]) by elvis.mu.org (Postfix) with ESMTP id 19BD01A4D89; Tue, 16 Jan 2007 13:21:57 -0800 (PST) Received: by obsecurity.dyndns.org (Postfix, from userid 1000) id 9071D515DC; Tue, 16 Jan 2007 16:21:52 -0500 (EST) Date: Tue, 16 Jan 2007 16:21:52 -0500 From: Kris Kennaway To: Willem Jan Withagen Message-ID: <20070116212152.GB1041@xor.obsecurity.org> References: <200701161934.l0GJY1mh057095@ambrisko.com> <45AD3507.402@withagen.nl> <20070116203739.GA343@xor.obsecurity.org> <45AD3BA4.8090505@withagen.nl> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="/NkBOFFp2J2Af1nK" Content-Disposition: inline In-Reply-To: <45AD3BA4.8090505@withagen.nl> User-Agent: Mutt/1.4.2.2i Cc: Scott Oertel , Willem Jan Withagen , freebsd-stable@freebsd.org, Kris Kennaway Subject: Re: running mksnap_ffs X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 16 Jan 2007 21:21:57 -0000 --/NkBOFFp2J2Af1nK Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Tue, Jan 16, 2007 at 09:55:00PM +0100, Willem Jan Withagen wrote: > Kris Kennaway wrote: > ...... >=20 > >>>The file-system would come to a stop, processes stuck on bio, snap-sho= ts > >>>not finishing etc. This was caused by the system running out of usable > >>>buffers. The change forces them to be flushed every so often. This is > >>>independant of locking. 10 might be to aggresive. Some scaling of > >>>nbuf would probably be better. > >>When I run mksnap_ffs it runs to the point where ANY access to the=20 > >>filesystem gives that process a lockup. > > > >Yes, that is expected. Actually it begins when something accesses the > >directory in which the snapshot is being made, since that causes the > >parent directory to be locked...then something tries to access the > >parent directory, which eventually cascades back to the root. > > > >>Getting the file system back is only thru "hard reboot". Trying to do i= t=20 > >>the gentle way locks the whole system. > > > >Or waiting until the snapshot operation finishes. You (still) haven't > >determined that it's actually hanging as opposed to just waiting for > >the snapshot operation to finish. >=20 > True, and that is what I was refering to. >=20 > * I've let it run for 12 hours on 1,5T (that's why I asked for other > experiences) > * I looked at diskstats with gstat: > that turned out that everything was idle for > 5 minutes >=20 > Then I concluded that it was locked. OK, that does sound like it's deadlocked. You could try Doug's patch, or it might be another (unknown) condition. If so, you'll need to do some additional debugging with a serial console to figure out what is wrong. Kris --/NkBOFFp2J2Af1nK Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (FreeBSD) iD8DBQFFrUHvWry0BWjoQKURAtlpAKCJe0+DJSbIQ+5cB1ltWlCMwnkORgCeIUAn QesuPWFuDhChu42rTtDegFg= =+cdI -----END PGP SIGNATURE----- --/NkBOFFp2J2Af1nK--