From owner-freebsd-stable@FreeBSD.ORG Tue Jan 16 20:53:04 2007 Return-Path: X-Original-To: freebsd-stable@freebsd.org Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 56CF416A4AB for ; Tue, 16 Jan 2007 20:53:04 +0000 (UTC) (envelope-from wjw@withagen.nl) Received: from mail.digiware.nl (www.tegenbosch28.nl [217.21.251.97]) by mx1.freebsd.org (Postfix) with ESMTP id 0FF6A13C45B for ; Tue, 16 Jan 2007 20:53:04 +0000 (UTC) (envelope-from wjw@withagen.nl) Received: from [212.61.27.67] (opteron.digiware.nl [212.61.27.67]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mail.digiware.nl (Postfix) with ESMTP id 5973E1712F; Tue, 16 Jan 2007 21:53:03 +0100 (CET) Message-ID: <45AD3BA4.8090505@withagen.nl> Date: Tue, 16 Jan 2007 21:55:00 +0100 From: Willem Jan Withagen User-Agent: Thunderbird 1.5.0.9 (Windows/20061207) MIME-Version: 1.0 To: Kris Kennaway References: <200701161934.l0GJY1mh057095@ambrisko.com> <45AD3507.402@withagen.nl> <20070116203739.GA343@xor.obsecurity.org> In-Reply-To: <20070116203739.GA343@xor.obsecurity.org> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: freebsd-stable@freebsd.org, Scott Oertel , Willem Jan Withagen Subject: Re: running mksnap_ffs X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 16 Jan 2007 20:53:04 -0000 Kris Kennaway wrote: ...... >>> The file-system would come to a stop, processes stuck on bio, snap-shots >>> not finishing etc. This was caused by the system running out of usable >>> buffers. The change forces them to be flushed every so often. This is >>> independant of locking. 10 might be to aggresive. Some scaling of >>> nbuf would probably be better. >> When I run mksnap_ffs it runs to the point where ANY access to the >> filesystem gives that process a lockup. > > Yes, that is expected. Actually it begins when something accesses the > directory in which the snapshot is being made, since that causes the > parent directory to be locked...then something tries to access the > parent directory, which eventually cascades back to the root. > >> Getting the file system back is only thru "hard reboot". Trying to do it >> the gentle way locks the whole system. > > Or waiting until the snapshot operation finishes. You (still) haven't > determined that it's actually hanging as opposed to just waiting for > the snapshot operation to finish. True, and that is what I was refering to. * I've let it run for 12 hours on 1,5T (that's why I asked for other experiences) * I looked at diskstats with gstat: that turned out that everything was idle for > 5 minutes Then I concluded that it was locked. IF you can give me a fair estimate of time < 1 day I'll be willing to let it sit for so long. But I'm not going to wait forever. :) --WjW