From owner-freebsd-current@FreeBSD.ORG Tue Sep 30 15:11:51 2003 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id C04BB16A4B3 for ; Tue, 30 Sep 2003 15:11:51 -0700 (PDT) Received: from postal2.es.net (proxy.es.net [198.128.3.206]) by mx1.FreeBSD.org (Postfix) with ESMTP id CB69044027 for ; Tue, 30 Sep 2003 15:11:50 -0700 (PDT) (envelope-from oberman@es.net) Received: from ptavv.es.net ([198.128.4.29]) by postal2.es.net (Postal Node 2) with ESMTP (SSL) id MUA74016; Tue, 30 Sep 2003 15:11:49 -0700 Received: from ptavv (localhost [127.0.0.1]) by ptavv.es.net (Tachyon Server) with ESMTP id 54E7E5D07; Tue, 30 Sep 2003 15:11:48 -0700 (PDT) To: "Marc G. Fournier" In-Reply-To: Message from "Marc G. Fournier" <20030930183437.Y94686@ganymede.hub.org> Date: Tue, 30 Sep 2003 15:11:48 -0700 From: "Kevin Oberman" Message-Id: <20030930221148.54E7E5D07@ptavv.es.net> cc: freebsd-current@freebsd.org Subject: Re: Improvements to fsck performance in -current ...? X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 30 Sep 2003 22:11:51 -0000 > Date: Tue, 30 Sep 2003 18:42:21 -0300 (ADT) > From: "Marc G. Fournier" > Sender: owner-freebsd-current@freebsd.org > > > Due to an electrician flipping the wrong circuit breaker this morning, I > had my servers go down hard ... they are all -STABLE, with one of the four > taking a *very* long time to fsck: > > jupiter# ps aux | grep fsck > root 361 99.0 2.3 95572 95508 p0 R+ 4:21PM 121:13.21 fsck -y /dev/da0s1h > jupiter# date > Tue Sep 30 18:37:02 ADT 2003 > jupiter# > > Now, CPU time is rising, so I figure its still working away, and fsck > shows: > > jupiter# fsck -y /dev/da0s1h > ** /dev/da0s1h > ** Last Mounted on /vm > ** Phase 1 - Check Blocks and Sizes > ** Phase 2 - Check Pathnames > ** Phase 3 - Check Connectivity > ** Phase 4 - Check Reference Counts > > so it isn't finding any errors ... > > A friend of mine asked if we had a journalling file system, which I told > him know, as I don't believe we do ... but are/have there been any > improvements to fsck in -CURRENT to improve performance on large file > systems (this is a 6x36G RAID5 system)? Does UFS2 address any of this? > > I've actually had a 6x18gig RAID5 file system once take 11+hrs to fsck ... > and when it was completed, everything seemed fine, with no reports of any > file or directory corruption ... it obviously did a good job of checking > the file system, just hate the lengthy downtime ... Current has two major changes re speeding up fsck. The most significant is the background operation of fsck on file system with soft updates enabled. Because of the way softupdates works, you are assured of metadata consistency on reboot, so the file systems can be mounted and used immediately with fsck started up in the background about a minute after the system comes up. Until fsck runs it is possible (likely) that some free blocks on the filesystem amy be unavailable until the fsck completes, but that should be the only issue. The other issue is significant speedup in the time fsck takes to run. On my little 30 MB /usr/partition it now takes only seconds to fsck vs. about 2 minutes when I was running V4 on the system. On huge system, I suspect the speedup is even more significant, but don't know for sure. I suspect that these enhancements may both require that soft updates be enabled for the file systems. -- R. Kevin Oberman, Network Engineer Energy Sciences Network (ESnet) Ernest O. Lawrence Berkeley National Laboratory (Berkeley Lab) E-mail: oberman@es.net Phone: +1 510 486-8634