From owner-freebsd-current@FreeBSD.ORG Thu Oct 2 13:51:22 2003 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 3A6A716A4B3 for ; Thu, 2 Oct 2003 13:51:22 -0700 (PDT) Received: from gw.catspoiler.org (217-ip-163.nccn.net [209.79.217.163]) by mx1.FreeBSD.org (Postfix) with ESMTP id F0AF643F85 for ; Thu, 2 Oct 2003 13:51:16 -0700 (PDT) (envelope-from truckman@FreeBSD.org) Received: from FreeBSD.org (mousie.catspoiler.org [192.168.101.2]) by gw.catspoiler.org (8.12.9/8.12.9) with ESMTP id h92Kp6N1039873; Thu, 2 Oct 2003 13:51:11 -0700 (PDT) (envelope-from truckman@FreeBSD.org) Message-Id: <200310022051.h92Kp6N1039873@gw.catspoiler.org> Date: Thu, 2 Oct 2003 13:51:06 -0700 (PDT) From: Don Lewis To: tlambert2@mindspring.com In-Reply-To: <3F7C5A33.A13BC11C@mindspring.com> MIME-Version: 1.0 Content-Type: TEXT/plain; charset=us-ascii cc: freebsd-current@FreeBSD.org Subject: Re: Improvements to fsck performance in -current ...? X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 02 Oct 2003 20:51:22 -0000 On 2 Oct, Terry Lambert wrote: > Jens Rehsack wrote: >> Kevin Oberman wrote: >> > Current has two major changes re speeding up fsck. >> > >> > The most significant is the background operation of fsck on file >> > system with soft updates enabled. Because of the way softupdates >> > works, you are assured of metadata consistency on reboot, so the file >> > systems can be mounted and used immediately with fsck started up in >> > the background about a minute after the system comes up. >> >> Be careful what you promise :-) >> Most new disks have an own disk cache and some of them have a >> write cache enabled. In case of a hardware failure (or power >> failure) this data may get lost and the disk's metadata isn't >> consistent. It's only when no write cache below the system >> is active. > > Actually, write caching is not so much the problem, as the disk > reporting that the write has completed before the contents of > the transaction saved in the write cache have actually been > committed to stable storage. > > Unfortunately, IDE disks do not permit disconnected writes, due > to a bug in the original IDE implementation, which has been > carried forward for [insert no good reason here]. > > Therefore IDE disks almost universally lie to the driver any > time write caching is enabled on an IDE drive. > > In most cases, if you use SCSI, the problem will go away. Nope, they "lie" as well unless you turn of the WCE bit. Fortunately with tagged command queuing there is very little performance penalty for doing this in most cases. The main exception to this is when you run newfs which talks to the raw partition and only has one command outstanding at a time. Back in the days when our SCSI implementation would spam the console whenever it reduced the number of tagged openings because the drive indicated that its queue was full, I'd see the number of tagged openings stay at 63 if write caching was disabled, but the number would drop significantly under load (50%?) if write caching was enabled. I always suspected that the drive's cache was full of data for write commands that it had indicated to the host as being complete even though the data hadn't been written to stable storage. Unfortunately SCSI drives all seem to ship with the WCE bit set, probably for "benchmarking" reasons, so I always have to remember to turn this bit off whenever I install a new drive.