From owner-freebsd-stable@FreeBSD.ORG Fri Jun 17 18:57:30 2005 Return-Path: X-Original-To: freebsd-stable@freebsd.org Delivered-To: freebsd-stable@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 08D3F16A41C for ; Fri, 17 Jun 2005 18:57:30 +0000 (GMT) (envelope-from dsze@mail.distrust.net) Received: from mail.distrust.net (mail.distrust.net [69.93.230.10]) by mx1.FreeBSD.org (Postfix) with ESMTP id CB89443D1D for ; Fri, 17 Jun 2005 18:57:29 +0000 (GMT) (envelope-from dsze@mail.distrust.net) Received: from mail.distrust.net (localhost [127.0.0.1]) by mail.distrust.net (8.13.3/8.13.3) with ESMTP id j5HIvQDO095303; Fri, 17 Jun 2005 13:57:26 -0500 (CDT) (envelope-from dsze@mail.distrust.net) Received: (from dsze@localhost) by mail.distrust.net (8.13.3/8.13.3/Submit) id j5HIvQ32095302; Fri, 17 Jun 2005 13:57:26 -0500 (CDT) (envelope-from dsze) Date: Fri, 17 Jun 2005 13:57:26 -0500 From: David Sze To: Matthias Buelow Message-ID: <20050617185726.GD94284@mail.distrust.net> References: <20050617155938.GB94284@mail.distrust.net> <200506171620.j5HGKxwW042819@drjekyll.mkbuelow.net> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <200506171620.j5HGKxwW042819@drjekyll.mkbuelow.net> User-Agent: Mutt/1.4.2.1i X-Virus-Scanned: ClamAV 0.85.1/944/Thu Jun 16 16:33:33 2005 on mail.distrust.net X-Virus-Status: Clean Cc: Greg Barniskis , uzi@bmby.com, freebsd-stable@freebsd.org Subject: Re: FreeBSD MySQL still WAY slower than Linux X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 17 Jun 2005 18:57:30 -0000 On Fri, Jun 17, 2005 at 06:20:59PM +0200, Matthias Buelow wrote: > David Sze writes: > > >CentOS uses ext3 by default. How does having a journal help if the > >journal is stored on the same async filesystem? Unless the journal > >writes are guaranteed sync. > > The journal guarantees that the filesystem will always be consistent. If > a journal entry doesn't make it to disk, the operation has never > happened; and the journal entry won't get removed, until the metadata > update has been performed. So the worst thing that could happen is, that > the same operation will be performed twice, once normally, and once at > log replay on reboot. This is not an issue, since such metadata > operations (delete file from directory, write a value into superblock, > etc.) are usually idempotent. > > That's the basic function of all journalled filesystems, and that's why > you don't need to run fsck on them. You don't need to write the journal > synchronously, you can do these things in groups. > > The softupdates mechanism does something similar; only it doesn't > maintain an on-disk journal, and hence needs fsck after boot to fix up > the free block bitmaps and stuff (basically performing a garbage > collection on the filesystem, which, unfortunately, is pretty slow). I'm not sure filesystem consistency alone is "good enough". Say your bank's database crashes right after you make a deposit. When it comes back up it's consistent, but only up to 5 minutes before the crash due to the async mount. For this type of application, something in the system has to be keeping a "journal" on a sync mount in order for recovery to be both consistent and correct.