From owner-freebsd-current@FreeBSD.ORG Wed Jul 10 19:34:24 2013 Return-Path: Delivered-To: current@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 81C33C09; Wed, 10 Jul 2013 19:34:24 +0000 (UTC) (envelope-from mckusick@mckusick.com) Received: from chez.mckusick.com (chez.mckusick.com [IPv6:2001:5a8:4:7e72:4a5b:39ff:fe12:452]) by mx1.freebsd.org (Postfix) with ESMTP id 4B9471755; Wed, 10 Jul 2013 19:34:24 +0000 (UTC) Received: from chez.mckusick.com (localhost [127.0.0.1]) by chez.mckusick.com (8.14.3/8.14.3) with ESMTP id r6AJYIIR014432; Wed, 10 Jul 2013 12:34:18 -0700 (PDT) (envelope-from mckusick@chez.mckusick.com) Message-Id: <201307101934.r6AJYIIR014432@chez.mckusick.com> To: Adrian Chadd Subject: Re: Kernel crash during heavy disk access In-reply-to: Date: Wed, 10 Jul 2013 12:34:18 -0700 From: Kirk McKusick X-Mailman-Approved-At: Wed, 10 Jul 2013 20:07:06 +0000 Cc: Benjamin Kaduk , Jeff Roberson , Eric Camachat , current@freebsd.org X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 10 Jul 2013 19:34:24 -0000 > Date: Tue, 9 Jul 2013 18:29:01 -0700 > Subject: Re: Kernel crash during heavy disk access > From: Adrian Chadd > To: Benjamin Kaduk , Jeff Roberson , > Kirk McKusick > Cc: Eric Camachat , current@freebsd.org > > Well, best to tell kirk and jeffr. > > Jeffr wrote the journaling stuff. > > .. but I thought they knew there's still problems? > > -adrian Jeff has fixed all the journaling issues for which we have some way of reproducing them. We do still have some reports that there are "problems" but only a vague description and nothing that we can use to reproduce them on our systems. One of the inherit characteristics of any type of journaling is that once it thinks that it has fixed something, it never goes back and checks it again later. So, if there is some inconsistency that gets into your filesystem through media error or an earlier journaling bug, it will stay there and continue to plague you until a full fsck is run to clean it up. So, if you are getting filesystem related crashes, the first thing you should do is a full (fsck -f) check to make sure that you are starting from a clean state. After that, if you find that the journaling is not keeping it consistent, please send Jeff and me a report of what you are doing, what problems it creates, and most importantly transcript of a run of `fsck_ffs -d' first using the journal and then a second time with a full check (fsck_ffs -f -d) so that we can try to analyse what is going wrong. Note that you need to run fsck_ffs explicitly because the fsck front end will not pass the -d (debug output) flag through to fsck_ffs. Kirk McKusick > On 9 July 2013 17:48, Benjamin Kaduk wrote: >> On Tue, 9 Jul 2013, Adrian Chadd wrote: >> >>> On 9 July 2013 09:24, Eric Camachat wrote: >>>> >>>> On Mon, 2013-07-08 at 23:05 -0700, Adrian Chadd wrote: >>>>> >>>>> Hi, >>>>> >>>>> Try doing a full, non-journal fsck. >>>>> >>>>> -adrian >>>> >>>> >>>> Thank you, it fixed the problem! >>>> Does it mean journal didn't work? >>> >>> >>> Yup :( >> >> >> So, you are going to tell Kirk about it? >> >> -Ben