Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 25 Mar 2015 10:12:01 -0700
From:      Kirk McKusick <mckusick@mckusick.com>
To:        Benjamin Kaduk <kaduk@MIT.EDU>
Cc:        freebsd-fs@freebsd.org
Subject:   Re: Delete a directory, crash the system 
Message-ID:  <201503251712.t2PHC1R8090290@chez.mckusick.com>
In-Reply-To: <alpine.GSO.1.10.1503250018030.22210@multics.mit.edu> 

next in thread | previous in thread | raw e-mail | index | archive | help
> Date: Wed, 25 Mar 2015 00:25:19 -0400 (EDT)
> From: Benjamin Kaduk <kaduk@MIT.EDU>
> To: Da Rock <freebsd-fs@herveybayaustralia.com.au>
> Subject: Re: Delete a directory, crash the system
> Cc: freebsd-fs@freebsd.org, mckusick@freebsd.org
> 
> On Tue, 24 Mar 2015, Da Rock wrote:
> 
>> On 03/25/15 00:16, Benjamin Kaduk wrote:
>> Not precisely, but the message is just a flash and there is no copying of it.
>> Anyway, inode 4 is the .sujournal file as expected; this means there is an
>> issue with the softupdates. Could this be narrowing it down (the OP to this
>> was also in this age of enlightenment, SU came in with 8.x didn't it?)?
> 
> Ah, SU+J could be quite relevant.  Soft-update journalling was enabled by
> default for a period of time, but I believe it was disabled because there
> were some scenarios where it was destabilizing.  CC-ing Kirk to improve on
> my lousy memory.

As far as I know SU+J is still on by default.

> Do you remember what version was used to install the system in question
> (i.e., create the filesystem in question)?  Please show the output of
> 'tunefs -p <filesystem>'
> 
>> So I did some fiddling with fsck, fsdb, find and stat; and got nowhere. I ran
>> fsck again and it gave me not much again. It did hint at some files in the
>> ports tree, so I cleaned up the ports tree to fresh install point, ran fsck
>> again and rebooted. So far so good, but I'm keeping my fingers crossed still.
> 
> It is probably important to note that 'fsck -F' and saying 'no' to "USE
> JOURNAL?" is the most relevant fsck invocation.
> 
>> This doesn't help the panics - they're still a pita when they happen. It does
>> help me resolve the issue this time though. But initiating this error in
>> testing is damn near impossible. What can we document here as a way to gather
>> data to determine how to resolve this issue? Given my luck with this, its
>> bound to happen again at some point :)
> 
> I think actual diagnostic is beyond my expertise/time committment at the
> moment.  I suspect that using tunefs to disable softupdate journalling
> will be a workaround, if that is what you are really interested.
> 
> I'll let Kirk decide if he wants to debug more, but the answer may well be
> "no" if you're not running the latest ufs from -current.
> 
> -Ben

The suggestion to disable journalling is a good one. Journalling fixes
only consistency errors that it knows about and cannot handle media errors.
The sorts of panics you are getting are usually caused by media errors.
So disabling journally and checking all metadata after crashes (which is
what fsck does) should minimize your problems.

	Kirk McKusick



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?201503251712.t2PHC1R8090290>