FreeBSD Mail Archives

Date:      Tue, 27 Dec 2011 13:53:55 -0800
From:      David Thiel <lx@redundancy.redundancy.org>
To:        freebsd-current@freebsd.org
Subject:   SU+J systems do not fsck themselves
Message-ID:  <20111227215330.GI45484@redundancy.redundancy.org>

next in thread | raw e-mail | index | archive | help

I've had multiple machines now (9.0-RC3, amd64, i386 and earlier 
9-CURRENT on ppc) running SU+J that have had unexplained panics and 
crashes start happening relating to disk I/O. When I end up running a 
full fsck, it keeps turning out that the disk is dirty and corrupted, 
but no mechanism is in place with SU+J to detect and fix this. A bgfsck 
never happens, but a manual fsck in single-user does indeed fix the 
crashing and weird behavior. Others have tested their SU+J volumes and 
found them to have errors as well. This makes me super nervous.

Basically, the way SU+J seems to operate is this:

http://redundancy.redundancy.org/fscklog2

"Oh hey, I see you shut down uncleanly, let's check everything looks 
good, off you go, whee"

Until I actually go and fsck, when I get:

http://redundancy.redundancy.org/fscklog1

So, I understand that journalling doesn't replace the need for a 
potential fsck (though I never had this problem with gjournal), but 
without a way for the system to detect that a fsck is necessary, this 
seems pretty much a guaranteed recipe for data corruption, and seems to 
offer little to no benefit over plain SU+fsck, or even just mounting 
async.

So: is everyone else seeing this? Am I misunderstanding how SU+J should 
be used? How should the error resolution process really happen? 

Thanks,
David

Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20111227215330.GI45484>

Header And Logo

Peripheral Links

Site Navigation

Header And Logo

Peripheral Links

Search

Site Navigation