Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 17 Dec 2002 12:34:01 -0800
From:      Kirk McKusick <mckusick@beastie.mckusick.com>
To:        Nate Lawson <nate@root.org>
Cc:        current@freebsd.org
Subject:   Re: Data corruption in soft updates? 
Message-ID:  <200212172034.gBHKY159017983@beastie.mckusick.com>
In-Reply-To: Your message of "Tue, 17 Dec 2002 12:14:12 PST." <Pine.BSF.4.21.0212171151140.50202-100000@root.org> 

next in thread | previous in thread | raw e-mail | index | archive | help
Please send me a `dumpfs /usr | head -50' output of the filesystem
under the current system. Then clean it up with fsck and run the
same command again. Finally, boot up under the old kernel and
get the output both before and after fsck cleaning. What I am
looking for is changes in the reported size of the filesystem
because that getting out of sync is what is causing these problems.
The basic deal is that the old UFS1 superblock stored the filesystem
size in a 32-bit field. The new UFS1 superblock stores the filesystem
size in a new (previously unused) 64-bit field. When you mount a
UFS1 filesystem on a new kernel, it copies the 32-bit size field
to the 64-bit field. At that point the filesystem size is in both
places and should work equally well on old or new kernels. However,
it does not update the 64-bit size field on any of the alternate
superblocks. So, somehow, your using and copying an alternate into
the standard location is losing the update done for the size field.
I am not sure how that is happening, but I am hoping to catch
where in all your messing around with alternates that is happening
so I can cover that hole.

	Kirk McKusick

=-=-=-=-=-=

Date: Tue, 17 Dec 2002 12:14:12 -0800 (PST)
From: Nate Lawson <nate@root.org>
To: Kirk McKusick <mckusick@beastie.mckusick.com>
cc: current@freebsd.org
Subject: Re: Data corruption in soft updates? 
In-Reply-To: <200212100309.gBA39h59001465@beastie.mckusick.com>
X-ASK-Info: Whitelist match

On Mon, 9 Dec 2002, Kirk McKusick wrote:
> It appears that you are getting all those errors (BAD block)
> because fsck thinks that your filesystem is smaller than it
> really is. If you do a dumpfs on the filesystem and check
> the size (about line 5), I expect that you will find that
> all those bad blocks exceed that size. It might be interesting
> to check one or more of the alternate blocks to see if they
> have a different size. If so, using an alternate should help.
> If not, then the question is why all those out of range blocks 
> were allocated.

I booted an older kernel (Dec. 4) and ran fsck_ffs -b 32.  It repaired a
few simple errors (summary info bad).  I then copied the alt sblock to the
default location with dd.  I reran fsck to make sure the sblock was copied
correctly and it came up clean.  Everything was fine.

I rebooted into multiuser with the old kernel and everything worked fine.  
I did a full buildkernel with srcs as of yesterday at 5 pm without any bad
block messages.  But after rebooting with that new kernel, it tried to
correct the sblockloc again and my system started having the same problem
again.  uname and dmesg is below.

-Nate

FreeBSD 5.0-CURRENT #1: Mon Dec 16 18:05:56 PST 2002

/: correcting fs_sblockloc from 4 to 8192
bad block 1553167, ino 386832
/usr: optimization changed from TIME to SPACE
bad block 1553152, ino 387421
pid 42 (syncer), uid 0 inumber 387421 on /usr: bad block
bad block 1551181, ino 383169
pid 42 (syncer), uid 0 inumber 383169 on /usr: bad block
bad block 1632087, ino 383281
pid 42 (syncer), uid 0 inumber 383281 on /usr: bad block
bad block 1616355, ino 383200
pid 42 (syncer), uid 0 inumber 383200 on /usr: bad block
bad block 1623472, ino 383200
pid 42 (syncer), uid 0 inumber 383200 on /usr: bad block
bad block 1551227, ino 383200
pid 42 (syncer), uid 0 inumber 383200 on /usr: bad block
bad block 1552592, ino 383200
pid 42 (syncer), uid 0 inumber 383200 on /usr: bad block
bad block 1555160, ino 383200
pid 42 (syncer), uid 0 inumber 383200 on /usr: bad block
bad block 1555208, ino 383200
pid 42 (syncer), uid 0 inumber 383200 on /usr: bad block
bad block 1550776, ino 383200
pid 42 (syncer), uid 0 inumber 383200 on /usr: bad block
bad block 1551208, ino 383198
pid 42 (syncer), uid 0 inumber 383198 on /usr: bad block
bad block 1551209, ino 383241
pid 42 (syncer), uid 0 inumber 383241 on /usr: bad block
bad block 1553153, ino 387219
pid 42 (syncer), uid 0 inumber 387219 on /usr: bad block
bad block 1552704, ino 389415
pid 42 (syncer), uid 0 inumber 389415 on /usr: bad block
bad block 1552707, ino 390100
pid 42 (syncer), uid 0 inumber 390100 on /usr: bad block
bad block 1639665, ino 391119
pid 42 (syncer), uid 0 inumber 391119 on /usr: bad block
bad block 1553170, ino 391111
pid 42 (syncer), uid 0 inumber 391111 on /usr: bad block
bad block 1553431, ino 391118
pid 42 (syncer), uid 0 inumber 391118 on /usr: bad block
bad block 1553405, ino 391122
pid 42 (syncer), uid 0 inumber 391122 on /usr: bad block


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-current" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200212172034.gBHKY159017983>