Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 7 Mar 2009 22:55:50 GMT
From:      Dieter <freebsd@sopwith.solgatos.com>
To:        freebsd-gnats-submit@FreeBSD.org
Subject:   kern/132397: reboot causes filesystem corruption
Message-ID:  <200903072255.n27Mtos4024466@www.freebsd.org>
Resent-Message-ID: <200903072300.n27N05RS079446@freefall.freebsd.org>

next in thread | raw e-mail | index | archive | help

>Number:         132397
>Category:       kern
>Synopsis:       reboot causes filesystem corruption
>Confidential:   no
>Severity:       critical
>Priority:       high
>Responsible:    freebsd-bugs
>State:          open
>Quarter:        
>Keywords:       
>Date-Required:
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Sat Mar 07 23:00:05 UTC 2009
>Closed-Date:
>Last-Modified:
>Originator:     Dieter
>Release:        7.1
>Organization:
>Environment:
7.1-RELEASE amd64
>Description:
FreeBSD 7.1
amd64
soft updates, disk write cache off

System was running fine.
I typed reboot.
It printed out a *very* long string of numbers of buffers left,
(see below) usually the string is very short (one line).
It gave up with 3 buffers left to go.  (why?)
A filesystem is now corrupt and fsck causes panic.

Through the miracle of soft updates filesystems can now survive
power outages and the reset button with no damage, but an
orderly reboot causes unfixable corruption?
THIS IS COMPLETELY UNACCEPTABLE AND NEEDS TO BE FIXED!

Q1) What "timed out" ?  (see below)  kproc_shutdown() maybe?

Q2) Why does the number of buffers sometimes go up?

Q3) Why doesn't it get all the buffers synced out to disk?
It is like something is still generating dirty buffers, but
aren't all processes killed before it gets to this point?

I supposed I can crank up
static int kproc_shutdown_wait = 60;
in kern_shutdown.c but that just gives it more time.  It could
still fail.

kern_kthread.c says:
 * Advise a kernel process to suspend (or resume) in its main loop.
 * Participation is voluntary.

Voluntary eh?  Q4) Could something refuse to suspend and cause this?
Why is this allowed?

in kern_shutdown.c:
/*
 * With soft updates, some buffers that are
 * written will be remarked as dirty until other
 * buffers are written.
 */
 for (iter = pbusy = 0; iter < 20; iter++) {

I suppose the "20" could be increased, but again, it could still fail.

Q5) Does this buffer syncing not follow the soft updates protocol?
If not why not?

There is something fundamentally wrong here, but I don't know what.
Barring a disk write error, which is not the case here, this should
never happen.


Waiting (max 60 seconds) for system process `syncer' to stop...
Syncing disks, vnodes remaining...1 3 1 3 2 0 1 2 0 1 0 2 2 0 2 2 2 2 2 2 0 2 0
2 2 2 2 0 2 2 2 2 0 1 0 2 0 2 2 2 2 0 2 0 2 1 0 2 2 0 2 0 1 2 2 2 1 0 1 2 2 time
d out
Syncing disks, buffers remaining... 40 40 34 33 33 17 17 11 10 10 2 1 1 37 39 17
 17 5 5 39 40 23 23 8 8 2 1 1 37 38 17 17 5 5 40 41 23 23 8 7 7 1 1 34 35 17 17
7 6 6 1 31 35 23 22 22 21 21 9 9 2 1 1 27 26 26 21 21 21 1 1 27 24 24 21 21 10 1
0 4 3 3 30 28 28 22 21 21 21 30 22 21 21 21 1 1 27 25 25 22 22 10 10 3 3 27 24 2
3 23 21 21 7 6 6 30 27 26 26 21 21 21 31 22 22 21 21 8 7 7 1 1 26 22 22 21 21 9
9 2 2 28 25 25 21 21 21 1 1 25 22 22 21 21 9 9 2 2 28 26 26 21 21 21 1 1 26 24 2
4 21 21 21 30 22 21 21 21 3 2 2 28 26 25 25 21 21 21 1 1 26 23 23 21 21 21 31 22
 22 21 21 10 9 9 2 2 28 25 24 24 21 21 10 10 2 2 29 26 26 21 21 21 1 1 27 24 24
21 21 21 31 23 22 22 21 21 7 7 31 30 29 29 22 22 21 21 10 9 9 3 2 2 28 26 25 25
21 21 21 1 1 26 23 23 21 21 10 10 3 3 30 27 27 21 21 21 2 2 29 26 26 21 21 21 1
31 36 23 23 21 21 10 10 3 3 30 28 27 27 21 21 21 1 1 26 23 23 21 21 10 10 2 1 1
27 24 23 23 21 21 6 6 31 28 27 27 21 21 21 2 1 1 27 25 25 21 21 21 29 21 21 21 2
 2 28 25 25 21 21 21 31 22 22 21 21 9 9 3 2 2 29 27 26 26 21 21 21 1 1 27 24 24
21 21 21 31 23 22 22 21 21 9 9 3 3 30 27 27 21 21 21 1 1 27 24 24 22 22 21 21 5
5 31 28 28 21 21 21 3 3 28 25 25 21 21 21 1 1 26 21 21 21 2 2 29 26 25 25 21 21
10 10 2 2 28 24 24 21 21 10 10 1 1 27 24 23 23 21 21 9 8 8 1 1 27 24 24 21 21 10
 9 9 2 1 1 25 22 22 21 21 9 9 3 2 2 28 24 23 23 21 21 9 9 3 3 30 26 26 21 21 21
1 1 26 22 22 21 21 10 10 4 3 3 30 27 26 26 21 21 21 31 22 21 21 21 4 4 30 26 26
21 21 21 1 1 27 23 23 21 21 10 10 3 3 28 25 25 21 21 21 1 31 35 22 22 21 21 10 1
0 3 3 28 26 25 25 21 21 21 1 1 26 23 23 21 21 21 31 22 21 21 21 3 3 29 26 26 21
21 21 1 1 26 23 23 21 21 7 7 1 1 26 23 23 21 21 9 9 1 40 46 37 37 33 32 32 26 26
 14 14 5 5 40 40 24 23 23 6 6 1 38 44 36 36 31 31 24 24 10 10 3 3 39 41 24 24 14
 14 5 5 40 41 24 24 9 9 3 3 39 41 25 25 10 9 9 4 4 40 41 25 25 10 10 2 2 38 39 2
3 23 7 7 1 1 36 37 8 8 1 1 38 37 37 32 31 31 25 24 24 9 9 3 3 36 36 22 16 16 6 6
 31 28 28 22 21 21 21 31 22 21 21 21 31 22 21 21 21 3 2 2 28 25 24 24 21 21 21 1
 1 26 22 22 21 21 9 9 2 1 1 27 25 24 24 21 21 21 1 1 25 22 22 21 21 9 9 3 2 2 29
 26 25 25 21 21 21 1 1 26 23 23 21 21 10 10 3 3 29 26 26 21 21 21 1 1 26 23 23 2
1 21 21 31 23 22 22 21 21 10 10 4 4 29 26 26 21 21 21 1 1 26 23 23 21 21 21 1 31
 35 22 22 21 21 9 9 1 1 27 24 23 23 21 21 21 1 1 26 22 22 21 21 9 8 8 2 1 1 27 2
5 24 24 21 21 7 7 29 27 26 26 21 21 21 31 22 22 21 21 8 7 7 1 1 26 23 22 22 21 2
1 10 10 4 4 30 27 27 21 21 21 1 1 27 24 24 21 21 21 1 31 35 22 21 21 21 4 3 3 20
 17 16 16 15 14 15 5 5 2 2 3 3 3 3 5 3 3 5 3 3 3 5 3 2 5 3 3 4 3 3 2 1 1 3 4 3 3
 3 3 3 1 1 3 3 5 3 3 4 4 3 3 3 3 3 5 5 3 2 3 3 3 3 5 3 2 5 3 3 4 3 3 4 3 3 5 4 4
 3 2 5 3 2 3 2 2 3 3 3 3 5 3 3 4 3 3 5 4 3 3 3 5 3 2 3 3 3 3 5 3 3 3 5 4 4 3 3 3
 3 5 1 1 3 3 3 3 2 3 3 2 2 3 3 3 3 3 3 3 2 2 5 3 3 3 4 3 3 3 3 2 3 2 5 3 3 3 3 3
 3 2 2 2 3 2 3 3 2 2 3 3 3 3 2 2 3 3 3 3 3 3 3 3 3 2 2 5 3 3 3 3 2 2 3 5 3 3 3 3
 3 2 2 3 3 2 2 2 2 3 3 2 2 3 2 2 5 3 2 2 3 5 1 2 3 3 3 4 5 3 2 2 3 2 2 3 3 2 2 5
 3 3 3 3 2 3 5 3 3 5 3 3 2 2 2 3 3 3 3 3 3 2 3 3 5 3 2 5 3 2 3 3 3 3 3 2 3 3 3 3
 3 3 3 3 3 3 3 3 3 3 3 3 3 6 3 3 3 3 5 3 3 3 3 3 3 3 3 2 2 3 5 3 2 5 3 2 3 3 3 3
 3 3 3 3 3 2 3 3 3 3 5 3 2 5 3 2 3 3 3 3 5 3 2 3 3 3 3 3 3 3 3 3 5 3 3 5 3 3 5 5
 3 3 3 3 3 5 5 3 3 5 3 3 5 3 3 5 3 3 5 1 1 3 4 5 3 3 3 5 3 3 3 3 3 3 3 4 3 2 3 3
 3 5 3 2 3 3 3 4 3 3 3 3 3 3 3 4 3 2 3 3 3 3 3 3 3 3 3 3 3 5 3 3 2 2 3 5 3 2 5 3
 2 3 3 5 3 3 3 3 3 5 2 5 3 3 4 3 3 3 5 3 2 5 3 3 3 3 5 3 3 3 4 3 3 3 3 3 5 3 3 3
 3 4 3 3 3 5 3 3 5 4 4 5 3 2 5 3 2 3 5 3 3 3 3 3 3 3 3 3 3 3 5 3 3 1 3 3 5 4 4 3
 2 3 3 3 3 3 5 3 3 5 3 3 4 3 2 3 3 5 3 3 2 1 1 5 3 3 3 5 4 4 3 2 5 3 3 3 3 3 3 3
 5 3 2 3 2 3 3 3 3 5 3 3 3 3 3 3 3 3 3 3 3 3 3 3 5 3 3 3 5 3 3 3 4 3 3 4 3 3 3 3
 5 2 3 3 3 2 3 3 3 3 5 3 2 3 4 3 2 3 1 1 3 4 4 3 3 3 4 3 3 3 3 2 2 4 3 3 4 3 3 3
 2 2 3 3 3 3 3 2 2 3 3 3 3 2 2 4 3 3 3 2 2 3 3 3 3 3 3 3 2 3 3 2 2 3 3 2 2 3 3 3
 2 2 3 3 2 2 3 3 2 3 2 2 3 3 3 3 3 3 3 3 2 2 2 3 3 3 3 4 3 3 2 2 3 3 3 3 3 3 2 3
 4 2 2 1 1 3 4 3 3 3 3 2 2 4 3 3 3 3 3 2 2 3 2 4 2 2 2 3 3 3 3 2 2 3 3 3 3 3 3 3
 3 3 3 3 3

I wasn't able to capture all of it.

>How-To-Repeat:
unknown
>Fix:
unknown

>Release-Note:
>Audit-Trail:
>Unformatted:



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200903072255.n27Mtos4024466>