From owner-freebsd-stable  Tue Jan 16  7:24:14 2001
Delivered-To: freebsd-stable@freebsd.org
Received: from arg1.demon.co.uk (arg1.demon.co.uk [194.222.34.166])
	by hub.freebsd.org (Postfix) with ESMTP id E8D7537B699
	for <freebsd-stable@freebsd.org>; Tue, 16 Jan 2001 07:23:44 -0800 (PST)
Received: by arg1.demon.co.uk (Postfix, from userid 300)
	id 2E0269B17; Tue, 16 Jan 2001 15:23:40 +0000 (GMT)
Received: from localhost (localhost [127.0.0.1])
	by arg1.demon.co.uk (Postfix) with ESMTP id 24C835D15
	for <freebsd-stable@freebsd.org>; Tue, 16 Jan 2001 15:23:40 +0000 (GMT)
Date: Tue, 16 Jan 2001 15:23:40 +0000 (GMT)
From: Andrew Gordon <arg@arg1.demon.co.uk>
X-Sender: arg@server.arg.sj.co.uk
To: freebsd-stable@freebsd.org
Subject: Vinum incidents.
Message-ID: <Pine.BSF.4.21.0101161413490.64349-100000@server.arg.sj.co.uk>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Sender: owner-freebsd-stable@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.ORG


I have a server with 5 identical SCSI drives, arranged as a single RAID-5
volume using vinum (and softupdates).  This is exported with
NFS/Samba/Netatalk/Econet to clients of various types; the root,usr,var
partitions are on a small IDE drive (there are no local users or
application processes).  The machine has a serial console.

This has been working reliably for a couple of months, running -stable
from around the time of 4.2-release.  On 1st January, I took advantage of
the low load to do an upgrade to the latest -stable.

Since then, there have been two incidents (probably not in fact related to
the upgrade) where vinum has not behaved as expected:

1) Phantom disc error
---------------------

Vinum logged:

Jan  2 01:59:26 serv20 /kernel: home.p0.s0: fatal write I/O error
Jan  2 01:59:26 serv20 /kernel: vinum: home.p0.s0 is stale by force
Jan  2 01:59:26 serv20 /kernel: vinum: home.p0 is degraded

However, there was no evidence of any actual disc error - nothing was
logged on the console, in dmesg or any other log files.  The system would
have been substantially idle at that time of night, except that the daily
cron jobs would just been starting at that time.

A "vinum start home.p0.s0" some time later successfully revived the plex
and the system then ran uninterrupted for two weeks.

Does this suggest some sort of out-of-range block number bug somewhere?

2) Recovery problems
--------------------

This morning, a technician accidentally(!) unplugged the cable between the
SCSI card and the drive enclosure while the system was busy.  The console
showed a series of SCSI errors, culminating in a panic.  Although it is
configured to dump to the IDE drive, no dump was saved (possibly due to
someone locally pressing the reset button).  In any case, this panic was
probably not very interesting.

On reboot, it failed automatic fsck due to unexpected softupdates
inconstencies, but a manual fsck worked OK with only a modest number of
incorrect block count/unref file errors, but a huge number of "allocated
block/frag marked free" errors.  A second fsck produced no errors, so I
mounted the filesystem and continued.

Sometime during this, the following occurred:

  (da3:ahc0:0:3:0): READ(10). CDB: 28 0 0 22 1d d6 0 0 2 0
  (da3:ahc0:0:3:0): MEDIUM ERROR info:221dd6 asc:11,0
  (da3:ahc0:0:3:0): Unrecovered read error sks:80,35
  Jan 16 09:17:40 serv20 /kernel: home.p0.s3: fatal read I/O error
  Jan 16 09:17:40 serv20 /kernel: vinum: home.p0.s3 is crashed by force
  Jan 16 09:17:40 serv20 /kernel: vinum: home.p0 is degraded
  (da3:ahc0:0:3:0): READ(10). CDB: 28 0 0 22 8 3a 0 0 2 0
  (da3:ahc0:0:3:0): MEDIUM ERROR info:22083a asc:11,0
  (da3:ahc0:0:3:0): Unrecovered read error sks:80,35
  Jan 16 09:17:41 serv20 /kernel: home.p0.s3: fatal read I/O error
  Jan 16 09:17:42 serv20 /kernel: vinum: home.p0.s3 is stale by force

These were real errors (reproducible by reading from da3s1a with 'dd'), so
I fixed them by writing zeros over most of the drive, and verified by
dd-ing /dev/da3s1a to /dev/null.  Since this now read OK, I tried to
revive the subdisk with "vinum start home.p0.s3".  Vinum reported that it
was reviving, then reported all the working drives "crashed by force", and
the machine locked solid (no panic or dump, required the reset button).

Jan 16 09:48:28 serv20 /kernel: vinum: drive drive3 is up
Jan 16 09:48:46 serv20 /kernel: vinum: home.p0.s0 is crashed by force
Jan 16 09:48:46 serv20 /kernel: vinum: home.p0 is corrupt
Jan 16 09:48:46 serv20 /kernel: vinum: home.p0.s1 is crashed by force
Jan 16 09:48:46 serv20 /kernel: vinum: home.p0.s2 is crashed by force
Jan 16 09:48:46 serv20 /kernel: vinum: home.p0.s4 is crashed by force
Jan 16 09:48:46 serv20 /kernel: vinum: home.p0 is faulty
Jan 16 09:48:46 serv20 /kernel: vinum: home is down


On reboot, the vinum volume was broken:

vinum: /dev is mounted read-only, not rebuilding /dev/vinum
Warning: defective objects

V home              State: down     Plexes:       1 Size:         68 GB
P home.p0        R5 State: faulty   Subdisks:     5 Size:         68 GB
S home.p0.s0        State: crashed  PO:        0  B Size:         17 GB
S home.p0.s1        State: crashed  PO:      512 kB Size:         17 GB
S home.p0.s2        State: crashed  PO:     1024 kB Size:         17 GB
S home.p0.s3        State: R 0%     PO:     1536 kB Size:         17 GB
                    *** Start home.p0.s3 with 'start' command ***
S home.p0.s4        State: crashed  PO:     2048 kB Size:         17 GB

I used 'vinum start' on home.p0.s[0124], and the plex came back in
degraded mode; after fsck it mounted OK.

On booting to multi-user mode, I noticed that all the drives were marked
as 'down', even though the volume and most of the subdisks were 'up' (and
a quick check in the console scroll-back showed that it was also in this
state before the previous attempt to revive:

vinum -> l
5 drives:
D drive0           State: down     Device /dev/da0s1a Avail: 0/17500 MB (0%)
D drive1           State: down     Device /dev/da1s1a Avail: 0/17500 MB (0%)
D drive2           State: down     Device /dev/da2s1a Avail: 0/17500 MB (0%)
D drive3           State: down     Device /dev/da3s1a Avail: 0/17500 MB (0%)
D drive4           State: down     Device /dev/da4s1a Avail: 0/17500 MB (0%)
1 volumes:
V home             State: up       Plexes:       1 Size:         68 GB
1 plexes:
P home.p0       R5 State: degraded Subdisks:     5 Size:         68 GB
5 subdisks:
S home.p0.s0       State: up       PO:        0  B Size:         17 GB
S home.p0.s1       State: up       PO:      512 kB Size:         17 GB
S home.p0.s2       State: up       PO:     1024 kB Size:         17 GB
S home.p0.s3       State: R 0%     PO:     1536 kB Size:         17 GB
                   *** Start home.p0.s3 with 'start' command ***
S home.p0.s4       State: up       PO:     2048 kB Size:         17 GB


This time, I used 'vinum start' on drive[0-4] before doing vinum start on
home.p0.s3, and this time it successfully revived, taking 10 minutes or
so.  Some minutes later, the machine paniced (this time saving a dump):

IdlePTD 3166208
initial pcb at 282400
panicstr: softdep_lock: locking against myself
panic messages:
---
panic: softdep_setup_inomapdep: found inode
(kgdb) where
#0  0xc014dd1a in dumpsys ()
#1  0xc014db3b in boot ()
#2  0xc014deb8 in poweroff_wait ()
#3  0xc01e6b49 in acquire_lock ()
#4  0xc01eae02 in softdep_fsync_mountdev ()
#5  0xc01eef0e in ffs_fsync ()
#6  0xc01edc16 in ffs_sync ()
#7  0xc017b42b in sync ()
#8  0xc014d916 in boot ()
#9  0xc014deb8 in poweroff_wait ()
#10 0xc01e792c in softdep_setup_inomapdep ()
#11 0xc01e44a4 in ffs_nodealloccg ()
#12 0xc01e352b in ffs_hashalloc ()
#13 0xc01e3186 in ffs_valloc ()
#14 0xc01f4e6f in ufs_makeinode ()
#15 0xc01f2824 in ufs_create ()
#16 0xc01f5029 in ufs_vnoperate ()
#17 0xc01b1e43 in nfsrv_create ()
#18 0xc01c6b2e in nfssvc_nfsd ()
#19 0xc01c6483 in nfssvc ()
#20 0xc022b949 in syscall2 ()
#21 0xc02207b5 in Xint0x80_syscall ()
#22 0x8048135 in ?? ()

After this reboot (again requiring manual fsck) the system appears to be
working normally, but again the drives are all marked 'down':

serv20[arg]% vinum l
5 drives:
D drive0          State: down     Device /dev/da0s1a Avail: 0/17500 MB(0%)
D drive1          State: down     Device /dev/da1s1a Avail: 0/17500 MB(0%)
D drive2          State: down     Device /dev/da2s1a Avail: 0/17500 MB(0%)
D drive3          State: down     Device /dev/da3s1a Avail: 0/17500 MB(0%)
D drive4          State: down     Device /dev/da4s1a Avail: 0/17500 MB(0%)

1 volumes:
V home            State: up       Plexes:       1 Size:         68 GB

1 plexes:
P home.p0      R5 State: up       Subdisks:     5 Size:         68 GB

5 subdisks:
S home.p0.s0      State: up       PO:        0  B Size:         17 GB
S home.p0.s1      State: up       PO:      512 kB Size:         17 GB
S home.p0.s2      State: up       PO:     1024 kB Size:         17 GB
S home.p0.s3      State: up       PO:     1536 kB Size:         17 GB
S home.p0.s4      State: up       PO:     2048 kB Size:         17 GB


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-stable" in the body of the message