Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 21 Jan 2000 01:35:33 -0500 (EST)
From:      John Baldwin <jhb@FreeBSD.org>
To:        Greg Lehey <grog@lemis.com>
Cc:        freebsd-questions@FreeBSD.org, cjclark@home.com
Subject:   Re: Recoverving/reviving a 'stale' subdisk under vinum
Message-ID:  <200001210635.BAA73206@server.baldwin.cx>
In-Reply-To: <20000121105518.N481@mojave.worldwide.lemis.com>

next in thread | previous in thread | raw e-mail | index | archive | help

On 21-Jan-00 Greg Lehey wrote:
> On Thursday, 20 January 2000 at 19:15:43 -0500, Crist J. Clark wrote:
>> On Thu, Jan 20, 2000 at 01:56:07PM -0500, John H. Baldwin wrote:
>>> I've read the vinum(4) and vinum(8) manpages as well as the webpages at
>>> www.lemis.com/~grog/vinum.html, and while they are very good as far as
>>> setup and configuration info, I haven't been able to find a lot of info
>>> about recovering.  I have a stale subdisk that I can't get to recover no
>>> matter how many different start commands I try.  I've tried starting the
>>> volume, the plex, and the subdisk itself with no success.
>>>
>>> # vinum list
>>> Configuration summary
>>>
>>> Drives:         3 (4 configured)
>>> Volumes:        1 (4 configured)
>>> Plexes:         1 (8 configured)
>>> Subdisks:       3 (16 configured)
>>>
>>> D vinumdrive0           State: up       Device /dev/da1s1e      Avail: 0/8683 MB (0%)
>>> D vinumdrive1           State: up       Device /dev/da2s1e      Avail: 0/8683 MB (0%)
>>> D vinumdrive2           State: up       Device /dev/da3s1e      Avail: 0/8683 MB (0%)
>>>
>>> V ftp_mirror            State: up       Plexes:       1 Size:         25 GB
>>>
>>> P ftp_mirror.p0       S State: corrupt  Subdisks:     3 Size:         25 GB
>>>
>>> S ftp_mirror.p0.s0      State: up       PO:        0  B Size:       8683 MB
>>> S ftp_mirror.p0.s1      State: up       PO:      256 kB Size:       8683 MB
>>> S ftp_mirror.p0.s2      State: stale    PO:      512 kB Size:       8683 MB
>>>
>>> # vinum start ftp_mirror.p0.s2
>>> Can't start ftp_mirror.p0.s2: Device busy (16)
> 
> Hmm.  That shouldn't happen.

Well, that's comforting. :)

>> You have to 'stop' everything first. (I might be overkilling here,
>> but better safe...)
> 
> No, that's not safe.  That would mean taking down the volume.

Err, oops.  I already did this and it worked.  I've already fsck'd the volume and have
it in use right now.

> I haven't seen this before.  How about the information I ask for in
> the web page?

Ok, here's what I do have, but I did fix it using the above hackishness, so some of it
may not apply.

# uname -a
FreeBSD raven.XXXXX 3.3-STABLE FreeBSD 3.3-STABLE #0: Mon Dec  6 16:25:01 EST 1999
     root@snowcow.XXXXX:/usr/source/src/sys/compile/RAVEN  i386

the output of 'vinum list' you already have above, here's some of vinum_history, although it doesn't
include any of the return values, so I don't think it will be of much use:

20 Jan 2000 12:39:55.489661 *** vinum started ***
20 Jan 2000 12:39:55.540632 start 
20 Jan 2000 12:39:55.820518 *** Created devices ***
20 Jan 2000 12:40:12.649217 *** vinum started ***
20 Jan 2000 12:40:13.502406 help 
20 Jan 2000 12:40:25.188145 ls 
20 Jan 2000 13:10:31.321216 start 
20 Jan 2000 13:10:47.978917 start ftp_mirror.p0.s2 
20 Jan 2000 13:10:50.980012 stop

That is what I did when I first brought the machine back up.

20 Jan 2000 16:21:53.536302 *** vinum started ***
20 Jan 2000 16:21:53.537010 stop ftp_mirror.p0 
20 Jan 2000 16:21:58.984393 *** vinum started ***
20 Jan 2000 16:21:58.985133 list 
20 Jan 2000 16:22:06.561902 *** vinum started ***
20 Jan 2000 16:22:06.562622 stop ftp_mirror.p0.s2 
20 Jan 2000 16:22:17.000952 *** vinum started ***
20 Jan 2000 16:22:17.005242 stop -f ftp_mirror.p0.s2 
20 Jan 2000 16:22:21.145993 *** vinum started ***
20 Jan 2000 16:22:21.146744 list 
20 Jan 2000 16:22:40.709634 *** vinum started ***
20 Jan 2000 16:22:40.710394 start ftp_mirror 
20 Jan 2000 16:22:54.393075 *** vinum started ***
20 Jan 2000 16:22:54.393778 start ftp_mirror.p0.s0 
20 Jan 2000 16:23:00.238272 *** vinum started ***
20 Jan 2000 16:23:00.239015 list 
20 Jan 2000 16:23:09.552251 *** vinum started ***
20 Jan 2000 16:23:09.552963 start ftp_mirror.p0.s1 
20 Jan 2000 16:23:16.193159 *** vinum started ***
20 Jan 2000 16:23:16.193896 start ftp_mirror.p0.s2 

That is how I "fixed" it.

However, the drive seems to have fallen over again (*sigh*) with the following
kernel messages:

Jan 20 23:28:38 raven /kernel: (da2:ahc1:0:1:0): SCB 0x96 - timed out while idle, LASTPHASE == 0x1, SEQADDR == 0xa
Jan 20 23:28:38 raven /kernel: (da2:ahc1:0:1:0): Queuing a BDR SCB
Jan 20 23:28:38 raven /kernel: (da2:ahc1:0:1:0): Bus Device Reset Message Sent
Jan 20 23:28:38 raven /kernel: (da2:ahc1:0:1:0): no longer in timeout, status = 34b
Jan 20 23:28:38 raven /kernel: ahc1: Bus Device Reset on A:1. 1 SCBs aborted

Note that I didn't get this message until after the drive had been booted for
a while, the kernel found it fine during boot:

ahc1: <Adaptec aic7890/91 Ultra2 SCSI adapter> rev 0x00 int a irq 10 on pci2.9.0
ahc1: aic7890/91 Wide Channel A, SCSI Id=7, 16/255 SCBs
 ...
da2 at ahc1 bus 0 target 1 lun 0
da2: <SEAGATE ST39173LW 5960> Fixed Direct Access SCSI-2 device 
da2: 80.000MB/s transfers (40.000MHz, offset 15, 16bit), Tagged Queueing Enabled
da2: 8683MB (17783240 512 byte sectors: 255H 63S/T 1106C)

Hope this enough info and hope your stay in India is going well.

> Greg

-- 

John Baldwin <jhb@FreeBSD.org> -- http://www.FreeBSD.org/~jhb/
PGP Key: http://www.cslab.vt.edu/~jobaldwi/pgpkey.asc
"Power Users Use the Power to Serve!"  -  http://www.FreeBSD.org/


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-questions" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200001210635.BAA73206>