From owner-freebsd-scsi Sat Dec 2 15:49:33 2000 Delivered-To: freebsd-scsi@freebsd.org Received: from red.gradwell.net (red.gradwell.net [195.149.39.8]) by hub.freebsd.org (Postfix) with SMTP id 00D5F37B400 for ; Sat, 2 Dec 2000 15:49:30 -0800 (PST) Received: (qmail 21294 invoked from network); 2 Dec 2000 23:49:26 -0000 Received: from ystwyth.demon.co.uk (HELO vaio.gradwell.com) (158.152.144.35) by pop3.gradwell.net with SMTP; 2 Dec 2000 23:49:26 -0000 Message-Id: <5.0.0.25.0.20001202233356.0366b2d8@pop3.gradwell.net> X-Sender: postmaster%pop3.peterg.org.uk@pop3.gradwell.net X-Mailer: QUALCOMM Windows Eudora Version 5.0 Date: Sat, 02 Dec 2000 23:49:24 +0000 To: Mike Smith From: Peter Gradwell Subject: Re: Mylex DAC960 Driver "online/offline" Cc: freebsd-scsi@freebsd.org In-Reply-To: <200012022339.eB2NdWF21371@mass.osd.bsdi.com> References: Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii"; format=flowed Sender: owner-freebsd-scsi@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.org Hi Mike, At 15:39 02/12/2000 -0800, Mike Smith wrote: > > What does this message really mean? > >It means that the controller is telling us that the drive is offline. >Then that it's online. Then that it's offline again. > >You don't say what the time intervals between these messages are; you can >get the 'drive offline' message from either the status poll (once per >second) or if an I/O operation is sent to a drive that the controller >reports as offline. The 'drive online' message only comes from the >status poll though. It was occuring without any apparent activity, about once per second, so I would guess it was from the status poll. >Can you describe your configuration? I can try to reproduce the >situation here and see if it's not possible that there's a bug in the >driver confusing the status between your two drives. I have to say, >though, that the fact that the controller thinks that one of your system >drives is offline when you claim it's a mirror is a bit troubling. Ok, on an update to the situation though, I was able to get too the mylex bios (there is 250 miles between me and the machine you see!) via a serial console and discovered that it had marked two drives offline. We have: 3 x 18 gig disks, of which two are bonded in a raid 1 pack and one is a hot spare 2 x 36 gig disks, bonded in a raid 0 pack. Everything apart from /var/spool/news is on the raid 1 pack. (Yeah, it's a news server.) One of the 18 gig disks and one of the 36 gig disks were marked offline. I belive that when the 18 gig disk was marked off line the RAID card rebuilt it's redundancy data onto the hot spare disk and carried on. - cos the 18 gig which is off line was part of the raid 1 pack and there is now not hot spare. *So, that's good.* So, we hard reset the machine and it booted. However, the symptoms described previously prevailed. We couldn't login via ssh or on the console as it was unresponsive. * This worries me. I would hope the machine would take the loss of /v/s/news gracefully, and carry on. So, when I accessed the bios this morning, I tried, as an "experiment" to put the 36 gig disk back online and rebooted. After running fsck a bit (is there a journaling file system for freebsd?!) the machine is now running ok. I have yet to schedule a reboot to mark the currently off line 18 gig disk as the hot spare. I think I will be able to do this. I am worried that the controller randomly marks the drives off line. Mylex tell me this happens when it looses contact with the drives. They are internal drives, well screwed into a big case, nicely racked into a locked cabinet in Telehouse Europe. From what I can gather, no one accessed the rack. It appears they aren't disconnected anyway because I can mark them online and we're go again. I'd be happy to help with more information if it helps. Directed questions work best! thanks peter -- peter gradwell; online @ http://www.gradwell.com/peter/ To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-scsi" in the body of the message