From owner-freebsd-scsi Sat Aug 12 15:20:55 2000 Delivered-To: freebsd-scsi@freebsd.org Received: from mail.du.gtn.com (mail.du.gtn.com [194.77.9.57]) by hub.freebsd.org (Postfix) with ESMTP id 8DF0337BA94 for ; Sat, 12 Aug 2000 15:20:50 -0700 (PDT) (envelope-from ticso@mail.cicely.de) Received: from mail.cicely.de (cicely.de [194.231.9.142]) by mail.du.gtn.com (8.11.0.Beta3/8.11.0.Beta3) with ESMTP id e7CMKdB06168 (using TLSv1/SSLv3 with cipher EDH-RSA-DES-CBC3-SHA (168 bits) verified OK); Sun, 13 Aug 2000 00:20:45 +0200 (MET DST) Received: (from ticso@localhost) by mail.cicely.de (8.11.0.Beta1/8.11.0.Beta1) id e7CMKbd40346; Sun, 13 Aug 2000 00:20:37 +0200 (CEST) Date: Sun, 13 Aug 2000 00:20:37 +0200 From: Bernd Walter To: Joe Modjeski Cc: "'Bernd Walter'" , "'freebsd-scsi@freebsd.org'" Subject: Re: to Vinum or not to Vinum Message-ID: <20000813002036.A40322@cicely7.cicely.de> References: <00101B7A7FDDD311A89500A0CC56C79048B9@MS1> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Mailer: Mutt 1.0.1i In-Reply-To: <00101B7A7FDDD311A89500A0CC56C79048B9@MS1>; from jmodjeski@ms1.northlink.com on Sat, Aug 12, 2000 at 01:11:39PM -0700 Sender: owner-freebsd-scsi@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.org On Sat, Aug 12, 2000 at 01:11:39PM -0700, Joe Modjeski wrote: > > > On Thu, Aug 10, 2000 at 12:09:47PM -0700, Joe Modjeski wrote: > > > Currently we have 3 Compaq Proliant 1600R servers with 6 > > 9.1 Ultra3 drives > > > in each. We are attempting (very unsuccessfully) to do > > Raid5 with vinum. > > > We get fatal trap 12 errors very regularly and after a few > > reboots the vinum > > > volume is so chewed up that we end up having to rebuild the > > system. I > > > tracked down the majority of the problems to the > > /etc/security script. I > > > believe it is about the 6th or 7th line down where it > > starts the find run. > > > The box starts off fine but after about 1 minute it starts > > to hit all the > > > drives at once then BLAM!! It gives me the error. > > > > Are your fatal trap 12 errors kernel panics? > > If yes do you see some SCSI error messages directly before > > this happens? > > Yes they are kernel panics. And Yes there are always SCSI errors. > > BAD DSA ( SOME_HEX_NUMBER ) in queue > SCSI BUS RESET DETECTED sym0:0:-1:-1 > > The above isn't exact. The message conveniently misses the logs. I can get > the exact messages if you would like. I am trying to avoid crashing the box > as much as possible. :) The exact error including the hex codes is important to distinguish between a bus error or something in the code. > The drives are Hotswap and it does appear that they get "Disconnected" when > the error happens. It is however no specific. In my original vinum setup I > was spanning the raid across all 6 drives. Then it was consistant with > drive 0. I though that was reason for the trouble so I changed the > configuration to the one included in the previous message. > > I have compiled a debug kernel in an effort to get a dump and now the fatal > trap 12 kernel panics are less the SCSI errors that go along with them are > more consistant. You mean you get SCSI errors sometimes without panics directly after? Are you still using the sym controller or is that behavour with the ahc card you mentioned? -- B.Walter COSMO-Project http://www.cosmo-project.de ticso@cicely.de Usergroup info@cosmo-project.de To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-scsi" in the body of the message