From owner-freebsd-scsi  Sat Aug 12 15:20:55 2000
Delivered-To: freebsd-scsi@freebsd.org
Received: from mail.du.gtn.com (mail.du.gtn.com [194.77.9.57])
	by hub.freebsd.org (Postfix) with ESMTP id 8DF0337BA94
	for <freebsd-scsi@freebsd.org>; Sat, 12 Aug 2000 15:20:50 -0700 (PDT)
	(envelope-from ticso@mail.cicely.de)
Received: from mail.cicely.de (cicely.de [194.231.9.142])
	by mail.du.gtn.com (8.11.0.Beta3/8.11.0.Beta3) with ESMTP id e7CMKdB06168
	(using TLSv1/SSLv3 with cipher EDH-RSA-DES-CBC3-SHA (168 bits) verified OK);
	Sun, 13 Aug 2000 00:20:45 +0200 (MET DST)
Received: (from ticso@localhost)
	by mail.cicely.de (8.11.0.Beta1/8.11.0.Beta1) id e7CMKbd40346;
	Sun, 13 Aug 2000 00:20:37 +0200 (CEST)
Date: Sun, 13 Aug 2000 00:20:37 +0200
From: Bernd Walter <ticso@mail.cicely.de>
To: Joe Modjeski <jmodjeski@ms1.northlink.com>
Cc: "'Bernd Walter'" <ticso@mail.cicely.de>,
	"'freebsd-scsi@freebsd.org'" <freebsd-scsi@freebsd.org>
Subject: Re: to Vinum or not to Vinum
Message-ID: <20000813002036.A40322@cicely7.cicely.de>
References: <00101B7A7FDDD311A89500A0CC56C79048B9@MS1>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
X-Mailer: Mutt 1.0.1i
In-Reply-To: <00101B7A7FDDD311A89500A0CC56C79048B9@MS1>; from jmodjeski@ms1.northlink.com on Sat, Aug 12, 2000 at 01:11:39PM -0700
Sender: owner-freebsd-scsi@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.org

On Sat, Aug 12, 2000 at 01:11:39PM -0700, Joe Modjeski wrote:
>  
> > On Thu, Aug 10, 2000 at 12:09:47PM -0700, Joe Modjeski wrote:
> > > Currently we have 3 Compaq Proliant 1600R servers with 6 
> > 9.1 Ultra3 drives
> > > in each.  We are attempting (very unsuccessfully) to do 
> > Raid5 with vinum.
> > > We get fatal trap 12 errors very regularly and after a few 
> > reboots the vinum
> > > volume is so chewed up that we end up having to rebuild the 
> > system.  I
> > > tracked down the majority of the problems to the 
> > /etc/security script.  I
> > > believe it is about the 6th or 7th line down where it 
> > starts the find run.
> > > The box starts off fine but after about 1 minute it starts 
> > to hit all the
> > > drives at once then BLAM!! It gives me the error.
> > 
> > Are your fatal trap 12 errors kernel panics?
> > If yes do you see some SCSI error messages directly before 
> > this happens?
> 
> Yes they are kernel panics.  And Yes there are always SCSI errors.
> 
> BAD DSA ( SOME_HEX_NUMBER ) in queue
> SCSI BUS RESET DETECTED sym0:0:-1:-1
> 
> The above isn't exact.  The message conveniently misses the logs.  I can get
> the exact messages if you would like.  I am trying to avoid crashing the box
> as much as possible. :)

The exact error including the hex codes is important to distinguish between
a bus error or something in the code.

> The drives are Hotswap and it does appear that they get "Disconnected" when
> the error happens.  It is however no specific.  In my original vinum setup I
> was spanning the raid across all 6 drives.  Then it was consistant with
> drive 0.  I though that was reason for the trouble so I changed the
> configuration to the one included in the previous message.
> 
> I have compiled a debug kernel in an effort to get a dump and now the fatal
> trap 12 kernel panics are less the SCSI errors that go along with them are
> more consistant.

You mean you get SCSI errors sometimes without panics directly after?
Are you still using the sym controller or is that behavour with the ahc
card you mentioned?

-- 
B.Walter              COSMO-Project         http://www.cosmo-project.de
ticso@cicely.de         Usergroup           info@cosmo-project.de


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-scsi" in the body of the message