Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 7 Oct 1997 14:50:03 -0700 (PDT)
From:      Zach Heilig <zach@gaffaneys.com>
To:        freebsd-bugs
Subject:   Re: kern/4684: crash on very heavy disk activity.
Message-ID:  <199710072150.OAA25767@hub.freebsd.org>

next in thread | raw e-mail | index | archive | help
The following reply was made to PR kern/4684; it has been noted by GNATS.

From: Zach Heilig <zach@gaffaneys.com>
To: Stefan Esser <se@freebsd.org>
Cc: FreeBSD-gnats-submit@freebsd.org
Subject: Re: kern/4684: crash on very heavy disk activity.
Date: Tue, 7 Oct 1997 16:41:35 -0500

 I did not see my other reply come over the bugs list [from a while back], so
 I will paraphrase my other reply...
 
 On Sun, Oct 05, 1997 at 11:04:22AM +0200, Stefan Esser wrote:
 > > ncr0 <ncr 53c875 fast20 wide scsi> rev 1 int a irq 11 on pci0:10
 > > sd0(ncr0:0:0): M_DISCONNECT received, but datapointer not saved:
 > > 	data=701b4 save=e40016b0 goal=e40016d4.
 
 > Hmm, the drive disconnected during the probe ...
 > Does this happen on each boot ?
 
 Yes, this happens on every boot.  I haven't actually noticed any other problems
 with sd0 (or sd1) though.
 
 > > Here are the last few console messages before the reboot:
 ...
 > > This was during both an rm -rf of a large tree on sd2s1e and a cvs checkout
 > > from the cvs repository I keep on that slice.
 
 > The command failed because of lack of agreement on the amount of 
 > data requested. The drive stayed in a data phase, when there was 
 > either no more data to deliver to it, or no more buffer space to
 > store the data read (depending on whether this happened during a
 > read or a write).
 
 I wonder of there was a bus reset, and the jaz drive responded by just dropping
 everything it was up to (and returned/wrote incomplete data).
 
 > This (together with the disconnect of your UW drive) indicates 
 > there is a SCSI bus problem. SCSI strobe pulses got lost or 
 > duplicated.
 
 Just for clarification, I only have 50 pin scsi devices.  These devices hang
 off of a 50 pin port on the scsi card.  The card will take ultra-wide devices,
 but I do not have any.
 
 > What's the (total!) length of your SCSI bus ?
 > (Internal plus external, number of connectors, if any, terminators ?)
 
 the devices were on a 38" cable, 5 connectors, 8" between four on one end,
 and 14" between the single connector and the group.  They were in the order
 card -14"- sd0 -8"- sd1 -8"- sd2 -8"- cd0.
 
 cd0 has terminators installed.
 sd2 has only automatic termination.
 [ the above are no longer connected ].
 sd1 had termination disabled.
 sd0 has termination and term power disabled.
 the card only has one setting [and that's for auto termination].
 
 I tried 3 other cables [all with connections for 2 devices].  I re-enabled
 termination on sd1, and only connected sd0 and sd1.  sd0 still disconnected
 on boot for all three cables.  These cables were all around 24" long.  I left
 one of these other cables installed [the one that came with the card].
 
 > Could you try with a much reduced data rate (say 5MHz), just to
 > make sure it is not caused by the bus cable ?
 
 This didn't seem to make any difference.  It took longer to induce a crash,
 but it still did crash.
 
 Ok, I'm pretty much convinced this is actually a cable [or device] problem.
 Unless the ncr driver is sending a reset that is messing up the jaz drive,
 there doesn't seem to be much on the software side that can fix these things.
 
 Even though I don't seem to be having problems with sd0, that is the device
 that was added not too long ago, and there were no problems at all before
 that.  When I have time, I'll switch the drive to the end and enable all its
 termination options [and disable the termination on the other devices], and
 see if that helps any.
 
 -- 
 Zach Heilig
 We know you are a good friend, but we have to charge you for our services just
 like our other customers.  Actually, we don't like charging our friends, but
 we did a study of our clientel and discovered none of our enemies do business
 with us.  [seen in a lawyers office].



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199710072150.OAA25767>