Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 19 Jul 95 12:18:42 MDT
From:      terry@cs.weber.edu (Terry Lambert)
To:        hm@altona.hamburg.com
Cc:        freebsd-hackers@freebsd.org
Subject:   Re: FreeBSD 2.0.5 system hangs
Message-ID:  <9507191818.AA28521@cs.weber.edu>
In-Reply-To: <m0sYUkJ-00001FC@ernie.altona.hamburg.com> from "Hellmuth Michaelis" at Jul 19, 95 10:44:47 am

next in thread | previous in thread | raw e-mail | index | archive | help
> Hi,
> 
> as Justin just wrote, the core team can only fix bugs it knows about, so:
> 
> I experience 2 types of total system hangs under 2.0.5-Release,
> 
> 1) in an xterm, while scrolling, the system sometimes and totally
>    unreproducable just hangs. This seems to occur more often the smaller
>    the used font and/or the larger the xterm is, or better the more
>    amount to scroll.
>    This also happened from time to time under 1.1.5.1 and was one of the
>    reasons i wanted to upgrade.
>    When this happens, the machine is totally frozen so there is not even
>    a chance to look from another side into the machine.

Move sio3 off of irq 7.

IRQ 7 is the garbage interrupt for untrapped interrupts.

Arguably, all BSD interrupts should be soft-vectored so that there is
no such thing as a "garbage" interrupt.  The ability to add and remove
chain items would help in equipment autodetection.

Nevertheless, since you have nothing on irq 2, potentially, the video
card is generating IRQ 2 on vertical retrace (a typical result of
card level scroll commands, since they wait for vertical retrace).

Potentially, you could also resolve this by putting a trap to a null
device (or throwing a printer driver) on irq 2, or by a jumper setting
on the video card.

> 2) Disk i/o hangs, sometimes with the access LED on the controller on, some-
>    times off. The machine is operational as long as one does not "touch"
>    the disks, so i would be able to search for something if someone would
>    tell me where to search and what to search for.
> 
>    always found under 1.1.5.1.

I'm a little upset that there is not a controller identification message
from the SCSI controller; I can nly tell by its name that it is some kind
of Adaptec controller.  Because of the EISA and other message, and the
fact that you were using the thing under 1.1.5.1, I'm going to guess
that it's an Adaptec 1740 (no floppy on board) or an Adaptec 1742.

The important issue here is, I think, firmware revision and EISA
configuration utility settings.

In particular, Adaptec shipped all of it's 174x boards with a "3.0"
EISA config disk, and they have a "3.1" EISA config disk that has
better settings available in it.

I would not suggest changing translation modes (one of the features
of the new setup disks is access to "advanced translation") unless
you are adding a big drive that needs to be accessed by DOS (and if
you modify this setting, be prepared to reinstall all drives using
the old translation!).

I would, however, suggest looking at bus timing and disconnect,
especially with regard to the Archive Viper drive.  Setting the
bus transfer rate down on the offending peripheral(s) will probably
fix your problem.

If you disable disconnect, make sure that the kernel you are
running has tagged command queueuing turned off, since it relies
on disconnect and doesn't compute transitive closure across the
call graph to ensure against deadlock.

And yes, SCSI failure recovery is currently under design discussion
on the hackers list, which will probably allow you to run with
hiccups instead of hangs (but wouldn't it be better to set the hardware
so that it fixes your problem now, and when the recovery code is in
place later, you don't have to deal with hiccups?).

BTW, UnixWare defaults the transfer rate to the second lowest for
all Adaptec controllers to guard against just this type of problem.
Is it an acceptable trade?  Probably not, if the majority of SCSI
hardware out there doesn't have problems at higher rates.


PS: Check your SCSI II cables and Active termination (had to say it
before Rod jumped in 8-)).


					Terry Lambert
					terry@cs.weber.edu
---
Any opinions in this posting are my own and not those of my present
or previous employers.



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?9507191818.AA28521>