Date: Wed, 19 Jul 95 12:18:42 MDT From: terry@cs.weber.edu (Terry Lambert) To: hm@altona.hamburg.com Cc: freebsd-hackers@freebsd.org Subject: Re: FreeBSD 2.0.5 system hangs Message-ID: <9507191818.AA28521@cs.weber.edu> In-Reply-To: <m0sYUkJ-00001FC@ernie.altona.hamburg.com> from "Hellmuth Michaelis" at Jul 19, 95 10:44:47 am
next in thread | previous in thread | raw e-mail | index | archive | help
> Hi, > > as Justin just wrote, the core team can only fix bugs it knows about, so: > > I experience 2 types of total system hangs under 2.0.5-Release, > > 1) in an xterm, while scrolling, the system sometimes and totally > unreproducable just hangs. This seems to occur more often the smaller > the used font and/or the larger the xterm is, or better the more > amount to scroll. > This also happened from time to time under 1.1.5.1 and was one of the > reasons i wanted to upgrade. > When this happens, the machine is totally frozen so there is not even > a chance to look from another side into the machine. Move sio3 off of irq 7. IRQ 7 is the garbage interrupt for untrapped interrupts. Arguably, all BSD interrupts should be soft-vectored so that there is no such thing as a "garbage" interrupt. The ability to add and remove chain items would help in equipment autodetection. Nevertheless, since you have nothing on irq 2, potentially, the video card is generating IRQ 2 on vertical retrace (a typical result of card level scroll commands, since they wait for vertical retrace). Potentially, you could also resolve this by putting a trap to a null device (or throwing a printer driver) on irq 2, or by a jumper setting on the video card. > 2) Disk i/o hangs, sometimes with the access LED on the controller on, some- > times off. The machine is operational as long as one does not "touch" > the disks, so i would be able to search for something if someone would > tell me where to search and what to search for. > > always found under 1.1.5.1. I'm a little upset that there is not a controller identification message from the SCSI controller; I can nly tell by its name that it is some kind of Adaptec controller. Because of the EISA and other message, and the fact that you were using the thing under 1.1.5.1, I'm going to guess that it's an Adaptec 1740 (no floppy on board) or an Adaptec 1742. The important issue here is, I think, firmware revision and EISA configuration utility settings. In particular, Adaptec shipped all of it's 174x boards with a "3.0" EISA config disk, and they have a "3.1" EISA config disk that has better settings available in it. I would not suggest changing translation modes (one of the features of the new setup disks is access to "advanced translation") unless you are adding a big drive that needs to be accessed by DOS (and if you modify this setting, be prepared to reinstall all drives using the old translation!). I would, however, suggest looking at bus timing and disconnect, especially with regard to the Archive Viper drive. Setting the bus transfer rate down on the offending peripheral(s) will probably fix your problem. If you disable disconnect, make sure that the kernel you are running has tagged command queueuing turned off, since it relies on disconnect and doesn't compute transitive closure across the call graph to ensure against deadlock. And yes, SCSI failure recovery is currently under design discussion on the hackers list, which will probably allow you to run with hiccups instead of hangs (but wouldn't it be better to set the hardware so that it fixes your problem now, and when the recovery code is in place later, you don't have to deal with hiccups?). BTW, UnixWare defaults the transfer rate to the second lowest for all Adaptec controllers to guard against just this type of problem. Is it an acceptable trade? Probably not, if the majority of SCSI hardware out there doesn't have problems at higher rates. PS: Check your SCSI II cables and Active termination (had to say it before Rod jumped in 8-)). Terry Lambert terry@cs.weber.edu --- Any opinions in this posting are my own and not those of my present or previous employers.
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?9507191818.AA28521>