Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 07 Dec 1999 20:01:39 +0800
From:      Peter Wemm <peter@netplex.com.au>
To:        Ed Hall <edhall@screech.weirdnoise.com>
Cc:        Matthew Dillon <dillon@apollo.backplane.com>, "Jonathan M. Bresler" <jmb@hub.freebsd.org>, kris@hub.freebsd.org, freebsd-hackers@FreeBSD.ORG
Subject:   Re: PCI DMA lockups in 3.2 (3.3 maybe?) 
Message-ID:  <19991207120139.869F01CC6@overcee.netplex.com.au>
In-Reply-To: Message from Ed Hall <edhall@screech.weirdnoise.com>  of "Mon, 06 Dec 1999 13:26:38 PST." <199912062126.NAA30946@screech.weirdnoise.com> 

next in thread | previous in thread | raw e-mail | index | archive | help
Ed Hall wrote:
> : you wrote:
> : : I wrote:
> : :4) Using a different SCSI driver (Peter managed to get a driver from 4.0
> : :   hooked up under 3.3, and it survived two days of torture that would
> : :   have toasted things within an hour using the stock driver; you'll have
> : :   to ask him for details).
> : 
> :     Ed, this is great stuff!
> : 
> :     Are you sure about #4?  Is that the same ncr.c driver or something
> :     else?  There are only a few differences between the 3.x and 4.x
> :     /usr/src/sys/pci/ncr.c drivers.  Which Peter, Peter Wemm?
> 
> It was Peter Wemm.  I may be misunderstanding just what he did--trying
> the 4.0 driver was just one several experiments he proposed and
> performed.  And saying that it "worked" is provisional; two days of
> testing strongly suggests that it reduced the problem with 3.3 to
> acceptible levels for my application.  Is it truly a "fix?" I don't
> know.
> 
> 		-Ed

I might add that others have found that using sym + fxp on the N440BX
motherboards didn't solve their problems, or moved the problem elsewhere,
eg: to the sbdrop() etc routines.  One other interesting variable.. an ahc
+ pn driver combination on a 440BX motherboard under -current in late may
99 had the exact same problems we saw a number of times with ncr + fxp (ie:
sbdrop, sbflush, m_copym etc panics).  The same motherboard with ahc + de or
fxp did not have the problems.

In all cases the panics were extremely "strange".  The original fxp+ncr
combination changed it's crash pattern when we put extra debugging in it to
sanity check and check conditions.  The results varied from registers getting
clobbered (as though an interrupt happened and the trapframe on the stack got
changed by the interrupt handler and then returned with garbage contents in
some registers.. this is what seems to be happening in the fxp_add_rfabuf()
panics - %esi was getting loaded earlier on and when it got to do the
vtophys() it was zero.  People have printed the contents of "rfa" on the stack
and seen garbage - in fact it's a register variable under normal circumstances.
Adding debugging caused it to be stored in the local variable rather than
being left in %esi, and then the panics moved elsewhere (!).)

It had the markings of "something trashing something somewhere and then crashing
quite a bit later".  :-(

Cheers,
-Peter
--
Peter Wemm - peter@FreeBSD.org; peter@yahoo-inc.com; peter@netplex.com.au



To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-hackers" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?19991207120139.869F01CC6>