Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 13 Aug 1997 12:04:39 +0930 (CST)
From:      Michael Smith <msmith@atrad.adelaide.edu.au>
To:        julian@whistle.com (Julian Elischer)
Cc:        msmith@atrad.adelaide.edu.au, julian@FreeBSD.ORG, hackers@FreeBSD.ORG
Subject:   Re: 2.2.2+ crash.. more info
Message-ID:  <199708130234.MAA11390@genesis.atrad.adelaide.edu.au>
In-Reply-To: <33F114EB.167EB0E7@whistle.com> from Julian Elischer at "Aug 12, 97 06:59:07 pm"

next in thread | previous in thread | raw e-mail | index | archive | help
Julian Elischer stands accused of saying:
> Michael Smith wrote:
> > 
> > Julian Elischer stands accused of saying:
> > >
> > > We have several hundred Bsd machines here.. we see this one enough for
> > > me to recognise it..
> > >
> > > the plot thickens..
> > > I have discovered the following:
> > > 1/ the code that crashes:
> > >   scanning the queues in swithc:
> > 
> > This looks a lot like the sort of crazy stuff I was seeing when I was
> > doing Verboten things inside a 'fast' ISA interrupt handler.  Do you have
> > RI_FAST set for any of your drivers, particularly ones that you've written
> > yourself?
> > 
> > You could try ripping RI_FAST out ouf _all_ of the handlers you're using
> > to start with and see if this cures things.
> > 
> > > code examinations will follow with more info..
> > > if this strikes anyone as familiar, do chime in!
> > 
> > Frighteningly.  It took us the best part of a year just to get a stack
> > trace that actually hinted at the problem.
> > 
> > > julian
> 
> this particular machine has no interupt handlers that were not 
> part of standard FreeBSD..
> 
> ed0 and ed1 networks,
> wd0 disk
> sio0 and sio1
> 
> how do I SET RI_FAST? :)
> (does that answer your question?)

You mask it into the id_ri_flags field of the isa_device structure.
Currently only the 'cy' and 'sio' drivers use it.  You could try
removing it from the 'sio' driver and see if it helps, but I expect
that Bruce would insist that this is not the case.

> actually it looks like some sort of SPL problem to me but as I said,
> there is very little
> that is non standard on this machine..

The RI_FAST problem _is_ an spl problem, in that a fast interrupt
handler does not honour any spl() protection.

> the fact that the process got put on the a sleep queue while it was
> on the runnable queue. suggests that maybe an interrupt driver
> ran 'tsleep' while curproc had the value of this process in it..

You get this sort of confusion if you futz with *sleep/wakeup inside a
fast interrupt handler because you can end up re-entering the code
that shuffles processess from one queue to another.

I would be fairly surprised, given your usage, if the sio interrupt
handler was the cause of your trouble; I think I may have given you
a bum steer.

-- 
]] Mike Smith, Software Engineer        msmith@gsoft.com.au             [[
]] Genesis Software                     genesis@gsoft.com.au            [[
]] High-speed data acquisition and      (GSM mobile)     0411-222-496   [[
]] realtime instrument control.         (ph)          +61-8-8267-3493   [[
]] Unix hardware collector.             "Where are your PEZ?" The Tick  [[



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199708130234.MAA11390>