Date:      Mon,  4 Mar 96  0:28:46 +0000
Subject:   Re(2): panic: inconsistent empty queue
Message-ID:  <"359-960304003140-C1B5*/G=Andrew/S=Gordon/O=Net-Tel Computer Systems Ltd/PRMD=Net-Tel/ADMD=Gold 400/C=GB/"@MHS>
In-Reply-To: <"SunOS:21708-960303190858-7A28*/DD.RFC-822=owner-stable(a) 400/C=GB/"@MHS>

> >I have duplicated this crash here running a 2 week old stable
> >during a make world, except I don't get the scsi timeout, I drop
> >right into a panic: getnewbuf: inconsistent AGE queue, qindex = 2,
> >with a similiar traceback.
> >
> >Reverting to 2.1-RELEASE on the system eliminated the problem :-(.
> >
> >I do not see this problem on any of the PCI systems I have been
> >building so this may be related to the EISA changes and only effecting
> >EISA systems.
> I've thought this too, but I can't understand how a change in the probe
> code could cause a buffer problem many hours down the line.  I have a
> bt747, aha1742, aha2742, and a aha2842 in my machine here, and as soon as
> my disk for 2.1-stalbe returns, I'll try to track this down.  Can you
> repro this in a -current environment?  The eisa code is very similar.

I have similar symptoms on a couple of systems here (running -current as of
CTM delta src-2.1.0046.gz):

- This machine (486/66 VLB with 2842 SCSI) fails an hour or so into
  'make world' if running -stable kernel; completes OK if running
   kernel.GENERIC from 2.1R

- Another machine (P166 PCI, 3940 SCSI) worked OK with a corresponding
  -stable kernel.

- I am not always seeing panics; more often partial or complete latch-ups.
  In the most interesting case, I was running X with a make world in
  one xterm that had been running overnight, but appeared to have stalled
  (no disc activity).  The shells in the other xterms were perfectly happy,
  and I could ps -ax to see what was going on, 'ls' various other
  directories etc.  However, when I tried to 'ls' in the directory where
  the 'make world' appeared to be working, that process hung up too.

Of course, the fact that it appears to be EISA/VLB versus PCI may be a red
herring, since (at least in my case) the PCI systems tend to have faster CPU
and so the timing will be subsantially different...             

