Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 15 Jan 2001 23:17:19 +1100 (EST)
From:      Bruce Evans <bde@zeta.org.au>
To:        John Baldwin <jhb@FreeBSD.ORG>
Cc:        Jordan Hubbard <jkh@winston.osd.bsdi.com>, current@FreeBSD.ORG
Subject:   RE: Anybody else seeing a broken /dev/lpt with SMP on -current?
Message-ID:  <Pine.BSF.4.21.0101152244270.16808-100000@besplex.bde.org>
In-Reply-To: <XFMail.010112185559.jhb@FreeBSD.org>

next in thread | previous in thread | raw e-mail | index | archive | help
On Fri, 12 Jan 2001, John Baldwin wrote:

> On 13-Jan-01 Jordan Hubbard wrote:
> > I've actually been seeing this for about 2 months now but only just
> > now got motivated enough to enable crashdumps and get some information
> > on what happens whenver I try to use the printer attached to my (sadly :)
> > -current SMP box:
> > 
> > IdlePTD 3682304
> > initial pcb at 2e70e0
> > panicstr: page fault
> > panic messages:
> > ---
> > Fatal trap 12: page fault while in kernel mode
> > cpuid = 0; lapic.id = 00000000
> > fault virtual address   = 0xffff8640
> > fault code              = supervisor write, page not present
> > instruction pointer     = 0x8:0xc8dc8676
> > stack pointer           = 0x10:0xc8280f88
> > frame pointer           = 0x10:0xc8280f9c
> > code segment            = base 0x0, limit 0xfffff, type 0x1b
> >                         = DPL 0, pres 1, def32 1, gran 1
> > processor eflags        = interrupt enabled, resume, IOPL = 0
> > current process         = 12322 (irq7: lpt0)
> > trap number             = 12
> > panic: page fault
> > cpuid = 0; lapic.id = 00000000
> > boot() called on cpu#0
> > 
> > If anybody wants a fuller traceback then I'll compile up a kernel with
> > debugging symbols, but it's going to be pretty sparse anyway since it
> > basically only shows the trap() from the page fault and the subsequent
> > panic.
> 
> All the other traces show the kerenl having returned to an address that is
> beyongd the end of the kernel (which causes the page fault) meaning that the
> stack is fubar'd, so the trace isn't meaningful anyways. :(  Knowing how and
> why the lpd interrupt handler trashes the stack is the useful info, and with
> teh stack already trashed, I don't know of an easy way to figure that out. 
> Suggestions welcome.

This may be cause by the lpt driver (ab)using BUS_SETUP_INTR() on every
write().  The interrupt system can't handle this.  I noticed the following
symptoms:
- stray irq7's from when the driver interrupt isn't attached (BUS_SETUP_INTR()
  for ppbus first tears down any previously set up handler).
- under UP, a slow memory leak from not freeing ih_name in inthand_remove().
  Fixed in the enclosed patch.
- under SMP with 1 cpu, panics in various places due to the process table
  filling up with undead ithreads.  Worked around in the enclosed patch.
  This bug should go away almost automatically when interrupt handling
  actually works.  Use something like "dd if=/dev/zero of=/dev/lpt0 bs=1"
  to see this bug.  Use a small value for kern.maxproc to see it quickly.
- "cp /dev/zero /dev/lpt0 &" caused about 50% interrupt overhead.  Under
  UP, interactive response was not noticeably affected, but under SMP with
  1 cpu, echoing of keystrokes in /bin/sh in single user mode took a few
  hundred msec.

Index: dev/ppbus/lpt.c
===================================================================
RCS file: /home/ncvs/src/sys/dev/ppbus/lpt.c,v
retrieving revision 1.20
diff -c -2 -r1.20 lpt.c
*** dev/ppbus/lpt.c	2000/12/07 22:33:12	1.20
--- dev/ppbus/lpt.c	2001/01/15 02:44:40
***************
*** 70,73 ****
--- 70,76 ----
  #include <sys/conf.h>
  #include <sys/kernel.h>
+ #include <sys/mutex.h>
+ #include <sys/proc.h>
+ #include <sys/resourcevar.h>
  #include <sys/uio.h>
  #include <sys/syslog.h>
***************
*** 759,762 ****
--- 762,797 ----
  			device_printf(lptdev, "handler registration failed, polled mode.\n");
  			sc->sc_irq &= ~LP_USE_IRQ;
+ 		}
+ 
+ 		/*
+ 		 * XXX setting up interrupts is a very expensive operation and
+ 		 * shouldn't be done here.  Despite its name, BUS_SETUP_INTR()
+ 		 * for this bus both sets up and tears down interrupts (it
+ 		 * first tears down any already-setup interrupt).  This
+ 		 * involves exiting from any existing ithread and starting a
+ 		 * new one.  The exit is done lazily, and at least under SMP,
+ 		 * writing tinygrams resulted in ithreads being created faster
+ 		 * than they were destroyed, resulting in assorted panics
+ 		 * depending on where the resource exhaustion was detected.
+ 		 *
+ 		 * Yield so that the ithreads get a chance to exit.
+ 		 *
+ 		 * XXX following grot cloned from uio_yield().
+ 		 */
+ 		{
+ 		struct proc *p;
+ 		int s;
+ 
+ 		p = curproc;
+ 		s = splhigh();
+ 		mtx_enter(&sched_lock, MTX_SPIN);
+ 		DROP_GIANT_NOSWITCH();
+ 		p->p_priority = p->p_usrpri;
+ 		setrunqueue(p);
+ 		p->p_stats->p_ru.ru_nivcsw++;
+ 		mi_switch();
+ 		mtx_exit(&sched_lock, MTX_SPIN);
+ 		PICKUP_GIANT();
+ 		splx(s);
  		}
  	}
Index: i386/isa/intr_machdep.c
===================================================================
RCS file: /home/ncvs/src/sys/i386/isa/intr_machdep.c,v
retrieving revision 1.42
diff -c -2 -r1.42 intr_machdep.c
*** i386/isa/intr_machdep.c	2000/12/08 21:50:11	1.42
--- i386/isa/intr_machdep.c	2001/01/15 01:27:24
***************
*** 710,713 ****
--- 710,714 ----
  		}
  	}
+ 	free(idesc->ih_name, M_DEVBUF);
  	free(idesc, M_DEVBUF);
  	return (0);

Bruce



To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-current" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?Pine.BSF.4.21.0101152244270.16808-100000>