From owner-freebsd-bugs@FreeBSD.ORG  Wed Mar 30 22:42:39 2005
Return-Path: <owner-freebsd-bugs@FreeBSD.ORG>
Delivered-To: freebsd-bugs@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id A158716A4CE
	for <freebsd-bugs@FreeBSD.org>; Wed, 30 Mar 2005 22:42:39 +0000 (GMT)
Received: from mail24.sea5.speakeasy.net (mail24.sea5.speakeasy.net
	[69.17.117.26])	by mx1.FreeBSD.org (Postfix) with ESMTP id 45FE043D1F
	for <freebsd-bugs@FreeBSD.org>; Wed, 30 Mar 2005 22:42:39 +0000 (GMT)
	(envelope-from jhb@FreeBSD.org)
Received: (qmail 11726 invoked from network); 30 Mar 2005 22:42:39 -0000
Received: from server.baldwin.cx ([216.27.160.63])
	(envelope-sender <jhb@FreeBSD.org>)AES256-SHA encrypted SMTP
	for <freebsd-bugs@FreeBSD.org>; 30 Mar 2005 22:42:37 -0000
Received: from [10.50.41.231] (gw1.twc.weather.com [216.133.140.1])
	(authenticated bits=0)
	by server.baldwin.cx (8.13.1/8.13.1) with ESMTP id j2UMgRJk019445;
	Wed, 30 Mar 2005 17:42:28 -0500 (EST)
	(envelope-from jhb@FreeBSD.org)
From: John Baldwin <jhb@FreeBSD.org>
To: Bruce Evans <bde@zeta.org.au>
Date: Wed, 30 Mar 2005 15:52:02 -0500
User-Agent: KMail/1.6.2
References: <815955888.20050323113529@osk.com.ua>
	<1101884216.20050323181742@osk.com.ua> <20050330155502.E16886@delplex.bde.org>
In-Reply-To: <20050330155502.E16886@delplex.bde.org>
MIME-Version: 1.0
Content-Disposition: inline
Content-Type: text/plain;
  charset="windows-1252"
Content-Transfer-Encoding: 7bit
Message-Id: <200503301552.02472.jhb@FreeBSD.org>
X-Spam-Status: No, score=-102.8 required=4.2 tests=ALL_TRUSTED,
	USER_IN_WHITELIST autolearn=failed version=3.0.2
X-Spam-Checker-Version: SpamAssassin 3.0.2 (2004-11-16) on server.baldwin.cx
cc: freebsd-bugs@FreeBSD.org
cc: Oleg Tarasov <subscriber@osk.com.ua>
Subject: Re: sio interrupt-level buffer overflows
X-BeenThere: freebsd-bugs@freebsd.org
X-Mailman-Version: 2.1.1
Precedence: list
List-Id: Bug reports <freebsd-bugs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-bugs>,
	<mailto:freebsd-bugs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-bugs>
List-Post: <mailto:freebsd-bugs@freebsd.org>
List-Help: <mailto:freebsd-bugs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-bugs>,
	<mailto:freebsd-bugs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 30 Mar 2005 22:42:39 -0000

On Wednesday 30 March 2005 01:06 am, Bruce Evans wrote:
> On Wed, 23 Mar 2005, Oleg Tarasov wrote:
> > About my panics. They persist and when this server panics it somehow
> > overloads my network so it stops functioning until reboot. This is
> > very, very bad.
> >
> > Maybe you could tell me where to write, or you could
> > personally tell me what should I do.
> >
> > Using all my theoretical skills I have come to this data I could
> > obtain from my dump:
> >
> > (kgdb) backtrace
> > #0  doadump () at pcpu.h:159
> > #1  0xc060b063 in boot (howto=260) at
> > /usr/src/sys/kern/kern_shutdown.c:397 #2  0xc060b389 in panic
> > (fmt=0xc080321d "spin lock held too long") at
> > /usr/src/sys/kern/kern_shutdown.c:553
> > #3  0xc060270c in _mtx_lock_spin (m=0xc08d7800, td=0xc19ca320, opts=0,
> >    file=0x0, line=0) at /usr/src/sys/kern/kern_mutex.c:613
> > #4  0xc077c165 in siointr (arg=0xc1ab8800) at
> > /usr/src/sys/dev/sio/sio.c:1710 #5  0xc0790ead in intr_execute_handlers
> > (isrc=0xc19b8890, iframe=0xd541ac94) at
> > /usr/src/sys/i386/i386/intr_machdep.c:203
> > #6  0xc07932be in lapic_handle_intr (frame=
> >      {if_vec = 52, if_fs = -717160424, if_es = -1067384816, if_ds = 16,
> > if_edi = -1046699232, if_esi = -1064591424, if_ebp = -717116188, if_ebx =
> > -1046425600, if_edx = -1064566184, if_ecx = 0, if_eax = -1046425600,
> > if_eip = -1067440569, if _cs = 8, if_eflags = 582, if_esp = -1045200000,
> > if_ss = 4})
> >    at /usr/src/sys/i386/i386/local_apic.c:490
> > #7  0xc078d753 in Xapic_isr1 () at apic_vector.s:110
> > #8  0x00000034 in ?? ()
> > #9  0xd5410018 in ?? ()
> > #10 0xc0610010 in coredump (td=0xc08b9fc0) at vnode_if.h:1244
> > #11 0xc05f6f46 in ithread_loop (arg=0xc1981c80)
> >    at /usr/src/sys/kern/kern_intr.c:546
> > #12 0xc05f6001 in fork_exit (callout=0xc05f6df8 <ithread_loop>,
> >    arg=0xc1981c80, frame=0xd541ad48) at /usr/src/sys/kern/kern_fork.c:811
> > #13 0xc078d3fc in fork_trampoline () at
> > /usr/src/sys/i386/i386/exception.s:209 ...
>
> I couldn't figure out the problem from this.  Your later mail says that
> the problem is caused by ppp not being MPSAFE, at least with sio, so I
> won't do much more with this stack trace, but I wonder about some of the
> strange entries in it:
>
> #13 - #11 are normal.
> #10 is weird.  ithread_loop() shouldn't call coredump().
> #8 - #9 seem to be more like stack garbage than module addresses.
> #7 is normal, but it looks like someone broke stack traces for interrupts,
>     giving the garbage in #8 - #10.

This is weird as we do match on Xapic_isr as being an interrupt frame.  I'm 
not sure why that didn't work correctly.

> #0 - #6 are normal if the spin lock is already held by the same CPU that
>     is handling the interrupt (except this can't happen :-).  I wouldn't
>     have thought that broken locking in ppp could cause this.

It's also normal if another CPU is holding the lock and spins with it for some 
reason.
>
> Bruce

-- 
John Baldwin <jhb@FreeBSD.org>  <><  http://www.FreeBSD.org/~jhb/
"Power Users Use the Power to Serve"  =  http://www.FreeBSD.org