Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 30 Mar 2005 15:52:02 -0500
From:      John Baldwin <jhb@FreeBSD.org>
To:        Bruce Evans <bde@zeta.org.au>
Cc:        Oleg Tarasov <subscriber@osk.com.ua>
Subject:   Re: sio interrupt-level buffer overflows
Message-ID:  <200503301552.02472.jhb@FreeBSD.org>
In-Reply-To: <20050330155502.E16886@delplex.bde.org>
References:  <815955888.20050323113529@osk.com.ua> <1101884216.20050323181742@osk.com.ua> <20050330155502.E16886@delplex.bde.org>

next in thread | previous in thread | raw e-mail | index | archive | help
On Wednesday 30 March 2005 01:06 am, Bruce Evans wrote:
> On Wed, 23 Mar 2005, Oleg Tarasov wrote:
> > About my panics. They persist and when this server panics it somehow
> > overloads my network so it stops functioning until reboot. This is
> > very, very bad.
> >
> > Maybe you could tell me where to write, or you could
> > personally tell me what should I do.
> >
> > Using all my theoretical skills I have come to this data I could
> > obtain from my dump:
> >
> > (kgdb) backtrace
> > #0  doadump () at pcpu.h:159
> > #1  0xc060b063 in boot (howto=260) at
> > /usr/src/sys/kern/kern_shutdown.c:397 #2  0xc060b389 in panic
> > (fmt=0xc080321d "spin lock held too long") at
> > /usr/src/sys/kern/kern_shutdown.c:553
> > #3  0xc060270c in _mtx_lock_spin (m=0xc08d7800, td=0xc19ca320, opts=0,
> >    file=0x0, line=0) at /usr/src/sys/kern/kern_mutex.c:613
> > #4  0xc077c165 in siointr (arg=0xc1ab8800) at
> > /usr/src/sys/dev/sio/sio.c:1710 #5  0xc0790ead in intr_execute_handlers
> > (isrc=0xc19b8890, iframe=0xd541ac94) at
> > /usr/src/sys/i386/i386/intr_machdep.c:203
> > #6  0xc07932be in lapic_handle_intr (frame=
> >      {if_vec = 52, if_fs = -717160424, if_es = -1067384816, if_ds = 16,
> > if_edi = -1046699232, if_esi = -1064591424, if_ebp = -717116188, if_ebx =
> > -1046425600, if_edx = -1064566184, if_ecx = 0, if_eax = -1046425600,
> > if_eip = -1067440569, if _cs = 8, if_eflags = 582, if_esp = -1045200000,
> > if_ss = 4})
> >    at /usr/src/sys/i386/i386/local_apic.c:490
> > #7  0xc078d753 in Xapic_isr1 () at apic_vector.s:110
> > #8  0x00000034 in ?? ()
> > #9  0xd5410018 in ?? ()
> > #10 0xc0610010 in coredump (td=0xc08b9fc0) at vnode_if.h:1244
> > #11 0xc05f6f46 in ithread_loop (arg=0xc1981c80)
> >    at /usr/src/sys/kern/kern_intr.c:546
> > #12 0xc05f6001 in fork_exit (callout=0xc05f6df8 <ithread_loop>,
> >    arg=0xc1981c80, frame=0xd541ad48) at /usr/src/sys/kern/kern_fork.c:811
> > #13 0xc078d3fc in fork_trampoline () at
> > /usr/src/sys/i386/i386/exception.s:209 ...
>
> I couldn't figure out the problem from this.  Your later mail says that
> the problem is caused by ppp not being MPSAFE, at least with sio, so I
> won't do much more with this stack trace, but I wonder about some of the
> strange entries in it:
>
> #13 - #11 are normal.
> #10 is weird.  ithread_loop() shouldn't call coredump().
> #8 - #9 seem to be more like stack garbage than module addresses.
> #7 is normal, but it looks like someone broke stack traces for interrupts,
>     giving the garbage in #8 - #10.

This is weird as we do match on Xapic_isr as being an interrupt frame.  I'm 
not sure why that didn't work correctly.

> #0 - #6 are normal if the spin lock is already held by the same CPU that
>     is handling the interrupt (except this can't happen :-).  I wouldn't
>     have thought that broken locking in ppp could cause this.

It's also normal if another CPU is holding the lock and spins with it for some 
reason.
>
> Bruce

-- 
John Baldwin <jhb@FreeBSD.org>  <><  http://www.FreeBSD.org/~jhb/
"Power Users Use the Power to Serve"  =  http://www.FreeBSD.org



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200503301552.02472.jhb>