Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 13 Apr 2006 07:55:02 +0100 (BST)
From:      Robert Watson <rwatson@FreeBSD.org>
To:        Konstantin Saurbier <saurbier@math.uni-bielefeld.de>
Cc:        freebsd-stable@freebsd.org
Subject:   Re: FreeBSD 6.0 panics - sbdrop
Message-ID:  <20060413075019.U443@fledge.watson.org>
In-Reply-To: <20060411150759.20a9e9d5.saurbier@math.uni-bielefeld.de>
References:  <20060411150759.20a9e9d5.saurbier@math.uni-bielefeld.de>

next in thread | previous in thread | raw e-mail | index | archive | help
  This message is in MIME format.  The first part should be readable text,
  while the remaining parts are likely unreadable without MIME-aware tools.

--0-961574413-1144911302=:443
Content-Type: TEXT/PLAIN; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: QUOTED-PRINTABLE


On Tue, 11 Apr 2006, Konstantin Saurbier wrote:

> I've encountered a strange problem while using FreeBSD 6.0 for our local=
=20
> mirror (mirror.math.uni-bielefeld.de) and thus is providing access via ft=
p,=20
> http, rsync and cvsup (all local and remote). The system crashes=20
> periodically with a kernel panic (panic: sbdrop). The uptimes between two=
=20
> crashes are going from a few hours to a few weeks.
>
> The system is a i386, Intel Pentium 4 based with 512MB ram and a 3ware-70=
00=20
> (twe) raid controller containig 1 raid 5 set with approx. 1.9TB. The kern=
el=20
> is a GENERIC kernel without changes of the config. These are the kernel=
=20
> dumps:

There have been one or more long-term bugs we've been attempting to track d=
own=20
that result in socket buffer corruption discovered only on socket close (he=
nce=20
in sbdrop() when we flush the cover).  We've had a lot of trouble tracking =
it=20
down, and it's not clear that it's actually a single bug, since the sbdrop(=
)=20
panic is a sanity check that can detect a number of types of problems.  We=
=20
could, for example, be looking at a network interface driver bug.  There ar=
e=20
changes in progress in the 7.x branch to further clean up the socket code o=
n=20
SMP, and they might fix some outstanding problems.  There are a couple of=
=20
instances of this bug report in the PR database, but if you could file the=
=20
below details, it would be helpful.  I'm on travel currently, but will take=
=20
another stab at this when back.

Robert N M Watson

>
>
> Unread portion of the kernel message buffer:
> panic: sbdrop
> Uptime: 22h22m7s
> Dumping 503 MB (2 chunks)
>  chunk 0: 1MB (159 pages) ... ok
>  chunk 1: 503MB (128752 pages) 487 471 455 439 423 407 391 375 359 343 32=
7 311 295 279 263 247 231 215 199 183 167 151 135 119 103 87 71 55 39 23 7
>
> #0  doadump () at pcpu.h:165
> 165             __asm __volatile("movl %%fs:0,%0" : "=3Dr" (td));
>
>
> (kgdb) backtrace
> #0  doadump () at pcpu.h:165
> #1  0xc068d10e in boot (howto=3D260) at /usr/src/sys/kern/kern_shutdown.c=
:399
> #2  0xc068d680 in panic (fmt=3D0xc090c16a "sbdrop")
>    at /usr/src/sys/kern/kern_shutdown.c:555
> #3  0xc06d266c in sbdrop_locked (sb=3D0xc20aca84, len=3D1)
>    at /usr/src/sys/kern/uipc_socket2.c:1157
> #4  0xc06d3d93 in sbdrop (sb=3D0xc20aca84, len=3D0)
>    at /usr/src/sys/kern/uipc_socket2.c:1208
> #5  0xc0748a7d in tcp_input (m=3D0xc1c09100, off0=3D-1039845124)
>    at /usr/src/sys/netinet/tcp_input.c:1201
> #6  0xc0740147 in ip_input (m=3D0xc1c09100)
>    at /usr/src/sys/netinet/ip_input.c:778
> #7  0xc07171ff in netisr_processqueue (ni=3D0xc09ca4f8)
>    at /usr/src/sys/net/netisr.c:236
> #8  0xc07174be in swi_net (dummy=3D0x0) at /usr/src/sys/net/netisr.c:349
> #9  0xc06740e5 in ithread_loop (arg=3D0xc19c3280)
>    at /usr/src/sys/kern/kern_intr.c:547
> #10 0xc0673110 in fork_exit (callout=3D0xc067402c <ithread_loop>, arg=3D0=
x0,
>    frame=3D0x0) at /usr/src/sys/kern/kern_fork.c:789
> #11 0xc0894a1c in fork_trampoline () at /usr/src/sys/i386/i386/exception.=
s:208
>
>
>
> Now the output of bt full:
>
> (kgdb) bt full
> #0  doadump () at pcpu.h:165
> No locals.
> #1  0xc068d10e in boot (howto=3D260) at /usr/src/sys/kern/kern_shutdown.c=
:399
>        first_buf_printf =3D 1
> #2  0xc068d680 in panic (fmt=3D0xc090c16a "sbdrop")
>    at /usr/src/sys/kern/kern_shutdown.c:555
>        bootopt =3D 260
>        newpanic =3D 0
>        buf =3D "sbdrop", '\0' <repeats 249 times>
> #3  0xc06d266c in sbdrop_locked (sb=3D0xc20aca84, len=3D1)
>    at /usr/src/sys/kern/uipc_socket2.c:1157
>        m =3D (struct mbuf *) 0x0
>        next =3D (struct mbuf *) 0x0
> #4  0xc06d3d93 in sbdrop (sb=3D0xc20aca84, len=3D0)
>    at /usr/src/sys/kern/uipc_socket2.c:1208
> No locals.
> #5  0xc0748a7d in tcp_input (m=3D0xc1c09100, off0=3D-1039845124)
>    at /usr/src/sys/netinet/tcp_input.c:1201
>        dbuf =3D "\024\000\000\000\000=BB=C0=C1?\033\b=D4=E4=AC\211=C0X=BA=
\231=C1=AC\033\b=D4x\034\b=D4=C3M\211=C0G\000\000\000\b\000\000\000(\000\b=
=D4(\000l=C0"
>        sbuf =3D "\0003\234=C1l\033\b=D4\200=F2=AD=C1\0003\234=C1\224\033\=
b=D4=BF\201\211=C0\0003\234=C1\000\000\000\000\000\000\000\000\000=B9\226=
=C1\027\000\000\000\020(=DD=C1"
>        th =3D (struct tcphdr *) 0xc1b2f824
>        ip =3D (struct ip *) 0xc1b2f810
>        inp =3D (struct inpcb *) 0xc3ec5ca8
>        optp =3D (u_char *) 0xc1b2f838 "\001\001\b\n:\r=B7\027\004=CC\236=
=F2#E=B2W"
>        optlen =3D 12
>        len =3D 69
>        tlen =3D 0
>        off =3D 32
>        drop_hdrlen =3D 52
>        tp =3D (struct tcpcb *) 0xc20538fc
>        thflags =3D 16
>        so =3D (struct socket *) 0xc20ac9bc
>        todrop =3D 69
>        acked =3D 69
>        ourfinisacked =3D 0
>        needoutput =3D 0
>        tiwin =3D 5840
>        to =3D {to_flags =3D 1, to_tsval =3D 973977367, to_tsecr =3D 80518=
898,
>  to_mss =3D 0, to_requested_s_scale =3D 0 '\0', to_nsacks =3D 0 '\0',
>  to_sacks =3D 0x0}
>        headlocked =3D 0
>        rstreason =3D 69
>        ip6 =3D (struct ip6_hdr *) 0x0
>        isipv6 =3D 0
> #6  0xc0740147 in ip_input (m=3D0xc1c09100)
>    at /usr/src/sys/netinet/ip_input.c:778
>        ip =3D (struct ip *) 0xc1b2f810
>        ia =3D (struct in_ifaddr *) 0xc1c0bb00
>        ifa =3D (struct ifaddr *) 0xc1c0bb00
>        checkif =3D 0
>        hlen =3D 20
>        sum =3D 0
>        dchg =3D 0
>        odst =3D {s_addr =3D 3250633472}
> #7  0xc07171ff in netisr_processqueue (ni=3D0xc09ca4f8)
>    at /usr/src/sys/net/netisr.c:236
>        m =3D (struct mbuf *) 0xc1c09100
> #8  0xc07174be in swi_net (dummy=3D0x0) at /usr/src/sys/net/netisr.c:349
>        ni =3D (struct netisr *) 0xc09ca4f8
>        bits =3D 0
>        i =3D 0
> #9  0xc06740e5 in ithread_loop (arg=3D0xc19c3280)
>    at /usr/src/sys/kern/kern_intr.c:547
>        ih =3D (struct intrhand *) 0xc19c1080
>        p =3D (struct proc *) 0xc19b0624
>        count =3D 0
>        warned =3D 0
>        hlen =3D 20
>        sum =3D 0
>        dchg =3D 0
>        odst =3D {s_addr =3D 3250633472}
> #7  0xc07171ff in netisr_processqueue (ni=3D0xc09ca4f8)
>    at /usr/src/sys/net/netisr.c:236
>        m =3D (struct mbuf *) 0xc1c09100
> #8  0xc07174be in swi_net (dummy=3D0x0) at /usr/src/sys/net/netisr.c:349
>        ni =3D (struct netisr *) 0xc09ca4f8
>        bits =3D 0
>        i =3D 0
> #9  0xc06740e5 in ithread_loop (arg=3D0xc19c3280)
>    at /usr/src/sys/kern/kern_intr.c:547
>        ih =3D (struct intrhand *) 0xc19c1080
>        p =3D (struct proc *) 0xc19b0624
>        count =3D 0
>        warned =3D 0
> #10 0xc0673110 in fork_exit (callout=3D0xc067402c <ithread_loop>, arg=3D0=
x0,
>    frame=3D0x0) at /usr/src/sys/kern/kern_fork.c:789
>        p =3D (struct proc *) 0xc19b0624
> #11 0xc0894a1c in fork_trampoline () at /usr/src/sys/i386/i386/exception.=
s:208
> No locals.
>
>
> I hope that helps. If you need further information or if you have some hi=
nts or directions for me, please send me a mail.
> --=20
>
> Best regards,
>
> Konstantin Saurbier
>
> ------------------------------------------------------
> Konstantin Saurbier                Tel.: 0521 106 3861
> Computerlabor Mathematik                        U5-138
> Universitaet Bielefeld             Universitaetsstr.25
> 33501 Bielefeld
> email:                  saurbier@math.uni-bielefeld.de
> ------------------------------------------------------
>
>
--0-961574413-1144911302=:443--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20060413075019.U443>