Date: Thu, 13 Apr 2006 07:55:02 +0100 (BST) From: Robert Watson <rwatson@FreeBSD.org> To: Konstantin Saurbier <saurbier@math.uni-bielefeld.de> Cc: freebsd-stable@freebsd.org Subject: Re: FreeBSD 6.0 panics - sbdrop Message-ID: <20060413075019.U443@fledge.watson.org> In-Reply-To: <20060411150759.20a9e9d5.saurbier@math.uni-bielefeld.de> References: <20060411150759.20a9e9d5.saurbier@math.uni-bielefeld.de>
next in thread | previous in thread | raw e-mail | index | archive | help
This message is in MIME format. The first part should be readable text, while the remaining parts are likely unreadable without MIME-aware tools. --0-961574413-1144911302=:443 Content-Type: TEXT/PLAIN; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: QUOTED-PRINTABLE On Tue, 11 Apr 2006, Konstantin Saurbier wrote: > I've encountered a strange problem while using FreeBSD 6.0 for our local= =20 > mirror (mirror.math.uni-bielefeld.de) and thus is providing access via ft= p,=20 > http, rsync and cvsup (all local and remote). The system crashes=20 > periodically with a kernel panic (panic: sbdrop). The uptimes between two= =20 > crashes are going from a few hours to a few weeks. > > The system is a i386, Intel Pentium 4 based with 512MB ram and a 3ware-70= 00=20 > (twe) raid controller containig 1 raid 5 set with approx. 1.9TB. The kern= el=20 > is a GENERIC kernel without changes of the config. These are the kernel= =20 > dumps: There have been one or more long-term bugs we've been attempting to track d= own=20 that result in socket buffer corruption discovered only on socket close (he= nce=20 in sbdrop() when we flush the cover). We've had a lot of trouble tracking = it=20 down, and it's not clear that it's actually a single bug, since the sbdrop(= )=20 panic is a sanity check that can detect a number of types of problems. We= =20 could, for example, be looking at a network interface driver bug. There ar= e=20 changes in progress in the 7.x branch to further clean up the socket code o= n=20 SMP, and they might fix some outstanding problems. There are a couple of= =20 instances of this bug report in the PR database, but if you could file the= =20 below details, it would be helpful. I'm on travel currently, but will take= =20 another stab at this when back. Robert N M Watson > > > Unread portion of the kernel message buffer: > panic: sbdrop > Uptime: 22h22m7s > Dumping 503 MB (2 chunks) > chunk 0: 1MB (159 pages) ... ok > chunk 1: 503MB (128752 pages) 487 471 455 439 423 407 391 375 359 343 32= 7 311 295 279 263 247 231 215 199 183 167 151 135 119 103 87 71 55 39 23 7 > > #0 doadump () at pcpu.h:165 > 165 __asm __volatile("movl %%fs:0,%0" : "=3Dr" (td)); > > > (kgdb) backtrace > #0 doadump () at pcpu.h:165 > #1 0xc068d10e in boot (howto=3D260) at /usr/src/sys/kern/kern_shutdown.c= :399 > #2 0xc068d680 in panic (fmt=3D0xc090c16a "sbdrop") > at /usr/src/sys/kern/kern_shutdown.c:555 > #3 0xc06d266c in sbdrop_locked (sb=3D0xc20aca84, len=3D1) > at /usr/src/sys/kern/uipc_socket2.c:1157 > #4 0xc06d3d93 in sbdrop (sb=3D0xc20aca84, len=3D0) > at /usr/src/sys/kern/uipc_socket2.c:1208 > #5 0xc0748a7d in tcp_input (m=3D0xc1c09100, off0=3D-1039845124) > at /usr/src/sys/netinet/tcp_input.c:1201 > #6 0xc0740147 in ip_input (m=3D0xc1c09100) > at /usr/src/sys/netinet/ip_input.c:778 > #7 0xc07171ff in netisr_processqueue (ni=3D0xc09ca4f8) > at /usr/src/sys/net/netisr.c:236 > #8 0xc07174be in swi_net (dummy=3D0x0) at /usr/src/sys/net/netisr.c:349 > #9 0xc06740e5 in ithread_loop (arg=3D0xc19c3280) > at /usr/src/sys/kern/kern_intr.c:547 > #10 0xc0673110 in fork_exit (callout=3D0xc067402c <ithread_loop>, arg=3D0= x0, > frame=3D0x0) at /usr/src/sys/kern/kern_fork.c:789 > #11 0xc0894a1c in fork_trampoline () at /usr/src/sys/i386/i386/exception.= s:208 > > > > Now the output of bt full: > > (kgdb) bt full > #0 doadump () at pcpu.h:165 > No locals. > #1 0xc068d10e in boot (howto=3D260) at /usr/src/sys/kern/kern_shutdown.c= :399 > first_buf_printf =3D 1 > #2 0xc068d680 in panic (fmt=3D0xc090c16a "sbdrop") > at /usr/src/sys/kern/kern_shutdown.c:555 > bootopt =3D 260 > newpanic =3D 0 > buf =3D "sbdrop", '\0' <repeats 249 times> > #3 0xc06d266c in sbdrop_locked (sb=3D0xc20aca84, len=3D1) > at /usr/src/sys/kern/uipc_socket2.c:1157 > m =3D (struct mbuf *) 0x0 > next =3D (struct mbuf *) 0x0 > #4 0xc06d3d93 in sbdrop (sb=3D0xc20aca84, len=3D0) > at /usr/src/sys/kern/uipc_socket2.c:1208 > No locals. > #5 0xc0748a7d in tcp_input (m=3D0xc1c09100, off0=3D-1039845124) > at /usr/src/sys/netinet/tcp_input.c:1201 > dbuf =3D "\024\000\000\000\000=BB=C0=C1?\033\b=D4=E4=AC\211=C0X=BA= \231=C1=AC\033\b=D4x\034\b=D4=C3M\211=C0G\000\000\000\b\000\000\000(\000\b= =D4(\000l=C0" > sbuf =3D "\0003\234=C1l\033\b=D4\200=F2=AD=C1\0003\234=C1\224\033\= b=D4=BF\201\211=C0\0003\234=C1\000\000\000\000\000\000\000\000\000=B9\226= =C1\027\000\000\000\020(=DD=C1" > th =3D (struct tcphdr *) 0xc1b2f824 > ip =3D (struct ip *) 0xc1b2f810 > inp =3D (struct inpcb *) 0xc3ec5ca8 > optp =3D (u_char *) 0xc1b2f838 "\001\001\b\n:\r=B7\027\004=CC\236= =F2#E=B2W" > optlen =3D 12 > len =3D 69 > tlen =3D 0 > off =3D 32 > drop_hdrlen =3D 52 > tp =3D (struct tcpcb *) 0xc20538fc > thflags =3D 16 > so =3D (struct socket *) 0xc20ac9bc > todrop =3D 69 > acked =3D 69 > ourfinisacked =3D 0 > needoutput =3D 0 > tiwin =3D 5840 > to =3D {to_flags =3D 1, to_tsval =3D 973977367, to_tsecr =3D 80518= 898, > to_mss =3D 0, to_requested_s_scale =3D 0 '\0', to_nsacks =3D 0 '\0', > to_sacks =3D 0x0} > headlocked =3D 0 > rstreason =3D 69 > ip6 =3D (struct ip6_hdr *) 0x0 > isipv6 =3D 0 > #6 0xc0740147 in ip_input (m=3D0xc1c09100) > at /usr/src/sys/netinet/ip_input.c:778 > ip =3D (struct ip *) 0xc1b2f810 > ia =3D (struct in_ifaddr *) 0xc1c0bb00 > ifa =3D (struct ifaddr *) 0xc1c0bb00 > checkif =3D 0 > hlen =3D 20 > sum =3D 0 > dchg =3D 0 > odst =3D {s_addr =3D 3250633472} > #7 0xc07171ff in netisr_processqueue (ni=3D0xc09ca4f8) > at /usr/src/sys/net/netisr.c:236 > m =3D (struct mbuf *) 0xc1c09100 > #8 0xc07174be in swi_net (dummy=3D0x0) at /usr/src/sys/net/netisr.c:349 > ni =3D (struct netisr *) 0xc09ca4f8 > bits =3D 0 > i =3D 0 > #9 0xc06740e5 in ithread_loop (arg=3D0xc19c3280) > at /usr/src/sys/kern/kern_intr.c:547 > ih =3D (struct intrhand *) 0xc19c1080 > p =3D (struct proc *) 0xc19b0624 > count =3D 0 > warned =3D 0 > hlen =3D 20 > sum =3D 0 > dchg =3D 0 > odst =3D {s_addr =3D 3250633472} > #7 0xc07171ff in netisr_processqueue (ni=3D0xc09ca4f8) > at /usr/src/sys/net/netisr.c:236 > m =3D (struct mbuf *) 0xc1c09100 > #8 0xc07174be in swi_net (dummy=3D0x0) at /usr/src/sys/net/netisr.c:349 > ni =3D (struct netisr *) 0xc09ca4f8 > bits =3D 0 > i =3D 0 > #9 0xc06740e5 in ithread_loop (arg=3D0xc19c3280) > at /usr/src/sys/kern/kern_intr.c:547 > ih =3D (struct intrhand *) 0xc19c1080 > p =3D (struct proc *) 0xc19b0624 > count =3D 0 > warned =3D 0 > #10 0xc0673110 in fork_exit (callout=3D0xc067402c <ithread_loop>, arg=3D0= x0, > frame=3D0x0) at /usr/src/sys/kern/kern_fork.c:789 > p =3D (struct proc *) 0xc19b0624 > #11 0xc0894a1c in fork_trampoline () at /usr/src/sys/i386/i386/exception.= s:208 > No locals. > > > I hope that helps. If you need further information or if you have some hi= nts or directions for me, please send me a mail. > --=20 > > Best regards, > > Konstantin Saurbier > > ------------------------------------------------------ > Konstantin Saurbier Tel.: 0521 106 3861 > Computerlabor Mathematik U5-138 > Universitaet Bielefeld Universitaetsstr.25 > 33501 Bielefeld > email: saurbier@math.uni-bielefeld.de > ------------------------------------------------------ > > --0-961574413-1144911302=:443--
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20060413075019.U443>