From owner-freebsd-stable Tue Aug 11 10:57:09 1998 Return-Path: Received: (from majordom@localhost) by hub.freebsd.org (8.8.8/8.8.8) id KAA08848 for freebsd-stable-outgoing; Tue, 11 Aug 1998 10:57:09 -0700 (PDT) (envelope-from owner-freebsd-stable@FreeBSD.ORG) Received: from mail.promo.de (mail.Promo.DE [194.45.188.65]) by hub.freebsd.org (8.8.8/8.8.8) with ESMTP id KAA08840 for ; Tue, 11 Aug 1998 10:56:58 -0700 (PDT) (envelope-from stb@freebsd.org) Received: from d254.promo.de (d254.Promo.DE [194.45.188.254]) by mail.promo.de (8.8.8/8.8.8) with ESMTP id TAA13192; Tue, 11 Aug 1998 19:52:29 +0200 (CEST) Date: Tue, 11 Aug 1998 19:54:49 +0200 From: Stefan Bethke To: Thomas Gellekum , "Jordan K. Hubbard" cc: freebsd-stable@FreeBSD.ORG Subject: Re: Huge Bug in FreeBSD not fixed? Message-ID: <1682190.3111854089@d254.promo.de> In-Reply-To: <87yasvsqfv.fsf@ghpc6.ihf.rwth-aachen.de> Originator-Info: login-token=Mulberry:01QcdaV7Q7v73gNoD93fDe X-Mailer: Mulberry Demo (MacOS) [1.4.0a8, s/n Evaluation] MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Sender: owner-freebsd-stable@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.ORG On Die, 11. Aug 1998 13:33 Uhr +0200 Thomas Gellekum wrote: > I have run this program five times and it finished once. The other > four occasions I got > > Fatal trap 12: page fault while in kernel mode > fault virtual address = 0x18 > fault code = supervisor write, page ot present > instruction pointer = 0x8:0xf0126d21 > stack pointer = 0x10:0xefbffe50 > frame pointer = 0x10:0xefbffe74 > code segment = base 0x0, limit 0xfffff, type 0x1b > = DPL 0, pres 1, def32 1, gran 1 > processor eflags = interrupt enabled, resume, IOPL = 0 > current process = 395 (crashbsd) > interrupt mask = > kernel: type 12 trap, code=0 > Stopped at _sosend+0x391: movl $0, 0x18(%ebx) > > After saving the core dump and recompiling a few object files with -g: > #9 0xf01c0a37 in trap (frame={tf_es = -2147483632, tf_ds = -272695280, > tf_edi = -272630136, tf_esi = -2147483648, tf_ebp = -272630156, > tf_isp = -272630212, tf_ebx = 0, tf_edx = 2147483647, > tf_ecx = -1073277766, tf_eax = 0, tf_trapno = 12, tf_err = 2, > tf_eip = -267227871, tf_cs = 8, tf_eflags = 66198, tf_esp = 0, > tf_ss = 1}) at ../../i386/i386/trap.c:324 > #10 0xf0126d21 in sosend (so=0xf0937f00, addr=0x0, uio=0xefbffeb0, > top=0x0, control=0xf06fff00, flags=0) at ../../kern/uipc_socket.c:432 Looking at kern/uipc_socket.c:sosend(), one can easily spot the problem (which IIRC even Stevens mentions?) sosend() uses MGETHDR() to get a fresh mbuf, and expect it to always succeed. Looking through MGETHDR (in sys/mbuf.h) and m_mballoc() and m_retryhdr() (in kern/uipc_mbuf.c), the following can happen: The free list mmbfree is empty. MGETHDR calls m_mballoc, which in turn calls kmem_malloc(). kmem_malloc() fails because the map mb_map is full (this is where the message is logged), and returns NULL. MGETHDR then calls m_retryhdr(). m_retryhdr() tries to get mbufs from the protocols by calling m_reclaim(). If no mbufs can be recovered this way, m_retry() returns NULL. Because sosend() expects a MGET(m, M_WAIT, MT_DATA) to always succeed, it pagefaults while trying to manipulate the non-allocated mbuf (m->m_pkthdr.len at 0+0x18). As a relief you can try to increase the number of mbufs; however, this will only make the case less likely to occur. The solution would be either to make MGET() and MGETHRD() to always succeed (or sleep indefinitly), or check the result of any of those calls (as many callers already do). This in both -stable and -current. A patch might be trivial for someone who understands sosend() fully; I currently don't :-( > Anything else I can do? Fix the bug :-) ? Stefan -- Hamburg | Voice: +49-177-3504009 Germany | e-mail: stb@freebsd.org To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-stable" in the body of the message