From owner-freebsd-bugs@FreeBSD.ORG Mon Nov 28 09:50:04 2005 Return-Path: X-Original-To: freebsd-bugs@hub.freebsd.org Delivered-To: freebsd-bugs@hub.freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 1A96C16A41F for ; Mon, 28 Nov 2005 09:50:04 +0000 (GMT) (envelope-from gnats@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [216.136.204.21]) by mx1.FreeBSD.org (Postfix) with ESMTP id DB52043D49 for ; Mon, 28 Nov 2005 09:50:02 +0000 (GMT) (envelope-from gnats@FreeBSD.org) Received: from freefall.freebsd.org (gnats@localhost [127.0.0.1]) by freefall.freebsd.org (8.13.3/8.13.3) with ESMTP id jAS9o2jA078009 for ; Mon, 28 Nov 2005 09:50:02 GMT (envelope-from gnats@freefall.freebsd.org) Received: (from gnats@localhost) by freefall.freebsd.org (8.13.3/8.13.1/Submit) id jAS9o27a078004; Mon, 28 Nov 2005 09:50:02 GMT (envelope-from gnats) Resent-Date: Mon, 28 Nov 2005 09:50:02 GMT Resent-Message-Id: <200511280950.jAS9o27a078004@freefall.freebsd.org> Resent-From: FreeBSD-gnats-submit@FreeBSD.org (GNATS Filer) Resent-To: freebsd-bugs@FreeBSD.org Resent-Reply-To: FreeBSD-gnats-submit@FreeBSD.org, Stijn Hoop Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id D367816A420; Mon, 28 Nov 2005 09:41:15 +0000 (GMT) (envelope-from stijn@sandcat.nl) Received: from kweetal.tue.nl (kweetal.tue.nl [131.155.3.6]) by mx1.FreeBSD.org (Postfix) with ESMTP id 1AD8B43D73; Mon, 28 Nov 2005 09:41:13 +0000 (GMT) (envelope-from stijn@sandcat.nl) Received: from localhost (localhost [127.0.0.1]) by kweetal.tue.nl (Postfix) with ESMTP id 0038413B94B; Mon, 28 Nov 2005 10:41:11 +0100 (CET) Received: from kweetal.tue.nl ([127.0.0.1]) by localhost (kweetal.tue.nl [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 60654-01; Mon, 28 Nov 2005 10:41:11 +0100 (CET) Received: from umta.win.tue.nl (umta.win.tue.nl [131.155.71.100]) by kweetal.tue.nl (Postfix) with ESMTP id 25BD713B943; Mon, 28 Nov 2005 10:41:11 +0100 (CET) Received: from sandcat.nl (pcwin165 [131.155.69.121]) by umta.win.tue.nl (Postfix) with ESMTP id 20D9231401C; Mon, 28 Nov 2005 10:41:11 +0100 (CET) Received: by sandcat.nl (Postfix, from userid 1001) id 3D52DAC823; Mon, 28 Nov 2005 10:43:38 +0100 (CET) Message-Id: <20051128094338.3D52DAC823@sandcat.nl> Date: Mon, 28 Nov 2005 10:43:38 +0100 (CET) From: Stijn Hoop To: FreeBSD-gnats-submit@FreeBSD.org X-Send-Pr-Version: 3.113 Cc: Lukas Ertl Subject: kern/89660: panic due to g_malloc returning null in gv_drive_done X-BeenThere: freebsd-bugs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: Stijn Hoop List-Id: Bug reports List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 28 Nov 2005 09:50:04 -0000 >Number: 89660 >Category: kern >Synopsis: panic due to g_malloc returning null in gv_drive_done >Confidential: no >Severity: serious >Priority: medium >Responsible: freebsd-bugs >State: open >Quarter: >Keywords: >Date-Required: >Class: sw-bug >Submitter-Id: current-users >Arrival-Date: Mon Nov 28 09:50:02 GMT 2005 >Closed-Date: >Last-Modified: >Originator: Stijn Hoop >Release: FreeBSD 6.0-RELEASE i386 >Organization: >Environment: System: FreeBSD 6.0-RELEASE #1: Sun Nov 27 14:48:26 CET 2005 stijn@pcwin002.win.tue.nl:/net/freebsd/6.0-SECURITY/obj/net/freebsd/6.0-SECURITY/src/sys/SANDCAT i386 >Description: - This machine panics every night on the daily maintenance time, with the following backtrace (hand transcribed because of a lack of serial cable): Fatal trap 12: page fault while in kernel mode fault virtual address = 0x0 fault code = supervisor write, page not present instruction pointer = 0x20:0xc0711a41 stack pointer = 0x28:0xe3245ccc frame pointer = 0x28:0xe3245cd8 code segment = base 0x0, limit 0xfffff, type 0x1b = DPL 0, pres 1, def32 1, gran 1 processor eflags = interrupt enabled, resume, IOPL = 0 current process = 3 (g_up) [thread pid 3 tid 100021 ] Stopped at gv_drive_done+0x29: movl %ebx,0(%eax) db> bt Tracing pid 3 tid 100021 td 0xc1e8f300 gv_drive_done(c6bcc630) at gv_drive_done+0x29 biodone(c6bcc630) at biodone+0x8b g_io_schedule_up(c1e8f300) at g_io_schedule_up+0x86 g_up_procbody(0,e3245d38) at g_up_procbody+0x6e fork_exit(c0482c04,0,e3245d38) at fork_exit+0x70 fork_trampoline() at fork_trampoline+0x8 --- trap 0x1, eip = 0, esp = 0xe3245d6c, ebp = 0 --- db> Luckily I was able to obtain a crash dump, which yielded the following information (after loading the geom_vinum symbols from a debug module): #0 doadump () at pcpu.h:165 #1 0xc0431953 in db_fncall (dummy1=-1066867712, dummy2=0, dummy3=-1067549157, dummy4=0xe3245af8 "$[$ã\210e^À\020[$ã\024[$ã\220\a") at /net/freebsd/6.0-SECURITY/src/sys/ddb/db_command.c:492 #2 0xc0431758 in db_command (last_cmdp=0xc0667c04, cmd_table=0x0, aux_cmd_tablep=0xc0638d58, aux_cmd_tablep_end=0xc0638d5c) at /net/freebsd/6.0-SECURITY/src/sys/ddb/db_command.c:350 #3 0xc0431820 in db_command_loop () at /net/freebsd/6.0-SECURITY/src/sys/ddb/db_command.c:458 #4 0xc043342d in db_trap (type=12, code=0) at /net/freebsd/6.0-SECURITY/src/sys/ddb/db_main.c:221 #5 0xc04d38ff in kdb_trap (type=12, code=0, tf=0xe3245c8c) at /net/freebsd/6.0-SECURITY/src/sys/kern/subr_kdb.c:473 #6 0xc05fea4c in trap_fatal (frame=0xe3245c8c, eva=0) at /net/freebsd/6.0-SECURITY/src/sys/i386/i386/trap.c:822 #7 0xc05fe7bb in trap_pfault (frame=0xe3245c8c, usermode=0, eva=0) at /net/freebsd/6.0-SECURITY/src/sys/i386/i386/trap.c:742 #8 0xc05fe3d5 in trap (frame= {tf_fs = 8, tf_es = 40, tf_ds = -484179928, tf_edi = 0, tf_esi = -1038670080, tf_ebp = -484156200, tf_isp = -484156232, tf_ebx = -960707024, tf_edx = 0, tf_ecx = -1041698048, tf_eax = 0, tf_trapno = 12, tf_err = 2, tf_eip = -1066329535, tf_cs = 32, tf_eflags = 590470, tf_esp = -1066329576, tf_ss = -960707024}) at /net/freebsd/6.0-SECURITY/src/sys/i386/i386/trap.c:432 #9 0xc05f0f8a in calltrap () at /net/freebsd/6.0-SECURITY/src/sys/i386/i386/exception.s:139 #10 0xc0711a41 in gv_drive_done (bp=0xc6bcc630) at geom.h:290 #11 0xc05038a7 in biodone (bp=0xc6bcc630) at /net/freebsd/6.0-SECURITY/src/sys/kern/vfs_bio.c:2893 #12 0xc0482a26 in g_io_schedule_up (tp=0xc1e8f300) at /net/freebsd/6.0-SECURITY/src/sys/geom/geom_io.c:474 #13 0xc0482c72 in g_up_procbody () at /net/freebsd/6.0-SECURITY/src/sys/geom/geom_kern.c:95 #14 0xc04a5820 in fork_exit (callout=0xc0482c04 , arg=0x0, frame=0xe3245d38) at /net/freebsd/6.0-SECURITY/src/sys/kern/kern_fork.c:789 #15 0xc05f0fec in fork_trampoline () at /net/freebsd/6.0-SECURITY/src/sys/i386/i386/exception.s:208 (kgdb) frame 10 #10 0xc0711a41 in gv_drive_done (bp=0xc6bcc630) at geom.h:290 290 geom.h: No such file or directory. in geom.h (kgdb) print/x $eax $3 = 0x0 (kgdb) disassemble gv_drive_done+0x41 Dump of assembler code for function gv_drive_done: 0xc0711a18 : push %ebp 0xc0711a19 : mov %esp,%ebp 0xc0711a1b : push %edi 0xc0711a1c : push %esi 0xc0711a1d : push %ebx 0xc0711a1e : mov 0x8(%ebp),%ebx 0xc0711a21 : mov 0x44(%ebx),%eax 0xc0711a24 : mov (%eax),%eax 0xc0711a26 : mov 0x3c(%eax),%esi 0xc0711a29 : orb $0x1,0x2(%ebx) 0xc0711a2d : push $0x101 0xc0711a32 : push $0xc0643ca0 0xc0711a37 : push $0xc 0xc0711a39 : call 0xc04b0a30 0xc0711a3e : add $0xc,%esp 0xc0711a41 : mov %ebx,(%eax) 0xc0711a43 : push $0xf6 0xc0711a48 : push $0xc071a752 0xc0711a4d : lea 0x7c(%esi),%ebx which on my system corresponds to %%% static void gv_drive_done(struct bio *bp) { struct gv_drive *d; struct gv_bioq *bq; /* Put the BIO on the worker queue again. */ d = bp->bio_from->geom->softc; bp->bio_cflags |= GV_BIO_DONE; bq = g_malloc(sizeof(*bq), M_NOWAIT | M_ZERO); <--- g_malloc returns NULL bq->bp = bp; mtx_lock(&d->bqueue_mtx); TAILQ_INSERT_TAIL(&d->bqueue, bq, queue); wakeup(d); mtx_unlock(&d->bqueue_mtx); } %%% Now, I don't know if this routine is run in an interrupt context, so I really wouldn't know if it is possible to remove the M_NOWAIT. Any thoughts? >How-To-Repeat: I wish I knew; I've tried running all of 'periodic daily', the backup routines (normally run at 4:00 AM), generating disk load using dd if={gvinum drive} of=/dev/null, all at the same time, and the box just copes (albeit slowly). However this is the 5th day in a row that it crashes during daily maintenance, so something else must also be triggering it. I'll update the PR if I can reproduce it more easily but for now it's "only" once a day... >Fix: Would love to have one. >Release-Note: >Audit-Trail: >Unformatted: