From owner-freebsd-hackers Fri Nov 7 22:40:27 1997 Return-Path: Received: (from root@localhost) by hub.freebsd.org (8.8.7/8.8.7) id WAA19894 for hackers-outgoing; Fri, 7 Nov 1997 22:40:27 -0800 (PST) (envelope-from owner-freebsd-hackers) Received: from biggusdiskus.flyingfox.com (biggusdiskus.flyingfox.com [206.14.52.27]) by hub.freebsd.org (8.8.7/8.8.7) with ESMTP id WAA19889 for ; Fri, 7 Nov 1997 22:40:25 -0800 (PST) (envelope-from jas@flyingfox.com) Received: (from jas@localhost) by biggusdiskus.flyingfox.com (8.8.5/8.8.5) id WAA02677 for hackers@freebsd.org; Fri, 7 Nov 1997 22:39:57 -0800 (PST) Date: Fri, 7 Nov 1997 22:39:57 -0800 (PST) From: Jim Shankland Message-Id: <199711080639.WAA02677@biggusdiskus.flyingfox.com> To: hackers@freebsd.org Subject: weird crashes in VM system -- advice sought Sender: owner-freebsd-hackers@freebsd.org X-Loop: FreeBSD.org Precedence: bulk Let me first say that I'm running a 2.2.2-RELEASE kernel with fairly extensive local modifications to the networking stuff. I'm getting crashes in the VM system, and there's a very good chance they're caused by some change I've made, though I haven't monkeyed with the VM system at all. So I'm not hoping for anyone to go find and fix this for me. But if anyone has a word or two of advice on how to track this down, or what kind of error might cause it, I'd sure appreciate it. I have 4 dumps. In all of them, the system crashed while cleaning up the VM of an exiting process. In one case, it was named (8.1.1) dying after catching a SEGV, but the others were normal exits. Two of the four crashes were a panic("rlist_free: free start overlaps already freed area") when trying to free up swap space: the same segment was apparently getting freed twice. The other two are a trap occurring in pmap_remove_pages, on line 2601 of pmap.c: pte = (unsigned *)vtopte(pv->pv_va); It appears that the trap occurred just from trying to reference the page table PTmap. The value of pv->pv_va looks reasonable to me: 8192 in one case, 32768 in the other. The value of PTmap looks right, too: it's 0xefc00000. Anyway, I fully expect to be digging at this until my fingers are bloody, but if anyone has a 30-second insight that might save me a few hours of digging, I'd sure like to hear it. I've appended stack traces below. Jim Shankland Flying Fox Computer Systems, Inc. Stack trace 1: #0 boot (howto=256) at ../../kern/kern_shutdown.c:243 #1 0xf0111572 in panic ( fmt=0xf01186af "rlist_free: free start overlaps already freed area") at ../../kern/kern_shutdown.c:367 #2 0xf01187da in rlist_free (rlh=0xf0204778, start=78432, end=78439) at ../../kern/subr_rlist.c:161 #3 0xf01ad58b in swap_pager_freeswapspace (object=0xf151a380, from=78432, to=78439) at ../../vm/swap_pager.c:409 #4 0xf01ad7f2 in swap_pager_free_swap (object=0xf151a380) at ../../vm/swap_pager.c:501 #5 0xf01add84 in swap_pager_dealloc (object=0xf151a380) at ../../vm/swap_pager.c:749 #6 0xf01b9126 in vm_pager_deallocate (object=0xf151a380) at ../../vm/vm_pager.c:177 #7 0xf01b4ec0 in vm_object_terminate (object=0xf151a380) at ../../vm/vm_object.c:416 #8 0xf01b4cfb in vm_object_deallocate (object=0xf151a380) at ../../vm/vm_object.c:353 #9 0xf01b30d8 in vm_map_entry_delete (map=0xf14bcb00, entry=0xf15cab40) at ../../vm/vm_map.c:1850 #10 0xf01b3254 in vm_map_delete (map=0xf14bcb00, start=0, end=4022329344) at ../../vm/vm_map.c:1951 #11 0xf01b32e4 in vm_map_remove (map=0xf14bcb00, start=0, end=4022329344) at ../../vm/vm_map.c:1976 #12 0xf010af90 in exit1 (p=0xf1473000, rv=139) at ../../kern/kern_exit.c:188 #13 0xf01127f6 in sigexit (p=0xf1473000, signum=11) at ../../kern/kern_sig.c:1218 #14 0xf01125da in postsig (signum=11) at ../../kern/kern_sig.c:1125 #15 0xf01c6538 in trap (frame={tf_es = 39, tf_ds = 39, tf_edi = 13859904, tf_esi = 45, tf_ebp = -272640376, tf_isp = -272629788, tf_ebx = 1240, tf_edx = 516096, tf_ecx = 0, tf_eax = 9983, tf_trapno = 12, tf_err = 1254, tf_eip = 117992, tf_cs = 31, tf_eflags = 66054, tf_esp = -272640392, tf_ss = 39}) at ../../i386/i386/trap.c:145 #16 0x1cce8 in ?? () Cannot access memory at address 0xefbfd68c. Stack trace 2: #0 boot (howto=256) at ../../kern/kern_shutdown.c:243 #1 0xf0111572 in panic (fmt=0xf01c60cf "page fault") at ../../kern/kern_shutdown.c:367 #2 0xf01c6c36 in trap_fatal (frame=0xefbffed8) at ../../i386/i386/trap.c:742 #3 0xf01c6724 in trap_pfault (frame=0xefbffed8, usermode=0) at ../../i386/i386/trap.c:653 #4 0xf01c63ff in trap (frame={tf_es = 16, tf_ds = 16, tf_edi = 0, tf_esi = -265431156, tf_ebp = -272629980, tf_isp = -272630016, tf_ebx = -265431156, tf_edx = 1073520933, tf_ecx = -272629752, tf_eax = 8, tf_trapno = 12, tf_err = 0, tf_eip = -266581769, tf_cs = 8, tf_eflags = 66066, tf_esp = -245748736, tf_ss = -247014400}) at ../../i386/i386/trap.c:311 #5 0xf01c48f7 in pmap_remove_pages (pmap=0xf15a2c64, sva=0, eva=4022329344) at ../../i386/i386/pmap.c:2601 #6 0xf010af83 in exit1 (p=0xf146dc00, rv=0) at ../../kern/kern_exit.c:186 #7 0xf010ae44 in exit (p=0xf146dc00, uap=0xefbfff94, retval=0xefbfff84) at ../../kern/kern_exit.c:106 #8 0xf01c6ecf in syscall (frame={tf_es = 39, tf_ds = 39, tf_edi = 440268, tf_esi = 0, tf_ebp = -272638484, tf_isp = -272629788, tf_ebx = -1, tf_edx = 3, tf_ecx = 16, tf_eax = 1, tf_trapno = 7, tf_err = 7, tf_eip = 83053, tf_cs = 31, tf_eflags = 582, tf_esp = -272638500, tf_ss = 39}) at ../../i386/i386/trap.c:890 #9 0x1446d in ?? () Cannot access memory at address 0xefbfddf0. (kgdb) Stack trace 3: #0 boot (howto=256) at ../../kern/kern_shutdown.c:243 #1 0xf0111572 in panic (fmt=0xf01c60cf "page fault") at ../../kern/kern_shutdown.c:367 #2 0xf01c6c36 in trap_fatal (frame=0xefbffed8) at ../../i386/i386/trap.c:742 #3 0xf01c6724 in trap_pfault (frame=0xefbffed8, usermode=0) at ../../i386/i386/trap.c:653 #4 0xf01c63ff in trap (frame={tf_es = 16, tf_ds = 16, tf_edi = 0, tf_esi = -265494360, tf_ebp = -272629980, tf_isp = -272630016, tf_ebx = -265494360, tf_edx = 134381568, tf_ecx = -272629728, tf_eax = 32, tf_trapno = 12, tf_err = 0, tf_eip = -266581769, tf_cs = 8, tf_eflags = 66066, tf_esp = -246243328, tf_ss = -247014400}) at ../../i386/i386/trap.c:311 #5 0xf01c48f7 in pmap_remove_pages (pmap=0xf152a064, sva=0, eva=4022329344) at ../../i386/i386/pmap.c:2601 #6 0xf010af83 in exit1 (p=0xf146dc00, rv=0) at ../../kern/kern_exit.c:186 #7 0xf010ae44 in exit (p=0xf146dc00, uap=0xefbfff94, retval=0xefbfff84) at ../../kern/kern_exit.c:106 #8 0xf01c6ecf in syscall (frame={tf_es = 39, tf_ds = 39, tf_edi = 0, tf_esi = -1, tf_ebp = -272638768, tf_isp = -272629788, tf_ebx = 134754400, tf_edx = 0, tf_ecx = 134694524, tf_eax = 1, tf_trapno = 7, tf_err = 7, tf_eip = 134704429, tf_cs = 31, tf_eflags = 658, tf_esp = -272638788, tf_ss = 39}) at ../../i386/i386/trap.c:890 #9 0x8076d2d in ?? () Cannot access memory at address 0xefbfdcd4. Stack trace 4: #0 boot (howto=256) at ../../kern/kern_shutdown.c:243 #1 0xf0111572 in panic ( fmt=0xf01186af "rlist_free: free start overlaps already freed area") at ../../kern/kern_shutdown.c:367 #2 0xf01187da in rlist_free (rlh=0xf0204778, start=10784, end=10791) at ../../kern/subr_rlist.c:161 #3 0xf01ad58b in swap_pager_freeswapspace (object=0xf1535280, from=10784, to=10791) at ../../vm/swap_pager.c:409 #4 0xf01adc33 in swap_pager_copy (srcobject=0xf1535280, srcoffset=0, dstobject=0xf15a5e80, dstoffset=0, offset=0) at ../../vm/swap_pager.c:692 #5 0xf01b5a21 in vm_object_collapse (object=0xf15a5e80) at ../../vm/vm_object.c:1022 #6 0xf01b4c7a in vm_object_deallocate (object=0xf1524e80) at ../../vm/vm_object.c:307 #7 0xf01b30d8 in vm_map_entry_delete (map=0xf15bc500, entry=0xf152f6c0) at ../../vm/vm_map.c:1850 #8 0xf01b3254 in vm_map_delete (map=0xf15bc500, start=0, end=4022329344) at ../../vm/vm_map.c:1951 #9 0xf01b32e4 in vm_map_remove (map=0xf15bc500, start=0, end=4022329344) at ../../vm/vm_map.c:1976 #10 0xf010af90 in exit1 (p=0xf146d800, rv=256) at ../../kern/kern_exit.c:188 #11 0xf010ae44 in exit (p=0xf146d800, uap=0xefbfff94, retval=0xefbfff84) at ../../kern/kern_exit.c:106 #12 0xf01c6ecf in syscall (frame={tf_es = 39, tf_ds = 39, tf_edi = 0, tf_esi = -1, tf_ebp = -272648200, tf_isp = -272629788, tf_ebx = 134848608, tf_edx = 0, tf_ecx = 134787612, tf_eax = 1, tf_trapno = 7, tf_err = 7, tf_eip = 134789629, tf_cs = 31, tf_eflags = 642, tf_esp = -272648220, tf_ss = 39}) at ../../i386/i386/trap.c:890 #13 0x808b9fd in ?? () Cannot access memory at address 0xefbfb7fc.