From owner-freebsd-hackers  Wed Oct  6 17:23:52 1999
Delivered-To: freebsd-hackers@freebsd.org
Received: from dt011n66.san.rr.com (dt011n66.san.rr.com [204.210.13.102])
	by hub.freebsd.org (Postfix) with ESMTP id BFE7414CA8
	for <freebsd-hackers@FreeBSD.ORG>; Wed,  6 Oct 1999 17:23:34 -0700 (PDT)
	(envelope-from Doug@gorean.org)
Received: from localhost (doug@localhost)
	by dt011n66.san.rr.com (8.9.3/8.8.8) with ESMTP id RAA57809;
	Wed, 6 Oct 1999 17:23:32 -0700 (PDT)
	(envelope-from Doug@gorean.org)
Date: Wed, 6 Oct 1999 17:23:32 -0700 (PDT)
From: Doug <Doug@gorean.org>
X-Sender: doug@dt011n66.san.rr.com
To: freebsd-hackers@FreeBSD.ORG
Cc: Luoqi Chen <luoqi@watermarkgroup.com>
Subject: New crash, NFS/multiple frees related. (Was: zalloci/pv_entry problem)
In-Reply-To: <Pine.BSF.4.10.9910051041320.43233-100000@dt011n66.san.rr.com>
Message-ID: <Pine.BSF.4.10.9910061650400.57739-100000@dt011n66.san.rr.com>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Sender: owner-freebsd-hackers@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.ORG

	Same machine crashed, this time at a different place. I'm starting
to wonder if there may be a hardware issue with this machine, but the
errors I'm seeing in the logs are similar enough to the show-stopping
errors I had when all the machines were running the same newer -current,
so maybe it's just bad luck. In any case, here is the latest data. Any
input would be appreciated. I can resend the pertinent details to anyone
who needs them.

Thanks,

Doug


Fatal trap 12: page fault while in kernel mode
mp_lock = 00000005; cpuid = 0; lapic.id = 01000000
fault virtual address   = 0x4800c040
fault code              = supervisor read, page not present
instruction pointer     = 0x8:0xc01e2c40
stack pointer           = 0x10:0xdc838a40
frame pointer           = 0x10:0xdc838a44
code segment            = base 0x0, limit 0xfffff, type 0x1b
                        = DPL 0, pres 1, def32 1, gran 1
processor eflags        = interrupt enabled, resume, IOPL = 0
current process         = 97652 (miva)
interrupt mask          = net tty bio cam  <- SMP: XXX
     kernel: type 12 trap, code=0 

panic: free: multiple frees
mp_lock = 00000001; cpuid = 0; lapic.id = 01000000
Debugger("panic")
Stopped at      Debugger+0x37:  movl    $0,in_Debugger

db> trace
Debugger(c0216d32) at Debugger+0x37
panic(c02162ff,c20e2440,dcef4bc0,dc838c94,4800c040) at panic+0xa8
free(c20e2440,c023b580,dd1c6e40,dc838c70,c0154ee7) at free+0xd3
cache_zap(c20e2440) at cache_zap+0xb3
cache_purge(dd1c6e40,20,dd1e2c60,c1c7a040,dc838c94) at cache_purge+0x37
getnewvnode(2,c1d3ce00,c1c58d00,dc838cc8,20) at getnewvnode+0x27e
nfs_nget(c1d3ce00,c111d860,20,dc838d64,dc838e1c) at nfs_nget+0x107
nfs_create(dc838e1c,0,dc838f80,fffffffb,da07aec0) at nfs_create+0x1673
vn_open(dc838eec,20a,180,da07aec0,c0239d4c) at vn_open+0xfa
open(da07aec0,dc838f80,1,80d5d80,bfbfd5f0) at open+0xbb
syscall(2f,2f,2f,bfbfd5f0,80d5d80) at syscall+0x19e
Xint0x80_syscall() at Xint0x80_syscall+0x31

I actually had better luck with x/s than I did with x/l, so here goes:

db> x/s 0xc02162ff
__set_sysuninit_set_sym_M_FREE_uninit_sys_uninit+0x7b:  free: multiple
frees

0xc20e2440:     @$\016\302\264Y\302\301@\026\372\301H\335"\302

db> x/s 0xc023b580
M_CACHE:        \340\265#\300\200'*

db> x/s 0xdd1c6e40
0xdd1c6e40:
db> x/l 0xdd1c6e40

0xdc838c70:     \234\214\203\334\202\203\025\300@n\034\335

db> x/s 0xc0154ee7
cache_purge+0x37:       \203\304\004\203\273\210

0xc20e2440:     @$\016\302\264Y\302\301@\026\372\301H\335"\302

db> x/s 0xdcef4bc0
0xdcef4bc0:
db> x/l 0xdcef4bc0

db> x/s 0xdc838c94
0xdc838c94:
db> x/l 0xdc838c94

0xdd1e2c60:     \300\346\343\335`3\001\335

0xc1c7a040:
\3617\335\300\346\203\335@\233g\335`\014\237\335@+\361\334@k%\335@\242B\335\300V\303\334@[\322\334\2405\032\335\300\266\305\334 

db> x/s 0xc1d3ce00
0xc1d3ce00:
db> x/l 0xc1d3ce00

db> x/s 0xc1c58d00
0xc1c58d00:     \300f\025\300\250f\025\300
\320\032\300\024\323\031\300h\327\031\300\370\327\031\300\300\272\027\300\250f\025\300xh\025\300\334\327\032\300\250\004\032\300\334\273\027\300\334\333\031\300\250f\025\300
\245\027\300\250f\025\300\364\274\027\300\260\327\032\300@\317\032\300\250f\025\300\3745\032\300\014g\032\300\204\362\031\300P\210\032\300\3105\032\300\024\320\032\300T\325\031\300\250f\025\300
h\025\300X\247\032\300\320\004\032\300\004Q\032\300\250X\032\300x\240\032\300\224\317\032\300\220q\032\300\250f\025\300\360\224\027\300H\227\027\300\250f\025\300
\330\032\300\240f\025\300\250f\025\300\250f\025\300\250f\025\300 

0xdc838cc8:     \200\344\032\301\364\215\203\334oL\032\300 


On Tue, 5 Oct 1999, Doug wrote:

> On Mon, 4 Oct 1999, Luoqi Chen wrote:
> 
> > If you have a crash dump, could you look at the 4 longwords starting
> > at address 0xc02698c0? It seemed to be an accouting problem. Do you
> > by any chance use any kld module? zalloc() calls from within a module
> > do not lock the vm_zone data structure, which is fine for UP but
> > dangerous for SMP.
> 
> 	Well the same machine crashed in the same place, so I can look at
> the current crash for you:
> 
> Fatal trap 12: page fault while in kernel mode
> mp_lock = 01000002; cpuid = 1; lapic.id = 00000000
> fault virtual address   = 0x800018
> fault code              = supervisor read, page not present
> instruction pointer     = 0x8:0xc01d1107
> stack pointer           = 0x10:0xdc98fe28
> frame pointer           = 0x10:0xdc98fe34
> code segment            = base 0x0, limit 0xfffff, type 0x1b
>                         = DPL 0, pres 1, def32 1, gran 1
> processor eflags        = interrupt enabled, resume, IOPL = 0
> current process         = 41226 (httpd)
> interrupt mask          = net tty bio cam  <- SMP: XXX
> kernel: type 12 trap, code=0
> Stopped at      zalloci+0x33:   movl    0(%edx),%eax
> 
> db> trace
> zalloci(c02698c0,dc98fe58,c01f24c3,da07e7a4,91bb000) at zalloci+0x33
> get_pv_entry(da07e7a4,91bb000,ffc246ec,0,dc98fe90) at get_pv_entry+0x4a
> pmap_insert_entry(da07e7a4,91bb000,c0572b60,8206000) at
> pmap_insert_entry+0x1f
> pmap_copy(da07e7a4,dc8eeb64,80c6000,1f85000,80c6000) at pmap_copy+0x1a0
> vm_map_copy_entry(dc8eeb00,da07e740,dc8327a8,dc861ed8) at
> vm_map_copy_entry+0xdf
> vmspace_fork(dc8eeb00,dc8e9ce0,dc8e9ce0,bfbfddbc,dc98ff30) at
> vmspace_fork+0x1d3
> vm_fork(dc8ea940,dc8e9ce0,14) at vm_fork+0x2f
> fork1(dc8ea940,14,dc98ff48,dc8ea940,9) at fork1+0x621
> fork(dc8ea940,dc98ff80,805b36c,30,bfbfddbc) at fork+0x16
> syscall(c01e002f,2f,2f,bfbfddbc,30) at syscall+0x19e
> Xint0x80_syscall() at Xint0x80_syscall+0x31

> 	One other thing that I thought of related to the SMP issue is that
> these are Intel N440BX motherboards, and the BIOS has an option to set the
> "Multi-Processor Specification" or some such that was set to 1.4, with the
> other option being 1.1. Would it be better to set it to 1.1, or was my
> assumption that FreeBSD would ignore that setting anyways correct?
> 
> 	I put more DDB stuff from this crash at
> http://doug.simplenet.com/DDB2.txt, let me know if there is anything else
> I need to do. Remote GDB is an option here if you think that'd be a better
> tool. I'll check to see if I'm getting dumps when the machine comes back.
> 
> Thanks,
> 
> Doug
> 

-- 
"Stop it, I'm gettin' misty." 

    - Mel Gibson as Porter, "Payback"


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-hackers" in the body of the message