From owner-freebsd-current Fri Feb 18 8:27:37 2000 Delivered-To: freebsd-current@freebsd.org Received: from lor.watermarkgroup.com (lor.watermarkgroup.com [207.202.73.33]) by hub.freebsd.org (Postfix) with ESMTP id BFCE737B989 for ; Fri, 18 Feb 2000 08:27:30 -0800 (PST) (envelope-from luoqi@watermarkgroup.com) Received: (from luoqi@localhost) by lor.watermarkgroup.com (8.8.8/8.8.8) id LAA15059; Fri, 18 Feb 2000 11:27:27 -0500 (EST) (envelope-from luoqi) Date: Fri, 18 Feb 2000 11:27:27 -0500 (EST) From: Luoqi Chen Message-Id: <200002181627.LAA15059@lor.watermarkgroup.com> To: freebsd-current@FreeBSD.ORG, tstromberg@rtci.com Subject: Re: repost of procfs crashes in -CURRENT (no html).. Sender: owner-freebsd-current@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.ORG > Kernel: > ======= > FreeBSD karma.afterthought.org 4.0-CURRENT FreeBSD 4.0-CURRENT #0: Mon Feb > 14 23:00:42 GMT 2000 > chenresig@karma.afterthought.org:/usr/src/sys/compile/KARMA i386 > > Background: > ============ > 3 users. One with X running , and two users running breakwidgets > , which make use of a minimized version of the > "killall" perl script which reads procfs. > > This crash appears to be the old one where when two processes read procfs > simultaneously, ugly things can happen. mdillon described this in more > depth to me once but I've since lost the e-mail. reports in late November & early december>. He suggested having my > programs "lock" procfs reads so only one could do it's killall function at > a time. Unfortunatly, the binary testing script is very time sensitive and > this would slow things down paralleled on 4 machines> > I don't believe that's the cause. > The kernel is a GENERIC one with ipv6, softupdates, and pcm added to it. > > Crash #1: > ========= > (kgdb) bt > #0 boot (howto=256) at ../../kern/kern_shutdown.c:304 > #1 0xc014e194 in poweroff_wait (junk=0xc02b9480, howto=-871862272) at > ../../kern/kern_shutdown.c:554 > #2 0xc022d064 in vm_fault (map=0xc031ee28, vaddr=3423105024, fault_type=1 > '\001', fault_flags=0) at ../../vm/vm_fault.c:240 > #3 0xc02810d2 in trap_pfault (frame=0xcc136cc4, usermode=0, > eva=3423108180) at ../../i386/i386/trap.c:788 > #4 0xc0280d37 in trap (frame={tf_fs = -871170032, tf_es = -871170032, > tf_ds = 16, tf_edi = -871142055, tf_esi = -871142025, > tf_ebp = -871141804, tf_isp = -871142160, tf_ebx = -872323392, > tf_edx = 0, tf_ecx = -872323392, tf_eax = -871859336, > tf_trapno = 12, tf_err = 0, tf_eip = -1072160861, tf_cs = 8, > tf_eflags = 66118, tf_esp = 0, tf_ss = 0}) > at ../../i386/i386/trap.c:423 > #5 0xc0181fa3 in procfs_dostatus (curp=0xcc145e00, p=0xcc0166c0, > pfs=0xc14abf60, uio=0xcc136eec) > at ../../miscfs/procfs/procfs_status.c:115 The fault is taken when trying to access the target process' p_stats which resides in the u area. What's interesting here is the code checks P_INMEM flag prior to accessing p_stats, so there shouldn't be a fault. My guess is this is an embryonic process, the p_stats field is inherited from the corpse of another process which points to no where. Would you print out p->p_stat (not p_stats) and check if it is 1 (SIDL)? That would confirm my theory. If this indeed is the case, the fix should be delaying setting P_INMEM flags in fork() until after the u area is allocated. It maybe also a good idea to skip embryonic processes in procfs altogether. > #6 0xc0182590 in procfs_rw (ap=0xcc136ea0) at > ../../miscfs/procfs/procfs_subr.c:277 > #7 0xc017dc0a in vn_read (fp=0xc14431c0, uio=0xcc136eec, cred=0xc1450700, > flags=0, p=0xcc145e00) at vnode_if.h:334 > #8 0xc015ac50 in dofileread (p=0xcc145e00, fp=0xc14431c0, fd=6, > buf=0x8235000, nbyte=4096, offset=-1, flags=0) > at ../../sys/file.h:140 > #9 0xc015ab57 in read (p=0xcc145e00, uap=0xcc136f80) at > ../../kern/sys_generic.c:111 > #10 0xc028167e in syscall (frame={tf_fs = 47, tf_es = 47, tf_ds = 47, > tf_edi = -1077946820, tf_esi = 672915688, > tf_ebp = -1077946996, tf_isp = -871141420, tf_ebx = 672858084, > tf_edx = 672809512, tf_ecx = 136531968, tf_eax = 3, > tf_trapno = 0, tf_err = 2, tf_eip = 672818732, tf_cs = 31, tf_eflags > = 659, tf_esp = -1077947040, tf_ss = 47}) > at ../../i386/i386/trap.c:1055 > > > > Crash #2: > ========= > #0 boot (howto=256) at ../../kern/kern_shutdown.c:304 > #1 0xc014e194 in poweroff_wait (junk=0xc02b9480, howto=-873472000) at > ../../kern/kern_shutdown.c:554 > #2 0xc022d064 in vm_fault (map=0xc031ee28, vaddr=3421495296, fault_type=1 > '\001', fault_flags=0) at ../../vm/vm_fault.c:240 > #3 0xc02810d2 in trap_pfault (frame=0xcbe0ccc4, usermode=0, > eva=3421498452) at ../../i386/i386/trap.c:788 > #4 0xc0280d37 in trap (frame={tf_fs = -874512368, tf_es = -874512368, > tf_ds = 16, tf_edi = -874459817, tf_esi = -874459788, > tf_ebp = -874459564, tf_isp = -874459920, tf_ebx = -873997056, > tf_edx = 0, tf_ecx = -873997056, tf_eax = -873469064, > tf_trapno = 12, tf_err = 0, tf_eip = -1072160861, tf_cs = 8, > tf_eflags = 66118, tf_esp = 0, tf_ss = 0}) > at ../../i386/i386/trap.c:423 > #5 0xc0181fa3 in procfs_dostatus (curp=0xcbd7df20, p=0xcbe7dd00, > pfs=0xc154ac20, uio=0xcbe0ceec) > at ../../miscfs/procfs/procfs_status.c:115 > #6 0xc0182590 in procfs_rw (ap=0xcbe0cea0) at > ../../miscfs/procfs/procfs_subr.c:277 > #7 0xc017dc0a in vn_read (fp=0xc1469200, uio=0xcbe0ceec, cred=0xc153d180, > flags=0, p=0xcbd7df20) at vnode_if.h:334 > #8 0xc015ac50 in dofileread (p=0xcbd7df20, fp=0xc1469200, fd=5, > buf=0x8253000, nbyte=4096, offset=-1, flags=0) > at ../../sys/file.h:140 > #9 0xc015ab57 in read (p=0xcbd7df20, uap=0xcbe0cf80) at > ../../kern/sys_generic.c:111 > #10 0xc028167e in syscall (frame={tf_fs = 47, tf_es = 47, tf_ds = 47, > tf_edi = -1077945828, tf_esi = 136638564, > tf_ebp = -1077946004, tf_isp = -874459180, tf_ebx = 672858084, > tf_edx = 672809512, tf_ecx = 136654848, tf_eax = 3, > tf_trapno = 0, tf_err = 2, tf_eip = 672818732, tf_cs = 31, tf_eflags > = 663, tf_esp = -1077946048, tf_ss = 47}) > at ../../i386/i386/trap.c:1055 > #11 0xc0276646 in Xint0x80_syscall () > -lq To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-current" in the body of the message