From owner-freebsd-stable@FreeBSD.ORG Fri Oct 20 00:22:53 2006 Return-Path: X-Original-To: freebsd-stable@freebsd.org Delivered-To: freebsd-stable@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 78C3916A40F for ; Fri, 20 Oct 2006 00:22:53 +0000 (UTC) (envelope-from kaakun@highway.ne.jp) Received: from mx.highway.ne.jp (pip8.gate01.com [61.122.117.246]) by mx1.FreeBSD.org (Postfix) with ESMTP id E9D4343D45 for ; Fri, 20 Oct 2006 00:22:52 +0000 (GMT) (envelope-from kaakun@highway.ne.jp) Received: from [219.0.96.106] (helo=[192.168.11.21]) by pop11.isp.us-com.jp with esmtp (Mail 4.20) id 1Gai9y-0002yz-FL; Fri, 20 Oct 2006 09:22:50 +0900 Message-ID: <45381689.1000101@highway.ne.jp> Date: Fri, 20 Oct 2006 09:21:29 +0900 From: Kazuaki ODA User-Agent: Thunderbird 1.5.0.7 (X11/20061019) MIME-Version: 1.0 To: Kris Kennaway References: <4538009D.6030400@highway.ne.jp> <20061019225256.GA38237@xor.obsecurity.org> <45380D57.9040302@highway.ne.jp> <20061019234539.GA38985@xor.obsecurity.org> In-Reply-To: <20061019234539.GA38985@xor.obsecurity.org> Content-Type: text/plain; charset=ISO-2022-JP Content-Transfer-Encoding: 7bit Cc: freebsd-stable@freebsd.org Subject: Re: ps locks up on 6.2-PRERELEASE SMP X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 20 Oct 2006 00:22:53 -0000 Kris Kennaway wrote: > On Fri, Oct 20, 2006 at 08:42:15AM +0900, Kazuaki ODA wrote: >> Kris Kennaway wrote: >>> On Fri, Oct 20, 2006 at 07:47:57AM +0900, Kazuaki ODA wrote: >>>> Hi, >>>> >>>> I ran the following two scripts simultaneously on 6.2-PRERELEASE SMP box >>>> (cvsup'ed today). >>>> >>>> ---------------------------------------- >>>> #!/bin/sh >>>> >>>> while true; do >>>> ls -lR / > ls_result.txt >>>> done >>>> ---------------------------------------- >>>> >>>> ---------------------------------------- >>>> #!/bin/sh >>>> >>>> while true; do >>>> ls -lR / > ls_result2.txt >>>> done >>>> ---------------------------------------- >>>> >>>> After a while (about 5 minutes), I suspended and killed the both, and >>>> ran "ps". As the result, "ps" locked up. No response and no output. >>>> To make matters worse, I could not login anymore... >>>> >>>> This is 100% reproducable on SMP, but no problem when using Non-SMP kernel. >>> What state is ps in? Press ^T. It's probably a devfs lockup, which >>> should be fixed by changes yet to be merged. >>> >>> Kris >> I tried again. But ^T, ^C and ^Z had no effect, ps was still locking.. >> >> Should I rebuild kernel with DDB option, and go into DDB to get more >> information? > > Yes. Throw in WITNESS, INVARIANTS, INVARIANT_SUPPORT, DEBUG_LOCKS and > DEBUG_VFS_LOCKS while you're there. > > Kris I tried again, and pressed Ctrl + Alt + Esc to go into DDB. Here is the result: KDB: enter: manual escape to debugger [thread pid 17 tid 100012 ] Stopped at kdb_enter+0x2b: nop db> ps pid ppid pgrp uid state wmesg wchan cmd 957 934 957 1001 S+ devfs 0xc4f00058 ps 953 1 945 1001 S+ devfs 0xc799c46c ls 952 1 943 1001 S+ devfsmou 0xc4ebddc0 ls 940 939 940 1001 Ss+ ttyin 0xc4eb6810 tcsh 939 937 937 1001 S select 0xc0a2d384 sshd 937 855 937 0 Ss sbwait 0xc52c2e90 sshd 934 933 934 1001 Ss+ pause 0xc509667c tcsh 933 931 931 1001 S select 0xc0a2d384 sshd 931 855 931 0 Ss sbwait 0xc50cea64 sshd 930 1 930 0 Ss+ ttyin 0xc4d54410 getty 929 1 929 0 Ss+ ttyin 0xc4d79c10 getty 928 1 928 0 Ss+ ttyin 0xc4d60010 getty 927 1 927 0 Ss+ ttyin 0xc4d62010 getty 926 1 926 0 Ss+ ttyin 0xc4d79410 getty 925 1 925 0 Ss+ ttyin 0xc4d79010 getty 924 1 924 0 Ss+ ttyin 0xc4d78010 getty 923 1 923 0 Ss+ ttyin 0xc4d77c10 getty 922 1 922 0 Ss+ ttyin 0xc4d78c10 getty 904 1 904 0 Ss select 0xc0a2d384 inetd 876 1 876 0 Ss nanslp 0xc09e002c cron 870 1 870 25 Ss pause 0xc4d5c24c sendmail 866 1 866 0 Ss select 0xc0a2d384 sendmail 865 795 795 80 S accept 0xc4fe5b5a httpd 864 795 795 80 S accept 0xc4fe5b5a httpd 863 795 795 80 S accept 0xc4fe5b5a httpd 862 795 795 80 S accept 0xc4fe5b5a httpd 861 795 795 80 S accept 0xc4fe5b5a httpd 855 1 855 0 Ss select 0xc0a2d384 sshd 808 1 808 0 Ss select 0xc0a2d384 usbd 795 1 795 0 Ss select 0xc0a2d384 httpd 776 771 771 0 S nfslockd 0xc0a35968 rpc.lockd 771 1 771 0 Ss select 0xc0a2d384 rpc.lockd 766 1 766 0 Ss select 0xc0a2d384 rpc.statd 723 1 723 0 Ss select 0xc0a2d384 rpcbind 708 1 708 0 Ss select 0xc0a2d384 syslogd 656 1 656 0 Ss select 0xc0a2d384 devd 135 1 135 0 Ss pause 0xc4f55034 adjkerntz 48 0 0 0 SL - 0xed1d6d04 [schedcpu] 47 0 0 0 SL sdflush 0xc0a3ad94 [softdepflush] 46 0 0 0 SL vlruwt 0xc4d5b000 [vnlru] 45 0 0 0 SL syncer 0xc09dfd9c [syncer] 44 0 0 0 SL psleep 0xc0a2d8ec [bufdaemon] 43 0 0 0 SL pgzero 0xc0a3bd04 [pagezero] 42 0 0 0 SL psleep 0xc0a3b854 [vmdaemon] 41 0 0 0 SL psleep 0xc0a3b810 [pagedaemon] 40 0 0 0 WL [irq1: atkbd0] 39 0 0 0 WL [irq7: ppc0] 38 0 0 0 WL [swi0: sio] 37 0 0 0 SL - 0xc4c48a3c [fdc0] 36 0 0 0 SL cooling 0xc4d417d4 [acpi_cooling0] 35 0 0 0 SL tzpoll 0xc0b9d838 [acpi_thermal] 34 0 0 0 WL [irq15: ata1] 33 0 0 0 WL [irq14: ata0] 32 0 0 0 WL [irq20: fwohci0] 31 0 0 0 SL - 0xc4c28800 [em0 taskq] 30 0 0 0 SL usbevt 0xc4bfe210 [usb4] 29 0 0 0 WL [irq23: ehci0] 28 0 0 0 SL usbevt 0xc4c4d210 [usb3] 27 0 0 0 SL usbevt 0xc4c39210 [usb2] 26 0 0 0 WL [irq18: rl0 uhci2] 25 0 0 0 SL usbevt 0xc4c3e210 [usb1] 24 0 0 0 WL [irq19: em0 uhci1] 23 0 0 0 SL usbtsk 0xc09dd004 [usbtask] 22 0 0 0 SL usbevt 0xc4c2a210 [usb0] 21 0 0 0 WL [irq16: uhci0 uhci3] 20 0 0 0 WL [irq9: acpi0] 19 0 0 0 WL [swi2: cambio] 18 0 0 0 WL [swi5: +] 9 0 0 0 SL - 0xc4b62380 [acpi_task_2] 8 0 0 0 SL - 0xc4b62380 [acpi_task_1] 7 0 0 0 SL - 0xc4b62380 [acpi_task_0] 6 0 0 0 SL - 0xc4b62400 [thread taskq] 17 0 0 0 RL CPU 0 [swi6: Giant taskq] 16 0 0 0 WL [swi6: task queue] 5 0 0 0 SL - 0xc4b62680 [kqueue taskq] 15 0 0 0 SL - 0xc09dad20 [yarrow] 4 0 0 0 SL - 0xc09dd4e8 [g_down] 3 0 0 0 SL - 0xc09dd4e4 [g_up] 2 0 0 0 SL - 0xc09dd4dc [g_event] 14 0 0 0 WL [swi1: net] 13 0 0 0 WL [swi3: vm] 12 0 0 0 LL *Giant 0xc4af9800 [swi4: clock sio] 11 0 0 0 RL [idle: cpu0] 10 0 0 0 RL CPU 1 [idle: cpu1] 1 0 1 0 SLs wait 0xc4b04000 [init] 0 0 0 0 WLs [swapper] db> show lock db> show alllocks Process 953 (ls) thread 0xc4c58780 (100040) exclusive sx devfsmount r = 0 (0xc4ebdd90) locked @ /usr/src/sys/fs/devfs/devfs_vnops.c:575 Process 17 (swi6: Giant taskq) thread 0xc4b05180 (100012) exclusive sleep mutex Giant r = 0 (0xc09df740) locked @ /usr/src/sys/kern/kern_intr.c:681 db> show lockedvnods Locked vnodes 0xc4f00000: tag devfs, type VDIR usecount 3, writecount 0, refcount 4 mountedhere 0 flags (VV_ROOT) lock type devfs: EXCL (count 1) by thread 0xc4c58780 (pid 953) with 1 pending#0 0xc06695fd at lockmgr+0x4ed #1 0xc06c1b75 at vop_stdlock+0x21 #2 0xc088abd3 at VOP_LOCK_APV+0x87 #3 0xc06d6a90 at vn_lock+0xac #4 0xc06ca8fe at vget+0xc2 #5 0xc062cab3 at devfs_allocv+0x37 #6 0xc062d61e at devfs_lookupx+0x1a6 #7 0xc062d955 at devfs_lookup+0x2d #8 0xc0888c93 at VOP_LOOKUP_APV+0x87 #9 0xc06c3afe at lookup+0x46e #10 0xc06c342e at namei+0x37e #11 0xc06d5a8b at vn_open_cred+0x277 #12 0xc06d5812 at vn_open+0x1e #13 0xc06cecc6 at kern_open+0xb6 #14 0xc06cebda at open+0x1a #15 0xc087844f at syscall+0x22f #16 0xc08642cf at Xint0x80_syscall+0x1f 0xc799c414: tag devfs, type VDIR usecount 5, writecount 0, refcount 5 mountedhere 0 flags () lock type devfs: EXCL (count 1) by thread 0xc4c58900 (pid 952) with 1 pending#0 0xc06695fd at lockmgr+0x4ed #1 0xc06c1b75 at vop_stdlock+0x21 #2 0xc088abd3 at VOP_LOCK_APV+0x87 #3 0xc06d6a90 at vn_lock+0xac #4 0xc06c3762 at lookup+0xd2 #5 0xc06c342e at namei+0x37e #6 0xc06d0e8b at kern_lstat+0x47 #7 0xc06d0e27 at lstat+0x1b #8 0xc087844f at syscall+0x22f #9 0xc08642cf at Xint0x80_syscall+0x1f db> I'm still in DDB, so please let me know if any other information is required. -- Kazuaki ODA