From owner-freebsd-current@FreeBSD.ORG Tue Nov 7 18:40:27 2006 Return-Path: X-Original-To: freebsd-current@freebsd.org Delivered-To: freebsd-current@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 1B24816A412; Tue, 7 Nov 2006 18:40:27 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from server.baldwin.cx (66-23-211-162.clients.speedfactory.net [66.23.211.162]) by mx1.FreeBSD.org (Postfix) with ESMTP id 3B4A343EA0; Tue, 7 Nov 2006 18:37:59 +0000 (GMT) (envelope-from jhb@freebsd.org) Received: from localhost.corp.yahoo.com (john@localhost [127.0.0.1]) (authenticated bits=0) by server.baldwin.cx (8.13.6/8.13.6) with ESMTP id kA7IbUj9059458; Tue, 7 Nov 2006 13:37:31 -0500 (EST) (envelope-from jhb@freebsd.org) From: John Baldwin To: freebsd-current@freebsd.org Date: Tue, 7 Nov 2006 13:24:22 -0500 User-Agent: KMail/1.9.1 References: <20061105152417.GM83118@garage.freebsd.pl> In-Reply-To: <20061105152417.GM83118@garage.freebsd.pl> MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-15" Content-Transfer-Encoding: 7bit Content-Disposition: inline Message-Id: <200611071324.23525.jhb@freebsd.org> X-Greylist: Sender succeeded SMTP AUTH authentication, not delayed by milter-greylist-2.0.2 (server.baldwin.cx [127.0.0.1]); Tue, 07 Nov 2006 13:37:31 -0500 (EST) X-Virus-Scanned: ClamAV 0.88.3/2172/Tue Nov 7 09:04:48 2006 on server.baldwin.cx X-Virus-Status: Clean X-Spam-Status: No, score=-4.4 required=4.2 tests=ALL_TRUSTED,AWL,BAYES_00 autolearn=ham version=3.1.3 X-Spam-Checker-Version: SpamAssassin 3.1.3 (2006-06-01) on server.baldwin.cx Cc: alc@freebsd.org, tegge@freebsd.org, Pawel Jakub Dawidek Subject: Re: VM deadlock? X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 07 Nov 2006 18:40:27 -0000 On Sunday 05 November 2006 10:24, Pawel Jakub Dawidek wrote: > Hi. > > We're seeing such a deadlock, seems to be VM related very often. Servers > are quite loaded, this is FreeBSD 6.2 on SMP Opteron system, but running > i386 FreeBSD. > > Here is some info from one of such machines: > > db> ps > pid ppid pgrp uid state wmesg wchan cmd > 90346 4316 4316 101 L *Giant 0xc8551ac0 qmail-smtpd > [...] > 89997 4316 4316 101 L *vm page 0xcabc9600 qmail-smtpd > [...] > 4316 4311 4316 101 LL *vm page 0xcabc9600 qmail-smtpd > 4315 4311 4315 101 RL drwebd > [...] > 34 0 0 0 LL *vm page 0xcabc9600 [pagezero] > [...] > 16 0 0 0 LL *vm page 0xcabc9600 [swi1: net] > [...] > 14 0 0 0 LL *tcp 0xcabc82c0 [swi4: clock sio] > [...] > 3 0 0 0 LL *vm page 0xcabc9600 [g_up] > > db> alltrace > Tracing command drwebd pid 4315 tid 100105 td 0xcac19600 > sched_switch(cac19600,0,2) at sched_switch+0x14b > mi_switch(2,0) at mi_switch+0x1ba > critical_exit(0,cac19600,ca0b5688,8246000,0,...) at critical_exit+0x9d > intr_execute_handlers(c844b0c8,f8211b90,34,f8211be0,c06590b3,...) at intr_execute_handlers+0x10c > lapic_handle_intr(31) at lapic_handle_intr+0x2e > Xapic_isr1() at Xapic_isr1+0x33 > --- interrupt, eip = 0xc06676c4, esp = 0xf8211bd4, ebp = 0xf8211be0 --- > pmap_pte_quick(ca0b5688,8246000) at pmap_pte_quick+0x90 > pmap_copy(ca0b5688,c88d1ea0,81a1000,a18000,81a1000) at pmap_copy+0x27c > vm_map_copy_entry(c88d1de0,ca0b55c8,cabc4374,cac161dc) at vm_map_copy_entry+0x14c > vmspace_fork(c88d1de0,4,c854b860,c8823d00,f8211cbc,...) at vmspace_fork+0x22a > vm_forkproc(cac19600,c854b860,c854c900,14) at vm_forkproc+0xb3 > fork1(cac19600,14,0,f8211cd4) at fork1+0x1169 > fork(cac19600,f8211d04) at fork+0x18 > syscall(3b,3b,bfbf003b,4,81bba00,...) at syscall+0x2bf > Xint0x80_syscall() at Xint0x80_syscall+0x1f > --- syscall (2, FreeBSD ELF32, fork), eip = 0x80e493f, esp = 0xbfbfdcac, ebp = 0xbfbfe148 --- > > db> sh allchain > chain 1: > thread 100260 (pid 90346, qmail-smtpd) blocked on lock 0xc07177e0 (sleep mutex) "Giant" > thread 100133 (pid 89997, qmail-smtpd) blocked on lock 0xc0722700 (sleep mutex) "vm page queue mutex" > thread 100105 (pid 4315, drwebd) is on a run queue > chain 2: > thread 100098 (pid 4316, qmail-smtpd) blocked on lock 0xc0722700 (sleep mutex) "vm page queue mutex" > thread 100105 (pid 4315, drwebd) is on a run queue > chain 3: > thread 100031 (pid 34, pagezero) blocked on lock 0xc0722700 (sleep mutex) "vm page queue mutex" > thread 100105 (pid 4315, drwebd) is on a run queue > chain 4: > thread 100015 (pid 3, g_up) blocked on lock 0xc0722700 (sleep mutex) "vm page queue mutex" > thread 100105 (pid 4315, drwebd) is on a run queue > chain 5: > thread 100002 (pid 14, swi4: clock sio) blocked on lock 0xc072146c (sleep mutex) "tcp" > thread 100000 (pid 16, swi1: net) blocked on lock 0xc0722700 (sleep mutex) "vm page queue mutex" > thread 100105 (pid 4315, drwebd) is on a run queue > db> sh locktree > db> sh turnstile > > Process drwebd (pid 4315) doesn't seem to go any further. How about 'show allpcpu' to see which threads are running? -- John Baldwin