From owner-freebsd-stable Mon Oct 30 10:53:54 2000 Delivered-To: freebsd-stable@freebsd.org Received: from mmap.nyct.net (mmap.nyct.net [216.44.109.243]) by hub.freebsd.org (Postfix) with ESMTP id 7268237B4CF for ; Mon, 30 Oct 2000 10:53:50 -0800 (PST) Received: by mmap.nyct.net (Postfix, from userid 1000) id EAE22F9A0; Mon, 30 Oct 2000 13:52:47 -0500 (EST) Date: Mon, 30 Oct 2000 13:52:47 -0500 From: Michael Bacarella To: freebsd-stable@FreeBSD.ORG Subject: followup: Re: vm_page_remove() problem.. Message-ID: <20001030135247.A6740@mmap.nyct.net> References: <20001027111723.A9057@mmap.nyct.net> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii User-Agent: Mutt/1.0.1i In-Reply-To: <20001027111723.A9057@mmap.nyct.net>; from mbac on Fri, Oct 27, 2000 at 11:17:23AM -0400 Sender: owner-freebsd-stable@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.ORG We compiled a non-SMP version of the same kernel and this problem has gone away, or at least hasn't happened in 3 days when it used to happen twice a day or so for 2 weeks. I realize a crash dump would be far more useful but it would not be appreciated if I intentionally crashed a production machine. To keep this short, I'll provide additional information on demand if anyone is interested. Thanks On Fri, Oct 27, 2000 at 11:17:23AM -0400, Michael Bacarella wrote: > This keeps happening to one of our multiprocessor servers. About twice > a day. > > panic: vm_page_remove(): page not found in hash > mp_lock = 01000001; cpuid = 1; lapic.id = 01000000 > boot() called on cpu#1 > > syncing disks... 68 68 68 68 68 68 68 68 68 68 68 68 68 68 68 68 68 68 68 68 > giving up on 67 buffers > Uptime: 21h28m53s > Automatic reboot in 15 seconds - press a key on the console to abort > Rebooting... > cpu_reset called on cpu#1 > cpu_reset: Stopping other CPUs > cpu_reset: Restarting BSP > cpu_reset_proxy: Grabbed mp lock for BSP > cpu_reset_proxy: Stopped CPU 1 > > FreeBSD bsd10.nyct.net 4.1.1-STABLE FreeBSD 4.1.1-STABLE #0: Fri Oct 20 15:58:40 EDT 2000 myj@bsd6.nyct.net:/usr/obj/usr/src/sys/NYCT i386 > > We've tweaked some variables in response to this (maxusers?) but it > doesn't seem to do the trick. > > It happens most when I do something memory intensive (like stopping and > restarting apache, and all several-hundred children), but it really does > happen quite randomly. > > My wild uneducated guess is that both processors are calling vm_page_remove() > on a page and the one that doesn't happen first ends up panic'ing because > it can't find the page anymore. > > We're in the process of trying it with a non-SMP kernel, but I figure I'd > put this out early in any case as it's obviously a bug of some kind. If > I really find myself at the end of a rope, I'm going to look to see if it's > a hardware problem. -- Michael Bacarella ;finger address for public key GPG Key Fingerprint: B4E4 82F5 BCAC AB83 E6F7 B5AA 933E 2A75 79A4 A9C1 To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-stable" in the body of the message