From owner-freebsd-stable@FreeBSD.ORG Tue Nov 29 10:58:35 2005 Return-Path: X-Original-To: freebsd-stable@freebsd.org Delivered-To: freebsd-stable@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 5BF9116A41F for ; Tue, 29 Nov 2005 10:58:35 +0000 (GMT) (envelope-from dan@syz.com) Received: from mail.clearwave.ca (h139-142-194-114.gtcust.grouptelecom.net [139.142.194.114]) by mx1.FreeBSD.org (Postfix) with ESMTP id 5CDF243D7B for ; Tue, 29 Nov 2005 10:58:25 +0000 (GMT) (envelope-from dan@syz.com) Received: from localhost (localhost.clearwave.ca [127.0.0.1]) by mail.clearwave.ca (Postfix) with ESMTP id B36AC10378DB for ; Tue, 29 Nov 2005 03:58:07 -0700 (MST) Received: from mail.clearwave.ca ([127.0.0.1]) by localhost (mail.clearwave.ca [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 56344-06 for ; Tue, 29 Nov 2005 03:58:04 -0700 (MST) Received: from [192.168.2.108] (h139-142-196-33.gtcust.grouptelecom.net [139.142.196.33]) by mail.clearwave.ca (Postfix) with ESMTP id 7C28810378DA for ; Tue, 29 Nov 2005 03:58:04 -0700 (MST) Mime-Version: 1.0 (Apple Message framework v746.2) In-Reply-To: <20051125033033.D779D16A42D@hub.freebsd.org> References: <20051125033033.D779D16A42D@hub.freebsd.org> Content-Type: text/plain; charset=US-ASCII; delsp=yes; format=flowed Message-Id: Content-Transfer-Encoding: 7bit From: Dan Charrois Date: Tue, 29 Nov 2005 03:58:21 -0700 To: freebsd-stable@freebsd.org X-Mailer: Apple Mail (2.746.2) X-Virus-Scanned: amavisd-new at clearwave.ca Subject: Re: FreeBSD unstable on Dell 1750 using SMP? X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 29 Nov 2005 10:58:35 -0000 Thanks everyone for replies made over the past few days about the "unsolicited" rebooting problem. At first, I thought there was a memory allocation bug as judged by the output of "netstat -m", but apparently it's just a cosmetic statistics reporting bug and nothing related to the instability itself. Unfortunately, it means that I still haven't been able to find a solution to the problem (and apparently, I'm not the only one to experience it). Considering that we only have the one machine, which happens to be a production machine, that experiences the problem (infrequently at that), it's difficult to test and resolve. It's been suggested that FreeBSD 6.0 may fix the problem, but considering some of the inevitable bugs that creep into new releases, I'm reluctant to go there until things settle down in 6.0 (plus, I haven't seen any documentation that implies that a fix for the problem will result from using 6.0 in any case). If it weren't a production machine that needs to be reliable, stable, and available, I'd have a better chance at being able to test it under 6.0. Some speculation has been made about it being triggered by possibly buggy ethernet drivers, etc. In my case, though possible, I doubt it - since my machine has rebooted itself right when mysqlhotcopy was about to run on the machine (and it runs locally without causing any network activity that I'm aware of). The first thought I had was that it may be caused by faulty memory or something, but Dell's hardware diagnostics all tested everything to be perfectly okay. What I find strange is that it's not that the kernel locks up or anything - the machine just suddenly restarts (caches aren't flushed to disk or anything - it's just like someone literally pulls the power plug midstream, and then plugs it back in. The only indication that something weird goes on is that in the server logs everything seems to be crunching away happily and then suddenly I see the boot messages when it restarts all by itself.. In any case, if anyone else with a dual processor machine (I have a PowerEdge 2850 myself) has experienced the rebooting problem discussed a few days ago and resolved it, I'd very much like to hear from you. Dan -- Syzygy Research & Technology Box 83, Legal, AB T0G 1L0 Canada Phone: 780-961-2213