Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 29 Nov 2005 03:58:21 -0700
From:      Dan Charrois <dan@syz.com>
To:        freebsd-stable@freebsd.org
Subject:   Re: FreeBSD unstable on Dell 1750 using SMP?
Message-ID:  <A64C1C7E-C8F3-4C67-8211-3CD9B51DB06F@syz.com>
In-Reply-To: <20051125033033.D779D16A42D@hub.freebsd.org>
References:  <20051125033033.D779D16A42D@hub.freebsd.org>

next in thread | previous in thread | raw e-mail | index | archive | help
Thanks everyone for replies made over the past few days about the  
"unsolicited" rebooting problem.  At first, I thought there was a  
memory allocation bug as judged by the output of "netstat -m", but  
apparently it's just a cosmetic statistics reporting bug and nothing  
related to the instability itself.

Unfortunately, it means that I still haven't been able to find a  
solution to the problem (and apparently, I'm not the only one to  
experience it).  Considering that we only have the one machine, which  
happens to be a production machine, that experiences the problem  
(infrequently at that), it's difficult to test and resolve.  It's  
been suggested that FreeBSD 6.0 may fix the problem, but considering  
some of the inevitable bugs that creep into new releases, I'm  
reluctant to go there until things settle down in 6.0 (plus, I  
haven't seen any documentation that implies that a fix for the  
problem will result from using 6.0 in any case).  If it weren't a  
production machine that needs to be reliable, stable, and available,  
I'd have a better chance at being able to test it under 6.0.

Some speculation has been made about it being triggered by possibly  
buggy ethernet drivers, etc.  In my case, though possible, I doubt it  
- since my machine has rebooted itself right when mysqlhotcopy was  
about to run on the machine (and it runs locally without causing any  
network activity that I'm aware of).  The first thought I had was  
that it may be caused by faulty memory or something, but Dell's  
hardware diagnostics all tested everything to be perfectly okay.

What I find strange is that it's not that the kernel locks up or  
anything - the machine just suddenly restarts (caches aren't flushed  
to disk or anything - it's just like someone literally pulls the  
power plug midstream, and then plugs it back in.  The only indication  
that something weird goes on is that in the server logs everything  
seems to be crunching away happily and then suddenly I see the boot  
messages when it restarts all by itself..

In any case, if anyone else with a dual processor machine (I have a  
PowerEdge 2850 myself) has experienced the rebooting problem  
discussed a few days ago and resolved it, I'd very much like to hear  
from you.

Dan
--
Syzygy Research & Technology
Box 83, Legal, AB  T0G 1L0 Canada
Phone: 780-961-2213




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?A64C1C7E-C8F3-4C67-8211-3CD9B51DB06F>