Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 07 Oct 2007 07:27:21 -0700
From:      Garrett Cooper <youshi10@u.washington.edu>
To:        Grant Peel <gpeel@thenetnow.com>, Gary Kline <kline@thought.org>,  questions@freebsd.org
Subject:   Re: Server Reboot
Message-ID:  <4708ECC9.8070107@u.washington.edu>
In-Reply-To: <008201c808d8$dfd697c0$6501a8c0@GRANT>
References:  <009c01c80810$169e4830$6501a8c0@GRANT> <4707A770.9060804@u.washington.edu> <20071007021558.GB67456@thought.org> <008201c808d8$dfd697c0$6501a8c0@GRANT>

next in thread | previous in thread | raw e-mail | index | archive | help
Grant Peel wrote:
> ----- Original Message -----
>
>     *From:* Gary Kline <mailto:kline@tao.thought.org>
>     *To:* Garrett Cooper <mailto:youshi10@u.washington.edu>
>     *Cc:* Grant Peel <mailto:gpeel@thenetnow.com> ; FreeBSD Mailing
>     List <mailto:freebsd-questions@freebsd.org>
>     *Sent:* Saturday, October 06, 2007 10:15 PM
>     *Subject:* Re: Server Reboot
>
>     On Sat, Oct 06, 2007 at 08:19:12AM -0700, Garrett Cooper wrote:
>     > Grant Peel wrote:
>     > >Hi all,
>     > >
>     > >This is the first time in 10 years I have seen this.
>     > >
>     > >I have a Dell PE750 (vintage 2004), running FreeBSD 6.2 that
>     had been
>     > >up and running for about 30 days without any issues.
>     > >
>     > >The server somehow rebooted last night, apparently, all by itself.
>     > >
>     > >The last log file line I can find waqs about 12:30 AM. The
>     dmesg shows
>     > >it restarted about 1:12 AM. dmesg shows some file errors that were
>     > >fixed upon reboot, other that that, everything is back up and
>     running
>     > >normally.
>     > >
>     > >I was wondering if anyone has seen anything similar and if a
>     cause was
>     > >found.
>     > >
>     > >Here is what I know:
>     > >
>     > >-all servers (there are 5 more) are plugged into the same power
>     bar
>     > >and none of the others were affected
>     > >-none of the standard logs show any intrusion or root log in
>     attempt,
>     > >-dmesg and console log show nothing of note,
>     > >-the DRAC logs and ESM logs show nothing,
>     > >-the sensors (temp,voltage,etc) logs currently show no issues, all
>     > >well withing normal parms.
>     > >-my MRTG logs show no abnormal CPU usage or network activity.
>     > >
>     > >
>     > >Any help would be appreciated,
>     > >
>     > >-Grant
>     >
>     > Check the capacitors on the motherboard (in particular near the
>     > memory and processor); they may be going bad (esp with that
>     vintage.
>     > 2004 Dell was a bad year =P..).
>     > You'll be looking for swelled capacitors and possibly some orange
>     > dialectric being emitted.
>     > -Garrett
>
>     Strange. In just the past few, 2 or 3 or even 4 weeks my
>     Dell-8200 has spontaneouslyrebooted too. I do have a number of
>     things in /var/log/messages, but nothing that I can seee that
>     would cause this problem. Before the video-card started flaking
>     out, this puppy ran for weeks/months happily. AFAIW, X (or a
>     heavily-loaded system) shouldn't have aynything to do with this
>     problem, [yes/no??]. Any clues, Garrett?
>
>     Ah, wait: dmesg.yesterday says
>
>
>
>     rl0: link state changed to UP
>     pid 729 (Xorg), uid 0: exited on signal 6 (core dumped)
>     pid 4475 (Xorg), uid 0: exited on signal 6 (core dumped)
>     pid 60174 (firefox-bin), uid 1000: exited on signal 11 (core dumped)
>     pid 47564 (as), uid 0: exited on signal 11 (core dumped)
>     pid 47570 (as), uid 0: exited on signal 11 (core dumped)
>     pid 79051 (as), uid 0: exited on signal 11 (core dumped)
>     pid 79057 (as), uid 0: exited on signal 11 (core dumped)
>     pid 3625 (as), uid 0: exited on signal 11 (core dumped)
>     pid 3631 (as), uid 0: exited on signal 11 (core dumped)
>     pid 74013 (conftest), uid 0: exited on signal 12 (core dumped)
>
>
>     This file is timestamped 03 Oct 07 at 03:17
>
>     Anybody know why firefox would core dump? I have no clue waht
>     "conftest" is... .
>
>     Grant, how oten has your system failed?
>
>
>     gary
>     Gary,
>
>     I have owned this server since new (in 2004), and this is the
>     first time it has done this. I also have another PE750 that was
>     bought and deployed the same time as this one and it has never
>     done this.
>      
>     I am not running anything graphical on this, so I am guessing its
>     not the built in video card. It is running as a server only.
>     Apache 2, Mysql, 4PHP4, Perl5, Exim4, vm-pop3d, ipa,
>     Openwebmail, and a number of add in modules for all the above.
>      
>     One thing I may have neglected in my original post, is that it
>     appears the system may have been locked for a while since the last
>     log entry I can find befor the reboot was at about 12:20 am, the
>     system then shows the reboot at about 1:20 AM.
>      
>     -Grant
>
>
>
>
>
>
>     -- 
>     Gary Kline kline@thought.org <mailto:kline@thought.org>
>     www.thought.org Public Service Unix
>     http://jottings.thought.org http://transfinite.thought.org
>
>     ------------------------------------------------------------------------
>

Gary,

    Depending on the webpages, amount of memory in use, and other 
things, firefox did have a tendency to crash from time to time when I 
used it. Most of the time it was an indication of bugs created by 
over-optimized binaries or rogue plugins / add-ons / extensions.

    conftest is run by autoconf, and a signal should only be 'thrown' 
(IIRC) if a test fails.

    Not sure about the signal 6 (SIGABRT) and other segfault stuff though..

    About the X11 comment.. actually a system that's heavier loaded than 
a lighter loaded system will exhibit more issues if any exist. So the 
more you run (at one time), the more problems you will see (possibly...).

Grant,

    I'd check your thermal stuff then (both on your drives and your 
case). What might be happening is that the machine is heating up after 
extended periods of intense computation or disk use, then it reaches the 
threshold operating temperature, and reboots.

HTH,
-Garrett



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?4708ECC9.8070107>