Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 22 Jun 2005 09:20:36 -0600
From:      "Chad Leigh -- Shire.Net LLC" <chad@shire.net>
To:        Matt Juszczak <matt@atopia.net>
Cc:        freebsd-questions questions <freebsd-questions@freebsd.org>
Subject:   Re: FreeBSD Machines dieing, we've tried so much....
Message-ID:  <823B638C-830E-45E8-82D5-4E2EC5E00534@shire.net>
In-Reply-To: <LOBBIFDAGNMAMLGJJCKNGEMKFBAA.tedm@toybox.placo.com>
References:  <LOBBIFDAGNMAMLGJJCKNGEMKFBAA.tedm@toybox.placo.com>

next in thread | previous in thread | raw e-mail | index | archive | help

On Jun 22, 2005, at 3:07 AM, Ted Mittelstaedt wrote:

>
>
>
>> -----Original Message-----
>> From: Matt Juszczak [mailto:matt@atopia.net]
>> Sent: Monday, June 20, 2005 10:49 AM
>> To: Ted Mittelstaedt
>> Cc: freebsd-questions@freebsd.org
>> Subject: RE: FreeBSD Machines dieing, we've tried so much....
>>
>>
>>
>>
>> On Mon, 20 Jun 2005, Ted Mittelstaedt wrote:
>>
>>
>>
>>
>>> Please post dmesg output from both systems.
>>>
>>
>> The systems end up crashing so I can't do a dmesg.... or do you  
>> mean a
>> general dmesg when they are stable?
>>
>>
>
> Yes. Matt, please slow down and quit panicing for just a second here
> - you haven't even told us what processor these are on let alone  
> what the
> hardware manufacturer is.  It's like your calling to schedule a  
> doctors
> appointment and you aren't even telling them if the patient is
> a man, woman, child, or for that matter, family dog!
>
> The vast majority of panics are hardware-related.  It is rare nowadays
> for a usermode program to make the system panic.  In particular you  
> said
> the problem happens more under load.  That really points even more  
> to a
> hardware problem - bad CPU cache ram, bad ram, scsi termination, that
> sort of thing.
>
> Ted

Just as an example of what Ted is saying.  About 3 or 4 years ago I  
had installed some new "server" main boards for AMD CPUs.  The  
"chipset" was a split chipset that had a "northbridge" by one vendor  
and a "southbridge" by another vendor.  One was an AMD chip and one  
was a VIA chip.  (The AMD supported ECC etc unlike all the other  
brands of that same chip functionality).  Under load (using Adaptec  
RAID controllers) the machine would freeze up.  Finally, after much  
testing and ridiculous amounts of cooling (assuming it was a heat  
problem), I replaced the main boards with new ones that only used AMD  
chipsets for both the north and southbridge chips.  Problem went away.

These same boards work fine, including under load, with Windows, for  
example, and a test Linux install also did not have problems (though  
the Linux was not very well tested).

My point is, that you can have some sort of HW problem that shows up  
under load and it may not be an pbvious one.

Test you RAM first, using something like memtest86, and think about  
what other HW is in your machine(s) and whether you can swap it out  
for test purposes, etc.

---
Chad Leigh -- Shire.Net LLC
Your Web App and Email hosting provider
chad@shire.net





Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?823B638C-830E-45E8-82D5-4E2EC5E00534>