Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 19 Aug 2005 01:09:17 +0200
From:      Bernd Walter <ticso@cicely12.cicely.de>
To:        harrisb@rcisd.org
Cc:        ticso@cicely.de, freebsd-alpha@freebsd.org
Subject:   Re: machine check on 4100 5.4-RELEASE
Message-ID:  <20050818230916.GD90999@cicely12.cicely.de>
In-Reply-To: <OFEB0B9C6C.361AA7F5-ON86257061.005478FD-86257061.005467B3@rcisd.org>
References:  <20050818111551.GP77387@cicely12.cicely.de> <OFEB0B9C6C.361AA7F5-ON86257061.005478FD-86257061.005467B3@rcisd.org>

next in thread | previous in thread | raw e-mail | index | archive | help
On Thu, Aug 18, 2005 at 10:24:42AM -0500, harrisb@rcisd.org wrote:
> Here's the panic info:
> 
> --------------------------------------------
> Mounting root from ufs:/dev/da0a
> 
> unexpected machine check:
> 
>     mces    = 0x1
>     vector  = 0x670
>     param   = 0xfffffc0000004e10
>     pc      = 0xfffffc000072faa8
>     ra      = 0xfffffc000072bba4
>     curproc = 0xfffffc002e7cc000
>         pid = 693, comm = perl5.8.6
> 

I could be completly wrong, but vector 0x670 (on AS4100) should be
CPU detected errors, such as cache failure - either B-Cache or
internal, which points to a CPU module.

Are they always the same, or will they happen at random times?

In any case you should check the SRM error log - might be something
interesting in there.
If they are logged they should appear as MCHK 670 events.

> On Wed, Aug 17, 2005 at 04:53:28PM -0500, harrisb@rcisd.org wrote:
> > All of a sudden, I'm getting regular crashes with machine check's.
> > 
> > I've pulled one of the 533 CPU's, which didn't help, and now am 
> > wondering if it's possible that my instance of Mysql with all it's 
> > unaligned errors
> > could possibly cause it crash?  I've stopped the mysql daemon for a
> > while just to see if it stabilizes.   Anyone have any ideas? 
> > 
> > It will crash after it's been up for days, and then immediately after 
> > reboot.
> 
> Details about the machine checks would be interesting.
> 
> Unaligned errors in userland are corrected or the appplication is
> terminated, depending on configuration.
> Only unaligned faults inside the kernel are fatal.
> 
> > I keep thinking hardware, but all the srm test fine.
> 
> Hard- and software is possible, but without further details this is
> hard to say.

-- 
B.Walter                   BWCT                http://www.bwct.de
bernd@bwct.de                                  info@bwct.de




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20050818230916.GD90999>