Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 11 Aug 2011 02:28:58 -0700
From:      Jeremy Chadwick <freebsd@jdc.parodius.com>
To:        Steven Hartland <killing@multiplay.co.uk>
Cc:        Attilio Rao <attilio@freebsd.org>, freebsd-stable@freebsd.org, Andriy Gapon <avg@freebsd.org>
Subject:   Re: debugging frequent kernel panics on 8.2-RELEASE
Message-ID:  <20110811092858.GA94514@icarus.home.lan>
In-Reply-To: <44DD20E1CFA949E8A1B15B3847769DCB@multiplay.co.uk>
References:  <47F0D04ADF034695BC8B0AC166553371@multiplay.co.uk> <A71C3ACF01EC4D36871E49805C1A5321@multiplay.co.uk> <4E4380C0.7070908@FreeBSD.org> <CAJ-FndAq2ASHzg_%2B9S__x=vTAgzHowMrv1DFSbXwroX27PF36A@mail.gmail.com> <44DD20E1CFA949E8A1B15B3847769DCB@multiplay.co.uk>

next in thread | previous in thread | raw e-mail | index | archive | help
On Thu, Aug 11, 2011 at 09:59:36AM +0100, Steven Hartland wrote:
> That's not the issue as its happening across board over 130 machines :(

Agreed, bad hardware sounds unlikely here.  I could believe some strange
incompatibility (e.g. BIOS quirk or the like[1]) that might cause problems
en masse across many servers, but hardware issues are unlikely in this
situation.

[1]: I mention this because we had something similar happen at my
workplace.  For months we used a specific model of system from our
vendor which worked reliably, zero issues.  Then we got a new shipment
of boxes (same model as prior) which started acting very odd (often AHCI
timeout issues or MCEs which when decoded would usually turn out to be
nonsensical).  It took weeks to determine the cause given how slow the
vendor was to respond: root cause turned out to be that the vendor
decided, on a whim, to start shipping a newer BIOS version which wasn't
"as compatible" with Solaris as previous BIOSes.  Downgrading all the
systems to the older BIOS fixed the problem.

In Steve's case this is unlikely to be the situation, but I thought I'd
share the story anyway.  "SKU ABCXYZ-1" from August 2009 is not
necessarily the same thing as "SKU ABCXYZ-1" from May 2010.  ;-)  This
is also why I prefer to buy/build my own systems, since I cannot trust
vendors to not mess about with settings w/out changing SKUs, P/Ns, or
revision numbers.

-- 
| Jeremy Chadwick                                jdc at parodius.com |
| Parodius Networking                       http://www.parodius.com/ |
| UNIX Systems Administrator                   Mountain View, CA, US |
| Making life hard for others since 1977.               PGP 4BD6C0CB |




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20110811092858.GA94514>