Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 7 Oct 1999 10:09:23 -0700 (PDT)
From:      Matthew Dillon <dillon@apollo.backplane.com>
To:        Adrian Penisoara <ady@warpnet.ro>
Cc:        freebsd-current@FreeBSD.ORG, freebsd-stable@FreeBSD.ORG
Subject:   Re: [Patches avail?] Re: MMAP() in STABLE/CURRENT ... 
Message-ID:  <199910071709.KAA95541@apollo.backplane.com>
References:   <Pine.BSF.4.10.9910071843290.16490-100000@ady.warpnet.ro>

next in thread | previous in thread | raw e-mail | index | archive | help
:Hi again,
:
: Whoops: a few hours after downgrading to 3.1-STABLE I had a double fault
:error (strange, it didn't look like a normal panic screen, just the
:message and the content of three registers, then the syncing disks
:message). It seems that I might be wrong about hardware not being the
:problem.
:
: I've changed the motherboard, CPU, memory and the video card and I'm
:waiting to see how much it's going to stay up (I have 1day 1hour uptime so
:far)...
:
: Thanks,
: Ady (@warpnet.ro)

    One thing I do on all 'server' class machines that I buy (and this is
    also something that BEST instituted as policy in 1998) is to only buy
    motherboards with ECC support and only buy ECC memory to go along with
    that support.  If you are using a non-ECC motherboard or non-ECC memory
    I would heartily recommend that you adopt the same policy.  Not that your
    problem is necessarily memory related, but I've found that memory-related
    problems account for at least 80% of the 'difficult to locate' hardware 
    problems that normally occur with PC technology.

    ECC gives you protection not only against hardware faults, but it also
    protects you against remarked dynamic ram chips and processors by 
    catching the timing errors that usually occur with such chips relatively
    soon after purchase rather then weeks or months down the line.  Being
    the commodity it is, memory is the most likely item on the motherboard
    to be out of spec.

    Intel's ECC implementation is not perfect (1), but it's good enough to 
    catch these sorts of problems.

    note 1: Intel doesn't implement memory scrubbing properly outside of the
    Xeon line and FreeBSD does not scrub memory either.  Scrubbing is a
    method of preventing bit errors from building up in memory by regenerating
    the ECC bits with a memory read followed by a memory write of the same
    data.  Outside of the Xeon chipsets the OS must issue a read followed by
    a write.  With the Xeon chipsets the OS need only issue a read and hardware
    will automatically rewrite a correction if it finds a bit error.  This
    information is 6 months old so the situation may have changed.

					-Matt
					Matthew Dillon 
					<dillon@backplane.com>


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-current" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199910071709.KAA95541>