Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 29 Nov 2003 08:40:17 -0800 (PST)
From:      Don Bowman <don@sandvine.com>
To:        freebsd-bugs@FreeBSD.org
Subject:   RE: kern/59719 Re: 4.9 Stable Crashes on SuperMicro with SMP
Message-ID:  <200311291640.hATGeHpe068807@freefall.freebsd.org>

next in thread | raw e-mail | index | archive | help
The following reply was made to PR kern/59719; it has been noted by GNATS.

From: Don Bowman <don@sandvine.com>
To: 'Uwe Doering' <gemini@geminix.org>,
	freebsd-gnats-submit@FreeBSD.org
Cc: freebsd-bugs@freebsd.org, freebsd-stable@freebsd.org
Subject: RE: kern/59719 Re: 4.9 Stable Crashes on SuperMicro with SMP
Date: Sat, 29 Nov 2003 11:33:58 -0500

 From: Uwe Doering [mailto:gemini@geminix.org]
 > Jonathan Gilpin wrote:
 > > I've run memtest (memtest86.com) kindly provided by Don and 
 > it passed all
 > > the tests. I've installed installed a kernel module to test 
 > for memory
 > > errors and found that again no memory errors are found... 
 > So this means it's
 > > either a problem with the CPU's or a geniune bug in the 
 > kernel. (bugger!)
 > 
 > No, that's unfortunately not what it means.  If a memory test 
 > fails you 
 > can draw the conclusion that you have bad memory, but this 
 > doesn't work 
 > the other way round.  If a memory test passes there is still a 
 > possibility that a memory chip is the culprit since memory 
 > test software 
 > cannot find all errors.
 > 
 > Also, there is the chip set on the mainboard that coordinates 
 > bus access 
 > etc. for the two CPUs.  Mainboard and chip set developers are 
 > known to 
 > make errors, too.  In this case you would have to swap the entire 
 > mainboard, possible with one from a different manufacturer.  
 > I can tell 
 > you from my own experience that it is really hard to find reliable PC 
 > hardware these days, in light of ever shorter and faster 
 > product release 
 > cycles.
 
 I have several hundred of the motherboard the poster is using,
 and it works reliably with MP operation with 4.X.
 The memtest86 that i sent him understands the ECC registers
 on the e7501 MCH, it should find all correctable and uncorrectable
 errors.
 
 --don



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200311291640.hATGeHpe068807>