Date: Sat, 8 Apr 2000 00:42:36 +0200 From: Wilko Bulte <wkb@chello.nl> To: Brooks Davis <brooks@one-eyed-alien.net> Cc: Warner Losh <imp@village.org>, Bob.Gorichanaz@midata.com, hackers@FreeBSD.ORG Subject: Re: bad memory patch? Message-ID: <20000408004236.A29300@yedi.wbnet> In-Reply-To: <20000407151907.A1185@orion.ac.hmc.edu>; from brooks@one-eyed-alien.net on Fri, Apr 07, 2000 at 03:19:07PM -0700 References: <OF2F5C4FC5.C68B571C-ON862568BA.0045E942@midata.com> <200004072204.QAA02457@harmony.village.org> <20000407151907.A1185@orion.ac.hmc.edu>
next in thread | previous in thread | raw e-mail | index | archive | help
On Fri, Apr 07, 2000 at 03:19:07PM -0700, Brooks Davis wrote: > On Fri, Apr 07, 2000 at 04:04:23PM -0600, Warner Losh wrote: > > In message <OF2F5C4FC5.C68B571C-ON862568BA.0045E942@midata.com> Bob.Gorichanaz@midata.com writes: > > : Maybe I'm mis-understanding something, but isn't this situation > > : analagous to bad sectors on a hard drive? Isn't this similar, at > > : least in theory, to remapping dead sectors and continuing to use the > > : drive? (except that the disk's onboard controller handles the > > : mapping instead of the OS) > > > > It is not analagous to the bad sectors on the hard drive. First, it > > is not always possible to detect a bad memory cell. In today's world, > > these cells are often bad only some of the time. They work unless > > pushed really hard in strange patters. They are just barely outside > > of spec, and usually work. This makes their detection hard. > > This can be truly evil. For instance, I was at a Myricom BOF at SC99 > and they said they had shipped a batch of cards (which they were > replacing that their expense) that had bad static RAM chips with one bit > (the exact same one on most of them) which would sometimes flip under > just the right stress. I believe the finaly built a test case that > could trigger the error within a couple of days knowing exactly where it > was and having some idea what caused it. > > The key to remember with memory is that DRAM is not the nice little > digital gate we like to think it is. It's a big ugly analog mess > and has all sorts of boundry condititions and idea digital system > wouldn't have. Right. In a former life I was part of a team that spent a couple of months tracking down mysterious DRAM errors. In our case we had parity checking on the machine. In the end our dear memory vendor said: "Well, you know, we might have found it. We had some mask alignment problems in manufacturing". Until then they always denied it was a chip problem. By then we knew that already, weekcode 37 from Hitachi was crap. Hitachi DRAM still gives me a weird feeling when I see it ;-) -- Wilko Bulte Powered by FreeBSD http://www.freebsd.org http://www.tcja.nl To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-hackers" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20000408004236.A29300>