From owner-freebsd-stable@FreeBSD.ORG Mon Jun 26 22:40:04 2006 Return-Path: X-Original-To: freebsd-stable@freebsd.org Delivered-To: freebsd-stable@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 5B14F16A400 for ; Mon, 26 Jun 2006 22:40:04 +0000 (UTC) (envelope-from M.Hirsch@hirsch.it) Received: from server1.hirsch.it (server1.hirsch.it [213.239.214.99]) by mx1.FreeBSD.org (Postfix) with ESMTP id 266A143D6E for ; Mon, 26 Jun 2006 22:39:53 +0000 (GMT) (envelope-from M.Hirsch@hirsch.it) Received: from hsi-kbw-085-216-025-126.hsi.kabelbw.de ([85.216.25.126] helo=[192.168.101.121]) by server1.hirsch.it with esmtpa (Exim 4.50) id 1FuzkC-0007Ez-2V; Tue, 27 Jun 2006 00:39:52 +0200 Message-ID: <44A06233.1090704@hirsch.it> Date: Tue, 27 Jun 2006 00:39:47 +0200 From: "M.Hirsch" User-Agent: Mozilla Thunderbird 1.0.6 (Macintosh/20050716) X-Accept-Language: de-DE, de, en-us, en MIME-Version: 1.0 To: Dmitry Pryanishnikov References: <20060626100949.G24406@fledge.watson.org> <20060626081029.L1114@ganymede.hub.org> <20060626140333.M38418@fledge.watson.org> <20060626235355.Q95667@atlantis.atlantis.dp.ua> <44A04FD2.1030001@hirsch.it> <20060627011512.N95667@atlantis.atlantis.dp.ua> In-Reply-To: <20060627011512.N95667@atlantis.atlantis.dp.ua> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Spam-Score: 0.2 (/) X-Spam-Report: Spam detection software, running on the system "server1.hirsch.it", has identified this incoming email as possible spam. The original message has been attached to this so you can view it (if it isn't spam) or label similar future email. If you have any questions, see the administrator of that system for details. Content preview: Dmitry Pryanishnikov schrieb: > > Hello! > > On Mon, 26 Jun 2006, M.Hirsch wrote: > >> ECC is a way to mask broken hardware. I rather have my hardware fail >> directly when it does first, so I can replace it _immediately_ > > > You got it backwards. If your data has any value to you, then you > don't want > to miss any single-error bit in it, do you? If you're running hardware > w/o > ECC, your single-bit error in your data will go to the disk unnoticed, > and you'll lose your data. With ECC, hardware will correct it. In > (rare) case of multiple-bit error ECC logic will generate NMI for you, > so you'll notice and "replace it _immediately_" instead of two weeks > ago when your archive wont extract. > Nope, I am right on track. I do not want to lose any data. So I'd prefer a ECC error to raise a panic so I can replace the hardware ASAP. Don't get me wrong, but tracking bugs in FreeBSD is quite more of an effort than "just" akquiring a new box... [...] Content analysis details: (0.2 points, 5.0 required) pts rule name description ---- ---------------------- -------------------------------------------------- 0.2 AWL AWL: From: address is in the auto white-list Cc: freebsd-stable@freebsd.org Subject: Re: FreeBSD 6.x CVSUP today crashes with zero load ... X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 26 Jun 2006 22:40:04 -0000 Dmitry Pryanishnikov schrieb: > > Hello! > > On Mon, 26 Jun 2006, M.Hirsch wrote: > >> ECC is a way to mask broken hardware. I rather have my hardware fail >> directly when it does first, so I can replace it _immediately_ > > > You got it backwards. If your data has any value to you, then you > don't want > to miss any single-error bit in it, do you? If you're running hardware > w/o > ECC, your single-bit error in your data will go to the disk unnoticed, > and you'll lose your data. With ECC, hardware will correct it. In > (rare) case of multiple-bit error ECC logic will generate NMI for you, > so you'll notice and "replace it _immediately_" instead of two weeks > ago when your archive wont extract. > Nope, I am right on track. I do not want to lose any data. So I'd prefer a ECC error to raise a panic so I can replace the hardware ASAP. Don't get me wrong, but tracking bugs in FreeBSD is quite more of an effort than "just" akquiring a new box... >> What's your hardware good for if it passes a "test", but fails in >> production? > > > It's the way in what RAM will manifest single-bit errors: you run > memory test - it won't catch them, later in production you'll miss > this error because > nothing will provide extra sanity check of your data. Ok... Does the standard fs, UFS2, do "extra sanity checks", then? M.