From owner-freebsd-alpha@FreeBSD.ORG Thu Feb 7 16:21:32 2008 Return-Path: Delivered-To: freebsd-alpha@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id B34B516A417 for ; Thu, 7 Feb 2008 16:21:32 +0000 (UTC) (envelope-from ticso@cicely12.cicely.de) Received: from raven.bwct.de (raven.bwct.de [85.159.14.73]) by mx1.freebsd.org (Postfix) with ESMTP id 36AE413C465 for ; Thu, 7 Feb 2008 16:21:31 +0000 (UTC) (envelope-from ticso@cicely12.cicely.de) Received: from cicely5.cicely.de ([10.1.1.7]) by raven.bwct.de (8.13.4/8.13.4) with ESMTP id m17GLUuT001679 for ; Thu, 7 Feb 2008 17:21:30 +0100 (CET) (envelope-from ticso@cicely12.cicely.de) Received: from cicely12.cicely.de (cicely12.cicely.de [10.1.1.14]) by cicely5.cicely.de (8.13.4/8.13.4) with ESMTP id m17GLM6s002550 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Thu, 7 Feb 2008 17:21:22 +0100 (CET) (envelope-from ticso@cicely12.cicely.de) Received: from cicely12.cicely.de (localhost [127.0.0.1]) by cicely12.cicely.de (8.13.4/8.13.3) with ESMTP id m17GLL3w032411; Thu, 7 Feb 2008 17:21:21 +0100 (CET) (envelope-from ticso@cicely12.cicely.de) Received: (from ticso@localhost) by cicely12.cicely.de (8.13.4/8.13.3/Submit) id m17GLLXA032410; Thu, 7 Feb 2008 17:21:21 +0100 (CET) (envelope-from ticso) Date: Thu, 7 Feb 2008 17:21:21 +0100 From: Bernd Walter To: ticso@cicely.de, freebsd-alpha@freebsd.org Message-ID: <20080207162120.GG24583@cicely12.cicely.de> References: <20080206121738.GA91825@mech-aslap33.men.bris.ac.uk> <20080207145311.GF24583@cicely12.cicely.de> <20080207154024.GA9605@mech-aslap33.men.bris.ac.uk> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20080207154024.GA9605@mech-aslap33.men.bris.ac.uk> X-Operating-System: FreeBSD cicely12.cicely.de 5.4-STABLE alpha User-Agent: Mutt/1.5.9i X-Spam-Status: No, score=-4.4 required=5.0 tests=ALL_TRUSTED=-1.8, BAYES_00=-2.599 autolearn=ham version=3.2.3 X-Spam-Checker-Version: SpamAssassin 3.2.3 (2007-08-08) on cicely12.cicely.de Cc: Subject: Re: DS10L - "processor correctable error" X-BeenThere: freebsd-alpha@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: ticso@cicely.de List-Id: Porting FreeBSD to the Alpha List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 07 Feb 2008 16:21:32 -0000 On Thu, Feb 07, 2008 at 03:40:24PM +0000, Anton Shterenlikht wrote: > On Thu, Feb 07, 2008 at 03:53:12PM +0100, Bernd Walter wrote: > > On Wed, Feb 06, 2008 at 12:17:38PM +0000, Anton Shterenlikht wrote: > > > > > > "Warning: received processor correctable error." > > > > > > What is the meaning of this warning? Something wrong with hardware? > > > > This is an ECC memory correction. > > It is OK to see it once in a while, since even 100% working DRAM has > > failures from time to time (called softerror rate) - therefor the need > > to have ECC in important systems > > If however you see a lot of them it is time to replace the faulty > > memory. > > Bernd, thank you. > Can I know which DIMM (DS10L has 2 DIMMs) is faulty? Unfortunately not. IIRC Tru64 and VMS have support for this, but we never had enough information to handle this and this is board specific as well. > If I run SRM memexer I get: > > >>>show_status > ID Program Device Pass Hard/Soft Bytes Written Bytes Read > -------- ------------ ------------ ------ --------- ------------- ------------- > 00000001 idle system 0 0 0 0 0 > 000003ab memtest memory 6 0 0 5586812928 5586812928 > >>> > Processor correctable error through vector 630. > > Machine Check Logout Frame @ 0x6000 Code = 0x86 > > Alpha 21264 IPRs (CPU 0): > I_STAT: 0000000000000000 DC_STAT: 000000000000000C > C_ADDR: 00000000296287C0 DC1_SYNDROME: 0000000000000000 > DC0_SYNDROME: 000000000000008F C_STAT: 0000000000000003 > C_STS: 000000000000000A MM_STAT: 0000000000000000 > > >>> > > The message appears approx. once every other pass. > The address is always the same. Don't be worried too much about this. Alphas are using the memory in pairs and can correct multiple faulty bits in a single dataword. However - you could try to remove and reconnect the Modules, since it can happen that a contact isn't good after that many years. -- B.Walter http://www.bwct.de http://www.fizon.de bernd@bwct.de info@bwct.de support@fizon.de