From owner-freebsd-hardware@FreeBSD.ORG Wed Jan 28 12:42:34 2009 Return-Path: Delivered-To: freebsd-hardware@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id B2D17106564A for ; Wed, 28 Jan 2009 12:42:34 +0000 (UTC) (envelope-from avg@icyb.net.ua) Received: from citadel.icyb.net.ua (citadel.icyb.net.ua [212.40.38.140]) by mx1.freebsd.org (Postfix) with ESMTP id DCA278FC1A for ; Wed, 28 Jan 2009 12:42:33 +0000 (UTC) (envelope-from avg@icyb.net.ua) Received: from odyssey.starpoint.kiev.ua (alpha-e.starpoint.kiev.ua [212.40.38.101]) by citadel.icyb.net.ua (8.8.8p3/ICyb-2.3exp) with ESMTP id OAA14270; Wed, 28 Jan 2009 14:26:52 +0200 (EET) (envelope-from avg@icyb.net.ua) Message-ID: <49804F0C.3000400@icyb.net.ua> Date: Wed, 28 Jan 2009 14:26:52 +0200 From: Andriy Gapon User-Agent: Thunderbird 2.0.0.19 (X11/20090110) MIME-Version: 1.0 To: FreeBSD Stable , freebsd-hardware@freebsd.org References: <497AF4C7.3080309@icyb.net.ua> In-Reply-To: <497AF4C7.3080309@icyb.net.ua> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit Cc: Subject: problem with "cold" hardware? [Was: panic in callout_reset: bad link in callwheel] X-BeenThere: freebsd-hardware@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: General discussion of FreeBSD hardware List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 28 Jan 2009 12:42:34 -0000 on 24/01/2009 13:00 Andriy Gapon said the following: [snip] > Additional info: > I recently added some new memory to this system. > The memory survived several passes of memtest86 before booting to > FreeBSD. It also survived one pass after the incident. > Still I wouldn't exclude a possibility of it being bad. I think that I established that the crash was because of hardware issue. I had another panic at a different place but with the similar diagnostics - bad pointer passed to a call. Fortunately, the second time the pointer was to a well-known long-lived object. So I was able to compare the bad pointer to an actual address. It turned out that a single bit was flipped. Then I realized that in both cases I saw panics after "very cold" boots, i.e. the system was powered down for more than 1 hour before the boot. So I performed memtest86 run again, this time also after a long power-off. And it reported lots of errors. I restarted memtest86 10 minutes later and then it could not find any errors in any tests. Previously I heard about problems with hardware running hot, but not with it being "cold". I put the word in quotes, because the system is in a room with normal room temperature. Any guesses what hardware part might be acting up like this? -- Andriy Gapon