Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 28 Jan 2009 14:26:52 +0200
From:      Andriy Gapon <avg@icyb.net.ua>
To:        FreeBSD Stable <freebsd-stable@freebsd.org>, freebsd-hardware@freebsd.org
Subject:   problem with "cold" hardware? [Was: panic in callout_reset: bad link in callwheel]
Message-ID:  <49804F0C.3000400@icyb.net.ua>
In-Reply-To: <497AF4C7.3080309@icyb.net.ua>
References:  <497AF4C7.3080309@icyb.net.ua>

next in thread | previous in thread | raw e-mail | index | archive | help
on 24/01/2009 13:00 Andriy Gapon said the following:
[snip]
> Additional info:
> I recently added some new memory to this system.
> The memory survived several passes of memtest86 before booting to
> FreeBSD. It also survived one pass after the incident.
> Still I wouldn't exclude a possibility of it being bad.

I think that I established that the crash was because of hardware issue.
I had another panic at a different place but with the similar
diagnostics - bad pointer passed to a call. Fortunately, the second time
the pointer was to a well-known long-lived object. So I was able to
compare the bad pointer to an actual address. It turned out that a
single bit was flipped.
Then I realized that in both cases I saw panics after "very cold" boots,
i.e. the system was powered down for more than 1 hour before the boot.
So I performed memtest86 run again, this time also after a long
power-off. And it reported lots of errors.
I restarted memtest86 10 minutes later and then it could not find any
errors in any tests.

Previously I heard about problems with hardware running hot, but not
with it being "cold". I put the word in quotes, because the system is in
a room with normal room temperature.

Any guesses what hardware part might be acting up like this?


-- 
Andriy Gapon



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?49804F0C.3000400>