From owner-freebsd-hardware@FreeBSD.ORG Thu Nov 12 12:59:05 2009 Return-Path: Delivered-To: hardware@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 2B7161065672 for ; Thu, 12 Nov 2009 12:59:05 +0000 (UTC) (envelope-from david@catwhisker.org) Received: from albert.catwhisker.org (adsl-63-193-123-122.dsl.snfc21.pacbell.net [63.193.123.122]) by mx1.freebsd.org (Postfix) with ESMTP id E82598FC1E for ; Thu, 12 Nov 2009 12:59:04 +0000 (UTC) Received: from albert.catwhisker.org (localhost [127.0.0.1]) by albert.catwhisker.org (8.14.3/8.14.3) with ESMTP id nACCx4bA002328; Thu, 12 Nov 2009 04:59:04 -0800 (PST) (envelope-from david@albert.catwhisker.org) Received: (from david@localhost) by albert.catwhisker.org (8.14.3/8.14.3/Submit) id nACCx3at002327; Thu, 12 Nov 2009 04:59:03 -0800 (PST) (envelope-from david) Date: Thu, 12 Nov 2009 04:59:03 -0800 From: David Wolfskill To: Peter Jeremy Message-ID: <20091112125903.GA1631@albert.catwhisker.org> References: <20091111173747.GA1150@albert.catwhisker.org> <20091112062708.GA16648@server.vk2pj.dyndns.org> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="jI8keyz6grp/JLjh" Content-Disposition: inline In-Reply-To: <20091112062708.GA16648@server.vk2pj.dyndns.org> User-Agent: Mutt/1.4.2.3i Cc: hardware@freebsd.org Subject: Re: 7.2-STABLE i386 box crashing -- clues? X-BeenThere: freebsd-hardware@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: hardware@freebsd.org, David Wolfskill List-Id: General discussion of FreeBSD hardware List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 12 Nov 2009 12:59:05 -0000 --jI8keyz6grp/JLjh Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Thu, Nov 12, 2009 at 05:27:09PM +1100, Peter Jeremy wrote: > I can't offer any solutions but I have some more questions... I appreciate the help! > ... > >Every once in a while, it just crashes -- hard. It loses video output > >at that point; Ctl+Alt+Esc doesn't appear to change anything; entering > >(say) "reset" blindly at that point has no apparent effect. >=20 > Roughly how often? For the current month: albert(7.2-S)[8] last reboot shutdown reboot ~ Thu Nov 12 03:04 reboot ~ Wed Nov 11 20:06 reboot ~ Wed Nov 11 14:42 shutdown ~ Wed Nov 11 14:40 reboot ~ Wed Nov 11 14:35 reboot ~ Wed Nov 11 10:05 reboot ~ Wed Nov 11 09:09 reboot ~ Wed Nov 11 04:25 reboot ~ Tue Nov 10 12:49 reboot ~ Mon Nov 9 14:52 reboot ~ Sun Nov 8 17:42 reboot ~ Sat Nov 7 04:22 reboot ~ Fri Nov 6 21:43 reboot ~ Fri Nov 6 19:00 reboot ~ Fri Nov 6 16:20 shutdown ~ Fri Nov 6 16:17 reboot ~ Fri Nov 6 16:03 reboot ~ Fri Nov 6 13:07 reboot ~ Fri Nov 6 09:46 reboot ~ Thu Nov 5 16:41 reboot ~ Thu Nov 5 13:32 reboot ~ Thu Nov 5 12:59 reboot ~ Thu Nov 5 10:17 reboot ~ Thu Nov 5 04:26 reboot ~ Wed Nov 4 20:32 reboot ~ Wed Nov 4 15:48 reboot ~ Wed Nov 4 10:37 reboot ~ Tue Nov 3 13:15 reboot ~ Tue Nov 3 10:55 reboot ~ Tue Nov 3 04:16 reboot ~ Mon Nov 2 18:13 reboot ~ Sun Nov 1 20:03 shutdown ~ Sun Nov 1 20:01 reboot ~ Sun Nov 1 17:10 reboot ~ Sun Nov 1 13:51 shutdown ~ Sun Nov 1 13:48 wtmp begins Sun Nov 1 05:08:18 PST 2009 albert(7.2-S)[9]=20 The "solo reboots" are crashes; those paired with "shutdown" entries are controlled. > Has anything unusual happened lately? Brownout, blackout, power surge, > lightning, heatwave, ... Nothing linked to the crashes. I pulled the UPS out of service some weeks ago because it needs new batteries; I need to get those ordered. But the crashes were happening before that, in any case. > >accordingly, had attached a SCSI host adaptor via PCI riser card. Since > >I had nothing actually connected to the card, I pulled it out of the > >machine before bringing it back up. >=20 > Did you also pull the riser card? Riser cards don't have a spectacularly > high reputation. That's actually what I pulled. The SCSI card itself is still physically in the chassis, merely with an air gap between itself at the system board (because the riser card is now in a closet). > > (I also fleft around for > >excessively warm spots; nothing. All fans spin up, as well.) >=20 > I don't suppose you also studied the capacitors on the motherboard. > Are any showing any signs of bulges? I'll take another look for those; I recall that electrolytics exhibit that as a sign of failure -- thanks for the reminder. > Have you tried reseating everything? The memory, yeah (even before replacing it); also swapped the DIMMs. Only other thing that can be re-seated (desktop system board, so most everything is built-in) would be the CPU, and I'm not quite sure how that heat sink works. I did re-seat some power connectors. > >Flaky CPU? Flaky power supply? How might I tell? >=20 > CPU shouldn't go flaky unless it's been overheated. In my experience, > PSUs are the least reliable part of consumer-grade hardware but about > the only way to check is to swap it. :-} > If you've got a DMM, you could check all the rails but there are > lots of failure modes that won't show up that way. Yeah, I kinda figured that. I do have a DMM (used to have a VTVM), but figured the meter wouldn't show transient dips or whatever too well. > Have you checked the voltage/temperature screen in the BIOS? Does > anything look abnormal? Did a couple of reality checks in that way as detours during some of the reboots. Nothing interesting there at all. (And I have seen a case in the past -- though with a 1U box) where that test definitely showed something wrong (CPU temp climbing about 1C every 30 seconds, IIRC). > Are you using a PS/2 or USB keyboard? PS/2 via KVM. I don't have any USB keyboarda. :-} > Are you running X? Yes; the machine is configured to start xdm on transition to multi--user, as my spouse used to use it as a desktop. (She's gone back to using its predecessor, a 4.11-STABLE machine, in frustration.) > At this stage, my suggestion would be to try swapping the PSU. Thanks. I'll discuss it with the "family CFO." Peace, david --=20 David H. Wolfskill david@catwhisker.org Depriving a girl or boy of an opportunity for education is evil. See http://www.catwhisker.org/~david/publickey.gpg for my public key. --jI8keyz6grp/JLjh Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.13 (FreeBSD) iEYEARECAAYFAkr8BpUACgkQmprOCmdXAD0yeQCfZmK6zwOTfDdQ2TIdjf9Df8QU G1MAnR81BXl85TGJIbjQ21LZqBHoFOin =QGTk -----END PGP SIGNATURE----- --jI8keyz6grp/JLjh--