From owner-freebsd-stable Fri Jan 18 6:39:50 2002 Delivered-To: freebsd-stable@freebsd.org Received: from mikea.ath.cx (okc-65-30-192-11.mmcable.com [65.30.192.11]) by hub.freebsd.org (Postfix) with ESMTP id 3B8BC37B41D for ; Fri, 18 Jan 2002 06:39:13 -0800 (PST) Received: (from mikea@localhost) by mikea.ath.cx (8.11.6/8.11.1) id g0IEdCf92459 for freebsd-stable@FreeBSD.ORG; Fri, 18 Jan 2002 08:39:12 -0600 (CST) (envelope-from mikea) Date: Fri, 18 Jan 2002 08:39:11 -0600 From: mikea To: freebsd-stable@FreeBSD.ORG Subject: Re: random crashes on 4.4-S - ASUS CUSL2-M mobo Message-ID: <20020118083911.B92349@mikea.ath.cx> References: <20020117155224.B2190@outreachnetworks.com> <20020117151216.A90572@mikea.ath.cx> <20020117235209.D2293@outreachnetworks.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.2.5i In-Reply-To: <20020117235209.D2293@outreachnetworks.com>; from elh@outreachnetworks.com on Thu, Jan 17, 2002 at 11:52:09PM -0500 Sender: owner-freebsd-stable@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG On Thu, Jan 17, 2002 at 11:52:09PM -0500, Eric L. Howard wrote: > Mike Andrews wrote: > > Temperature? Can you run a temperature/fan-rpm/whatnot monitor, > > such as healthd? > > Yes...if only I could compile... > > [root@www etc]# cd /usr/ports/sysutils/healthd/ > [root@www healthd]# make all > >> healthd-0.6.5.tar.gz doesn't seem to exist in /usr/ports/distfiles/. > >> Attempting to fetch from http://healthd.thehousleys.net/. > Receiving healthd-0.6.5.tar.gz (64789 bytes): 100% > --------8<--snip-------- > updating cache ./config.cache > creating ./config.status > creating Makefile > creating config.h > *** Signal 11 > > Stop in /usr/ports/sysutils/healthd. > *** Error code 1 > > Stop in /usr/ports/sysutils/healthd. > *** Error code 1 > > Stop in /usr/ports/sysutils/healthd. > [root@www healthd]# ls > Makefile pkg-comment pkg-plist work > distinfo pkg-descr touch.core That's really bad. > > Totally random w.r.t. system load, or just totally random w.r.t. > > clock time? Or something else? > > Completely random period...there was zero load on the machine when I tried > to compile the above - the room and cabinet the server sits in are temp > controlled...and the box soon died after the above attempt. > > I've moved everything off the server so I've got a little playing room/time. > > I did catch this just before the box died earlier: > > vm_page_remove: page not found in hash It's about one chance in 10^googolplex of _not_ being a hardware problem. Do you have a different mobo that you can slide under the CPU(s), RAM, cards, etc.? Can you put the box out on a table and to a fingertip temperature check of things during operations? Have you tried dis- and re-connecting/inserting _EVERYTHING_, on the premise that it might be an intermittent connection triggered by vibration? Can you swap in a new power supply? Got any cooling spray? If yes (a can of compressed air for cleaning things will work), try cooling down specific components, one at a time. Do the same thing with a heat gun or hair dryer. Tap the board gently to try to isolate bad solder joints. At this point, it's grab-at-straws time. And, of course, the serial console idea is a good one. Keep us posted. -- Mike Andrews mikea@mikea.ath.cx Tired old sysadmin since 1964 To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-stable" in the body of the message