Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 18 Jan 2002 08:39:11 -0600
From:      mikea <mikea@mikea.ath.cx>
To:        freebsd-stable@FreeBSD.ORG
Subject:   Re: random crashes on 4.4-S - ASUS CUSL2-M mobo
Message-ID:  <20020118083911.B92349@mikea.ath.cx>
In-Reply-To: <20020117235209.D2293@outreachnetworks.com>; from elh@outreachnetworks.com on Thu, Jan 17, 2002 at 11:52:09PM -0500
References:  <20020117155224.B2190@outreachnetworks.com> <20020117151216.A90572@mikea.ath.cx> <20020117235209.D2293@outreachnetworks.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On Thu, Jan 17, 2002 at 11:52:09PM -0500, Eric L. Howard wrote:
> Mike Andrews wrote: 

> > Temperature? Can you run a temperature/fan-rpm/whatnot monitor,
> > such as healthd?
> 
> Yes...if only I could compile...
> 
> [root@www etc]# cd /usr/ports/sysutils/healthd/
> [root@www healthd]# make all
> >> healthd-0.6.5.tar.gz doesn't seem to exist in /usr/ports/distfiles/.
> >> Attempting to fetch from http://healthd.thehousleys.net/.
> Receiving healthd-0.6.5.tar.gz (64789 bytes): 100%
> --------8<--snip--------
> updating cache ./config.cache
> creating ./config.status
> creating Makefile
> creating config.h
> *** Signal 11
> 
> Stop in /usr/ports/sysutils/healthd.
> *** Error code 1
> 
> Stop in /usr/ports/sysutils/healthd.
> *** Error code 1
> 
> Stop in /usr/ports/sysutils/healthd.
> [root@www healthd]# ls
> Makefile        pkg-comment     pkg-plist       work
> distinfo        pkg-descr       touch.core

That's really bad. 

> > Totally random w.r.t. system load, or just totally random w.r.t.
> > clock time? Or something else? 
> 
> Completely random period...there was zero load on the machine when I tried
> to compile the above - the room and cabinet the server sits in are temp
> controlled...and <sigh> the box soon died after the above attempt.
> 
> I've moved everything off the server so I've got a little playing room/time.
> 
> I did catch this just before the box died earlier:
> 
> vm_page_remove: page not found in hash

It's about one chance in 10^googolplex of _not_ being a hardware
problem. Do you have a different mobo that you can slide under 
the CPU(s), RAM, cards, etc.? Can you put the box out on a table
and to a fingertip temperature check of things during operations?
Have you tried dis- and re-connecting/inserting _EVERYTHING_, on
the premise that it might be an intermittent connection triggered
by vibration? Can you swap in a new power supply? Got any cooling
spray? If yes (a can of compressed air for cleaning things will 
work), try cooling down specific components, one at a time. Do the
same thing with a heat gun or hair dryer. Tap the board gently to
try to isolate bad solder joints. 

At this point, it's grab-at-straws time. 

And, of course, the serial console idea is a good one. 

Keep us posted.

-- 
Mike Andrews
mikea@mikea.ath.cx
Tired old sysadmin since 1964

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-stable" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20020118083911.B92349>