Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 9 Mar 2005 23:46:20 -0800 (PST)
From:      Doug White <dwhite@gumbysoft.com>
To:        Tony Arcieri <tarcieri@atmos.colostate.edu>
Cc:        freebsd-stable@freebsd.org
Subject:   Re: Continued instability with 5.3-STABLE
Message-ID:  <20050309234350.W53915@carver.gumbysoft.com>
In-Reply-To: <20050309184838.GA64546@flash.atmos.colostate.edu>
References:  <20050309184838.GA64546@flash.atmos.colostate.edu>

next in thread | previous in thread | raw e-mail | index | archive | help
On Wed, 9 Mar 2005, Tony Arcieri wrote:

> I have a dual Opteron upon which seems to only stay up approximately two
> weeks at a time then spontaneously reboots.  It's colocated so I can't ever
> see panic messages, and I don't have another system colocated at the same
> place I can use to gather debugging info.

You may want to consider finding a small system with a free serial port to
serve as a temporary serial console.  Without output from the crash its
impossible to tell what went wrong.

> I've never managed to get the system to generate a crash dump either.  It
> has a 1GB swap partition and 2GB of physical RAM but through the last
> few reboots I've been setting hw.physmem to 896M as the only custom parameter
> in loader.conf.  The swap partition is labeled as follows:
>
> twed0s1b  swap         1024MB SWAP
>
> And dumpdev is set in rc.conf as follows:
>
> dumpdev="/dev/twed0s1b"
>
> /var/crash/minfree is set to 2048
>
> Lately I built a kernel from GENERIC using the latest RELENG_5 sources and
> without SMP support and experienced a reboot after approximately 16 days uptime,
> roughly equivalent to how long it took the system to crash with SMP enabled.
> No core file was generated.
>
> The kernel was built using source checked out from RELENG_5 on February 18th.
> I'm not sure if any Opteron specific fixes have been applied to the branch
> since then.

Make sure you're actually running this kernel since crashdump support for
twe was added 2/12, in rev 1.22.2.1 of src/sys/dev/twe/twe.c.

> Are there any other means of gathering debugging data that would work in
> my situation?  As is I'm still unsure if my problems are hardware or
> software related as I've still never seen a panic message from the
> system (hardware is a Tyan K8S motherboard in a Tyan Transport system)

You really, really want a serial console.

> Should I look into using KTR ALQ to log KTR data to the swap partition, and
> if it fills up will it wrap over to the beginning?  I've never used that
> feature before...

If you don't have a serial console to manipulate ddb from or crashdumps
then there is no way to retrieve the ktr data.

-- 
Doug White                    |  FreeBSD: The Power to Serve
dwhite@gumbysoft.com          |  www.FreeBSD.org



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20050309234350.W53915>