Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 22 Oct 2008 15:31:51 +0200
From:      Willem Jan Withagen <wjw@withagen.nl>
Cc:        current@freebsd.org
Subject:   Re: SMP opteron system freezes
Message-ID:  <48FF2B47.7010804@withagen.nl>
In-Reply-To: <48FCA4E4.4070508@withagen.nl>
References:  <48F90FC1.3040503@digiware.nl>	<20081018002133.GA36113@troutmask.apl.washington.edu>	<48FBB431.4090102@digiware.nl> <48FCA4E4.4070508@withagen.nl>

next in thread | previous in thread | raw e-mail | index | archive | help
Willem Jan Withagen wrote:
 > Willem Jan Withagen wrote:
 >> Steve Kargl wrote:
 >>> On Sat, Oct 18, 2008 at 12:20:49AM +0200, Willem Jan Withagen wrote:
 >>>> I'm sort of assuming that the bge0: timeouts and coalesced links 
are due to the freezing.
 >>>
 >>> Does the following help?
 >>
 >> Just a little...
 >> It now takes a little longer for the system to freeze, but eventally 
it will.
 >> The coalesced messages did not return.
 >>
 >> Just out of curiosity is also plugged in an fxp-card.
 >> And there it takes even longer for the system to freeze, but in the 
end it does freeze.
 >>
 >> The "funny" part is it once in a while is revivable by going into 
the kernel-debugger and then just continue.
 >> Sometimes a long wait (10 sec) will suffice, during which there is 
no keyboard response what so ever.
 >> But on other instances the system is dead in the water, and only a 
hardware reset gets it back.
 >>
 >> Something I'm still wondering if this only is with NFS traffic, or 
with all other types of network traffic. But I haven't tested thids.
 >
 > Well I tested something different.
 >
 > This is a (older) dual opteron 244 system. So each chip has only one 
core.
 > And I removed one of the processors...
 >
 > Guess what:
 >     It just runs without any problems as far as I could test.
 >
 > With 2 processors it is just enough to let init start all the nfs 
related stuff in /etc/rc.d and lock up the system.
 >
 > So I guess we need to look at totally different things.
 > Given enough time, I'll check and see whether 7.x does run without 
trouble.
 >
 > If somebody thinks this thread should go to amd64, just say so.
 > But I am running the i386 stuff.

Tested 7.1-PRERELEASE, and that seems to run with mount problems.
So my guess is that there is something I have in my hardware that is 
either really wierdly broken, or there is some other problem that is 
really bothering me.

So I'm in the process of getting the serial console working to capture 
some of the traceback and stuff.

People wanting to compare dmesg.8 and dmesg.7, have a look at
www.tegenbosch28.nl:/FreeBSD/Toy

--WjW




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?48FF2B47.7010804>