Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 9 Oct 2006 21:20:12 -0400
From:      John Baldwin <jhb@freebsd.org>
To:        freebsd-smp@freebsd.org
Cc:        Charles Ulrich <charles@idealso.com>
Subject:   Re: FreeBSD 6.1 Instability
Message-ID:  <200610092120.12570.jhb@freebsd.org>
In-Reply-To: <200610051544.03861.charles@idealso.com>
References:  <200610051544.03861.charles@idealso.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On Thursday 05 October 2006 15:44, Charles Ulrich wrote:
> Greetings,
>=20
> We have been running FreeBSD on our mail servers for about as long as I c=
an=20
> remember. Recently, we decided to go SMP to handle increased mail load. A=
fter=20
> assembling the hardware, installing the OS and software, and restoring al=
l of=20
> our data, we noticed in testing that our first machine began hanging=20
> semi-regularly when it began processing lots of mail. Disabling SMP=20
> eliminated the hangs completely. We tried it all again on completely=20
> different hardware with exactly the same result. Our conclusion: somethin=
gs's=20
> buggy in SMP.
>=20
> Here are the symptoms. The machine hangs, and becomes completely=20
> unresponsive. =A0It looks like a deadlock. =A0It will sometimes respond t=
o the=20
> power button and shut down (without being able to first sync and unmount=
=20
> filesystems), and sometimes the power button event gets caught in the=20
> deadlock. =A0Sinceit's not actually a crash, there is no core dump or oth=
er=20
> debugging information. In the most recent situation, it hung at different=
=20
> points every time I tried to compile ezm3, after successfully compiling o=
ther=20
> packages.
>=20
> We're system administrators, not kernel hackers, so this is a plea for he=
lp. I=20
> wouldn't know where to start, but I'm hoping someone can point me in the=
=20
> right direction. We're also willing to give a (trustworthy) FreeBSD devel=
oper=20
> root access to the test machine since it's just sitting idle right now. I=
f=20
> you need to crash it, that's fine. We'll have people during normal busine=
ss=20
> hours who know how to push a reset button.
>=20
> Thanks for your time.

Compile a debug kernel and include 'DDB' in the kernel.  When it hangs, bre=
ak into
the debugger and type 'panic' to have it panic the machine and write out a =
crash
dump.  Once you have the crash dump, download http://www.FreeBSD.org/~jhb/g=
db/gdb6
and do this:

$ kgdb /usr/obj/usr/src/sys/FOO/kernel.debug /var/crash/vmcore.X

(where FOO is your kernel config file and X is the right vmcore file)

Then do this:

(gdb) source /path/to/gdb6
(gdb) ps
=2E..

And reply with the output from the 'ps' command.

=2D-=20
John Baldwin



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200610092120.12570.jhb>