Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 27 Feb 2003 11:08:52 -0400 (AST)
From:      "Marc G. Fournier" <scrappy@hub.org>
To:        David Schultz <das@FreeBSD.ORG>
Cc:        freebsd-stable@FreeBSD.ORG
Subject:   Re: 4.8-PRERELEASE 'hangs' nightly like clockwork ...
Message-ID:  <20030227110726.J17399@hub.org>
In-Reply-To: <20030226060854.GA6637@HAL9000.homeunix.com>
References:  <20030225125414.P90059@hub.org> <20030226060854.GA6637@HAL9000.homeunix.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On Tue, 25 Feb 2003, David Schultz wrote:

> Thus spake Marc G. Fournier <scrappy@hub.org>:
> > For the past few nights, since I "fixed" the KVA_PAGES issue, the server
> > seems to be hanging almost like clockwork ... plus or minus a bit, but is
> > around 23hrs or so since the last hang (or, around 9pm CST, not sure which
> > one is the 'trigger') ...
> >
> > top, from last nights, shows:
> >
> > last pid: 44187;  load averages:  0.29, 11.36, 19.195   up 1+00:11:55  22:04:00
> > 3173 processes:1 running, 3150 sleeping, 22 zombie
> > CPU states:  0.0% user,  0.0% nice,  8.6% system,  0.6% interrupt, 90.8% idle
> > Mem: 2335M Active, 426M Inact, 595M Wired, 205M Cache, 199M Buf, 5860K Free
> > Swap: 2048M Total, 495M Used, 1553M Free, 24% Inuse
> >
> > now, I got the folks down at Rackspace to do a ctl-alt-esc and 'panic',
> > and it dumps core, if that helps any ... a gdb on the core file just tells
> > me that a panic was issued from the key board ... the top session above
> > continued to run up until they issued the ctl-alt-sec, as does a ping to
> > the server, so it looks like those processes resident in memory do continu
> > to run ...
>
> It sounds like processes are blocking forever on I/O.  Once you
> have a crash dump, you can run ps(1) on the image to see what
> state processes were in when the dump was taken.  I think you want
> something like
> 	ps -alxww -M/path/to/core -N/path/to/kernel
> If you notice a bunch of them stuck in a suspicious state, load
> the dump into kgdb and type

'K, first question is ... what would I consider a "suspicous state":

jupiter# awk '{print $9}' ps.1 | sort | uniq -c
 978 -
   1 FFS
   1 WCHAN
 239 accept
 324 ffsvgt
 382 inode
 558 lockf
   4 nfsd
  26 pause
 236 piperd
   1 pipewr
   3 poll
   1 ppwait
   1 psleep
  97 sbwait
  32 select
  14 ttyin
 283 wait
jupiter# wc -l ps.1
    3181 ps.1


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-stable" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20030227110726.J17399>