Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 17 Feb 1996 00:38:01 -0600 (CST)
From:      Joe Greco <jgreco@brasil.moneng.mei.com>
To:        taob@io.org (Brian Tao)
Cc:        freebsd-hackers@freebsd.org
Subject:   Re: Web server locks up... but not quite. (?)
Message-ID:  <199602170638.AAA06538@brasil.moneng.mei.com>
In-Reply-To: <Pine.BSF.3.91.960216213633.12191H-100000@zip.io.org> from "Brian Tao" at Feb 16, 96 09:52:00 pm

next in thread | previous in thread | raw e-mail | index | archive | help
>     This sort of thing has happened before with other 2.1.0-R machines
> here, but tonight was the first time I was able to get to the console
> of one before someone else rebooted it.
> 
>     Our web server is a P90 with 64 megabytes of RAM, running Apache
> 1.0.2.  For no discernable reason, it stopped working tonight.
> "Stopped working" in that no TCP services were available, NFS clients
> that mounted a filesystem served from it hung in disk wait and no
> rwhod packets were being broadcast.
> 
>     You could telnet to various ports on it (indicating that inetd was
> still bound to those ports), but none of the services normally
> attached to those ports would run, including internal ones like
> chargen or daytime (indicating that inetd was blocked in some way).
> It wasn't fielding RPC requests either.  The login prompt was still
> displayed on all the virtual consoles (I was still able to switch
> between them), but there was no response from the keyboard, as if the
> getty's had died off.  The only sign of life was that it was returning
> pings from another machine.
> 
>     There were no telltale messages on the console, nor in the syslog.
> This server gets 250,000 to 300,000 hits per day.  While it is
> running, it does not appear to be under any excessive load.  There are
> typically 40 to 60 httpd's running.  It exports a 4-gigabyte
> filesystem containing access logs to client machines so our customers
> can produce statistical reports.  It also mounts 26 gigabytes of home
> directories from a central NFS server.
> 
>     Since there is no indication as to the source of the hang, is
> there anything I can run periodically from cron to help track down the
> problem?  I can start tracking load averages, swap space usage, the
> output of vmstat, netstat, iostat and nfsstat if that will help.  Any
> suggestions?

I've seen similar hangs occasionally under both 2.0.5R and 2.1.0R and one
additional "thing" I've noticed is that processes that are completely
in-core appear to keep running (i.e. I had a "vmstat 1" running for a few
weeks and when the box I am thinking of locked up, the vmstat 1 was still
scrolling output, the box was ping-able, but any services that were not
entirely in-core or required other disk accesses were not available).
There is something to the "in-core" business because I have seen the same
box both continue to broadcast rwho and NOT broadcast rwho, presumably
determined by whether or not it was in-core..

... Joe

-------------------------------------------------------------------------------
Joe Greco - Systems Administrator			      jgreco@ns.sol.net
Solaria Public Access UNIX - Milwaukee, WI			   414/546-7968



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199602170638.AAA06538>