Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 4 Jul 1996 09:18:58 -0400 (EDT)
From:      Brian Tao <taob@io.org>
To:        FREEBSD-CURRENT-L <freebsd-current@freebsd.org>
Subject:   File descriptor exhaustion == processes hung in disk wait?
Message-ID:  <Pine.NEB.3.92.960704085608.20017D-100000@zap.io.org>

next in thread | raw e-mail | index | archive | help
    Subject pretty much says it all.  Our main Web server (63 virtual
domains, a few thousand other personal/business pages, 0.5M hits/day)
had been crashing or hanging under 2.1.0R once every few days, so I
upgraded to 2.2-960501-SNAP.  After the initial twiddling, it has now
been up for 17 days and counting.  :)

    Just yesterday though, I started noticing httpd processes stuck in
disk wait.  We did have a problem with one of our secondary NFS
servers that morning, and I had to kill off a bunch of D-state
processes, but the server was able to recover nicely from that.  Even
after restarting Apache, the number of D-state processes would slowly
increase.

    I was paged this morning and found that the Web server was
extremely sluggish.  Apache had hit the 150-process limit I had
specified in httpd.conf, with most of them in disk wait.

    fstat showed that each httpd had 133 file descriptors open (stdin,
stdout, stderr, 2*(# of domains), /tmp/htstatus, two sockets, plus
HTML file).  The output of fstat was 20931 lines long, with 20264 of
them belonging to userid "nobody".  Isn't this supposed to be limited
by kern.maxfiles and kern.maxfilesperproc?  Both are set to 4136 in
the kernel.  Should I recompile?

    I don't see any NFS-related errors or warnings in syslog (thinking
the NFS server may have been dropping off for a few seconds here and
there).  As a last resort, I could reboot the server to see if the
problem goes away, but that doesn't really help solve it.  :-/
--
Brian Tao (BT300, taob@io.org, taob@ican.net)
Systems and Network Administrator, Internet Canada Corp.
"Though this be madness, yet there is method in't"




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?Pine.NEB.3.92.960704085608.20017D-100000>