From owner-freebsd-questions Sun Jul 14 20:33:31 1996 Return-Path: owner-questions Received: (from root@localhost) by freefall.freebsd.org (8.7.5/8.7.3) id UAA14270 for questions-outgoing; Sun, 14 Jul 1996 20:33:31 -0700 (PDT) Received: from fw.tabula.com (fw.tabula.com [204.160.137.2]) by freefall.freebsd.org (8.7.5/8.7.3) with SMTP id UAA14261 for ; Sun, 14 Jul 1996 20:33:28 -0700 (PDT) Received: by fw.tabula.com (4.1/SMI-4.1) id AA09094; Sun, 14 Jul 96 20:33:28 PDT Received: from tab012.tabula.com(204.119.64.12) by fw.tabula.com via smap (V1.3) id sma009092; Sun Jul 14 20:32:47 1996 Received: by tabula.com (5.x/SMI-SVR4) id AA27528; Sun, 14 Jul 1996 20:31:19 -0700 Date: Sun, 14 Jul 1996 20:31:11 -0700 (PDT) From: Thor Clark To: questions@freebsd.org Subject: system hangs? after resetting rtq_reallyold Message-Id: Mime-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-questions@freebsd.org X-Loop: FreeBSD.org Precedence: bulk running 2.1 Release, straight off the cd The system doesn't actually hang - it just becomes unresponsive, while continuing to run some processes: eg will respond to a ping, and continue to run some background processes, but will not respond to telnet,ftp,http, or console input. I can (and did several times ;) reliably reproduce this by mailing ~1500 messages, with ~150 of them going to non-responsive or non-existant servers. from the log 830:Jul 14 15:21:50 trex /kernel: in_rtqtimo: adjusted rtq_reallyold to 2400 836:Jul 14 15:31:50 trex /kernel: in_rtqtimo: adjusted rtq_reallyold to 1600 846:Jul 14 15:41:50 trex /kernel: in_rtqtimo: adjusted rtq_reallyold to 1066 853:Jul 14 15:51:50 trex /kernel: in_rtqtimo: adjusted rtq_reallyold to 710 863:Jul 14 16:01:57 trex /kernel: in_rtqtimo: adjusted rtq_reallyold to 473 875:Jul 14 16:11:58 trex /kernel: in_rtqtimo: adjusted rtq_reallyold to 315 883:Jul 14 16:21:58 trex /kernel: in_rtqtimo: adjusted rtq_reallyold to 210 I tried a few things: reducing net.inet.ip.rtexpire to 500 at boot (little or no effect) (I haven't tried reducing this to 0) increasing net.inet.ip.maxcache to 256 (little or no effect) removing all the queued mail messages from /var/mqueue (the ~150 with bad addresses - this worked!) So I can probably avoid this by tweaking sendmail parameters?, but I'd like to figure out what's happening. The only way to regain control is a physical reboot, which is not a great solution ;/ from top (which continues to run through telnet, though I can't log into the console...) load averages: 5.70, 3.08, 1.64 16:56:53 51 processes: 1 running, 45 sleeping, 1 stopped, 4 zombie Cpu states: 0.0% user, 0.0% nice, 0.4% system, 0.0% interrupt, 99.6% idle Memory: 7552K Active, 2668K Inact, 2708K Wired, 1104K Cache, 64K Free Swap: 82M Total, 63M Free, 24% Inuse This is about 4 minutes after I lose access - load is generally < 1. Sorry about the long post - Any help, pointers greatly appreciated. -Thor Clark