From owner-freebsd-questions  Sun Jul 14 20:33:31 1996
Return-Path: owner-questions
Received: (from root@localhost)
          by freefall.freebsd.org (8.7.5/8.7.3) id UAA14270
          for questions-outgoing; Sun, 14 Jul 1996 20:33:31 -0700 (PDT)
Received: from fw.tabula.com (fw.tabula.com [204.160.137.2])
          by freefall.freebsd.org (8.7.5/8.7.3) with SMTP id UAA14261
          for <questions@freebsd.org>; Sun, 14 Jul 1996 20:33:28 -0700 (PDT)
Received: by fw.tabula.com (4.1/SMI-4.1)
	id AA09094; Sun, 14 Jul 96 20:33:28 PDT
Received: from tab012.tabula.com(204.119.64.12) by fw.tabula.com via smap (V1.3)
	id sma009092; Sun Jul 14 20:32:47 1996
Received: by tabula.com  (5.x/SMI-SVR4)
	id AA27528; Sun, 14 Jul 1996 20:31:19 -0700
Date: Sun, 14 Jul 1996 20:31:11 -0700 (PDT)
From: Thor Clark <thor@tab012.tabula.com>
To: questions@freebsd.org
Subject: system hangs? after resetting rtq_reallyold
Message-Id: <Pine.SOL.3.91.960714195300.27434A-100000@tab012.tabula.com>
Mime-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Sender: owner-questions@freebsd.org
X-Loop: FreeBSD.org
Precedence: bulk

running 2.1 Release, straight off the cd

The system doesn't actually hang - it just becomes unresponsive, while 
continuing to run some processes: eg will respond to a ping, and continue 
to run some background processes, but will not respond to 
telnet,ftp,http, or console input.

I can (and did several times ;) reliably reproduce this by mailing ~1500 
messages, with ~150 of them going to non-responsive or non-existant 
servers.  

from the log
830:Jul 14 15:21:50 trex /kernel: in_rtqtimo: adjusted rtq_reallyold to 2400
836:Jul 14 15:31:50 trex /kernel: in_rtqtimo: adjusted rtq_reallyold to 1600
846:Jul 14 15:41:50 trex /kernel: in_rtqtimo: adjusted rtq_reallyold to 1066
853:Jul 14 15:51:50 trex /kernel: in_rtqtimo: adjusted rtq_reallyold to 710
863:Jul 14 16:01:57 trex /kernel: in_rtqtimo: adjusted rtq_reallyold to 473
875:Jul 14 16:11:58 trex /kernel: in_rtqtimo: adjusted rtq_reallyold to 315
883:Jul 14 16:21:58 trex /kernel: in_rtqtimo: adjusted rtq_reallyold to 210

I tried a few things: 
reducing   net.inet.ip.rtexpire to 500 at boot (little or no effect)
(I haven't tried reducing this to 0) 
increasing net.inet.ip.maxcache to 256         (little or no effect)

removing all the queued mail messages from /var/mqueue (the ~150 with 
bad addresses - this worked!)

So I can probably avoid this by tweaking sendmail parameters?, but I'd 
like to figure out what's happening.  The only way to regain control is a 
physical reboot, which is not a great solution ;/

from top (which continues to run through telnet, though I can't log into 
the console...)

load averages:   5.70,  3.08,  1.64                       16:56:53
51 processes:  1 running, 45 sleeping, 1 stopped, 4 zombie
Cpu states:  0.0% user,  0.0% nice,  0.4% system,  0.0% interrupt, 99.6% idle
Memory: 7552K Active, 2668K Inact, 2708K Wired, 1104K Cache, 64K Free
Swap:   82M Total, 63M Free, 24% Inuse  

This is about 4 minutes after I lose access - load is generally < 1.
Sorry about the long post - 
Any help, pointers greatly appreciated.

-Thor Clark