From owner-freebsd-questions Mon Jul 15 14:17:14 1996 Return-Path: owner-questions Received: (from root@localhost) by freefall.freebsd.org (8.7.5/8.7.3) id OAA16764 for questions-outgoing; Mon, 15 Jul 1996 14:17:14 -0700 (PDT) Received: from fw.tabula.com (fw.tabula.com [204.160.137.2]) by freefall.freebsd.org (8.7.5/8.7.3) with SMTP id OAA16757 for ; Mon, 15 Jul 1996 14:17:10 -0700 (PDT) Received: by fw.tabula.com (4.1/SMI-4.1) id AA18030; Mon, 15 Jul 96 14:17:06 PDT Received: from tab012.tabula.com(204.119.64.12) by fw.tabula.com via smap (V1.3) id sma018018; Mon Jul 15 14:16:11 1996 Received: by tabula.com (5.x/SMI-SVR4) id AA08661; Mon, 15 Jul 1996 14:14:43 -0700 Date: Mon, 15 Jul 1996 14:14:42 -0700 (PDT) From: Thor Clark To: David Greenman Cc: questions@freebsd.org Subject: Re: system hangs? after resetting rtq_reallyold In-Reply-To: <199607150521.WAA02093@root.com> Message-Id: Mime-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-questions@freebsd.org X-Loop: FreeBSD.org Precedence: bulk On Sun, 14 Jul 1996, David Greenman wrote: > >running 2.1 Release, straight off the cd > > > >The system doesn't actually hang - it just becomes unresponsive, while > >continuing to run some processes: eg will respond to a ping, and continue > >to run some background processes, but will not respond to > >telnet,ftp,http, or console input. > > > >I can (and did several times ;) reliably reproduce this by mailing ~1500 > >messages, with ~150 of them going to non-responsive or non-existant > >servers. > > > >from the log > >830:Jul 14 15:21:50 trex /kernel: in_rtqtimo: adjusted rtq_reallyold to 2400 > >836:Jul 14 15:31:50 trex /kernel: in_rtqtimo: adjusted rtq_reallyold to 1600 > >846:Jul 14 15:41:50 trex /kernel: in_rtqtimo: adjusted rtq_reallyold to 1066 > >853:Jul 14 15:51:50 trex /kernel: in_rtqtimo: adjusted rtq_reallyold to 710 > >863:Jul 14 16:01:57 trex /kernel: in_rtqtimo: adjusted rtq_reallyold to 473 > >875:Jul 14 16:11:58 trex /kernel: in_rtqtimo: adjusted rtq_reallyold to 315 > >883:Jul 14 16:21:58 trex /kernel: in_rtqtimo: adjusted rtq_reallyold to 210 > > The above messages can be completely ignored and aren't part of the > problem. They indicate that the system is adjusting the route expiration > time for "clone" routes. > Look via dmesg or in /var/log/messages for a kernel message of the form > "Out of mbuf clusters - increase maxusers", or for older releases of FreeBSD, > "mb_map full". The message will only occur once. If this is happening (likely), > then you need to increase the number of mbuf clusters either by increasing > maxusers or by adding: > > options "NMBCLUSTERS=" > > ...to your kernel config file. Where is some number in the area of 2000 > and not more than 4000 (unless you know what you're doing). > I believe this is covered in the FAQ. The question gets asked about once > a week, so it should be. > > -DG > > David Greenman > Core-team/Principal Architect, The FreeBSD Project > I've just rebuilt+installed the kernel - (with options "NMBCLUSTERS=2000") - I'm seeing the same behavior - sending a significant #(~40 last time) of mail messages in quick succession (1 every 2 seconds) will reliably cause the system to stop responding, with no errors logged. below are the complete /var/log/messages for the last boot- - neither of the above mentioned messages were ever logged (this has happened ~7 times)- the only kernel messages logged between boots were the ones I listed (even those have stopped occuring between boots now)- Jul 15 13:41:02 trex /kernel: npx0 on motherboard Jul 15 13:41:02 trex /kernel: npx0: INT 16 interface Jul 15 13:41:02 trex /kernel: Probing for devices on the PCI bus: Jul 15 13:41:02 trex /kernel: chip0 rev 2 on pci0:0 Jul 15 13:41:02 trex /kernel: chip1 rev 2 on pci0:7 Jul 15 13:41:02 trex /kernel: WARNING: / was not properly dismounted. Jul 15 13:40:51 trex named[64]: starting. named LOCAL-960513.114936 Mon May 13\ 11:49:36 PDT 1996 root@trex.investools.com:/usr/src/usr.sbin/named Jul 15 13:40:52 trex named[65]: Ready to answer queries. Jul 15 13:40:53 trex lpd[91]: restarted Jul 15 13:41:09 trex login: login on ttyv1 as invest (1 named message(about lame server), 7 login,su messages deleted) Jul 15 13:47:45 trex su: invest to root on /dev/ttyp1 (lost access here - no response to http, etc, had to physically reboot) Jul 15 13:54:32 trex /kernel: FreeBSD 2.1.0-RELEASE #0: Sun Jul 14 23:27:20 PDT 1996 Jul 15 13:54:32 trex /kernel: invest@trex.investools.com:/usr/src/sys/compile/TCKERNEL Is there anything else that would be interesting to look at, or do? Any other information I can gather that would shed light on this? TIA -Thor Clark