From owner-freebsd-stable Wed Apr 17 8:55:26 2002 Delivered-To: freebsd-stable@freebsd.org Received: from mail.webmonster.de (datasink.webmonster.de [194.162.162.209]) by hub.freebsd.org (Postfix) with SMTP id C19BB37B41B for ; Wed, 17 Apr 2002 08:55:16 -0700 (PDT) Received: (qmail 32787 invoked by uid 1000); 17 Apr 2002 15:55:37 -0000 Date: Wed, 17 Apr 2002 17:55:37 +0200 From: "Karsten W. Rohrbach" To: "Marc G. Fournier" Cc: freebsd-stable@FreeBSD.ORG Subject: Re: STABLE kernel panicking all too often ... Message-ID: <20020417175537.A32675@mail.webmonster.de> References: <20020417034229.D1D82BA05@i8k.babbleon.org> <20020417093534.O99298-100000@mail1.hub.org> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-md5; protocol="application/pgp-signature"; boundary="wac7ysb48OaltWcw" Content-Disposition: inline User-Agent: Mutt/1.2.5i In-Reply-To: <20020417093534.O99298-100000@mail1.hub.org>; from scrappy@hub.org on Wed, Apr 17, 2002 at 09:43:26AM -0300 X-Arbitrary-Number-Of-The-Day: 42 X-URL: http://www.webmonster.de/ X-Disclaimer: My opinions do not necessarily represent those of my employer X-Work-URL: http://www.ngenn.net/ X-Work-Address: nGENn GmbH, Schloss Kransberg, D-61250 Usingen-Kransberg, Germany X-Work-Phone: +49-6081-682-304 Sender: owner-freebsd-stable@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG --wac7ysb48OaltWcw Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable Marc G. Fournier(scrappy@hub.org)@2002.04.17 09:43:26 +0000: > > Also, I'm sure that this just shows my ignorance, but how can it be the= case > > that the load averages are 67-46 when the CPU is 70% idle? Those two f= igures > > seem to be at odds with each other based on my experience. >=20 > Its relatively consistent: >=20 > last pid: 13191; load averages: 34.75, 41.81, 42.68 up 1+04:42:37 07= :36:27 > 2904 processes:4 running, 2900 sleeping > CPU states: 3.2% user, 0.0% nice, 29.3% system, 0.2% interrupt, 67.3% = idle > Mem: 2376M Active, 235M Inact, 285M Wired, 117M Cache, 199M Buf, 4348K Fr= ee > Swap: 3072M Total, 1089M Used, 1982M Free, 35% Inuse, 60K In just a wild guess, judging from the _massive_ amount of idle time of the box, a scenario: - you got plenty of processes running there - they might be i/o intensive - your box swaps a lot of memory to/from the disk(s) - you might have misbehaving storage devices (just an assumption) - the vm subsystem in this scenario barfs on the number of swapped out pages things i'd try: - if possible, limit max. no of processes consuming all of your ram, to make the box not swap excessively to disk; this also gives you more inactive pages used for read cache, IIRC - put in one or two more disks and distribute the swap load over the spindles. this would make the box more responsive, anyway - check your dmesg output/syslog for scsi bus resets or other symptoms of bad cabling or broken disk hardware - try to spread the load over several considerably smaller boxes, if possible - compile a non-SMP kernel and look what happens. you appear to have enough cpu time in spare to try that. there might be a driver that SMP stumbles over. those points are from the perspective of operations, intended as a quick fix, not from the kernel hacker's point of view to "make things right", but rather work around the actual problem. as another wild guess i'd say there's some limit the vm subsystem the kernel hits in a kind of race condition due to misbehaviour of hardware in conjunction with the vm subsystem in conjunction with large amounts of ram and swap on a SMP platform.=20 that's quite a big box you got there. regards, /k --=20 > I'm not as think as you stoned I am. KR433/KR11-RIPE -- WebMonster Community Founder -- nGENn GmbH Senior Techie http://www.webmonster.de/ -- ftp://ftp.webmonster.de/ -- http://www.ngenn.n= et/ GnuPG 0x2964BF46 2001-03-15 42F9 9FFF 50D4 2F38 DBEE DF22 3340 4F4E 2964 B= F46 My mail is GnuPG signed -- Unsigned ones are bogus -- http://www.gnupg.org/ Please do not remove my address from To: and Cc: fields in mailing lists. 1= 0x --wac7ysb48OaltWcw Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.0.6 (FreeBSD) Comment: For info see http://www.gnupg.org iD8DBQE8vZr4M0BPTilkv0YRAsn6AJ43bKTt+nBo0NjaFV50E2YHQt6oMgCcDsIA udV8SwdszxVI/FYN6lcUyCo= =x9ZT -----END PGP SIGNATURE----- --wac7ysb48OaltWcw-- To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-stable" in the body of the message