From owner-freebsd-stable Mon Feb 16 11:41:51 1998 Return-Path: Received: (from majordom@localhost) by hub.freebsd.org (8.8.8/8.8.8) id LAA05489 for freebsd-stable-outgoing; Mon, 16 Feb 1998 11:41:51 -0800 (PST) (envelope-from owner-freebsd-stable@FreeBSD.ORG) Received: from noao.edu (noao.edu [140.252.1.54]) by hub.freebsd.org (8.8.8/8.8.8) with ESMTP id LAA05373 for ; Mon, 16 Feb 1998 11:41:14 -0800 (PST) (envelope-from grandi@noao.edu) Received: from mirfak.tuc.noao.edu (grandi@mirfak.tuc.noao.edu [140.252.1.9]) by noao.edu (8.8.8/8.8.7/SAG-02Dec97) with SMTP id MAA20065 for ; Mon, 16 Feb 1998 12:41:10 -0700 (MST) (envelope-from grandi@noao.edu) Date: Mon, 16 Feb 1998 12:41:10 -0700 (MST) From: Steve Grandi X-Sender: grandi@mirfak.tuc.noao.edu Reply-To: Steve Grandi To: freebsd-stable@FreeBSD.ORG Subject: I need a strategy for making my STABLE installation stable Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-freebsd-stable@FreeBSD.ORG Precedence: bulk My STABLE system hasn't been very stable: I've been averaging one system crash a day for the past week or so. The frequency of crashes is increasing with perhaps one crash a week averaged the past 3 months. I need some help in devising a stratey to make things stable... The hardware: PentiumPro-200 (Venus MotherBoard), 128 MB of RAM, Adaptec 2940 Ultra-Wide SCSI controller, two Seagate ST32155W 2GB disks, a Micropolis 3391WS 9GB disk, Plextor SCSI CD-ROM, Intel EtherExpress Pro 10/100B Ethernet card. The System: FreeBSD 2.2.5-STABLE kept up-to-date via CVSUP What's the system doing: DNS server, Semdmail server, FTP server, Net News server. Ever since I upgraded to 2.2.5-RELEASE in late November, I've seen far too many system crashes. About half the time, the crash would be followed by a reboot. The other half of the time the system would just hang with no response from the console keyboard or active rlogin sessions (but sometimes the system would still answer PINGs). Crashes seemed to follow heavy disk I/O and/or paging (usually soon after an INN expire with a 200MB+ history file). Despite nearly 15 years experience with BSD (going back to 4.1BSD on VAX 11/750s), I am sometimes not a very bright lad and it took me a LONG time to realize that system panics are not noted in any system logs after a reboot. I finally wised up and started playing some games. I compiled DDB into the kernel and had several crashes that caused a drop into the debugger with the result: Fatal Trap 12 page fault while in kernel mode ... supervisor read, page not present ... current process = 4 (update) ... I still haven't managed to capture a core file so I won't attempt to type in the traceback. I think I have dumpon configured properly through the dumpdev variable in /etc/rc.conf; but today's perusal of the man pages seems to indicate that savecore won't save a crash dump if /kernel isn't the same as the kernel running at the time of the crash. So I need to stop tweaking things. So what strategy should I follow to make the system stable and make the Users happy again? Thoughts that I have had: 1) Capture a crash dump and see where that leads me. 2) Start swapping hardware (I have some new memory --- Parity memory this time! -- on order and I do have a spare Adapatec board I can lay my hands on). 3) Keep tweaking the kernel config file. So far, I have increased values for MAXDSIZ, DFLDSIZ and NMBCLUSTERS and deleted the options MFS and AHC_ALLOW_MEMIO. The next item on my hit list would be deleting AHC_TAGENABLE. Any advice out there? Steve Grandi, National Optical Astronomy Observatories, Tucson, Arizona USA Internet: grandi@noao.edu Voice: +1 520 318-8228 To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-stable" in the body of the message