From owner-freebsd-questions Thu Jan 6 12:27:17 2000 Delivered-To: freebsd-questions@freebsd.org Received: from cc942873-a.ewndsr1.nj.home.com (cc942873-a.ewndsr1.nj.home.com [24.2.89.207]) by hub.freebsd.org (Postfix) with ESMTP id BA79F1570C for ; Thu, 6 Jan 2000 12:27:13 -0800 (PST) (envelope-from cjc@cc942873-a.ewndsr1.nj.home.com) Received: (from cjc@localhost) by cc942873-a.ewndsr1.nj.home.com (8.9.3/8.9.3) id PAA20493 for freebsd-questions@FreeBSD.ORG; Thu, 6 Jan 2000 15:31:40 -0500 (EST) (envelope-from cjc) From: "Crist J. Clark" Message-Id: <200001062031.PAA20493@cc942873-a.ewndsr1.nj.home.com> Subject: Hung NFS Mount To: freebsd-questions@FreeBSD.ORG (FreeBSD Questions) Date: Thu, 6 Jan 2000 15:31:40 -0500 (EST) Reply-To: cjclark@home.com X-Mailer: ELM [version 2.4ME+ PL54 (25)] MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: owner-freebsd-questions@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.ORG A machine of mine had some SCSI hardware problems yesterday. The machine does NFS serving to several others. The filesystems exported are on drives that were experiencing problems. This was causing local hung processes on the machine as well as hung processes on the NFS clients. Eventually, I was forced to reboot the machine with hardware problems. Now, the NFS exports are clean. Most machines that had problems noticed the server go down and come up. They responded with 'stale NFS handle's messages at access attempts. A simple umount/mount of the filesystem fixed this. However, one machine is still having problems. It tried to access files on the failing server while the NFS daemon was alive, but unable to get the files due to the hardware problems. These processes are still hanging. Despite the server going up and down and the fact it is now alive and well, I cannot get the processes to "unhang." Here are some of them, root 15083 0.0 0.1 288 16 p0- D 11:51AM 0:00.04 umount /usr/ports postman 15288 0.0 2.2 740 488 p1 Ds 12:08PM 0:00.40 -tcsh (tcsh) root 15312 0.0 0.1 288 16 p2- D 12:09PM 0:00.03 umount /usr/ports root 15820 0.0 0.1 224 16 p2- D 12:42PM 0:00.02 mount /usr/ports root 16223 0.0 1.3 240 288 p2- D 1:05PM 0:00.43 / (find) root 17693 0.0 0.2 288 36 p0- D 2:53PM 0:00.03 umount -f /usr/ports I would really rather not reboot the machine this is happening on (and I wonder if the shutdown would even be clean). However, these are just a few of the hung processes. I've already had 'file table full' errors which I believe are caused by all of the hung processes keeping files open. I know that hard NFS errors like this are very tough, if not impossible, to clear, but I'd try just about anything. I'd build raw packets to throw from the NFS server if I thought it would spoof the cleint out of the hangs. Any ideas would be great. (But I really think I'll need to reboot... after 160 days up too... *sigh*) -- Crist J. Clark cjclark@home.com To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-questions" in the body of the message