From owner-freebsd-stable Sat Dec 18 12:24:12 1999 Delivered-To: freebsd-stable@freebsd.org Received: from wall.polstra.com (rtrwan160.accessone.com [206.213.115.74]) by hub.freebsd.org (Postfix) with ESMTP id 825C114EBC for ; Sat, 18 Dec 1999 12:24:10 -0800 (PST) (envelope-from jdp@polstra.com) Received: from vashon.polstra.com (vashon.polstra.com [206.213.73.13]) by wall.polstra.com (8.9.3/8.9.3) with ESMTP id MAA27241 for ; Sat, 18 Dec 1999 12:24:09 -0800 (PST) (envelope-from jdp@polstra.com) Received: (from jdp@localhost) by vashon.polstra.com (8.9.3/8.9.1) id MAA26899 for stable@freebsd.org; Sat, 18 Dec 1999 12:24:09 -0800 (PST) (envelope-from jdp@polstra.com) Message-ID: X-Mailer: XFMail 1.3 [p0] on FreeBSD X-Priority: 3 (Normal) Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 8bit MIME-Version: 1.0 Date: Sat, 18 Dec 1999 12:24:09 -0800 (PST) Organization: Polstra & Co., Inc. From: John Polstra To: stable@freebsd.org Subject: Strange transient filesystem failures Sender: owner-freebsd-stable@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.ORG I have been seeing some baffling errors on freefall, and I'd like to know if anybody has a clue about what could be causing them. There is a CVSup server running on freefall, and it constantly mirrors the entire repository over to cvsup-master.freebsd.org. A new update is run every 6 minutes, or as fast as it can go when it takes longer than that. Occasionally, I am seeing updates fail because opendir() failed with ENOENT on the server (freefall). For example: Cannot read directory "/usr/local/etc/cvsup/prefixes/FreeBSD.cvs/src/sys/nwfs": No such file or directory But from looking at the timestamps of the directory and its parent directory, it's clear that that directory _did_ exist and was not being modified at that time. Also, the commit logs don't show any commits going on around that time. Nothing unusual appears in the system logs, either. These failures are transient and short-lived; the next update always works fine. The failures also seem to be clustered. In one case there happened to be three updates running at once (two from committers), and all three failed within a 5-second interval on different directories: /usr/local/etc/cvsup/prefixes/FreeBSD.cvs/ports/x11-toolkits/v /usr/local/etc/cvsup/prefixes/FreeBSD.cvs/src/sys/nwfs /usr/local/etc/cvsup/prefixes/FreeBSD.cvs/ports/sysutils/afio/pkg At first I suspected that the file table was full, as there are two messages in the dmesg output to that effect. But now I don't think that's it. I have seen several of these failures and have immediately checked the dmesg output, but no new "file: table was full" messages have appeared. Soft-updates are not in use. FWIW, the paths in question involve a few symlinks: /usr/local -> /d/usr.local /d/usr.local/etc/cvsup/prefixes/FreeBSD.cvs -> /home/ncvs /home/ncvs -> /x/ncvs The system is running -stable from October 30. John -- John Polstra jdp@polstra.com John D. Polstra & Co., Inc. Seattle, Washington USA "No matter how cynical I get, I just can't keep up." -- Nora Ephron To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-stable" in the body of the message