From owner-freebsd-questions@FreeBSD.ORG Sun Jul 28 10:05:56 2013 Return-Path: Delivered-To: freebsd-questions@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTP id B22075A1 for ; Sun, 28 Jul 2013 10:05:56 +0000 (UTC) (envelope-from frank2@fjl.co.uk) Received: from bs1.fjl.org.uk (bs1.fjl.org.uk [84.45.41.196]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id 10A1E2783 for ; Sun, 28 Jul 2013 10:05:55 +0000 (UTC) Received: from [192.168.1.35] (mux.fjl.org.uk [62.3.120.246]) (authenticated bits=0) by bs1.fjl.org.uk (8.14.4/8.14.4) with ESMTP id r6SA5nqH041242 (version=TLSv1/SSLv3 cipher=DHE-DSS-CAMELLIA256-SHA bits=256 verify=NO); Sun, 28 Jul 2013 11:05:51 +0100 (BST) (envelope-from frank2@fjl.co.uk) Message-ID: <51F4ECFD.5090502@fjl.co.uk> Date: Sun, 28 Jul 2013 11:05:49 +0100 From: Frank Leonhardt User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:17.0) Gecko/20130620 Thunderbird/17.0.7 MIME-Version: 1.0 To: Polytropon Subject: Re: Delete a directory, crash the system References: <51F3F290.9020004@cordula.ws> <51F420ED.1050402@fjl.co.uk> <20130728075447.4d6e0468.freebsd@edvax.de> In-Reply-To: <20130728075447.4d6e0468.freebsd@edvax.de> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Content-Filtered-By: Mailman/MimeDel 2.1.14 Cc: freebsd-questions@freebsd.org X-BeenThere: freebsd-questions@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: User questions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 28 Jul 2013 10:05:56 -0000 On 28/07/2013 06:54, Polytropon wrote: > And here, kids, you can see the strength of open source > operating system: You can see _why_ something happens. :-) Too true! > On Sat, 27 Jul 2013 20:35:09 +0100, Frank Leonhardt wrote: >> On 27/07/2013 19:57, David Noel wrote: >>>> So the system panics in ufs_rmdir(). Maybe the filesystem is >>>> corrupt? Have you tried to fsck(8) it manually? >>> fsck worked, though I had to boot from a USB image because I couldn't >>> get into single user.. for some odd reason. >>> >>>> Even if the filesystem is corrupt, ufs_rmdir() shouldn't >>>> panic(), IMHO, but fail gracefully. Hmmm... >>> Yeah, I was pretty surprised. I think I tried it like 3 times to be >>> sure... and yeah, each time... kaboom! Who'd have thought. Do I just >>> post this to the mailing list and hope some benevolent developer >>> stumbles upon it and takes it upon him/herself to "fix" this, or where >>> do I find the FreeBSD Suggestion Box? I guess I should file a Problem >>> Report and see what happens from there. >>> >> I was going to raise an issue when the discussion had died down to a >> concensus. I also don't think it's reasonable for the kernel to bomb >> when it encounters corruption on a disk. >> >> If you want to patch it yourself, edit sys/ufs/ufs/ufs_vnops.c at around >> line 2791 change: >> >> if (dp->i_effnlink < 3) >> panic("ufs_dirrem: Bad link count %d on parent", >> dp->i_effnlink); >> >> To >> >> if (dp->i_effnlink < 3) { >> error = EINVAL; >> goto out; >> } >> >> The ufs_link() call has a similar issue. >> >> I can't see why my mod will break anything, but there's always >> unintended consequences. > One of the core policies usually is to stop _any_ action that > had failed due to a "reason that cannot be" and make sure it > won't get worse. This can be seen for example in fsck's behaviour: > If there is a massive file system error that cannot be repaired > without further intervention that _could_ destroy data or make > its retrieval harder or impossible, the operator will be requested > to make the decision. There are options to automate this process, > but on the other hand, "always assume 'yes'" can then be a risk, > as it could prevent recovery. My assumtion is that the developers > chose a similar approach here: "We found a situation that should > not be possible, so we stop the system for messing up the file > system even more." This carries the attitude of not "hiding a > problem for the sake of convenience" by "being silent and going > back to the usual work". Of course it is debatable if this is the > right decision in _this_ particular case. > > > The problem I have with this is the assumption that the inode was at fault. I said this was the most likely, but it's not the absolute reason. At the risk of repeating, it's the /effective/ link count (in the vnode) that's out of line here, not the inode count. If the inode was wrong it could be down to minor FS corruption; an interrupted directory creation or deletion would do the trick. The vnode could go wrong for all sorts of reasons, probably associated with a race during the directory removal, which is not an atomic operation by any means. See "The Design of the UNIX operating system" p 5.16.1, Bach, Prentice-Hall, 1986. My guess is that we're looking at an old debugging pragma here, put in to cope with a race going wrong if the code wasn't quite right (note that the function has since been renamed but the message not updated). You're right about stopping on internal errors (corruption to the kernel data structures in this case) but this case is indeed debatable. On the one hand, now the system is stable (i.e. we can probably trust rmdir code after all this time), the most likely cause is inode corruption polluting the vnode. On the other hand the pragma may be useful if people are tinkering with the kernel and you get even more opportunities for a race with (say) SMP. I don't expect the kernel to panic on a user-land I/O error, or anything else that's expected or recoverable - and a wonky FS meets these criteria in my book. David was lucky to find this - I tend to run FreeBSD on servers, not laptops, and I'd never have seen this server panic "live" and therefore not been able to discover the cause very easily. That's worrying. So it boils down to: a) Leave is is, as it can detect when the kernel has trashed its vnode table; or b) It's probably caused by "expected" FS corruption, so handle it gracefully. Incidentally, if you look at the code you'll see this is only heuristic check, and a weak one at that. Most of the time it WILL NOT pick up the case where the parent directory's link is missing. As far as I can tell it will go on to unlink the target successfully, with no ill effects. If this situation really did lead to catastrophe (as suggested by the use of a panic) then the check used ought to be a lot more reliable! As it is, removing it entirely except for debug kernels, is a third option. Regards, Frank.