Date: Wed, 6 Mar 1996 12:57:28 -0700 (MST) From: Terry Lambert <terry@lambert.org> To: bde@zeta.org.au (Bruce Evans) Cc: jhay@mikom.csir.co.za, freebsd-current@FreeBSD.ORG Subject: Re: fixes for rename panic (round 1) Message-ID: <199603061957.MAA11679@phaeton.artisoft.com> In-Reply-To: <199603060750.SAA09666@godzilla.zeta.org.au> from "Bruce Evans" at Mar 6, 96 06:50:15 pm
next in thread | previous in thread | raw e-mail | index | archive | help
> The main cause of the panic is broken reference counting in certain > error cases. relookup() calls vput(fdvp) when it fails, so callers > must increment the reference count before calling relookup() and > decrement it if relookup() doesn't fail. Not doing this caused > v_usecount for the test directory to eventually become negative. > > relookup() fails when the `from' file went away. Different bad things > happen if it went away and came back. Then relookup doesn't fail, but > the wrong file or directory is removed. If a regular file went away > and came back as a directory, then the file system is corrupted. [ ... patches ... ] Have you tested these changes with MDOSFS (especially) and EXT2FS (less especially)? I suspect a potential deadlock on identical path prefixes one path component off, and on "." and ".." references for some particular cases. The existance of the race on the rename is intentional based on the need to hold the directory exclusively when inserting an entry. This is, I think, an architectural side-effect of making rename an FS atomic operation. Not that I don't agree that it shouldn't cause a panic. 8-) 8-). My personal "dream soloution" to this problem is to issue reader/writer locks on the vnode and remove the ladder-chain release race on a path traversal entirely. The magic is that "R" and "IX" are not conflicting for the same thread (context/process ID/whatever); it's the promotion from "IX" to "X" that generates the conflict. This causes the vnode reference to exist, but doesn't screw the traversal because of the VLOCK recursion restriction. Alternately, in implementing the UFS under Win95's IFS framework, we set the IN_RECURSE bit, which has similar effect to the patches you have, without adding unduly to the complexity. This is a kludge, plain and simple, because the IN_RECURSE itself is a kludge (it's predicated on the possibility of a fault on the copyout in the uiomove with a swap file on the same FS as is performing the currently requested operation). Really, this goes to my suggestion before that the VOP_LOCK code go to common higher level code and become purely advisory in all FS's except UNIONFS (needs to hold underlying vnodes locked) and QUOTAFS (needs to hold the quota file vnode locked on the underlying FS to which quotas are being applied). That, in combination with the conversion of the lock to a counting semaphore (ala SVR4, SunOS, Solaris, SCO, Unix SVR4 ES/MP, and AIX) will resolve the recusion case cleanly as well, without involving the underlying FS's inodes. Terry Lambert terry@lambert.org --- Any opinions in this posting are my own and not those of my present or previous employers.
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199603061957.MAA11679>