Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 25 Aug 1999 01:52:38 +0000 (GMT)
From:      Terry Lambert <tlambert@primenet.com>
To:        grog@lemis.com (Greg Lehey)
Cc:        a.reilly@lake.com.au, phk@critter.freebsd.dk, dillon@apollo.backplane.com, hackers@FreeBSD.ORG, cvs-committers@FreeBSD.ORG, wollman@khavrinen.lcs.mit.edu
Subject:   Re: Locking in Vinum (was: Mandatory locking?)
Message-ID:  <199908250152.SAA16323@usr01.primenet.com>
In-Reply-To: <19990825083036.Q83273@freebie.lemis.com> from "Greg Lehey" at Aug 25, 99 08:30:36 am

next in thread | previous in thread | raw e-mail | index | archive | help
> > I don't want to express an opinion about the need or otherwise
> > for mandatory locking, but I would appreciate a teensy
> > clarification of the problem domain:
> >
> > On Mon, Aug 23, 1999 at 05:43:45PM +0930, Greg Lehey wrote:
> >>   To write a block to a RAID-5 device, you need to:
> >>
> >>   1.  Read the old data into a temporary buffer.
> >>   2.  Read the old parity data corresponding to the data into a
> >>       temporary buffer.
> >>   3.  XOR the two, storing the result in one of the temporary buffers.
> >>   4.  XOR the result with the data which is to be written.
> >>   5.  Write the data block.
> >>   6.  Write the parity block.
> >
> > Are you suggesting that random user processes have to do all of
> > this every time that they access a vinum drive?  
> 
> Yes.


This could also be accomplished with a volume access lock at the
CAM level.


I think what people are missing here is that Vinum, when doing
software RAID, is implementing a type of namespace escape, only
it isn't a standard namespace escape.


For example, if I have a QUOTAFS that accesses the file "/.quota",
and then lies during VOP_READDIR and other name lookup operations
in order to hide the "/.quota" file from prying eyes.


Because this escapes the whole file, it is _not_ like the Vinum
usage, which needs to escape parity bits on a block device.  The
Vinum usage needs to prevent access to the file range covered by
the parity bits, rather than merely protecting the parity bits.


Use of a mandatory lock mechanism has a significantly higher
granularity than a logical volume lock (which is what would
have to be use, instead of a physical volume, unless we can be
guaranteed that the parity bits plus the bits over which the
parity is being calculated do not span more than a single
physical volume.


It seems to me that this is a proper application of mandatory
locks.



There also seems to be a general misconception about mandatory
locking implementation in SVR4 (or in general).

The point of mandatory locks is to allow you to _prevent_ access
to files where the locking applies, except when such access is
encapsulated with a lock.

This means that the "race scenarios", where a badly behaved
process is able to thwart the locking, don't exist.

For the other "deadlock" scenarios, mandatory locks are no worse
than a chflags'ed ld.so file that you can't replace unless you
chflag it back, and certainly no worse than the FreeBSD behaviour
that prevented updating a running executable, in the past (EBUSY).

At worst, you can always kill the process that has the lock, and
allow resource tracking to clean up after it.


For the use proposed by Vinum, however, fcntl() based mandatory
locks are probably not the proper tool.

This is because you can only apply locks to devices that have a
VOP_ADVLOCK on their backing store, and which use the VFSOPS
based fileops structures.

For the same reason that Linux user lament the inability to
place advisory range locks on special files in FreeBSD, so, too,
would Vinum be unable to place mandatory locks through that
same mechanism against special files in FreeBSD.

To correct this to allow it to work would require hanging the
locks off of the vnode, instead of hanging them off the backing
object (I have been suggesting -- and providing patches for --
this for literally years).


					Terry Lambert
					terry@lambert.org
---
Any opinions in this posting are my own and not those of my present
or previous employers.


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-hackers" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199908250152.SAA16323>