Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 1 Mar 1999 09:11:05 -0800
From:      John Plevyak <jplevyak@inktomi.com>
To:        Terry Lambert <tlambert@primenet.com>, John Plevyak <jplevyak@inktomi.com>
Cc:        hackers@FreeBSD.ORG
Subject:   Re: lockf and kernel threads
Message-ID:  <19990301091105.B21935@tsdev.inktomi.com>
In-Reply-To: <199902272043.NAA14007@usr09.primenet.com>; from Terry Lambert on Sat, Feb 27, 1999 at 08:43:33PM %2B0000
References:  <19990224165840.A19033@proxydev.inktomi.com> <199902272043.NAA14007@usr09.primenet.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On Sat, Feb 27, 1999 at 08:43:33PM +0000, Terry Lambert wrote:
> > Ok, I found the problem with using kernel threads and fcntl( .. F_SETLK ..).
> > The issue is two fold.  
> > 
> >  1. the locks are keyed off pid, and if the first thread to exit and close
> >     them is not the one that took out the lock the close will occur without
> >     the lock being removed.
> 
> Probably the correct way to deal with this is to put the locks into
> seperate collision domains for the keying.
> 
> A very simple way to do this would be to take the NFS server locking
> kernel patches from www.freebsd.org/~terry/, and then make rpid the
> same as pid for unthreaded process, the same as the thread ID for
> threaded processes, and then make the rpid value significant for
> both hash bucket selection and locally asserted locks.
> 
> This would result in the locks locking against each other.

This is a good idea, and one I considered for the patch.
Two problems with this:

  1. the obvious choice of rpid is p_leader->p_pid.  However,
     it is possible for the leader to terminate a short time
     (time to service a 'kill') before the peers.  Depending
     on how overloaded the machine, NFS troubles etc, I might
     worry about PID recycling.

  2. It requires a more substantial change to the locking and
     a modification of the proc struct.  The reason is that currently
     file locking uses the proc pointer as the 'id' or uniqueness
     token.  I wanted to see if something simpler could be done.

Alternatively, if one can ensure that the p_leader does not 
die before the peers, then one can preserve the locking code
and refraim from changing the proc strucure since one can
simply pass p->p_leader down to the locking code.  Since for
unthreaded programs p == p->p_leader this yields very simple
code for both cases.

Moreover, this change enables other such threading problems to be fixed
because logically 'process-type' state can be stored in
p->p_leader->foo and logically 'thread-type' state can be stored
in p->foo.

(this is the change I chose)
     
> 
> 
> >  2. the P_ADVLOCK flag is set in p_flags on a per processes basis, so
> >     we don't even check to see if locks need to be removed if the
> >     closing (struct proc*) is not the locking (struct proc*).
> 
> POSIX is very explicit, that the first close on a file within a single
> process results in the locks being deasserted.
> 
> I think that your patches fail to address this issue, since in reality,
> I believe that you want to treat each thread as if it were a seperate
> process for the purpose of specifying POSIX close semantics.

The patch does in fact handle this case.  The first closing
thread with pass p->p_leader into the VOP_ADVLOCK and cause the lock
to be released. 

In the current system, only if the original thread which obtained
the lock was the first to close it would the lock be released.

> 
> The patches also have the coeslescence problem that made me seperate
> the VOP_ADVLOCK code into an assert-veto/affirm-coelesce, instead
> of what they do now.
> 
> Basically, the problem is that when you assert a byte range lock on
> a file (e.g., a read lock), and an overlapping lock is granted to
> another thread in your process (e.g. a read lock with some overlap),
> and one or the other of the locks (but not both) are deasserted, then
> the overlapping region is no longer read locked for the thread that
> did not release it's lock.
> 
> In effect, the locks cast "shadows" onto a linear contention space
> for each file, and shadows with the same indices are coelesced.
> 
> The purpose in dealing with this for NFS server locking is that a
> single process in user space proxies the locks for a large number
> of clients, and locks from one client machine that overlap another
> lock from the same client machine results in a condition similar
> to multiple threads within the same process asserting locks, and
> expecting them to be enforced as if they weren't in the same
> contention domain (it happens to clean up VFS advisory locking on
> union FS stacking at the same time, but that's just a nice side
> effect).
> 

I understand the problem for NFS locking, but for threaded programs
it would seem that lock shadowing would be the desired behavior
The program is logically one process, and the lock ranges are
shared state, not thread-specific state.  

> 
> 
> As a general comment on your original problem, the correct way to
> handle signals (according to the pthreads model) is to disable
> disnals in all threads but one, which is designated as the signal
> handling thread.
> 
> One problem here is that FreeBSD kernel threads using rfork get
> different PID's, so the signal handling will never work correctly
> anyway.
> 
> For user space threads and the Linux crap, the signal handling
> should be possible, if you follow the disallow-all-but-one-thread
> rule.

I considered that also.  There are a few problems with this:

  1. some signals cannot be blocked.  In particular, if the non-locking
     thread is 'KILL'ed, then the lock will never be released in the
     current code.

  2. the abort(3) code explicitly unblocks the calling threads signal
     handler so that strategy will not work for asserts, aborts.

  3. signals are used for other purposes, some of which are not
     amenable to the 'disallow-all-but-on-thread' technique.

> 
> 
> 					Terry Lambert
> 					terry@lambert.org
> ---
> Any opinions in this posting are my own and not those of my present
> or previous employers.
> 
> 
> To Unsubscribe: send mail to majordomo@FreeBSD.org
> with "unsubscribe freebsd-hackers" in the body of the message

john

-- 
John Bradley Plevyak,    PhD,    jplevyak@inktomi.com,     PGP KeyID: 051130BD
Inktomi Corporation,  1900 S. Norfolk Street,  Suite 110,  San Mateo, CA 94403
W:(415)653-2830 F:(415)653-2801 P:(888)491-1332/5103192436.4911332@pagenet.net


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-hackers" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?19990301091105.B21935>