Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 30 Sep 1996 11:16:40 -0700 (MST)
From:      Terry Lambert <terry@lambert.org>
To:        julian@whistle.com (Julian Elischer)
Cc:        eng@alpo.whistle.com, hackers@freebsd.org
Subject:   Re: flock/sendmail stuffup
Message-ID:  <199609301816.LAA06288@phaeton.artisoft.com>
In-Reply-To: <324F31C6.41C67EA6@whistle.com> from "Julian Elischer" at Sep 29, 96 07:34:46 pm

next in thread | previous in thread | raw e-mail | index | archive | help
> It's taken a couple of days to track down, but in the
> implimentation we're using here of sendmail there appears to be some
> strangeness.
> (this is as of a few weeks ago)
> I understand that this may not be relevant with the sendmail,
> new version (I'll check it tomorrow or tonight)
> but it appears to point to a weakness.
> 
> Sendmail was openning /var/tmp/dead.letter and locking the file
> it then re-openned it or whatever, and got a second file
> structure for that file which it attempted to lock
> and immediatly went into deadlock with itself.
> 
> all following sendmails (sendmail -q5m) all lined up obediently waiting
> for that file to come free (as it already existed)
> I eventually had 85 sendmails all stopped in "lockf"
> 
> Unfortunatly it seems that whole file locks are 
> feferenced to the struct file, rather than to the process itself.
> so children can inherrit the  file-locks, however
> it ALSO bypasses the deadlock detection if
> it's not a "POSIX" type lock. this means that
> it's trivial to deadlock oneself
> with flock.
> 
> e.g.
> #include <sys/file.h>
> 
> main()
> {
>         int fd1, fd2;
> 
>         fd1 = open("/tmp/xx",O_RDWR,0666);
>         fd2 = open("/tmp/xx",O_RDWR,0666);
>         flock(fd1,LOCK_EX);
>         flock(fd2,LOCK_EX);
> }
> 
> 
> of course the quick answer is 
> "Don't Do That"
> but it shouldn't be possible to deadlock yourself so easily..
> and if something like sendmail can be configured in such a way that it
> happens then maybe it should be thought out a bit better.

The correct place to hang the locks are off the vnode.  The vnode should
be the same.  The key is that an fd is


per proc
open file (struct file)
table
,------.
| fd 1 |
+------+
| fd 2 |----------------------.
+------+                      |
|      |                      |
  ...                        ...
|      |                      v
+------+                      ,-------.   ,-------.  ,-----------.
| fd x |--------------------> | vnode |-->| inode |->| lock list |
+------+                      `-------'   `-------'  +-----------+
| fd y |                                             |           |
`------'                                                  ...


Call graph:

flock
   VOP_ADVLOCK
      ufs_advlock
	 lf_advlock ( ..., &(ip->i_lockf), ip->i_size)
	    lf_setlock
    
Here we can see the lock is actually hung off a pointer in the FS
specific in core inode.  This greatly complicates processing,
actually.  One would expect locks to operate on top level vnodes
and be hung off the in core vnode instead.  This would greatly
simplify the FS specific code by moving the common code up to
the VFS/VOP (syscalls) layer.

The deadlock detection should be functioning.  I suspect the "error"
is that you expect non-interference for the same process?  What about
threads?

No, in order to get an EWOULDBLOCK return, you must ammend your code:

#include <sys/file.h>

main()
{
        int fd1, fd2;

        fd1 = open("/tmp/xx",O_RDWR,0666);
        fd2 = open("/tmp/xx",O_RDWR,0666);
        flock(fd1,LOCK_EX);
        flock(fd2,LOCK_EX | LOCK_NB);
}


That is, you only get deadlock avoidance if you spcifically ask for it.


For what it's worth, flock() is probably implemented incorrectly
internally anyway -- but that's not what is causing you grief.

>From the flock man page:

     Requesting a lock on an object that is already locked normally causes the
     caller to be blocked until the lock may be acquired.  If LOCK_NB is in-
     cluded in operation, then this will not happen; instead the call will
     fail and the error EWOULDBLOCK will be returned.


> (I'm not sure WHY sendmail was doing this... it olny seemed to do it
> if /var/tmp/dead.letter already existed. Once I deleted it and killed
> the deadlocked sendmail, all the rest went to completion,
> but if I didn't delete it, then each one would dead-lock, as I killed
> the one before it..

Because sendmail is in error.

> ggest a way of keeping track of what processes
> have a particular 'file' structure referenced, so that the 
> deadlock detection in the POSIX case can be extended to the
> 'flock' case.


8-).  Already there.  flock uses an advisory range lock on the entire
file -- that's how it operates: it's a simplified special case of fcntl()
locking.



					Regards,
					Terry Lambert
					terry@lambert.org
---
Any opinions in this posting are my own and not those of my present
or previous employers.



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199609301816.LAA06288>