Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 28 Aug 1999 12:52:42 +0930
From:      Greg Lehey <grog@lemis.com>
To:        FreeBSD Committers <cvs-committers@FreeBSD.org>, FreeBSD Hackers <hackers@FreeBSD.org>
Subject:   locking revisited
Message-ID:  <19990828125241.G13904@freebie.lemis.com>

next in thread | raw e-mail | index | archive | help

--OgqxwSJOaUobr8KG
Content-Type: text/plain; charset=us-ascii

After all the stuff that has been said on the last locking thread, I
think it's better to restate the case than follow up.

It's obvious from the messages in the last thread that a number of
otherwise clever people have little understanding or knowledge of the
concepts of file locking.  I'm appending a hastily worked-over version
of the section about locking from "Porting UNIX Software".

Here's a summary of what I've been trying to say:

All systems which do more than one thing at a time need file locking
at some time or another.  Since it involves cooperation between
potentially unrelated processes, it's an obvious kernel function.  Any
"solution" requiring cooperation between processes isn't really a
solution.  As a result, I don't consider advisory locking to be real
locking: it's just a kludge.

FreeBSD is one of the few operating systems which doesn't have
kernel-level locking.  If we want to emulate other systems correctly,
we *must* have advisory locking.  This includes SCO UNIX, System V.4
and Linux.  I suspect it also includes Microsoft.

All this doesn't leave too much room for arguments about whether
locking works or not: it works on all platforms except FreeBSD, and
that's only because FreeBSD doesn't implement locking.

As a result, I argue that we should implement locking.  The questions
are: how?  I'd suggest three methods which can be individually enabled
via sysctls:

 - System V style.  We need this for compatibility with System V.  The
   choice of mandatory or advisory locking depends on the file
   permissions.

 - Only mandatory locking.  fcntl works as before, but locks are
   always mandatory, not advisory.  I'm sure that this won't be
   popular, at least initially, but if you don't like it, you don't
   have to use it.y

 - Via separate calls to fcntl.  fcntl currently has the following
   command values:

     #define	F_DUPFD		0		/* duplicate file descriptor */
     #define	F_GETFD		1		/* get file descriptor flags */
     #define	F_SETFD		2		/* set file descriptor flags */
     #define	F_GETFL		3		/* get file status flags */
     #define	F_SETFL		4		/* set file status flags */
     #define	F_GETOWN	5		/* get SIGIO/SIGURG proc/pgrp */
     #define    F_SETOWN	6		/* set SIGIO/SIGURG proc/pgrp */
     #define	F_GETLK		7		/* get record locking information */
     #define	F_SETLK		8		/* set record locking information */
     #define	F_SETLKW	9		/* F_SETLK; wait if blocked */

   We could add a F_SETMANDLOCK or some such.

Any thoughts?

Greg
--
See complete headers for address, home page and phone numbers
finger grog@lemis.com for PGP public key

--OgqxwSJOaUobr8KG
Content-Type: text/plain
Content-Disposition: attachment; filename="locking.txt"
Content-Transfer-Encoding: quoted-printable









                                 File locking





F=08File locking
____________

The  Seventh Edition did not originally allow programs to coordinate concur=
rent
access to a file.  If two users both had a file open for  modification  at =
 the
same  time,  it  was almost impossible to prevent disaster.  This is an obv=
ious
disadvantage, and all modern versions of UNIX supply some form of file lock=
ing.

Before  we  look  at  the  functions  that  are  available, it's a good ide=
a to
consider the various kinds of lock.   There  seem  to  be  two  of  everyth=
ing.
First, the granularity is of interest:

o file locking applies to the whole file.

o range  locking  applies  only  to a range of byte offsets.  This is somet=
imes
  misleadingly called record locking.

With file locking, no other process can access the file when a lock is appl=
ied.
With  range  locking,  multiple locks can coexist as long as their ranges d=
on't
overlap.  Secondly, there are two types of lock:

o Advisory locks do not actually prevent access to the file.  They work onl=
y if
  every  participating  process ensures that it locks the file before acces=
sing
  it.  If the file is already locked, the process blocks  until  it  gains =
 the
  lock.

o mandatory locks prevent (block) read and write access to the file, but do=
 not
  stop it from being removed or renamed.  Many editors do just  this,  so  =
even
  mandatory locking has its limitations.

Finally, there are also two ways in which locks cooperate with each other:

o exclusive  locks  allow  no  other locks that overlap the range.  This is=
 the
  only was to perform file locking, and it implies that only a  single  pro=
cess
  can access the file at a time.  These locks are also called also called w=
rite
  locks.

                                                                         Pa=
ge 1






File locking


o shared locks allow other shared locks  to  coexist  with  them.   Their  =
main
  purpose  is  to prevent an exclusive lock from being applied.  In combina=
tion
  with mandatory range locking, a write is not permitted to a range covered=
  by
  a shared lock.  These locks are also called read locks.

There are five different kinds of file or record locking in common use:

o Lock  files,  also  called  dot  locking,  is  a primitive workaround use=
d by
  communication programs such as uucp and getty.   It  is  independent  of =
 the
  system  platform,  but  since it is frequently used we'll look at it brie=
fly.
  It implements advisory file locking.

o After the initial release of the Seventh  Edition,  a  file  locking  pac=
kage
  using  the  system  call locking was introduced.  It is still in use toda=
y on
  XENIX systems.  It implements mandatory range locking.

o BSD systems have the system call flock.  It implements advisory file lock=
ing.

o System  V, POSIX.1, and more recent versions of BSD support range locking=
 via
  the fcntl system  call.   BSD  and  POSIX.1  systems  provide  only  advi=
sory
  locking.   System  V  supplies  a  choice  of  advisory or mandatory lock=
ing,
  depending on the file permissions.  If you need to rewrite locking code, =
this
  is the method you should use.

o System  V  also supplies range locking via the lockf library call.  Again=
, it
  supplies a choice of advisory or mandatory locking,  depending  on  the  =
file
  permissions.

The  decision between advisory and mandatory locking in System V depends on=
 the
file permissions and not on the call to fcntl or lockf.  The setgid bit is =
used
for  this purpose.  Normally, in executables, the setgid bit specifies that=
 the
executable should assume the effective group ID of its owner group when exe=
ced.
On  files  that  do  not  have group execute permission, it specifies manda=
tory
locking if it is set, and advisory locking if it is not set.  For example,

o A file with  permissions  0764  (rwxrw-r--)  will  be  locked  with  advi=
sory
  locking, since its permissions include neither group execute nor setgid.

o A  file  with  permissions  0774  (rwxrwxr--)  will  be  locked with advi=
sory
  locking, since its permissions don't include setgid.

o A file with permissions  02774  (rwxrwsr--)  will  be  locked  with  advi=
sory
  locking, since its permissions include both group execute and setgid.


Page 2






                                                                   File loc=
king


o A file with permissions 02764 will be locked with mandatory locking, sinc=
e it
  has the setgid bit set, but group execute  is  not  set.   If  you  list =
 the
  permissions  of this file with ls -l, you get rwxrwlr-- on a System V sys=
tem,
  but many versions of ls, including BSD and GNU versions, will list rwxrwS=
r--.


Lock files
__________

Lock  files are the traditional method that uucp uses for locking serial li=
nes.
Serial lines are typically used either for dialing out, for example with  u=
ucp,
or dialing in, which is handled by a program of the getty family.  Some kin=
d of
synchronization is needed to ensure that both of these programs  don't  try=
  to
access  the line at the same time.  The other forms of locking we describe =
only
apply to disk files, so we can't use them.  Instead, uucp and getty create =
lock
files.   A  typical  lock file will have a name like /var/spool/uucp/LCK..t=
tyb,
and for some reason these double periods in the name have led to the  term =
 dot
locking.

The locking algorithm is straightforward: if a process wants to access a se=
rial
line /dev/ttyb, it looks for a file /var/spool/uucp/LCK..ttyb.  If it finds=
 it,
it  checks  the contents, which specify the process ID of the owner, and ch=
ecks
if the owner still exists.  If it does, the file is  locked,  and  the  pro=
cess
can't  access  the  serial line.  If the file doesn't exist, or if the owne=
r no
longer exists, the process creates the file  if  necessary  and  puts  its =
 own
process ID in the file.

Although  the algorithm is straightforward, the naming conventions are anyt=
hing
but standardized.  When porting software from other platforms, it is absolu=
tely
essential that all programs using dot locking should be agreed on the lock =
file
name and its format.  Let's  look  at  the  lock  file  names  for  the  de=
vice
/dev/ttyb,  which  is major device number 29, minor device number 1.  The l=
s -l
listing looks like:

$ ls -l /dev/ttyb
crw-rw-rw-   1 root     sys       29,   1 Feb 25  1995 /dev/ttyb









                                                                         Pa=
ge 3






File locking


This describes common conventions:
           |                                |
System     | Name                           | PID format
-----------+--------------------------------+-----------------
4.3BSD     | /usr/spool/uucp/LCK..ttyb      | binary, 4 bytes
4.4BSD     | /var/spool/uucp/LCK..ttyb      | binary, 4 bytes
System V.3 | /usr/spool/uucp/LCK..ttyb      | ASCII, 10 bytes
System V.4 | /var/spool/uucp/LK.032.029.001 | ASCII, 10 bytes
A couple of points to note are:

o The digits in the lock file name for System V.4 are the major  device  nu=
mber
  of  the  disk  on  which /dev is located (32), the major device number of=
 the
  serial device (29), and the minor device number of the serial device (1).

o Some systems, such as SCO, have multiple names for terminal lines,  depen=
ding
  on  the  characteristics  which  it  should exhibit.  For example, /dev/t=
ty1a
  refers to a line when running without modem control signals,  and  /dev/t=
ty1A
  refers  to  the  same  line when running with modem control signals.  Cle=
arly
  only one of these lines can be used at the same time: by convention, the =
lock
  file name for both devices is /usr/spool/uucp/LCK..tty1a.

o The  locations  of the lock files vary considerably.  Apart from those in=
 the
  table,       other       possibilities       are        /etc/locks/LCK..t=
tyb,
  /usr/spool/locks/LCK..ttyb, and /usr/spool/uucp/LCK/LCK..ttyb.

o Still  other  methods  exist.   See  the  file  policy.h  in  the Taylor =
uucp
  distribution for further discussion.

Lock files are unreliable.  It is  quite  possible  for  two  processes  to=
  go
through  this  algorithm at the same time, both find that the lock file doe=
sn't
exist, both create it, and both put their process ID in it.  The result is =
 not
what  you  want.   Lock  files  should  only  be  used  when there is reall=
y no
alternative.

locking system call
___________________

locking comes from the original implementation introduced  during  the  Sev=
enth
Edition.   It  is  still  available  in  XENIX.   It implements mandatory r=
ange
locking.





Page 4






                                                                   File loc=
king


int locking (int fd, int mode, long size);

locking locks a block of data of length size bytes,  starting  at  the  cur=
rent
position in the file.  mode can have one of the following values:
          |
Parameter | Meaning
----------+--------------------------------------------------------------
LK_LOCK   | Obtain  an  exclusive  lock for the specified block.  If any
          | part is not available, sleep until it becomes available.
LK_NBLCK  | Obtain an exclusive lock for the specified  block.   If  any
          | part  is  not available, the request fails, and errno is set
          | to EACCES.
LK_NBRLCK | Obtains a shared lock for the specified block.  If any  part
          | is  not  available,  the  request fails, and errno is set to
          | EACCES.
LK_RLCK   | Obtain a shared lock for the specified block.  If  any  part
          | is not available, sleep until it becomes available.
LK_UNLCK  | Unlock a previously locked block of data.
                       Figure 1: locking operation codes


flock
_____

flock is the weakest of all the lock functions.  It provides only advisory =
file
locking.

#include <sys/file.h>
(defined in sys/file.h)
#define   LOCK_SH   1          /* shared lock */
#define   LOCK_EX   2          /* exclusive lock */
#define   LOCK_NB   4          /* don't block when locking */
#define   LOCK_UN   8          /* unlock */

int flock (int fd, int operation);

flock applies or removes a lock on  fd.   By  default,  if  a  lock  cannot=
  be
granted,  the  process blocks until the lock is available.  If you set the =
flag
LOCK_NB, flock returns immediately with errno set to EWOULDBLOCK  if  the  =
lock
cannot be granted.





                                                                         Pa=
ge 5






File locking


fcntl locking
_____________

fcntl is a function that can perform various functions on open files.  A nu=
mber
of these functions perform advisory record locking, and System  V  also  of=
fers
the  option  of  mandatory  locking.  All locking functions operate on a st=
ruct
flock:

struct flock
  {
  short l_type;                /* lock type: read/write, etc. */
  short l_whence;              /* type of l_start */
  off_t l_start;               /* starting offset */
  off_t l_len;            /* len =3D 0 means until end of file */
  long  l_sysid;               /* Only SVR4 */
  pid_t l_pid;            /* lock owner */
};

In this structure,

o l_type specifies the type of the lock, listed below:
          |
  value   | Function
  --------+-------------------------------------
  F_RDLCK | Acquire a read or shared lock.
  F_WRLCK | Acquire a write or exclusive  lock.
  F_UNLCK | Clear the lock.
                          Figure 2: flock.l_type values


o The  offset  is  specified  in  the same way as a file offset is specifie=
d to
  lseek: flock->l_whence may be set to SEEK_SET (offset is from  the  begin=
ning
  of  the  file),  SEEK_CUR  (offset  is  relative  to the current position=
) or
  SEEK_EOF (offset is relative to the current end of file position).

All fcntl lock operations use this struct, which is passed to fcntl as the =
 arg
parameter.  For example, to perform the operation F_FOOLK, you would write:

struct flock flock;
error =3D fcntl (myfile, F_FOOLK, &flock);

The following fcntl operations relate to locking:



Page 6






                                                                   File loc=
king


o F_GETLK  gets information on any current lock on the file.  when calling,=
 you
  set  the   fields   flock->l_type,   flock->l_whence,   flock->l_start,  =
 and
  flock->l_len  to  the  value  of  a lock that we want to set.  If a lock =
that
  would cause a lock request to block already exists, flock is overwritten =
with
  information  about  the  lock.  The field flock->l_whence is set to SEEK_=
SET,
  and flock->l_start is set to the offset in the file.  flock->l_pid is set=
  to
  the  pid  of  the  process  that  owns the lock.  If the lock can be gran=
ted,
  flock->l_type is set to  F_UNLK  and  the  rest  of  the  structure  is  =
left
  unchanged,

o F_SETLK  tries  to set a lock (flock->l_type set to F_RDLCK or F_WRLCK) o=
r to
  reset a lock (flock->l_type set to F_UNLCK).  If a lock cannot  be  obtai=
ned,
  fcntl  returns with errno set to EACCES (System V) or EAGAIN (BSD and POS=
IX).

o F_SETLKW works like F_SETLK, except that if the lock cannot be obtained, =
 the
  process blocks until it can be obtained.

o System V.4 has a further function, F_FREESP, which uses the struct flock,=
 but
  in fact has nothing to do with file locking: it frees the  space  defined=
  by
  flock->l_whence,  flock->l_start, and flock->l_len.  The data in this par=
t of
  the file is physically removed, a read access returns EOF, and a write ac=
cess
  writes  new  data.  The only reason this operation uses the struct flock =
(and
  the reason we discuss it here) is because struct flock has  suitable  mem=
bers
  to describe the area that needs to be freed.  Many file systems allow dat=
a to
  be freed only if the end of the region corresponds with the end of  file,=
  in
  which case the call can be replaced with ftruncate.


lockf
_____

lockf  is  a  library  function  supplied  only  with System V.  Like fcntl=
, it
implements advisory or mandatory range locking based on the  file  permissi=
ons.
In  some  systems,  it  is  implemented  in  terms  of fcntl.  It supports =
only
exclusive locks:

#include <unistd.h>

int lockf (int fd, int function, long size);

The functions are similar to those supplied by  fcntl.   l_type  specifies =
 the




                                                                         Pa=
ge 7






File locking


type of the lock, as shown below.
        |
value   | Function
--------+--------------------------------------------
F_ULOCK | Unlock the range.
F_LOCK  | Acquire exclusive lock.
F_TLOCK | Lock if possible, otherwise return status.
F_TEST  | Check range for other locks.
                           Figure 3: lockf functions

lockf  does  not  specify  a  start offset for the range to be locked.  Thi=
s is
always the current position in the file--you need to use lseek to get there=
  if
you  are  not  there already.  The following code fragments are roughly equ=
iva-
lent:

flock->ltype =3D F_WRLK;         /* lockf only supports write locks */
flock->whence =3D SEEK_SET;
flock->l_start =3D filepos;     /* this was set elsewhere */
flock->l_len =3D reclen;         /* the length to set */
error =3D fcntl (myfile, F_GETLK, &flock);

=2E..and

lseek (myfile, SEEK_SET, filepos); /* Seek the correct place in the file */
error =3D lockf (myfile, F_LOCK, reclen);


Which locking scheme?
_____________________

As we've seen, file locking is a can of worms.  Many portable software pack=
ages
offer  you  a choice of locking mechanisms, and your system may supply a nu=
mber
of them.  Which do you take?  Here are some rules of thumb:

o fcntl locking is the best choice, as long as  your  system  and  the  pac=
kage
  agree  on  what  it  means.   On System V.3 and V.4, fcntl locking offers=
 the
  choice of mandatory or advisory locking, whereas on  other  systems  it  =
only
  offers advisory locking.  If your package expects to be able to set manda=
tory
  locking, and you're running, say, 4.4BSD, the package may not work correc=
tly.
  If this happens, you may have to choose flock locking instead.

o If  your  system  doesn't  have fcntl locking, you will almost certainly =
have
  either flock or lockf locking instead.  If the package supports it,  use =
 it.
  Pure  BSD  systems don't support lockf, but some versions simulate it.  S=
ince

Page 8






                                                                   File loc=
king


  lockf can also be used to require mandatory locking, it's better to use f=
lock
  on BSD systems and lockf on System V systems.

o You'll  probably  not come across any packages which support locking.  If=
 you
  do, and your system supports it, it's not a bad choice.

o If all else fails, use lock files.  This is a very poor option,  though--=
it's
  probably a better idea to consider a more modern kernel.





































                                                                         Pa=
ge 9




--OgqxwSJOaUobr8KG--


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-hackers" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?19990828125241.G13904>