Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 8 Mar 2006 19:57:22 -0500
From:      Kris Kennaway <kris@obsecurity.org>
To:        Miguel Lopes Santos Ramos <miguel@anjos.strangled.net>
Cc:        kuriyama@imgsrc.co.jp, freebsd-stable@freebsd.org, kris@obsecurity.org
Subject:   Re: rpc.lockd brokenness (2)
Message-ID:  <20060309005722.GA55432@xor.obsecurity.org>
In-Reply-To: <200603090026.k290Qihj002701@compaq.anjos.strangled.net>
References:  <20060308224531.GA53611@xor.obsecurity.org> <200603090026.k290Qihj002701@compaq.anjos.strangled.net>

next in thread | previous in thread | raw e-mail | index | archive | help

--pWyiEgJYm5f9v55/
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

On Thu, Mar 09, 2006 at 12:26:44AM +0000, Miguel Lopes Santos Ramos wrote:
> > From: Kris Kennaway <kris@obsecurity.org>
> > Subject: Re: rpc.lockd brokenness (2)
> >
> > This is intentional.  It's how pidfile_*() tests whether the process
> > is still running.  The intention is that if someone tries to open the
> > pidfile again while the first process is still running, the lock
> > acquisition will fail and we'll know the other process is still alive,
> > and therefore avoid starting a second instance.
>=20
> No, no, you got me wrong. The pidfile is left locked after cron stopped
> running (with /etc/rc.d/cron stop). This behaviour must be wrong.

OK, I misunderstood.  The rc.d script will signal cron to kill it,
which should be closing the file descriptors and causing rpc.lockd to
release the lock.  Perhaps this part is broken.  OK, I tested this
with daemon -p, and it indeed seems to be broken:

haessal# daemon -p pid_file sleep 100000
haessal# kill -KILL `cat pid_file`
haessal# ps -p `cat pid_file`
  PID  TT  STAT      TIME COMMAND
haessal# lockf -t 0 pid_file echo Yay
lockf: pid_file: already locked

> > There is a (known) lockd bug here though, which you isolated:
> >
>=20
> So, this really is bin/80389?

No, I don't think so.  The missing ability to cancel locking requests
(i.e. unkillable process while blocked on a lock) has never been
implemented in FreeBSD's rpc.lockd (I'm not aware of a PR about it, so
I filed my own earlier tonight), and the problem above might be a
separate regression.

> I am a bit disappointed. First, this problem didn't cause me trouble befo=
re
> I went to 6-STABLE, now I must either disable cron or disable locking (wh=
ich
> I can't).
> And I'm still not completely convinced. That problem, if I understand cor=
rectly,
> existed before January...

The pidfile_*() functions are new, before that the pidfile handling
was done differently.

> There are two things...
> - cron.pid shouldn't be locked after cron terminated. (this interaction w=
as
> fully saved as http://mega.ist.utl.pt/~mlsr/nfs-nofile.bin)

Actually the locking isn't traced here; I misunderstood how it works,
and the lock transactions are done on another UDP port.  You have to
use rpcinfo to figure out which one it is, since it varies.  Anyway,
the above sequence reproduces it.

> - cron shouldn't hang on startup just because the file is locked, since
> pidfile_open opens it with O_NONBLOCK (unlike lockf).

I haven't been able to reproduce this, e.g. lockf -t 0 does O_NONBLOCK
locking and works correctly when the file is already locked.  Perhaps
it's another locked file (not the pidfile) that was also leaked in the
same way, and is being opened without O_NONBLOCK.

> - cron shouldn't hang in such a way that it is not killable... (and should
> not also the open system call in lockf be interruptible?)

This is the bug (really: missing feature) that I described in my
previous mail.

Kris
--pWyiEgJYm5f9v55/
Content-Type: application/pgp-signature
Content-Disposition: inline

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.2.1 (FreeBSD)

iD8DBQFED31yWry0BWjoQKURAgULAJ9i4lMqVMtQXnglp0eVl+Md6FGnWgCgonFc
Gpxre1m11a+weYT1QSWNc44=
=80Xg
-----END PGP SIGNATURE-----

--pWyiEgJYm5f9v55/--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20060309005722.GA55432>