Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 4 Oct 2010 20:17:08 -0700
From:      Garrett Cooper <gcooper@FreeBSD.org>
To:        Daichi GOTO <daichi@ongs.co.jp>
Cc:        freebsd-fs@freebsd.org, freebsd-current@freebsd.org
Subject:   Re: fcntl always fails to delete lock file, and PID is always -6464
Message-ID:  <AANLkTi=w5ZAfRymSYbL6X37uyYX17J2dW8LHVcPXZ_%2Bb@mail.gmail.com>
In-Reply-To: <20101005093826.17432b1e.daichi@ongs.co.jp>
References:  <20101004123725.65d09b9e.daichi@ongs.co.jp> <AANLkTinZg3n3wDUzQFPv_Gq1o2hswGL3%2B4o0brmTi0-h@mail.gmail.com> <20101004144927.36822f07.daichi@ongs.co.jp> <AANLkTimVcLVdULyAAJD-_TaC5OLj%2BaZVNa=%2BSaiN6PKv@mail.gmail.com> <20101005093826.17432b1e.daichi@ongs.co.jp>

next in thread | previous in thread | raw e-mail | index | archive | help
--0050450157519afb400491d61878
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable

On Mon, Oct 4, 2010 at 5:38 PM, Daichi GOTO <daichi@ongs.co.jp> wrote:
> On Mon, 4 Oct 2010 07:19:45 -0700
> Garrett Cooper <gcooper@FreeBSD.org> wrote:
>> >> issues that might be occurring with the software, as per my copy of
>> >> SUSv4 (see the ERRORS section of fcntl). I would print out the
>> >> strerror for that case.
>> >> =A0 =A0 Providing a backtrace of the application's execution and the
>> >> architecture and what version of FreeBSD you're using would be
>> >> helpful.
>>
>> =A0 =A0 I'm not even getting that far. Logs attached from both runs
>> (WITH_DEBUG_CODE and WITHOUT_DEBUG_CODE).
>
> Yeah, it looks like the same situation.
>
> =A01) mozc_server was killed
> =A0 =A0 =A0lock file remains =A0(even though it should be removed)
> =A02) mozc_server try to boot
> =A0 =A01. check lock file there
> =A0 =A02. there is lock file, so cannot get lock file via fcntl
> =A0 =A03. lock file means there is another mozc_server running,
> =A0 =A0 =A0 so mozc_server will stop boot and finish

    Ok, weird. fstat on the file didn't yield anything nasty when I
ran the app, and deleting the file in /tmp allowed the server to go a
ways, then die, as opposed to die quickly, like what happened on the
second try.

> The cause of problem is that kernel does not remove lock file
> after mozc_server killed. Mozc developer explained me that
> fcntl will remove lock file after that process killed. But
> it looks like fnctl() does not remove lock file itself. According
> to FreeBSD fcntl(2) manual:
>
> =A0 =A0 All locks associated with a file for a given process are removed =
when the
> =A0 =A0 process terminates.
>
> No explanation lock file removing. Does FreeBSD fnctl(2) not remove lock =
file
> after process killed? =A0Apparently from Mozc developer, Linux kernel rem=
oves
> lock files after process killed.

    On Linux (RHEL 4.8):

Window 1:
$ ls -l /tmp/lockfile
ls: /tmp/lockfile: No such file or directory
$ ./test_fcntl

Window 2:

$ ls -l /tmp/lockfile
--wxr-s--T  1 garrcoop eng 0 Oct  4 19:49 /tmp/lockfile
$ ./test_fcntl
test_fcntl: fcntl: Resource temporarily unavailable

Ok. This (EAGAIN) matches the Linux requirements specified in the
manpage [1] I found, as well as the POSIX manpage [2]. The author is
wrong about fcntl removing the file at exit though:

$ ls -l /tmp/lockfile
--wxr-s--T  1 garrcoop eng 0 Oct  4 19:49 /tmp/lockfile

The file descriptor is closed though, so I can remove it at will:

$ rm /tmp/lockfile
$ ls -l /tmp/lockfile
ls: /tmp/lockfile: No such file or directory

Following through the same process on FreeBSD...

Window 1:
$ ls -l /tmp/lockfile
ls: /tmp/lockfile: No such file or directory
$ ./test_fcntl

Window 2:

$ ls -l /tmp/lockfile
-rwsr-x---  1 garrcoop  wheel  0 Oct  4 20:14 /tmp/lockfile
$ ./test_fcntl
test_fcntl: fcntl: Resource temporarily unavailable

Well, lookie here! It locked as expected :).

$ ls -l /tmp/lockfile
-rwsr-x---  1 garrcoop  wheel  0 Oct  4 20:14 /tmp/lockfile
$ rm /tmp/lockfile
$ ls -l /tmp/lockfile
ls: /tmp/lockfile: No such file or directory

So something else is going on with the application that needs to be
resolved in that area.

With that aside though, after modifying the test app a bit, I'm
confused at the value of l_pid...

Window 1:
$ ./test_fcntl
My pid: 5372

Window 2:
$ ./test_fcntl
My pid: 5373
test_fcntl: fcntl: Resource temporarily unavailable
PID=3D1 has the lock

    Huh...? init has the file locked...? WTF?!
    So assuming Occam's Razor, I did a bit more reading and it turns
out that l_pid is only populated when you call with F_GETLK:

     negative, l_start means end edge of the region.  >>> The l_pid and l_s=
ysid
     fields are only used with F_GETLK to return the process ID of the proc=
ess
     holding a blocking lock and the system ID of the system that owns that
     process.  Locks created by the local system will have a system ID of
     zero.  <<< After a successful F_GETLK request, the value of l_whence i=
s
     SEEK_SET.

    Thus, after fixing the test app I'm getting a sensical value:

Window 1:
$ ./test_fcntl
My pid: 5394

Window 2:
$ ./test_fcntl
My pid: 5395
test_fcntl: fcntl[1]: Resource temporarily unavailable
PID=3D5394 has the lock

Linux operates in the same manner:

Window 1:
$ ./test_fcntl
My pid: 17861

Window 2:
$ ./test_fcntl
My pid: 17963
test_fcntl: fcntl[1]: Resource temporarily unavailable
PID=3D17861 has the lock

    Which I would expect because I'm not using anything exotic with
fcntl(2) / open(2).
    I suspect mozc isn't properly initializing / calling fcntl(2), or
the author used a non-POSIX extension that is implementation dependent
and doesn't realize it (the Linux manpage has a pretty fat set of
warnings about POSIX compatibility up at the top of the manpage). The
developer might also want to use O_EXCL in the flags passed to open(2)
as well, unless they want to lock specific sections in the file.
    Verified on UFS2 with SUJ. Test app attached.

>> $ uname -a
>> FreeBSD bayonetta.local 9.0-CURRENT FreeBSD 9.0-CURRENT #9 r211309M:
>> Thu Aug 19 22:50:36 PDT 2010
>> root@bayonetta.local:/usr/obj/usr/src/sys/BAYONETTA =A0amd64
>>
>> =A0 =A0 I completely blasted past the part of your reply above where you
>> said your home directory is served up via NFS. It might be a problem
>> if you don't have lockd running (/etc/rc.d/lockd onestatus ? It isn't
>> enabled by default, and definitely isn't on on my machine) or the
>> mount isn't setup with lockd on the client side (nolockd will do this
>> on the initial mount, according to the manpage). There might be
>> `dragons' in the nfsd code that fail to do locking properly, but I
>> think that Rick (rmacklem@) or someone else on the list might be
>> better at answering whether or not things work from an NFS
>> perspective.
>
> server side:
> =A0FreeBSD 7.3-PRERELEASE #0: Mon Mar =A01 15:10:07 JST 2010 i386
> =A0rc.conf
> =A0 =A0nfs_server_enable=3D"YES"
> =A0 =A0mountd_enable=3D"YES"
> =A0 =A0nfs_reserved_port_only=3D"YES"
> =A0 =A0rpc_lockd_enable=3D"YES"
> =A0 =A0rpc_statd_enable=3D"YES"
>
> client side:
> =A0FreeBSD 9.0-CURRENT #6 r213257: Thu Sep 30 10:30:06 JST 2010 amd64
> =A0rc.conf:
> =A0 =A0nfs_client_enable=3D"YES"
> =A0 =A0nfs_reserved_port_only=3D"YES"
> =A0 =A0rpc_lockd_enable=3D"YES"
> =A0 =A0rpc_statd_enable=3D"YES"
>
>> =A0 =A0 I'd definitely divulge which version of NFS you're using as well
>> as what your NFS server and client are running if enabling lockd both
>> client and server side doesn't solve your problems right away.

[...]

> I have tested with ZFS because I was doubting NFS working well,
> but result was the same. (I didn't test with UFS.)
>
> Thanks truss output!

No problem :).

Cheers,
-Garrett

[1] http://linux.die.net/man/2/fcntl
[2] http://www.opengroup.org/onlinepubs/009695399/functions/fcntl.html

--0050450157519afb400491d61878
Content-Type: application/octet-stream; name="test_fcntl.c"
Content-Disposition: attachment; filename="test_fcntl.c"
Content-Transfer-Encoding: base64
X-Attachment-Id: f_gew700ol0

I2luY2x1ZGUgPHN5cy90eXBlcy5oPgojaW5jbHVkZSA8ZXJyLmg+CiNpbmNsdWRlIDxlcnJuby5o
PgojaW5jbHVkZSA8ZmNudGwuaD4KI2luY2x1ZGUgPHN0ZGlvLmg+CiNpbmNsdWRlIDx1bmlzdGQu
aD4KCmludAptYWluKHZvaWQpCnsKCXN0cnVjdCBmbG9jayBmbG9jazsKCWludCBlcnJvciwgZmQ7
CgoJZmQgPSBvcGVuKCIvdG1wL2xvY2tmaWxlIiwgT19DUkVBVHxPX1dST05MWSk7CglpZiAoZmQg
PT0gLTEpCgkJZXJyKDEsICJvcGVuIik7CgoJcHJpbnRmKCJNeSBwaWQ6ICVkXG4iLCBnZXRwaWQo
KSk7CgoJZmxvY2subF90eXBlID0gRl9XUkxDSzsKCWZsb2NrLmxfc3RhcnQgPSAwOwoJZmxvY2su
bF93aGVuY2UgPSBTRUVLX1NFVDsKCWZsb2NrLmxfbGVuID0gMDsKCglpZiAoZmNudGwoZmQsIEZf
U0VUTEssICZmbG9jaykgPT0gLTEpIHsKCQllcnJvciA9IGVycm5vOwoJCXdhcm4oImZjbnRsWzFd
Iik7CgkJaWYgKGVycm9yID09IEVBR0FJTikgewoJCQlpZiAoZmNudGwoZmQsIEZfR0VUTEssICZm
bG9jaykgPT0gLTEpCgkJCQllcnIoMSwgImZjbnRsWzJdIik7CgkJCXByaW50ZigiUElEPSVkIGhh
cyB0aGUgbG9ja1xuIiwgZmxvY2subF9waWQpOwoJCX0KCQlyZXR1cm4gKDEpOwoJfQoKCXdoaWxl
ICgxKQoJCXNsZWVwKDEpOwoKCXJldHVybiAoMCk7Cgp9Cg==
--0050450157519afb400491d61878--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?AANLkTi=w5ZAfRymSYbL6X37uyYX17J2dW8LHVcPXZ_%2Bb>