Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 5 Jul 2006 16:20:05 +0300
From:      Kostik Belousov <kostikbel@gmail.com>
To:        Robert Watson <rwatson@freebsd.org>
Cc:        freebsd-stable@freebsd.org, Michel Talon <talon@lpthe.jussieu.fr>
Subject:   Re: NFS Locking Issue
Message-ID:  <20060705132005.GP37822@deviant.kiev.zoral.com.ua>
In-Reply-To: <20060705140225.X18236@fledge.watson.org>
References:  <E1FxzUU-000MMw-5m@cs1.cs.huji.ac.il> <20060705100403.Y80381@fledge.watson.org> <20060705113822.GM37822@deviant.kiev.zoral.com.ua> <20060705122040.GN37822@deviant.kiev.zoral.com.ua> <20060705140225.X18236@fledge.watson.org>

next in thread | previous in thread | raw e-mail | index | archive | help

--B3NBd8mrXZtPJEYR
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

On Wed, Jul 05, 2006 at 02:04:59PM +0100, Robert Watson wrote:
>=20
> On Wed, 5 Jul 2006, Kostik Belousov wrote:
>=20
> >>Also, the both lockd processes now put identification information in th=
e=20
> >>proctitle (srv and kern). SIGUSR1 shall be sent to srv process.
> >
> >Hmm, after looking at the dump there and some code reading, I have noted=
=20
> >the following:
> >
> >1. NLM lock request contains the field caller_name. It is filled by (let=
=20
> >call it) kernel rpc.lockd by the results of hostname(3).
> >
> >2. This caller_name is used by server rpc.lockd to send request for host=
=20
> >monitoring to rpc.statd (see send_granted). Request is made by clnt_call=
,=20
> >that is blocking rpc call.
> >
> >3. rpc.statd does getaddrinfo on caller_name to determine address of the=
=20
> >host to monitor.
> >
> >If the getaddrinfo in step 3 waits for resolver, then your client machin=
e=20
> >will get locking process in"lockd" state.
> >
> >Could people experiencing rpc.lockd mistery at least report whether=20
> >_server_ machine successfully resolve hostname of clients as reported by=
=20
> >hostname? And, if yes, to what family of IP protocols ?
>=20
> It's not impossible.  It would be interesting to see if ps axl reports th=
at=20
> rpc.lockd is in the kqread state, which would suggest it was blocked in t=
he=20
^^^^^^^^^^^^  rpc.statd :).
> resolver.  We probably ought to review rpc.statd and make sure it's=20
> generally sensible.  I've noticed that its notification process on start =
is=20
> a bit poorly structured in terms of how it notifies hosts of its state=20
> change -- if one host is down, it may take a very long time to notify oth=
er=20
> hosts.

--B3NBd8mrXZtPJEYR
Content-Type: application/pgp-signature
Content-Disposition: inline

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.4 (FreeBSD)

iD8DBQFEq7yEC3+MBN1Mb4gRAl6hAJkBxQS3CgwTXHTUpUYSK/z7SedtrwCfXksU
qepdFQmKwhGll47wICxaJDg=
=anyo
-----END PGP SIGNATURE-----

--B3NBd8mrXZtPJEYR--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20060705132005.GP37822>