Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 29 Sep 2006 09:07:19 -0500
From:      Astrodog <astrodog@gmail.com>
To:        "Robert Watson" <rwatson@freebsd.org>
Cc:        current@freebsd.org
Subject:   Re: lockf in installworld -- not a good idea
Message-ID:  <2fd864e0609290707t7e7d6e17g61a09ff5aa10ff3f@mail.gmail.com>
In-Reply-To: <20060929144738.W70454@fledge.watson.org>
References:  <20060929141709.E70454@fledge.watson.org> <20060929134332.GD4776@rambler-co.ru> <20060929144738.W70454@fledge.watson.org>

next in thread | previous in thread | raw e-mail | index | archive | help
On 9/29/06, Robert Watson <rwatson@freebsd.org> wrote:
>
>
> On Fri, 29 Sep 2006, Ruslan Ermilov wrote:
>
> > The necessity to run rpc.lockd is documented in the build(7) manpage.
> > Quote:
>
> I think this is a bad idea.  rpc.lockd is one of the most fragile and
> largely
> broken pieces of the operating system.  Arguably we shouldn't even be
> shipping
> it.  Making it required for installworld is asking for trouble.
>
> > : installworld     Install everything built by a preceding buildworld
> step
> > :                  into the directory hierarchy pointed to by make(1)
> vari-
> > :                  able DESTDIR.
> > :
> > :                  If installing onto an NFS file system, make sure that
> > :                  rpc.lockd(8) is running on both client and
> server.  See
> > :                  rc.conf(5) on how to make it start at boot time.
> >
> >> I've noticed an increasing intolerance in our tools for system install
> and
> >> maintenance to locking not being implemented over the past few
> years.  I no
> >> longer get working cron on boxes with neither rpc.lockd nor local
> locking
> >> enabled, for example.  Installworld represents a bigger problem, since
> I
> >> don't want to have to depend on a completely working rpc chain in order
> to
> >> installworld, nor depend on running in what would effectively be
> multiuser
> >> mode.  Surely there's a better fix for this than adding lockf use?
> >>
> > I don't know of a better fix.  Another approach is that mentioned in the
> > commit log and used by NetBSD.  I tried it, and it was very fragile --
> it
> > could easily leave the temporary file around, and that would stuck
> forever
> > another instances.
> >
> > The problem at hand is that multiple instances of install-info(1) are
> > attempting to write to the ${DESTDIR}/usr/share/info/dir file. If you
> have a
> > better idea, don't hesitate to let me know.  I'd very much like to get
> rid
> > of that as well.
>
> The basic problem here is that install-info doesn't support parallelism.
> Sounds like we just need to accept that and therefore accept that we don't
> support doing installworld with the -j argument.  A middle-ground solution
> would be to only use lockf if -j is used.


I'd be concerned that with this approach, we could see some rather hard to
diagnose problems come up  if rpc.lockd broke silently during the install,
but the install continued.

Personally, I find it much, much more important to be able to do an
installworld from a "real" single user mode via NFS, than it is to support
-j. I don't think I've ever had a circumstance where I really needed make
installworld to finish quickly. Besides, if there's significant use of locks
in installworld, -j doesn't get a whole lot of performance gain anyway.

Another thought here, is that in my experience, installworld is disk, or
network I/O bound, not CPU... Under those circumstances, we may find that
there's reduced performance with -j anyway, in which case there's no reason
to support it that I can see.

For what its worth.... I'd go with just stopping support for -j in
installworld, even if things are CPU bound.

--- Harrison Grundy



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?2fd864e0609290707t7e7d6e17g61a09ff5aa10ff3f>