Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 17 Feb 2002 14:58:35 -0800
From:      Terry Lambert <tlambert2@mindspring.com>
To:        Matthew Dillon <dillon@apollo.backplane.com>
Cc:        arch@FreeBSD.ORG, jhb@FreeBSD.ORG
Subject:   Re: revised buildworld comparison stable vs current
Message-ID:  <3C70359B.54867FBC@mindspring.com>
References:  <200202170818.g1H8ID067573@apollo.backplane.com> <200202171824.g1HIOnw71118@apollo.backplane.com>

next in thread | previous in thread | raw e-mail | index | archive | help
Matthew Dillon wrote:
> 
>     (note: source is NFS mounted.  /usr/obj is local disk.  Witness is
>     turned off on all -current builds).
> 
>     stable:                     1800 seconds
>     current/invariants:         2219 seconds
>     current/no-invariants       2142 seconds


Basically, you aren't going to win back the lock overhead
unless you also get the increased concurrency.

When SVR4 ES/MP -- SVR4 4.2, also known as UnixWare 2.x --
went to SMP compiled by default for the UP systems, there
was a performance improvement of 30% overall over the
previous UP-only kernel.  This was even with the network
speed losses from going to the ODI drivers (these losses
were not inconsiderable; if I owned SCO like Caldera does
now, that's the first thing I'd revert, for another 15%).

The reason this happened was because of the increased
concurrency, even in the UP case, from work to make
UFS (FFS) reentrant, the TCP stack reentrant, etc..

For this particular "benchmark", the places where there
are natural stall points are:

1)	FS reentrancy on a directory.

2)	Stalling once a buffer is scheduled for write,
	and goes read-only, but stays in the pipeline
	for a long time.

The first one is pretty obvious; basically, moving to
the FS owning the vnodes, rather than getting them from
a system pool, and getting rid of the VOP_LOCK that's
required by the second chance cache and the vnode
reclaimer in the system, are enough to increase the
concurrency there.

The second one is harder, but easier.  Intention mode
locking would reduce the stall, but typical usage in a
compile situation means that the win there would not
be as much as you might naievely expect.  The problem
is that just blocking the operations on the buffers
*only* when they have gone to the driver, rather than
once they've entered the BIO system doesn't stop the
final stall on the object file creation.  THe way you
have to deal with that (if you are unwilling to change
the directory layout management) is to lock directory
blocks individually, and add additional directory
blocks, up to a small number, rather than blocking
writes to the entire directory when a single block
write is in progress.  By treating directory blocks as
atomic, and allowing a little bloat by allowing creates
in a new block when creates in the prior block are
concurrently prohibited (reading it to verify that the
file doesn't already exist is OK), you should get a
60% or so increase in stall, at least from my simple
statistics gathering on a build world, counting the
stalls based on this.

Lots of "low hanging fruit" there...

-- Terry

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-arch" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?3C70359B.54867FBC>