From owner-freebsd-arch  Fri Jan 18 14:25:58 2002
Delivered-To: freebsd-arch@freebsd.org
Received: from avocet.prod.itd.earthlink.net (avocet.mail.pas.earthlink.net [207.217.120.50])
	by hub.freebsd.org (Postfix) with ESMTP id 85E1637B402
	for <arch@freebsd.org>; Fri, 18 Jan 2002 14:25:50 -0800 (PST)
Received: from pool0424.cvx22-bradley.dialup.earthlink.net ([209.179.199.169] helo=mindspring.com)
	by avocet.prod.itd.earthlink.net with esmtp (Exim 3.33 #1)
	id 16RhSF-0004Ed-00; Fri, 18 Jan 2002 14:25:47 -0800
Message-ID: <3C48A0E7.F97BC01@mindspring.com>
Date: Fri, 18 Jan 2002 14:25:43 -0800
From: Terry Lambert <tlambert2@mindspring.com>
X-Mailer: Mozilla 4.7 [en]C-CCK-MCD {Sony}  (Win98; U)
X-Accept-Language: en
MIME-Version: 1.0
To: Michal Mertl <mime@traveller.cz>
Cc: arch@FreeBSD.ORG
Subject: Re: 64 bit counters again
References: <Pine.BSF.4.41.0201181420210.15107-100000@prg.traveller.cz>
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Sender: owner-freebsd-arch@FreeBSD.ORG
Precedence: bulk
List-ID: <freebsd-arch.FreeBSD.ORG>
List-Archive: <http://docs.freebsd.org/mail/> (Web Archive)
List-Help: <mailto:majordomo@FreeBSD.ORG?subject=help> (List Instructions)
List-Subscribe: <mailto:majordomo@FreeBSD.ORG?subject=subscribe%20freebsd-arch>
List-Unsubscribe: <mailto:majordomo@FreeBSD.ORG?subject=unsubscribe%20freebsd-arch>
X-Loop: FreeBSD.ORG

Michal Mertl wrote:
> That's explaining only CPU overhead which I knew there is some.

Yes, the question is whether or not it will impact anything, and
to know that for an arbitrary application, you need to quantify
"some".

> > The additional locks required for i386 64 bit atomicity will,
> > if the counter is accessed by more than one CPU, result in
> > bus contention for inter-CPU coherency.
> 
> What additional locks? The lock prefix for cmpxchg8b? It's required for 32
> bit too and it increases time spent on operation from 3 to 21 clocks
> making the difference between 32 and 64 bit "only" 29 clocks instead on
> 47.

The additional locks on PPC, SPARC, and Alpha.

THe lock also is a barrier instruction.  You need to read the
Intel programming guid on barrier instructions.  On a P4, it
will effectively stall two of the three pipelines.  On standard
SMP systems, it will cause a coherency cycle, if the data being
changed is in a cache line on another CPU.  With ithreads, this
is very likely the case, since the interrupt is not necessarily
likely to maintain affinity for a single CPU (at least until
the scheduler gets fixed).  Invalidation cycles happen at the
memory bus speed.  Most fast memory busses these days are 133MHz
(though I've seen 233MHz, parts are hard to find), and for a
2GHz clock speed, that's a factor of 15 for the 133MHz parts
and a factor of 8 for the 233MHz parts.


> > > What do you mean by that? Zero-copy operation? Like sendfile? Is Apache
> > > 1.x zero-copy?
> >
> > Yes, zero copy.  Sendfile isn't ideal, but works.  Apache is
> > not zero copy.  The idea is to not include a lot of CPU work
> > on copies between the user space and the kernel, which aren't
> > going to happen in an extremely optimized application.
> 
> An "extremely optimized" application is a thing which would have
> an administrator who doesn't enable costly counters.

No. If we are talking a BSD-based embedded system, then it's just
one written by someone who was not playing at being an engineer
(assuming the performance requirements were there; otherwise,
their just an engineer who went after the low hanging fruit, and
it's a legitimate design decision).

It's incredibly easy to get zero copy, if you are willing to make
minor kernel changes to get it.  It's a little harder otherwise.

Note that the NFS code is zero copy (I don't know if the patches
have been rolled into -current yet; I haven't been watching), and
the sendfile is technically zero copy, although it had bad things
about it that you have to worry about (there are several DOS
attacks you can use, if you know the server is written to use
sendfile, and sendfile practically requires that you store files
to be sent with <CR><LF> termination, so you might as well be
running MS-DOS, and conversion of programs to expect this to be
a line terminator is annoying, to say the least, and it has high
relative system call overhead, compared to other approaches).


> > Well, you probably should collect *all* statistics you can,
> > in the most "this is the only thing I'm doing with the box"
> > way you can, before and after the code change, and then plot
> > the ones that get worse (or better) as a result of the change.
> 
> Will do eventually, but unfotunately don't have the time to devote to it
> at the moment.

I think it's a requirement to advocate this change.

[ ... ]

> > I think the answer is "yes, we need atomic counters".  Whether they
> > need to be 64 bit or just 32 bit is really application dependent
> > (we have all agreed to that, I think).
> 
> Thanks. Do you think it's always true (STABLE/CURRENT,network device
> ISRs, /sys/netinet routines) ?

I think it's true of all open-ended counters, where there is
a risk of overflow if they are 32 bit, and some application
could be bitten by the overflow, and still be consideted to
be "well written"... in other words, anywhere overflow is
*expected*.


> > See Bruce's posting about atomicity; I think it speaks very
> > eleoquently on the issue (much more brief than what I'd write
> > to say the same thing ;^)).
> 
> If you mean the email where he talks about atomic_t ("atomic_t would be
> "int" if anything") it doesn't fully apply. I am not inventing atomic_t
> anymore anyway :-). Isn't there a platform, which better works with 64 bit
> ints than with 32 bits (a-la 32/16 bits on modern i386)?

Yes.  IA64.  SPARC 9b (SPARC64) and Alpha, which are 64
bits, require locks, since they don't have the ability to
do an atomic "lock; cmpxchg8b".

An atomic_t is a moderately good idea in this regard; for it
to be other than simple "int", though, I think would require
that the valus be read/written using accessor/mutators in
macro form, so that the overhead on platforms that need it
(everything but IA64, effectively -- I don't think this point
has been commmunicated strongly enough!) could be masked on
the platforms that don't have nearly as high an overhead for
the operations.

-- Terry

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-arch" in the body of the message