Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 3 Apr 2001 19:07:29 -0700 (PDT)
From:      Matt Dillon <dillon@earth.backplane.com>
To:        Garrett Wollman <wollman@khavrinen.lcs.mit.edu>
Cc:        Alfred Perlstein <alfred@FreeBSD.ORG>, cvs-committers@FreeBSD.ORG, cvs-all@FreeBSD.ORG
Subject:   Re: cvs commit: src/sys/sys mbuf.h src/sys/kern uipc_mbuf.c
Message-ID:  <200104040207.f3427Tw80262@earth.backplane.com>
References:  <200104030315.f333FCX69312@freefall.freebsd.org> <20010403140457.B2952@electricjellyfish.net> <200104031813.f33ID4b58965@earth.backplane.com> <20010403194004.A15434@technokratis.com> <200104040020.f340Kgi74269@earth.backplane.com> <20010403173529.O12164@fw.wintelcom.net> <200104040106.VAA26103@khavrinen.lcs.mit.edu>

next in thread | previous in thread | raw e-mail | index | archive | help

:<<On Tue, 3 Apr 2001 17:35:29 -0700, Alfred Perlstein <alfred@FreeBSD.org> said:
:
:> While this is a good idea, it doesn't give us a consistant view of
:> the stats without additional atomic ops or critical regions.
:
:Atomic operations are likely to be cheaper on modern platforms than
:locking.  In any case, you can simply keep per-CPU stats and then
:summarize when they are requested, which is even cheaper, since the
:stats are updated FAR more frequently than they are inspected.
:
:-GAWollman

    A per-cpu variable that is only manipulated by that cpu does not even
    need to be bus-locked (i.e. not even a 'lock' prefix is required for
    i386).   For a counter, a simple 'incl' or 'addl' type of instruction
    is sufficient.  In order of expense, for an i386:

	VERY FAST	normal (per-cpu) read-modify-write instruction,
			no cache contention.  (incl, addl, etc...)

	SLOW		bus-locked instruction, no cache contention.
			(lock; addl ...)

	EXTREMELY SLOW	bus-locked instruction, cache contention.
			(lock; addl ...)

    In anycase, people should keep in mind that the whole point of using
    a per-cpu variable in this case is to avoid *ALL* locking requirements...
    avoid the mutexes, AND avoid any bus locking.  The moment you do either
    you might as well throw in the towel and not bother. 

    But if you do it right, then whatever 'slow' cases remain (e.g. no mbufs
    on the per-cpu free list) can be implemented with simple global mutexes
    and no special optimizations.  In fact, any serious effort towards 
    optimizing the slow case when you have a fast case fronting it is nothing
    but a waste of time.  Don't add complex optimizations where they aren't
    needed.

						-Matt


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe cvs-all" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200104040207.f3427Tw80262>