Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 6 May 2004 10:15:44 -0400 (EDT)
From:      Andrew Gallatin <gallatin@cs.duke.edu>
To:        Don Bowman <don@sandvine.com>
Cc:        Gerrit Nagelhout <gnagelhout@sandvine.com>
Subject:   RE: 4.7 vs 5.2.1 SMP/UP bridging performance
Message-ID:  <16538.18576.320694.79356@grasshopper.cs.duke.edu>
In-Reply-To: <FE045D4D9F7AED4CBFF1B3B813C85337045D8CB5@mail.sandvine.com>
References:  <FE045D4D9F7AED4CBFF1B3B813C85337045D8CB5@mail.sandvine.com>

next in thread | previous in thread | raw e-mail | index | archive | help

Don Bowman writes:
 > 
 > On the P4, there are mfence,lfence,sfence instructions to enforce
 > memory ordering. These are cheaper than "lock; andl" or "cpuid",
 > which are the traditional 'sync' instructions.

For what its worth, using those operations yeilds these results
on my 2.53GHz P4 (for UP)

Mutex (atomic_store_rel_int) cycles per iteration: 208 
Mutex (sfence) cycles per iteration: 85 
Mutex (lfence) cycles per iteration: 63 
Mutex (mfence) cycles per iteration: 169 
Mutex (none) cycles per iteration: 18 

lfence looks like a winner..

Drew



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?16538.18576.320694.79356>