From owner-freebsd-stable@FreeBSD.ORG Sat May 24 14:54:42 2003 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id B652337B401 for ; Sat, 24 May 2003 14:54:42 -0700 (PDT) Received: from cirb503493.alcatel.com.au (c18609.belrs1.nsw.optusnet.com.au [210.49.80.204]) by mx1.FreeBSD.org (Postfix) with ESMTP id 4A9AC43F75 for ; Sat, 24 May 2003 14:54:41 -0700 (PDT) (envelope-from peterjeremy@optushome.com.au) Received: from cirb503493.alcatel.com.au (localhost.alcatel.com.au [127.0.0.1])h4OLsVp9040957; Sun, 25 May 2003 07:54:31 +1000 (EST) (envelope-from jeremyp@cirb503493.alcatel.com.au) Received: (from jeremyp@localhost) by cirb503493.alcatel.com.au (8.12.8/8.12.8/Submit) id h4OLsRD6040956; Sun, 25 May 2003 07:54:27 +1000 (EST) Date: Sun, 25 May 2003 07:54:27 +1000 From: Peter Jeremy To: "David G. Lawrence" Message-ID: <20030524215427.GA27340@cirb503493.alcatel.com.au> References: <20030521104442.G65751@prg.traveller.cz> <20030522092501.GZ10795@imhotep.yuckfou.org> <20030522094306.GD60352@nexus.dglawrence.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20030522094306.GD60352@nexus.dglawrence.com> User-Agent: Mutt/1.4.1i cc: freebsd-stable@freebsd.org Subject: Re: 4GB limit with netstat X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 24 May 2003 21:54:43 -0000 On Thu, May 22, 2003 at 02:43:06AM -0700, David G. Lawrence wrote: > I've forgotten the orginaly discussion last year - just how expensive >is it again to do a locked 64bit update on x86? If it is less than say >8x the time to do a 32bit increment, then we should probably just bite >the bullet and do it for the few counters where it makes sense (input >and output bytes and packets). On a 386 or 486, you can't do it - but since we don't support SMP, "addl %eax,counter; adcl %edx,counter+4" would work as long as the counter was not updated or referenced at interrupt level (and a di/ei pair would fix the interrupt problem without spoiling interrupt latency too much). On Pentium-and-above, you need to use a locked cmpxchg8b in a loop: movl update,%esi 1: movl %esi,%ebx xorl %ecx,%ecx movl counter,%eax movl counter+5,%edx addl %eax,%ebx adcl %edx,%ecx lock cmpxchg8b counter jnz 1b I'm not sure how much slower this is than lock addl %esi,counter but it's definitely a lot more code. Ignoring the 'lock', 8x slower would seem very optimistic but I think Matt once pointed out that the lock prefix is incredibly expensive on SMP so the overall performance degradation might not be so bad. Of course, there's the second-order effect of needing 4 additional work registers - which leaves only a single register unused by the operation. This means the compiler has to spill virtually all temporaries, further degrading performance. Of course, both approaches suffer from the problem that there's no easy way to atomically load a 64-bit value. This means you need to lock reads as well as writes - which will significantly increase overall complexity and reduce performance. Peter