From owner-freebsd-current Mon Jul 12 23:33:35 1999 Delivered-To: freebsd-current@freebsd.org Received: from alcanet.com.au (border.alcanet.com.au [203.62.196.10]) by hub.freebsd.org (Postfix) with ESMTP id A898615214 for ; Mon, 12 Jul 1999 23:33:30 -0700 (PDT) (envelope-from jeremyp@gsmx07.alcatel.com.au) Received: by border.alcanet.com.au id <40326>; Tue, 13 Jul 1999 16:15:25 +1000 Date: Tue, 13 Jul 1999 16:33:11 +1000 From: Peter Jeremy Subject: Re: "objtrm" problem probably found (was Re: Stuck in "objtrm") In-reply-to: <199907130528.WAA74299@apollo.backplane.com> To: dillon@apollo.backplane.com Cc: freebsd-current@FreeBSD.ORG Message-Id: <99Jul13.161525est.40326@border.alcanet.com.au> Sender: owner-freebsd-current@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.ORG Matthew Dillon wrote: >:[1] A locked instruction implies a synchronous RMW cycle. In order >: to meet write-ordering guarantees (without which, a locked RMW >: cycle would be useless as a semaphore primitive), it implies a >: complete write serialization, and probably some level of >: instruction serialisation. Since write-back pipelines will get > > A locked instruction only implies cache coherency across the > instruction. It does not imply serialization. Intel blows it > big time, but that's intel for you. Ooops, looks like foot-in-mouth time for me :-(. Maybe I should have said that "without any other cache coherency protocol, you need serialisation" :-). Given this correction, the lock degradation is much less than I suggested. I suspect there _will_ be gradual degradation though. >: longer and parallel execution units more numerous, the cost of >: a serialisation operation will get relatively higher. Also, > > It is not a large number of execution units that implies a higher > cost of serialization but instead data interdependancies. I was thinking more that a locked instruction is inconsistent with parallel execution, but that's probably not true either. > Modern cache coherency protocols do not have a problem with > a large number of caches in a parallel processing subsystem. I thought we were talking about Intel :-). > So, using the above rules as an example, a locked instruction can cost > as little as 0 extra cycles no matter how many cpu's you have running > in parallel. There is no need to serialize or synchronize anything. Assuming a non-contested access. If you've got two CPU's fighting over a lock, then you'll have a bus cycle - and CPU core speeds are increasing faster than bus speeds. (486's were normally 1 or 2 times the bus speed, a PIII-450 is 4.5 times bus speed). And as you pointed out elsewhere, call/return sequences can't get too much faster - which suggests that the relative costs should stay fairly similar. At least for a well-designed architecture... Peter To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-current" in the body of the message