From owner-freebsd-hackers Fri Jan 31 09:40:30 1997 Return-Path: Received: (from root@localhost) by freefall.freebsd.org (8.8.5/8.8.5) id JAA03084 for hackers-outgoing; Fri, 31 Jan 1997 09:40:30 -0800 (PST) Received: from phaeton.artisoft.com (phaeton.Artisoft.COM [198.17.250.211]) by freefall.freebsd.org (8.8.5/8.8.5) with SMTP id JAA03073 for ; Fri, 31 Jan 1997 09:40:18 -0800 (PST) Received: (from terry@localhost) by phaeton.artisoft.com (8.6.11/8.6.9) id KAA03009; Fri, 31 Jan 1997 10:38:37 -0700 From: Terry Lambert Message-Id: <199701311738.KAA03009@phaeton.artisoft.com> Subject: Re: performance puzzler To: ajones@ctron.com (Alexander Seth Jones) Date: Fri, 31 Jan 1997 10:38:37 -0700 (MST) Cc: hackers@freefall.freebsd.org In-Reply-To: <32F20D0B.6385@ctron.com> from "Alexander Seth Jones" at Jan 31, 97 10:17:31 am X-Mailer: ELM [version 2.4 PL24] MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: owner-hackers@FreeBSD.ORG X-Loop: FreeBSD.org Precedence: bulk > The code consists of protocol messages being encoded into a byte > stream, and then decoded back into a C++ object. One of the messages is > 200 bytes in length, and I successfully decode it, do all error > checking, etc., in about 0.5 milliseconds. This is without -O, and with > -m486 and -ggdb. The hardware is an Intel 486-66, running > 2.1.5-RELEASE. > > The puzzling thing comes when I try to run the test at home on my AMD > 486-120, running 2.1.0-RELEASE. It runs the test in 0.6 milliseconds!! Divide each clock speed by increasing integer values starting with 1 until the result is less than or equal to 33. This is your max bus speed possible for the system. An easy way to do this is magnitude based arithmatic (yes, I own a slide-rule): exp(log(120)%log(33)) = 30 exp(log(66)%log(33)) = 33 Your bus on the 120 is 3MHz slower than the bus on the 66. What you are doing is not I/O bound, it is CPU bound. It is a common mistake to believe that a clock multiplied CPU will make everything faster, and frequently people trade down bus speed to trade up CPU speed. In point of fact, access to everything but L1 is done at bus speed, not CPU speed, and access to non-L1, non-L2 potentially causes I/O wait states. These are the results you should expect on I/O bound operations, even on CPU's from the same chipmask. There may be AMD-specific instruction speed difference on top of this. Regards, Terry Lambert terry@lambert.org --- Any opinions in this posting are my own and not those of my present or previous employers.