From owner-freebsd-hackers Sun Aug 25 13:21:13 1996 Return-Path: owner-hackers Received: (from root@localhost) by freefall.freebsd.org (8.7.5/8.7.3) id NAA01839 for hackers-outgoing; Sun, 25 Aug 1996 13:21:13 -0700 (PDT) Received: from who.cdrom.com (who.cdrom.com [204.216.27.3]) by freefall.freebsd.org (8.7.5/8.7.3) with ESMTP id NAA01821 for ; Sun, 25 Aug 1996 13:21:08 -0700 (PDT) Received: from dympna (dympna.lgc.com [134.132.73.254]) by who.cdrom.com (8.7.5/8.6.11) with SMTP id LAA25683 for ; Sun, 25 Aug 1996 11:11:51 -0700 (PDT) Received: from localhost (localhost [127.0.0.1]) by dympna (950413.SGI.8.6.12/950213.SGI.AUTOCF) via SMTP id NAA16842 for ; Sun, 25 Aug 1996 13:10:02 -0500 Date: Sun, 25 Aug 1996 13:10:02 -0500 (CDT) From: Rob Snow X-Sender: rsnow@dympna To: freebsd-hackers@FreeBSD.ORG Subject: [Fwd: Fastvid,NT,Pentium Pro Performance: Expanation] (fwd) Message-ID: MIME-Version: 1.0 Content-Type: MULTIPART/MIXED; BOUNDARY=------------1CFBAE3959E2B60015FB7483 Content-ID: Sender: owner-hackers@FreeBSD.ORG X-Loop: FreeBSD.org Precedence: bulk This message is in MIME format. The first part should be readable text, while the remaining parts are likely unreadable without MIME-aware tools. Send mail to mime@docserver.cac.washington.edu for more info. --------------1CFBAE3959E2B60015FB7483 Content-Type: TEXT/PLAIN; CHARSET=us-ascii Content-ID: Here is some more information on the post I made a couple of weeks ago about performance features for PPros systems and the PCI bus. This doesn't make a ton of sense to me. However, running the DOS FASTVID program on my Natoma increases video output from ~25MPixel/sec to ~75MPixel/sec. I thought this might be some interesting info for you hacker dudes. BTW, it's from comp.sys.intel -Rob --------------1CFBAE3959E2B60015FB7483 Content-Type: MESSAGE/NEWS Content-ID: Content-Description: Path: dildog.lgc.com!news.sesqui.net!newsfeed.rice.edu!uw-beaver!news.u.washington.edu!news.uoregon.edu!hunter.premier.net!news-res.gsl.net!news.gsl.net!news.mathworks.com!newsfeed.internetmci.com!news.sprintlink.net!news-stk-200.sprintlink.net!news.sprintlink.net!news-chi-13.sprintlink.net!itnews.sc.intel.com!news.fm.intel.com!ornews.intel.com!news.jf.intel.com!usenet From: Jeff Sponaugle Newsgroups: comp.sys.intel,comp.sys.ibm.pc.hardware.chips,alt.comp.periphs.mainboard.asus,alt.comp.periphs.mainboard.tyan,comp.os.linux.misc Subject: Fastvid,NT,Pentium Pro Performance: Expanation Date: Thu, 08 Aug 1996 09:50:10 -0700 Organization: Intel Corporation Message-ID: <320A1AC2.3588@ccm.jf.intel.com> Reply-To: Jeff_Sponaugle@ccm.jf.intel.com NNTP-Posting-Host: jsponau.jf.intel.com Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-Mailer: Mozilla 3.0b4 (Win95; I) Xref: dildog.lgc.com comp.sys.intel:96071 comp.sys.ibm.pc.hardware.chips:93295 alt.comp.periphs.mainboard.asus:23893 alt.comp.periphs.mainboard.tyan:3373 comp.os.linux.misc:130758 Subject: Fastvid,NT,Pentium Pro Performance: Explanation Over the past few weeks, there has been a great deal of discussion about the speed of a Pentium Pro system when compared to an equivalent Pentium systems. In particular, the discussion has focused around the program "fastvid", and certain "PCI features". After reading many such threads, I felt it would be helpful to clarify these issues and put to rest some concerns. First, a disclaimer: I'm an engineer at Intel, however I do not speak for Intel, nor do I represent Intel on this forum in any way. The opinions expressed here are mine and mine alone. I believe the first clarification should be what fastvid does. Fastvid (the dos version), in its current form, does three things. The first is to enable write posting. This is enabled by toggling bit #1 in PCI config register offset 0x53. That bit enables "P6.0 to PCI Write Posting" IMPORTANT: This bit is specific to the Intel 82450Kx/Gx Orion PCIset. This bit is slightly different in the 440FX PCIset) This feature was not enabled in some of the first Pentium Pro systems shipped due to a bug with the 82450Kx/Gx (stepping a3 I believe). Enabling this feature in a system with this older rev of the chipset can cause data loss and random lockups. (of course, many people have enabled this without problem, and the performance of PCI writes is greatly improved...) If you have an older 82450 system, use this feature with caution. If you have a new 82450, or a Natoma, the feature should already be enabled. The second and third thing fastvid does has nothing to do with the chipset, but only with the processor itself. The Pentium Pro Processor has a set of MSR (Machine specific registers) that control the kind of caching performed on different ranges of physical system memory. (e.g. applies to the real physical address, not the virtual address) These MSRs are called MTTRs (Memory type range register). There are three different kinds of MTTRs. Ones for fixed memory ranges, Ones for variable ranges, and a single registers for all other ranges not included in other MTTRs. (i.e.. default) This is where fastvid comes in. By default, the caching applied to the physical address space used by the video adapter is set to none. There are two MTTRs that must be adjusted in order to change this. Most video adapters have two distinct physical address ranges, one in the more standard location (0xa0000), and another flat linear (large) buffer at some other location (0xfc000000 for the Matrox Millennium). Fastvid changes both the fixed and variable MTTRs for these ranges to enable a caching protocol called "Write Combining". (see pages 11-12 thru 11-26 in The Pentium Pro Family Developer's Manual, Volume 3: Operating System Writer's Manual) "Write Combining" is defined as: {per Intel Docs} " System memory locations are not cached and coherency is not enforced by the processor's bus coherency protocol. Speculative reads are allowed. Writes may be delayed and combined in the write buffer to reduce memory accesses. This type of cache-control is appropriate for frame buffers, where the order of the writes is unimportant as long as the writes update memory so they can be seen on the graphics display." This kind of caching is IDEAL for frame buffers. The speed up is great due to the reduction in small PCI write cycles. THE IMPORTANT THING TO NOTE HERE IS THAT "Write Combining" IS NOT A FEATURE OF THE CHIPSET, BUT OF THE PENTIUM PRO PROCESSOR ITSELF. When you run fastvid on a newer Natoma system (which has write posting, as well as several other performance features enabled already), the only enhancement you get is the addition of the Write Combining feature. Also NOTE: Running fastvid on a newer Orion or Natoma system ( that already has write posting enabled), will not directly speed up other PCI devices. The MTTRs are specific to the physical addresses of the Video device, and have no direct effect outside of that range. The only noticeable effect on other devices might be the change in bus traffic, thus providing more bandwidth to other devices. Lastly, for those of you running NT (which I hope for Pro users you are!), I do have an alpha NT driver that programs the "Write Combining" features, as well as can toggle to write posting bit in the Orion chipset. I'll post later this week with information about where to get a copy. (If you run linux, and are interesteing in a driver to turn this feature on, let me know) I hope this clarifies what fastvid does. Many people still ask why Intel has not put code to do this into the bios. The answer should be evident from the above explanation. First, the MTTRs have to be mapped to a particular physical address range. Not all video cards map the linear physical frame buffers to the same place. Secondly even if they did, some cards might use certain memory ordering restrictions, which would be broken by the weak ordering of the "Write Combining" method. IN MY OPINION, it is the responsibility of the Video BIOS and or Video Driver to detect the presence of MTTRS (from the CPUID instruction), and to enable the features that could speed up screen access. Jeff Sponuagle (jeff_sponaugle@ccm.jf.intel.com) --------------1CFBAE3959E2B60015FB7483--