From owner-freebsd-current Sun Mar 10 11:26:44 1996 Return-Path: owner-current Received: (from root@localhost) by freefall.freebsd.org (8.7.3/8.7.3) id LAA15147 for current-outgoing; Sun, 10 Mar 1996 11:26:44 -0800 (PST) Received: from who.cdrom.com (who.cdrom.com [192.216.222.3]) by freefall.freebsd.org (8.7.3/8.7.3) with SMTP id LAA15142 for ; Sun, 10 Mar 1996 11:26:43 -0800 (PST) Received: from phaeton.artisoft.com (phaeton.Artisoft.COM [198.17.250.211]) by who.cdrom.com (8.6.12/8.6.11) with ESMTP id LAA22667 for ; Sun, 10 Mar 1996 11:26:41 -0800 Received: (from terry@localhost) by phaeton.artisoft.com (8.6.11/8.6.9) id MAA01621; Sun, 10 Mar 1996 12:21:10 -0700 From: Terry Lambert Message-Id: <199603101921.MAA01621@phaeton.artisoft.com> Subject: Re: AMD doesn't like SNAP! (panic: unwire: page not in pmap) To: imb@scgt.oz.au (michael butler) Date: Sun, 10 Mar 1996 12:21:10 -0700 (MST) Cc: rgrimes@GndRsh.aac.dev.com, current@FreeBSD.org In-Reply-To: <199603101425.BAA02483@asstdc.scgt.oz.au> from "michael butler" at Mar 11, 96 01:25:33 am X-Mailer: ELM [version 2.4 PL24] MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: owner-current@FreeBSD.org X-Loop: FreeBSD.org Precedence: bulk > I note that through a number of drivers there is mention of > cache-invalidation instructions (software-style) but none of them seem to > implement anything of this nature. Is there a problem with doing this .. > that is .. to invalidate the page(s) into which data has just been > transferred prior to the application being told that the transfer completed? > No other cases need to be considered do they ? OK. This is an interesting topic. 8-(. Before, when I suggested that a DMA be triggered as part of a probe process on a per controller basis to determine if bounce buffers were necessary, I neglected the non-working cache case. The cache cases need to be considered at the same time because they need to trigger similar controller/memory events in order to be detectable. When a bus master DMA occurs, there is supposed to be a cache notification, and the L1 and L2 cache lines are supposed to be invalidated or written back for the memory range in which the DMA took place. It's possible to fail the L1 cache invalidate/write back, the L2, or both. The old Cyrix/TI 386 processors using the Cyrix chip mask (the newer Cyrix parts -- not sure about TI -- use a licensed version of the IBM "Blue Lightening" masks) had an L1 cache without implementing a cache notification mechanism. Because of this, if the L1 cache is enabled on these chips (it is disabled by default and must be explicitly enabled by software -- usually BIOS POST based on CMOS settings), they will potentially have stale data hanging around after a bus mastering DMA. These chips are detectable ("The Undocumented PC"), and the cache is software disableable (contravening the user preferences -- some might argue that it is a driver problem, and the driver should explicitly BINVD instead). For the L2, the Saturn I chipset (mask date pre-April 1994) had a flaw, where the DMA notification from PCI was simply not internally connected to anything. This is most often seen in Gateway and Dell systems with 60MHz Pentiums, but they aren't the only ones who used Saturn I's, so they aren't the only machines with problems. Finally, VLB systems frequently do not identify "master" slots. A "master" slot is one where cache notification occurs after a bus master DMA takes place. Now, these aren't the *only* cases, *but* they are tha majority of cases where "turn of the cache" will fix the problem. Detecting failed cache update is tricky-- mostly because a small amount of cache is involved, and you can't tell the difference between data that was correctly invalidated, and data that was invalidated to load in your test code, data, etc.. Basically, you have to fully set up a DMA into an area, but not trigger it, and the modify a small part of the area to force it into cache with a value other than the one that will result from the DMA. Then you trigger the DMA on a small enough operation that it doesn't cause the cache line for the invalid data to be flushed (you test this by doing the DMA to an area other than the cache area and use instruction timing to determine if the data is still in cache or if the operation blew it out -- a tricky operation). To avoid cache effects in determining the data that will result, you have to do BINVD's (the software cache flush) during the setup to the point that you do the cache load... not the least because you won't be able to load the test if there is a cache problm and you use a DMA-using driver. It's arguable whether BINVD'ing all your I/O is worthwhile -- you may not gain any significant benefit from the cache otherwise to compensate for the overhead; probably there will be *some* gain, but it will be marginal. One very real problem is that the people hacking the code areas that would need to change simply don't buy the cheap hardware necessary to reproduce the problem and allow them to test. Finally, for the less likely cases of a flaw in the motherboard L2 cache implementation (which may include you not being able to successfully BINVD the area and work around the bug in software), there is no fix except disabling the L2 cache. 8-(. Because of the need for a DMA card and a driver to use it, it's no wonder that this type of testing has not made it into a consumer "hardware test" application that you can use in a store before buying the hardware. It isn't worthwhile to BINVD in the majority of cases, because the majority is people with functional cache hardware. Admittedly, we have self-selected this by not "fixing" the problem in software for people without the good hardware. A decent fix would require a lot of effort and a lot of hooks to make it work in all cases (for instance, I could have two VLB controllers, one in a "master" slot, one not; the fix has to have a per controller granularity, etc.). 8-(. Regards, Terry Lambert terry@lambert.org --- Any opinions in this posting are my own and not those of my present or previous employers.