From owner-freebsd-hackers Mon May 20 20:42:00 1996 Return-Path: owner-hackers Received: (from root@localhost) by freefall.freebsd.org (8.7.3/8.7.3) id UAA18191 for hackers-outgoing; Mon, 20 May 1996 20:42:00 -0700 (PDT) Received: from paloalto.access.hp.com (daemon@paloalto.access.hp.com [15.254.56.2]) by freefall.freebsd.org (8.7.3/8.7.3) with ESMTP id UAA18186 for ; Mon, 20 May 1996 20:41:55 -0700 (PDT) Received: from fakir.india.hp.com by paloalto.access.hp.com with ESMTP (1.37.109.16/15.5+ECS 3.3) id AA134330105; Mon, 20 May 1996 20:41:51 -0700 Received: from localhost by fakir.india.hp.com with SMTP (1.37.109.16/15.5+ECS 3.3) id AA215120311; Tue, 21 May 1996 09:15:12 +0530 Message-Id: <199605210345.AA215120311@fakir.india.hp.com> To: hackers@freebsd.org Subject: I-/D- cache coherency issues Date: Tue, 21 May 1996 09:15:10 +0530 From: A JOSEPH KOSHY Sender: owner-hackers@freebsd.org X-Loop: FreeBSD.org Precedence: bulk I'm looking at generating machine code on the fly and executing it later. Since many of the newer uPs have separate I- and D- caches without consistency checking between the two, this requires a way to ensure that the instructions executed from a virtual address range are what were written out to memory by the code generator. Most I- cache implementations are simple and don't snoop the bus so the responsibility for maintaining coherency rests with the OS. Before the list jumps on me about the horrors of self-modifying code I'd like to point out that rolling your machine code is useful in: o Direct threaded interpreters: FORTH comes to mind. o Reiser raster ops: turns out that this is one of the ways you get decent performance out of torturous hardware like the IBM (vanilla) VGA --- you generate m/c code for your graphics operation special cased for operation desired and then let it rip. o I would hazard a guess that a native mode compiler for the Java virtual machine would need similar facilities too. o Then of course there is the amusement value :-). The magic incantation for correctly executing freshly generated code varies from a simple "jmp $+2" on a '386 to more arcane calls to PALcode or equivalent on the newer riscs. I.e. its pretty much processor architecture and memory architecture dependent. So my questions are: (a) Do we have a means in userland of ensuring that a particular range of virtual addresses is flushed from the I- or D- cache? Something that would work across {Free,Net,*}BSD hopefully? (b) Is there a non-m/c specific way this can be done from within the kernel? I.e. are the suitable kernel VM primitives that one could invoke? Thanks, Koshy