Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 28 Apr 2009 16:48:35 -0500
From:      Kevin Day <toasty@dragondata.com>
To:        "Julian Bangert" <julidaoc@online.de>
Cc:        freebsd-hackers@freebsd.org
Subject:   Re: Question about adding flags to mmap system call / NVIDIA amd64 driver implementation
Message-ID:  <EC226ED2-3EF6-4102-8186-6F7B68AFC809@dragondata.com>
In-Reply-To: <op.us35euemeer2kn@server2go>
References:  <op.us35euemeer2kn@server2go>

next in thread | previous in thread | raw e-mail | index | archive | help

On Apr 28, 2009, at 3:19 PM, Julian Bangert wrote:

> Hello,
>
> I am currently trying to work a bit on the remaining "missing  
> feature" that NVIDIA requires ( http://wiki.freebsd.org/NvidiaFeatureRequests 
>   or a back post in this ML) -  the improved mmap system call.
> For now, I am trying to extend the current system call and  
> implementation to add cache control ( the type of memory caching  
> used) . This feature inherently is very architecture specific- but  
> it can lead to enormous performance improvements for memmapped  
> devices ( useful for drivers, etc). I would do this at the user site  
> by adding 3 flags to the mmap system call (MEM_CACHE__ATTR1 to  
> MEM_CACHE__ATTR3 ) which are a single octal digit corresponding to  
> the various caching options ( like Uncacheable,Write Combining,  
> etc... ) with the same numbers as the PAT_* macros from i386/include/ 
> specialreg.h except that the value 0 ( PAT_UNCACHEABLE ) is replaced  
> with value 2 ( undefined), whereas value 0 ( all 3 flags cleared) is  
> assigned the meaning "feature not used, use default cache control".
> For each cache behaviour there would of course also be a macro  
> expanding to the rigth combination of these flags for enhanced  
> useability.
>
> The mmap system call would, if any of these flags are set, decode  
> them and get a corresponding PAT_* value, perform the mapping and  
> then call into the pmap module to modify the cache attributes for  
> every page.

Have you looked at mem(4) yet?

      Several architectures allow attributes to be associated with  
ranges of
      physical memory.  These attributes can be manipulated via  
ioctl() calls
      performed on /dev/mem.  Declarations and data types are to be  
found in
      <sys/memrange.h>.

      The specific attributes, and number of programmable ranges may  
vary
      between architectures.  The full set of supported attributes is:

      MDF_UNCACHEABLE
              The region is not cached.

      MDF_WRITECOMBINE
              Writes to the region may be combined or performed out of  
order.

      MDF_WRITETHROUGH
              Writes to the region are committed synchronously.

      MDF_WRITEBACK
              Writes to the region are committed asynchronously.

      MDF_WRITEPROTECT
              The region cannot be written to.

This requires knowledge of the physical addresses, but I believe  
that's probably already necessary for what it sounds like you're  
trying to accomplish.

Back in the FreeBSD-3.0 days, I was writing a custom driver for an AGP  
graphics controller, and setting the MTRR flags for the exposed buffer  
was a definite improvement (200-1200% faster in most cases).

-- Kevin




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?EC226ED2-3EF6-4102-8186-6F7B68AFC809>