Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 20 Jun 2002 11:45:11 -0400
From:      Bosko Milekic <bmilekic@unixdaemons.com>
To:        Andrew Gallatin <gallatin@cs.duke.edu>
Cc:        "Kenneth D. Merry" <ken@kdm.org>, current@FreeBSD.ORG, net@FreeBSD.ORG
Subject:   Re: new zero copy sockets snapshot
Message-ID:  <20020620114511.A22413@unixdaemons.com>
In-Reply-To: <15633.62357.79381.405511@grasshopper.cs.duke.edu>; from gallatin@cs.duke.edu on Thu, Jun 20, 2002 at 11:24:05AM -0400
References:  <20020618223635.A98350@panzer.kdm.org> <xzpelf3ida1.fsf@flood.ping.uio.no> <20020619090046.A2063@panzer.kdm.org> <20020619120641.A18434@unixdaemons.com> <15633.17238.109126.952673@grasshopper.cs.duke.edu> <20020619233721.A30669@unixdaemons.com> <15633.62357.79381.405511@grasshopper.cs.duke.edu>

next in thread | previous in thread | raw e-mail | index | archive | help

On Thu, Jun 20, 2002 at 11:24:05AM -0400, Andrew Gallatin wrote:
> 
> Bosko Milekic writes:
>  > 
>  > On Wed, Jun 19, 2002 at 10:52:06PM -0400, Andrew Gallatin wrote:
> 
> <...>
> 
>  >   Yes, I know that that's what it does.  What I meant was that it would
>  > be convenient to have UMA handle the allocation bit.  It should be
>  > possible to have UMA do this sort of allocation and keep the free pages
>  > in per-CPU caches.  The problem right now, as you know, is that UMA
>  > (nor mb_alloc for that matter) will allow you to play those vm-tricks
>  > you play on the allocated pages.  I was just pointing out that it is an
>  > unfortunate thing and that, hopefully, UMA will allow for this sort of
>  > thing at some point in the future.
> 
> Ah, OK, point taken.   I'm sorry if I gave offense.

  Hey, I'm sorry if I accidently suggested that I took offense!  Really,
I wasn't offended at all. :-)

>  >   By the way, my other two comments have been deleted, but reading the
>  > page that Ken maintains I noticed that Alfred already pointed them out.
> 
> <...>
> 
> Ken has been maintaining the patchset on his own for quite some time.
> I must admit that I've not looked closely at these issues, so I didn't
> feel it was appropriate for me to comment on them.  I didn't mean to
> discount your other comments.

  Ken has already taken care of it.  I'm really impressed that Ken has
maintained the patchset for so long.  It's really good that the
zero-copy stuff has not been dropped over the years.

>  > [...]
>  > > This is orthogonal to the zero-copy patch, but it _would_ be nice to
>  > > have general purpose mbuf allocator which could allocate mbuf clusters
>  > > with 9K physically contigous for dumber nics.  There are a whole slew
>  > > of drivers (unpatched ti, bge, nge, lge, etc) which roll their own for
>  > > no better reason than the system doesn't offer this feature.  That's
>  > > what needs fixing.  Heck, if such an allocator was available, we could
>  > > use it for copyin's of large chunks of data.   Tru64 has 8K and 2K
>  > > clusters and does this. (based from emperical evidence garnered at the
>  > > driver level).
>  > 
>  >   Right.  It's very hard to do > PAGE_SIZE allocations that are backed
>  > by physically contiguous memory in FreeBSD right now.  I agree that this
>  > would be very useful, though.
>  >
> 
> Years ago, I used Wollman's MCLBYTES > PAGE_SIZE support (introduced
> in rev 1.20 of uipc_mbuf.c) and it seemed to work OK then.  But having
> 16K clusters is a huge waste of space. ;).

  Since then, the mbuf allocator in -CURRENT has totally changed.  It is
still possible to provide allocations of > PAGE_SIZE buffers, however
they will likely not map physically contiguous memory.  If you happen to
have a device that doesn't support scatter/gather for DMA, then these
buffers will be broken for it (I know that if_ti is not a problem).
  The other issue is that the mbuf allocator then as well as the new
mbuf allocator uses the kmem_malloc() interface that was also used by
malloc() to perform allocations of wired-down pages.  I am not sure if
you'll be able to play those tricks where you unmap and remap the page
that is allocated for you once it comes out of the mbuf allocator.  Do
you think it would work?

> Do you think it would be feasable to glue in a new jumbo (10K?)
> allocator on top of the existing mbuf and mcl allocators using the
> existing mechanisms and the existing MCLBYTES > PAGE_SIZE support
> (but broken out into separte functions and macros)?

  Assuming that you can still play those VM tricks with the pages spit
out by mb_alloc (kern/subr_mbuf.c in -CURRENT), then this wouldn't be a
problem at all.  It's easy to add a new fixed-size type allocation to
mb_alloc.  In fact, it would be beneficial.  mb_alloc uses per-CPU
caches and also makes mbuf and cluster allocations share the same
per-CPU lock.  What could be done is that the jumbo buffer allocations
could share the same lock as well (since they will likely usually be
allocated right after an mbuf is).  This would give us jumbo-cluster
support, but it would only be useful for devices clued enough to break
up the cluster into PAGE_SIZE chunks and do scatter/gather.  For most
worthy gigE devices, I don't think this should be a problem.

> Drew

Regards,
-- 
Bosko Milekic
bmilekic@unixdaemons.com
bmilekic@FreeBSD.org


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-current" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20020620114511.A22413>