Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 10 Oct 2016 12:21:14 -0700
From:      John Baldwin <jhb@freebsd.org>
To:        Slawa Olhovchenkov <slw@zxy.spb.ru>
Cc:        src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org
Subject:   Re: svn commit: r306661 - in stable/11/sys/dev/cxgbe: . tom
Message-ID:  <4497031.RrIpxcHyXF@ralph.baldwin.cx>
In-Reply-To: <20161010184357.GB54003@zxy.spb.ru>
References:  <201610032315.u93NFiHE057529@repo.freebsd.org> <5243602.cilUCEM5cP@ralph.baldwin.cx> <20161010184357.GB54003@zxy.spb.ru>

next in thread | previous in thread | raw e-mail | index | archive | help
On Monday, October 10, 2016 09:43:57 PM Slawa Olhovchenkov wrote:
> On Mon, Oct 10, 2016 at 11:39:24AM -0700, John Baldwin wrote:
> 
> > On Monday, October 10, 2016 09:28:21 PM Slawa Olhovchenkov wrote:
> > > On Mon, Oct 10, 2016 at 10:46:27AM -0700, John Baldwin wrote:
> > > 
> > > > On Monday, October 10, 2016 02:09:01 PM Slawa Olhovchenkov wrote:
> > > > > On Mon, Oct 03, 2016 at 11:15:44PM +0000, John Baldwin wrote:
> > > > > 
> > > > > > Author: jhb
> > > > > > Date: Mon Oct  3 23:15:44 2016
> > > > > > New Revision: 306661
> > > > > > URL: https://svnweb.freebsd.org/changeset/base/306661
> > > > > > 
> > > > > > Log:
> > > > > >   MFC 303405: Add support for zero-copy aio_write() on TOE sockets.
> > > > > >   
> > > > > >   AIO write requests for a TOE socket on a Chelsio T4+ adapter can now
> > > > > >   DMA directly from the user-supplied buffer.  This is implemented by
> > > > > >   wiring the pages backing the user-supplied buffer and queueing special
> > > > > >   mbufs backed by raw VM pages to the socket buffer.  The TOE code
> > > > > >   recognizes these special mbufs and builds a sglist from the VM page
> > > > > >   array associated with the mbuf when queueing a work request to the TOE.
> > > > > >   
> > > > > >   Because these mbufs do not have an associated virtual address, m_data
> > > > > >   is not valid.  Thus, the AIO handler does not invoke sosend() directly
> > > > > >   for these mbufs but instead inlines portions of sosend_generic() and
> > > > > >   tcp_usr_send().
> > > > > >   
> > > > > >   An aiotx_buffer structure is used to describe the user buffer (e.g.
> > > > > >   it holds the array of VM pages and a reference to the AIO job).  The
> > > > > >   special mbufs reference this structure via m_ext.  Note that a single
> > > > > >   job might be split across multiple mbufs (e.g. if it is larger than
> > > > > >   the socket buffer size).  The 'ext_arg2' member of each mbuf gives an
> > > > > >   offset relative to the backing aiotx_buffer.  The AIO job associated
> > > > > >   with an aiotx_buffer structure is completed when the last reference to
> > > > > >   the structure is released.
> > > > > >   
> > > > > >   Zero-copy aio_write()'s for connections associated with a given
> > > > > >   adapter can be enabled/disabled at runtime via the
> > > > > >   'dev.t[45]nex.N.toe.tx_zcopy' sysctl.
> > > > > >   
> > > > > >   Sponsored by:	Chelsio Communications
> > > > > 
> > > > > Do you have any public available application patches for support this?
> > > > > May be nginx?
> > > > 
> > > > Applications need to use aio_read(), ideally with at least 2 buffers (so
> > > > queue two reads, then when a read completes, consume the data and do the
> > > > next read).  I'm not sure nginx will find this but so useful as web servers
> > > > tend to send a lot more data than they receive.  The only software I have
> > > > patched explicitly for this is netperf.
> > > 
> > > Hm, this is like only aio_read() on sokets give performance boost, not
> > > aio_write()?
> > 
> > Sorry, I was confused on the commit, this does affect aio_write() (earlier
> > changes also permit zero-copy for receive via aio_read()).  However, as you
> > noted in the reply to Navdeep, it seems that nginx only supports using
> > AIO on the backing files for static content it seems.  It would need changes
> > to support using aio_write on sockets (similar to using sendfile).
> 
> Thanks
> You don't planed to do this?

After talking with Navdeep it looks like I will take a stab at adding support
for this and seeing what effects it has.  For static content workloads using
sendfile I'm not sure it will be a win, but it might let you avoid some of
the CPU cycles for dynamically-generated content by avoiding copies for those.

-- 
John Baldwin



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?4497031.RrIpxcHyXF>