Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 10 Oct 2016 11:39:24 -0700
From:      John Baldwin <jhb@freebsd.org>
To:        Slawa Olhovchenkov <slw@zxy.spb.ru>
Cc:        src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org
Subject:   Re: svn commit: r306661 - in stable/11/sys/dev/cxgbe: . tom
Message-ID:  <5243602.cilUCEM5cP@ralph.baldwin.cx>
In-Reply-To: <20161010182821.GZ54003@zxy.spb.ru>
References:  <201610032315.u93NFiHE057529@repo.freebsd.org> <1660024.uzJn2AtV1k@ralph.baldwin.cx> <20161010182821.GZ54003@zxy.spb.ru>

next in thread | previous in thread | raw e-mail | index | archive | help
On Monday, October 10, 2016 09:28:21 PM Slawa Olhovchenkov wrote:
> On Mon, Oct 10, 2016 at 10:46:27AM -0700, John Baldwin wrote:
> 
> > On Monday, October 10, 2016 02:09:01 PM Slawa Olhovchenkov wrote:
> > > On Mon, Oct 03, 2016 at 11:15:44PM +0000, John Baldwin wrote:
> > > 
> > > > Author: jhb
> > > > Date: Mon Oct  3 23:15:44 2016
> > > > New Revision: 306661
> > > > URL: https://svnweb.freebsd.org/changeset/base/306661
> > > > 
> > > > Log:
> > > >   MFC 303405: Add support for zero-copy aio_write() on TOE sockets.
> > > >   
> > > >   AIO write requests for a TOE socket on a Chelsio T4+ adapter can now
> > > >   DMA directly from the user-supplied buffer.  This is implemented by
> > > >   wiring the pages backing the user-supplied buffer and queueing special
> > > >   mbufs backed by raw VM pages to the socket buffer.  The TOE code
> > > >   recognizes these special mbufs and builds a sglist from the VM page
> > > >   array associated with the mbuf when queueing a work request to the TOE.
> > > >   
> > > >   Because these mbufs do not have an associated virtual address, m_data
> > > >   is not valid.  Thus, the AIO handler does not invoke sosend() directly
> > > >   for these mbufs but instead inlines portions of sosend_generic() and
> > > >   tcp_usr_send().
> > > >   
> > > >   An aiotx_buffer structure is used to describe the user buffer (e.g.
> > > >   it holds the array of VM pages and a reference to the AIO job).  The
> > > >   special mbufs reference this structure via m_ext.  Note that a single
> > > >   job might be split across multiple mbufs (e.g. if it is larger than
> > > >   the socket buffer size).  The 'ext_arg2' member of each mbuf gives an
> > > >   offset relative to the backing aiotx_buffer.  The AIO job associated
> > > >   with an aiotx_buffer structure is completed when the last reference to
> > > >   the structure is released.
> > > >   
> > > >   Zero-copy aio_write()'s for connections associated with a given
> > > >   adapter can be enabled/disabled at runtime via the
> > > >   'dev.t[45]nex.N.toe.tx_zcopy' sysctl.
> > > >   
> > > >   Sponsored by:	Chelsio Communications
> > > 
> > > Do you have any public available application patches for support this?
> > > May be nginx?
> > 
> > Applications need to use aio_read(), ideally with at least 2 buffers (so
> > queue two reads, then when a read completes, consume the data and do the
> > next read).  I'm not sure nginx will find this but so useful as web servers
> > tend to send a lot more data than they receive.  The only software I have
> > patched explicitly for this is netperf.
> 
> Hm, this is like only aio_read() on sokets give performance boost, not
> aio_write()?

Sorry, I was confused on the commit, this does affect aio_write() (earlier
changes also permit zero-copy for receive via aio_read()).  However, as you
noted in the reply to Navdeep, it seems that nginx only supports using
AIO on the backing files for static content it seems.  It would need changes
to support using aio_write on sockets (similar to using sendfile).

-- 
John Baldwin



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?5243602.cilUCEM5cP>