Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 29 Jun 2014 13:57:45 -0400
From:      Patrick Kelsey <kelsey@ieee.org>
To:        Adrian Chadd <adrian@freebsd.org>, "freebsd-arm@freebsd.org" <arm@freebsd.org>
Subject:   Re: arm alignment faults...
Message-ID:  <CAD44qMWM1J7n9sG%2B5kjkPESSYF%2BthBORkf_GxVbtCcr4N_sA-A@mail.gmail.com>
In-Reply-To: <20140629040150.GO1560@funkthat.com>
References:  <20140629033823.GN1560@funkthat.com> <CAJ-VmokxO5vfOOSvPrTfnda6gSKOPpJQF3kto3AdgUhvbFgNYg@mail.gmail.com> <20140629040150.GO1560@funkthat.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On Sun, Jun 29, 2014 at 12:01 AM, John-Mark Gurney <jmg@funkthat.com> wrote:

> Adrian Chadd wrote this message on Sat, Jun 28, 2014 at 20:44 -0700:
> > On 28 June 2014 20:38, John-Mark Gurney <jmg@funkthat.com> wrote:
> > > So, one of the little projects I'd like to see is the removal of
> > > ETHER_ALIGN from the tree..  This bogosity can (and does) cause the use
> > > of bouncing durning DMA ops on all ethernet frames...
>
> Now that I think about it, total removal may not be necessary, just
> the requirement to use it...  If the ethernet dma engine can do half
> word aligned dma, then there would be benifit on those to keep
> ETHER_ALIGN...
>
> > Well, as long as you're not doing it by forcing the various CPUs to
> > handle unaligned accesses.
>
> Hard to do on armv4 which I don't believe supports unaligned access...
>
> > The cost of those unaligned accesses on some CPUs that support them is
> > not trivial. We benchmarked some of the ARM cores at Qualcomm back
> > when looking to migrate stuff to ARM and it wasn't very quick.
>
> I plan on fixing the TCP/IP stack to copy data to an aligned buffer
> (maybe only if the original buffer isn't aligned) on the stack when
> __NO_STRICT_ALIGNMENT is not defined...  I can't see how copying the
> entire packet is cheaper than copying 20 bytes or so...
>
>
I like the idea of getting away from the concept of ETHER_ALIGN.

This will also be nice when feeding the TCP/IP stack using netmap.  There's
no facility in netmap for landing receive data at an offset in the
word-aligned ring buffer, so using netmap to receive frames for the TCP/IP
stack relies on unaligned access support from the platform.  For platforms
where there is not unaligned access support, or it comes with a noticeable
penalty, it would become easier to consider using a netmap + TCP/IP stack
approach to some problems.

I wonder if it's really worth having the copy action be conditional on an
alignment check - it seems to me the cycle savings would be modest at best
for the half-word-aligned-capable ethernet dma engine crowd, at the cost of
having two different header access paths in the code.  Also, getting away
from the current business of modifying the mbuf via in-place header
byteswaps will make the stack friendlier for building things that inspect
packets up through tcp headers/connection lookup and then possibly decide
to bridge them out some interface as a result (as then in such cases, one
won't have to ever unwind the changes to mbuf contents).

If I'm thinking about this right, pr_input_t will have to grow a new
parameter to support passing the on-stack IP header pointer along.  I
suppose this could be considered a generic 'encapsulation header' pointer,
so as not to taint the protosw interface with something ip-specific.

-Patrick



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAD44qMWM1J7n9sG%2B5kjkPESSYF%2BthBORkf_GxVbtCcr4N_sA-A>