Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 14 Dec 2013 16:06:37 -0800
From:      Adrian Chadd <adrian@freebsd.org>
To:        Ryan Stone <rysto32@gmail.com>
Cc:        freebsd-net <freebsd-net@freebsd.org>
Subject:   Re: buf_ring in HEAD is racy
Message-ID:  <CAJ-VmonJG-M_f_m36f-z3ArumBKNdt5%2B6muwFJzWABRxRQWJaw@mail.gmail.com>
In-Reply-To: <CAFMmRNyJpvZ0AewWr62w16=qKer%2BFNXUJJy0Qc=EBqMnUV3OyQ@mail.gmail.com>
References:  <CAFMmRNyJpvZ0AewWr62w16=qKer%2BFNXUJJy0Qc=EBqMnUV3OyQ@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help
oh cool, you just did the output-drops thing I was about to code up.
We're missing those counters at work and the ops guys poked me about
it.

I'll also give that a whirl locally and see about working with jack to
get it into -HEAD / MFC'ed to 10.

Thanks!


-a


On 13 December 2013 21:04, Ryan Stone <rysto32@gmail.com> wrote:
> I am seeing spurious output packet drops that appear to be due to
> insufficient memory barriers in buf_ring.  I believe that this is the
> scenario that I am seeing:
>
> 1) The buf_ring is empty, br_prod_head = br_cons_head = 0
> 2) Thread 1 attempts to enqueue an mbuf on the buf_ring.  It fetches
> br_prod_head (0) into a local variable called prod_head
> 3) Thread 2 enqueues an mbuf on the buf_ring.  The sequence of events
> is essentially:
>
> Thread 2 claims an index in the ring and atomically sets br_prod_head (say to 1)
> Thread 2 sets br_ring[1] = mbuf;
> Thread 2 does a full memory barrier
> Thread 2 updates br_prod_tail to 1
>
> 4) Thread 2 dequeues the packet from the buf_ring using the
> single-consumer interface.  The sequence of events is essentialy:
>
> Thread 2 checks whether queue is empty (br_cons_head == br_prod_tail),
> this is false
> Thread 2 sets br_cons_head to 1
> Thread 2 grabs the mbuf from br_ring[1]
> Thread 2 sets br_cons_tail to 1
>
> 5) Thread 1, which is still attempting to enqueue an mbuf on the ring.
> fetches br_cons_tail (1) into a local variable called cons_tail.  It
> sees cons_tail == 1 but prod_head == 0 and concludes that the ring is
> full and drops the packet (incrementing br_drops unatomically, I might
> add)
>
>
> I can reproduce several drops per minute by configuring the ixgbe
> driver to use only 1 queue and then sending traffic from concurrent 8
> iperf processes.  (You will need this hacky patch to even see the
> drops with netstat, though:
> http://people.freebsd.org/~rstone/patches/ixgbe_br_drops.diff)
>
> I am investigating fixing buf_ring by using acquire/release semantics
> rather than load/store barriers.  However, I note that this will
> apparently be the second attempt to fix buf_ring, and I'm seriously
> questioning whether this is worth the effort compared to the
> simplicity of using a mutex.  I'm not even convinced that a correct
> lockless implementation will even be a performance win, given the
> number of memory barriers that will apparently be necessary.
> _______________________________________________
> freebsd-net@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-net
> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org"



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAJ-VmonJG-M_f_m36f-z3ArumBKNdt5%2B6muwFJzWABRxRQWJaw>