Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 12 Feb 2014 23:56:51 -0800
From:      John-Mark Gurney <jmg@funkthat.com>
To:        Garrett Wollman <wollman@bimajority.org>
Cc:        FreeBSD Net <freebsd-net@freebsd.org>, John Baldwin <jhb@freebsd.org>
Subject:   Re: Use of contiguous physical memory in cxgbe driver
Message-ID:  <20140213075651.GY34851@funkthat.com>
In-Reply-To: <21244.20212.423983.960018@hergotha.csail.mit.edu>
References:  <21216.22944.314697.179039@hergotha.csail.mit.edu> <201402111348.52135.jhb@freebsd.org> <CAJ-VmonCdNQPUCQwm0OhqQ3Kt_7x6-g-JwGVZQfzWTgrDYfmqw@mail.gmail.com> <201402121446.19278.jhb@freebsd.org> <21244.20212.423983.960018@hergotha.csail.mit.edu>

next in thread | previous in thread | raw e-mail | index | archive | help
Garrett Wollman wrote this message on Wed, Feb 12, 2014 at 23:49 -0500:
> <<On Wed, 12 Feb 2014 14:46:19 -0500, John Baldwin <jhb@freebsd.org> said:
> 
> > Is this because UMA keeps lots of mbufs cached in your workload?
> > The physmem buddy allocator certainly seeks to minimize
> > fragmentation.  However, it can't go yank memory out of UMA caches
> > to do so.
> 
> It's not just UMA caches: there are TCP queues, interface queues, the
> NFS request "cache", and elsewhere.  I first discovered this problem
> in the NFS context: what happens is that you build up very large TCP
> send buffers (NFS forces the socket buffers to 2M) for many clients
> (easy if the server is dedicated 10G and the clients are all on shared
> 1G links).  The NIC is eventually unable to replenish its receive
> ring, and everything just stops.  Eventually, the TCP connections time
> out, the buffers are freed, and the server mysteriously starts working
> again.  (Actually, the last bit never happens in production.  It's
> more like: Eventually, the users start filing trouble tickets, then
> Nagios starts paging the sysadmins, then someone does a hard reset
> because that's the fastest way to recover.  And then they blame me.)

This is an issue that most ethernet drivers have in that they require
the ability to fetch a new mbuf to replace the received one instead
of delaying the replacement till later...  If the driver allowed the
receive ring to be "missing" a few buffers that are potentially filled
upon next RX, it would allow the machine forward progress and possibly
free up a ton of mbufs... Maybe that dropped frame is an ack that will
free up 10 or more mbufs, but we'll never know since we just drop it
on the floor...

Though we might want to keep a few mbufs reserved for receive now that
you mention it...  We should never get to the point where we can't
allocate even one frame for receive...

-- 
  John-Mark Gurney				Voice: +1 415 225 5579

     "All that I will do, has been done, All that I have, has not."



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20140213075651.GY34851>