FreeBSD Mail Archives

Date:      Sun, 20 Feb 2011 00:03:57 -0500
From:      Arnaud Lacombe <lacombar@gmail.com>
To:        Karim Fodil-Lemelin <fodillemlinkarim@gmail.com>, Jack Vogel <jfvogel@gmail.com>
Cc:        freebsd-net@freebsd.org
Subject:   Re: igb driver RX (was TX) hangs when out of mbuf clusters
Message-ID:  <AANLkTik3hzgiLB_d5B0=ieaE6NmXed3f2Xydie-X5DAN@mail.gmail.com>
In-Reply-To: <AANLkTinSFycBZx31A-QQoweEVAD-tsEBnuZW5%2BpZgP2Z@mail.gmail.com>
References:  <AANLkTikrjkHDaBq%2Bx6MTZhzOeqWA=xtFpqQPsthFGmuf@mail.gmail.com> <D70A2DA6-23B7-442D-856C-4267359D66A5@lurchi.franken.de> <AANLkTinLg6QZz67e3Hhda-bzTX69XWNcdEkr3EZHFmSZ@mail.gmail.com> <AANLkTikMuFRY=W0%2BVtGKdWkJcOFVbdy=OOZNe_xFUC3R@mail.gmail.com> <AANLkTin5DZBnr_VcXRyUmpcH2Gsr3GuaW4EsBtKJ6omd@mail.gmail.com> <AANLkTinaftP09MxxpXQwhLaO3dybSep2q4SWZRP4ycHB@mail.gmail.com> <AANLkTikaFRh-3OK0xjO8a%2BnY5aoPnMVFGPCnR1CGDVPk@mail.gmail.com> <F06CCA42-610F-41CA-897F-7029CCAE991B@freebsd.org> <AANLkTinMHSTMqskxTz2d3ysooadF5AwjTOGHnAbOhAj-@mail.gmail.com> <12838373-FE96-443E-8979-AF5408705BF0@freebsd.org> <AANLkTinSFycBZx31A-QQoweEVAD-tsEBnuZW5%2BpZgP2Z@mail.gmail.com>

Hi Jack,

It would seem I've just been encountering this issue on an `em'
interface as well (chip ID 0x10d38086). The system has been up for a
bit more than a day. netstat(1) list about 2500 clusters allocation
denial. The mentioned interface was unable to receive traffic,
however, it continued to transmit ARP request. Comparing the output of
sysctl's statistics showed an increase of "missed packets":

Over a 10s time frame:
-dev.em.5.mac_stats.missed_packets: 288412
+dev.em.5.mac_stats.missed_packets: 288423

TX accounting and INTR count got up as I'd expect. Doing an `ifconfig
down && ifconfig up' restored the connectivity.

 - Arnaud

On Fri, Feb 11, 2011 at 2:53 PM, Karim Fodil-Lemelin
<fodillemlinkarim@gmail.com> wrote:
> Hi,
>
> I see a commit was made in current (r218530 | jfv | 2011-02-10 20:00:26
> -0500 (Thu, 10 Feb 2011)). Is that commit done to address this issue?
>
> And if so Is there any MFC planned for 7.4 for this?
>
> Thanks,
>
> Karim.
>
> 2011/2/9 Michael Tuexen <tuexen@freebsd.org>
>
>> On Feb 9, 2011, at 6:35 PM, Jack Vogel wrote:
>>
>> > OK, but the question is why does the ring get totally consumed this wa=
y,
>> the
>> > ring has 1024 descriptors, it seems unintuitive that that whole quanti=
ty
>> can be
>> > used without some being recharged. Do you see the system mbuf pool bei=
ng
>> > depleted at the same time?
>> That was the test case I created: I set up a server accepting connection=
s
>> but not reading anything. So the driver passes the mbufs to the transpor=
t
>> stack and they are not consumed. Then the problem occurs. Then I kill th=
e
>> server. Now there are mbufs available again, but the driver doesn't know=
.
>>
>> I had the impression that these were the circumstances in which the prob=
lem
>> showed up (mbuf allocations failing).
>> >
>> > Since you can reproduce it, do me a favor, in rxeof, =A0change the
>> processed
>> > value from 8 to 4 and then 1, effectively call refresh every descripto=
r,
>> see if
>> > that eliminates the issue.
>> I will do. Need to see if I can do it remotely, since I'm not in my lab
>> right now. Can do it tomorrow for sure.
>>
>> But I do not think that this solves the problem, since I did the things
>> very slowly and you call it at least when you are leaving rxeof.
>>
>> Best regards
>> Michael
>> >
>> > Thanks for your help,
>> >
>> > Jack
>> >
>> >
>> > On Wed, Feb 9, 2011 at 2:36 AM, Michael Tuexen <tuexen@freebsd.org>
>> wrote:
>> > Hi Jack,
>> >
>> > I could recreate the problem. When the problem occurs, we see
>> >
>> > rx_nxt_check =3D n
>> > rx_nxt_refresh =3D n + 1
>> >
>> > (This was also reported in a mail from Karim)
>> >
>> > This means that the *whole* receive ring has no buffers anymore. This =
can
>> > occur if, for some amount of time, no clusters are available.
>> >
>> > Now outside of the driver, at some point of time, clusters are freed.
>> > I don't think that igb_refresh_mbufs() gets called, since it only gets
>> > called from igb_rxeof(), which gets called when a packet has been
>> received,
>> > which can not happen since the receive ring is empty. So how can the
>> driver
>> > know? I have no idea. Maybe we can periodically check for such an even=
t
>> > and call igb_refresh_mbufs().
>> >
>> > Does this make sense to you?
>> >
>> > Best regards
>> > Michael
>> >
>> >
>> > On Feb 9, 2011, at 8:32 AM, Jack Vogel wrote:
>> >
>> > > Hmmm, well so much for that theory :)
>> > >
>> > > Jack
>> > >
>> > >
>> > > On Tue, Feb 8, 2011 at 4:06 PM, Karim Fodil-Lemelin <
>> fodillemlinkarim@gmail.com> wrote:
>> > >
>> > >
>> > > 2011/2/8 Jack Vogel <jfvogel@gmail.com>
>> > >
>> > >
>> > > I have been following this, and thinking about it. I still am workin=
g
>> from a theoretical
>> > > standpoint, but based on a patch I got quite a long time back and ne=
ver
>> quite groked,
>> > > I believe now that I might have a solution.
>> > >
>> > > The original PR and patch was kern/150516 from Beezar Liu, =A0I was =
never
>> quite comfortable
>> > > with the code changes, nor convinced that it was a real issue and no=
t a
>> misunderstanding.
>> > > However I think now that this very report might be behind what we ar=
e
>> seeing today. I have
>> > > a slightly different approach to solving it, of course it remains to=
 be
>> seen if it handles it
>> > > properly.
>> > >
>> > > Please try the patch I've attached, I'm open to further correction o=
r
>> polishing of the
>> > > changes. And thanks to Beezar for his original report and changes, t=
his
>> is not for em,
>> > > but if this eliminates the problem its clearly needed in all drivers=
.
>> > >
>> > > Jack
>> > >
>> > >
>> > > Hi Jack,
>> > >
>> > > Thanks for your help. I tried your patch and it didn't work so I add=
ed
>> a couple of printf to see if the added code was getting hit:
>> > >
>> > > --- a/freebsd/sys/dev/e1000/if_igb.c
>> > > --More--(byte 1253)+++ b/freebsd/sys/dev/e1000/if_igb.c
>> > > @@ -612,7 +612,7 @@ igb_attach(device_t dev)
>> > > =A0 =A0 =A0 =A0 =A0 =A0 device_get_nameunit(dev));
>> > >
>> > > =A0 =A0 =A0 =A0 INIT_DEBUGOUT("igb_attach: end");
>> > > -
>> > > + =A0 =A0 =A0 printf("this driver has a patch from Jack Vogel\n");
>> > > =A0 =A0 =A0 =A0 return (0);
>> > >
>> > > =A0err_late:
>> > > @@ -4131,6 +4131,7 @@ igb_rxeof(struct igb_queue *que, int count, in=
t
>> *done)
>> > > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 struct mbuf =A0 =A0 =A0 =A0 =A0 =A0 =
*sendmp, *mh, *mp;
>> > > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 struct igb_rx_buf =A0 =A0 =A0 *rxbuf=
;
>> > > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 u16 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =
=A0 =A0 hlen, plen, hdr, vtag;
>> > > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 int =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =
=A0 =A0 commit;
>> > > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 bool =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0=
 =A0 =A0eop =3D FALSE;
>> > >
>> > > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 cur =3D &rxr->rx_base[i];
>> > > @@ -4255,10 +4256,23 @@ next_desc:
>> > > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 bus_dmamap_sync(rxr->rxdma.dma_tag, =
rxr->rxdma.dma_map,
>> > > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 BUS_DMASYNC_PREREAD | BUS_DM=
ASYNC_PREWRITE);
>> > >
>> > > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 commit =3D i; =A0 =A0 /* capture the o=
ld index */
>> > > +
>> > > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 /* Advance our pointers to the next =
descriptor. */
>> > > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 if (++i =3D=3D adapter->num_rx_desc)
>> > > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 i =3D 0;
>> > > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 /*
>> > > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 ** Sanity test for ring full, if this
>> > > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 ** happens we need to refresh immediat=
ely
>> > > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 ** or refresh may deadlock.
>> > > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 */
>> > > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 if (i =3D=3D rxr->next_to_refresh) {
>> > > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 igb_refresh_mbufs(rxr,=
 commit);
>> > > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 printf("igb_refresh_mb=
ufs called with commit
>> %d\n", commit);
>> > > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 processed =3D 0;
>> > > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 }
>> > > +
>> > > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 /*
>> > > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 ** Send to the stack or LRO
>> > > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 */
>> > > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 if (sendmp !=3D NULL) {
>> > >
>> > > Here is the results:
>> > >
>> > > # dmesg | grep Vogel
>> > > this driver has a patch from Jack Vogel
>> > > this driver has a patch from Jack Vogel
>> > >
>> > > # netstat -m
>> > > 60453/52707/113160 mbufs in use (current/cache/total)
>> > > 48416/51584/100000/100000 mbuf clusters in use
>> (current/cache/total/max)
>> > > 2894/690 mbuf+clusters out of packet secondary zone in use
>> (current/cache)
>> > > 11946/854/12800/12800 4k (page size) jumbo clusters in use
>> (current/cache/total/max)
>> > > 0/0/0/6400 9k jumbo clusters in use (current/cache/total/max)
>> > > 0/0/0/3200 16k jumbo clusters in use (current/cache/total/max)
>> > > 164834K/119760K/284595K bytes allocated to network
>> (current/cache/total)
>> > > 0/339/0 requests for mbufs denied (mbufs/clusters/mbuf+clusters)
>> > > 0/0/0 requests for jumbo clusters denied (4k/9k/16k)
>> > > 0/4/6656 sfbufs in use (current/peak/max)
>> > > 0 requests for sfbufs denied
>> > > 0 requests for sfbufs delayed
>> > > 0 requests for I/O initiated by sendfile
>> > > 0 calls to protocol drain routines
>> > > # dmesg | grep commit
>> > >
>> > > At this point RX has hung.
>> > >
>> > > Somehow the check (i =3D=3D rxr->next_to_refresh) is never true in t=
his
>> case. Also, I did read kern/150516 and couldn't wrap my head around the
>> patch for the em driver that Beezar Liu suggested.
>> > >
>> > > Regards,
>> > >
>> > > Karim.
>> > >
>> > >
>> >
>> >
>>
>>
> _______________________________________________
> freebsd-net@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-net
> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org"
>

Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?AANLkTik3hzgiLB_d5B0=ieaE6NmXed3f2Xydie-X5DAN>

Header And Logo

Peripheral Links

Site Navigation

Header And Logo

Peripheral Links

Search

Site Navigation