From owner-freebsd-stable@FreeBSD.ORG Fri Apr 9 13:17:16 2010 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id BBF931065674 for ; Fri, 9 Apr 2010 13:17:16 +0000 (UTC) (envelope-from mike@sentex.net) Received: from lava.sentex.ca (pyroxene.sentex.ca [199.212.134.18]) by mx1.freebsd.org (Postfix) with ESMTP id 8D0EB8FC1B for ; Fri, 9 Apr 2010 13:17:16 +0000 (UTC) Received: from mdt-xp.sentex.net (simeon.sentex.ca [192.168.43.27]) by lava.sentex.ca (8.14.3/8.14.3) with ESMTP id o39DHFEl049965; Fri, 9 Apr 2010 09:17:15 -0400 (EDT) (envelope-from mike@sentex.net) Message-Id: <201004091317.o39DHFEl049965@lava.sentex.ca> X-Mailer: QUALCOMM Windows Eudora Version 7.1.0.9 Date: Fri, 09 Apr 2010 09:17:07 -0400 To: pyunyh@gmail.com, Jack Vogel From: Mike Tancsa In-Reply-To: <20100408230750.GR5734@michelle.cdnetworks.com> References: <201004081313.o38DD4JM041821@lava.sentex.ca> <7.1.0.9.0.20100408091756.10652be0@sentex.net> <201004081446.o38EkU7h042296@lava.sentex.ca> <20100408181741.GI5734@michelle.cdnetworks.com> <201004081831.o38IVR3s043434@lava.sentex.ca> <20100408205626.GN5734@michelle.cdnetworks.com> <201004082105.o38L5DCH044187@lava.sentex.ca> <20100408230750.GR5734@michelle.cdnetworks.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii"; format=flowed Cc: freebsd-stable@freebsd.org Subject: Re: em driver regression X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 09 Apr 2010 13:17:16 -0000 At 07:07 PM 4/8/2010, Pyun YongHyeon wrote: >On Thu, Apr 08, 2010 at 02:06:09PM -0700, Jack Vogel wrote: > > Only one device support by em does multiqueue right now, and that is > > Hartwell, 82574. > > > >Thanks for the info. > >Mike, here is updated patch. Now UDP bulk TX transfer performance >recovered a lot(about 890Mbps) but it still shows bad numbers >compared to other controllers. For example, bce(4) shows about >958Mbps for the same load. >During the testing I found a strong indication of packet reordering >issue of drbr interface. If I forcibly change to use single TX >queue, em(4) got 950Mbps as it used to be. > >Jack, as we talked about possible drbr issue with igb(4), UDP >transfer seems to suffer from packet reordering issue here. Can we >make em(4)/igb(4) use single TX queue until we solve drbr interface >issue? Given that only one em(4) controller supports multiqueue, >dropping multiqueue support for em(4) does not look bad to me. No watchdog errors over night. I wonder if the issue was due to 100Mb, or the patch from current fixed it. I will try today with the new patch below! I am guessing the rejection was due to the RX/TX fix ? ---Mike Hmm... Looks like a unified diff to me... The text leading up to this was: -------------------------- |Index: sys/dev/e1000/if_em.c |=================================================================== |--- sys/dev/e1000/if_em.c (revision 206403) |+++ sys/dev/e1000/if_em.c (working copy) -------------------------- Patching file if_em.c using Plan A... Hunk #1 succeeded at 812 with fuzz 2. Hunk #2 succeeded at 834 (offset -4 lines). Hunk #3 succeeded at 869 (offset -4 lines). Hunk #4 succeeded at 913 (offset -4 lines). Hunk #5 succeeded at 941 (offset -4 lines). Hunk #6 succeeded at 1439 (offset -4 lines). Hunk #7 succeeded at 1452 (offset -4 lines). Hunk #8 succeeded at 1472 (offset -4 lines). Hunk #9 succeeded at 1532 (offset -4 lines). Hunk #10 succeeded at 1549 (offset -4 lines). Hunk #11 failed at 1909. Hunk #12 succeeded at 3617 (offset 2 lines). Hunk #13 succeeded at 4069 (offset -6 lines). Hunk #14 succeeded at 4087 (offset 2 lines). Hunk #15 succeeded at 4187 (offset -6 lines). 1 out of 15 hunks failed--saving rejects to if_em.c.rej Hmm... The next patch looks like a unified diff to me... The text leading up to this was: -------------------------- |Index: sys/dev/e1000/if_em.h |=================================================================== |--- sys/dev/e1000/if_em.h (revision 206403) |+++ sys/dev/e1000/if_em.h (working copy) -------------------------- Patching file if_em.h using Plan A... Hunk #1 succeeded at 223. done 1(ich10)# less if_em.c.rej *************** *** 1908,1919 **** bus_dmamap_sync(txr->txdma.dma_tag, txr->txdma.dma_map, BUS_DMASYNC_PREREAD | BUS_DMASYNC_PREWRITE); E1000_WRITE_REG(&adapter->hw, E1000_TDT(txr->me), i); - txr->watchdog_time = ticks; - /* Call cleanup if number of TX descriptors low */ - if (txr->tx_avail <= EM_TX_CLEANUP_THRESHOLD) - em_txeof(txr); - return (0); } --- 1909,1915 ---- bus_dmamap_sync(txr->txdma.dma_tag, txr->txdma.dma_map, BUS_DMASYNC_PREREAD | BUS_DMASYNC_PREWRITE); E1000_WRITE_REG(&adapter->hw, E1000_TDT(txr->me), i); return (0); } 0(ich10)# > > Jack > > > > > > On Thu, Apr 8, 2010 at 2:05 PM, Mike Tancsa wrote: > > > > > At 04:56 PM 4/8/2010, Pyun YongHyeon wrote: > > > > > >> On Thu, Apr 08, 2010 at 02:31:18PM -0400, Mike Tancsa wrote: > > >> > At 02:17 PM 4/8/2010, Pyun YongHyeon wrote: > > >> > > > >> > >Try this patch. It should fix the issue. It seems Jack forgot to > > >> > >strip CRC bytes as old em(4) didn't strip it, probably to > > >> > >workaround silicon bug of old em(4) controllers. > > >> > > > >> > Thanks! The attached patch does indeed fix the dhclient issue. > > >> > > > >> > > > >> > >It seems there are also TX issues here. The system load is too high > > >> > >and sometimes system is not responsive while TX is in progress. > > >> > >Because I initiated TCP bulk transfers, TSO should reduce CPU load > > >> > >a lot but it didn't so I guess it could also be related watchdog > > >> > >timeouts you've seen. I'll see what can be done. > > >> > > > >> > Thanks for looking into that as well!! > > >> > > > >> > ---Mike > > >> > > > >> > > >> Mike, > > >> > > >> Here is patch I'm working on. This patch fixes high system load and > > >> system is very responsive as before. But it seems there is still > > >> some TX issue here. Bulk UDP performance is very poor(< 700Mbps) > > >> and I have no idea what caused this at this moment. > > >> > > >> BTW, I have trouble to reproduce watchdog timeouts. I'm not sure > > >> whether latest fix from Jack cured it. By chance does your > > >> controller support multi TX/RX queues? You can check whether em(4) > > >> uses multi-queues with "vmstat -i". If em(4) use multi-queue you > > >> may have multiple irq output for em0. > > >> > > > > > > Hi, > > > I will give it a try later tonight! This one does not seem to. > > > > > > 0(ich10)# vmstat -i > > > interrupt total rate > > > irq16: uhci0+ 30 0 > > > irq18: ehci0 uhci5 158419 17 > > > irq19: fwohci0++ 86 0 > > > irq21: uhci1 17 0 > > > irq23: uhci3 ehci1 2 0 > > > cpu0: timer 18570305 1994 > > > irq256: igb0 80 0 > > > irq257: igb0 255 0 > > > irq258: igb0 66 0 > > > irq259: igb0 32 0 > > > irq260: igb0 2 0 > > > irq261: igb1 2679 0 > > > irq262: igb1 998 0 > > > irq263: igb1 2468 0 > > > irq264: igb1 6361 0 > > > irq265: igb1 2 0 > > > irq266: em0 33910 3 > > > irq267: ahci1 15317 1 > > > cpu1: timer 18557074 1993 > > > cpu3: timer 18557168 1993 > > > cpu2: timer 18557108 1993 > > > Total 74462379 7998 > > > 0(ich10)# > > > > -------------------------------------------------------------------- Mike Tancsa, tel +1 519 651 3400 Sentex Communications, mike@sentex.net Providing Internet since 1994 www.sentex.net Cambridge, Ontario Canada www.sentex.net/mike