Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 14 Sep 2010 13:59:36 -0400
From:      Mike Tancsa <mike@sentex.net>
To:        pyunyh@gmail.com
Cc:        freebsd-stable@freebsd.org, jfvogel@gmail.com
Subject:   Re: RELENG_7 em problems (and RELENG_8)
Message-ID:  <201009141759.o8EHxcZ0013539@lava.sentex.ca>
In-Reply-To: <20100817200020.GE6482@michelle.cdnetworks.com>
References:  <201006102031.o5AKVCH2016467@lava.sentex.ca> <201007021739.o62HdMOU092319@lava.sentex.ca> <20100702193654.GD10862@michelle.cdnetworks.com> <201008162107.o7GL76pA080191@lava.sentex.ca> <20100817185208.GA6482@michelle.cdnetworks.com> <201008171955.o7HJt67T087902@lava.sentex.ca> <20100817200020.GE6482@michelle.cdnetworks.com>

next in thread | previous in thread | raw e-mail | index | archive | help
Hi Jack,
         Any plans to commit the patch below ? I have been running it 
on a number of boxes and it works as expected with no side effects.

         ---Mike


At 04:00 PM 8/17/2010, Pyun YongHyeon wrote:
>On Tue, Aug 17, 2010 at 03:55:12PM -0400, Mike Tancsa wrote:
> > At 02:52 PM 8/17/2010, Pyun YongHyeon wrote:
> >
> > >Here is updated patch for HEAD and stable/8.
> > >http://people.freebsd.org/~yongari/em.csum_tso.20100817.patch
> > >
> > >It seems to work as expected under my limited environments. If
> >
> > Thanks! The patch applies cleanly and all works as expected now! I am
> > no longer able to trigger the bug. I just use the stock unmodified
> > driver normally, so no multi queues
> >
>
>Glad to hear that. Thanks for testing!
>
> > # vmstat -i
> > interrupt                          total       rate
> > irq256: em0                          149          0
> > irq257: em1                            3          0
> > irq259: em3                          971          2
> > irq260: ahci0                       1520          3
> >
> >
> >
> > em3: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
> > 
> options=219b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,TSO4,WOL_MAGIC>
> >         ether 00:15:17:xx:xx:xx
> >         inet6 fe80::215:17ff:fexx:xxxx%em3 prefixlen 64 scopeid 0x4
> >         inet 192.168.xx.xx netmask 0xffffff00 broadcast 192.168.xx.xx
> >         nd6 options=3<PERFORMNUD,ACCEPT_RTADV>
> >         media: Ethernet autoselect (100baseTX <full-duplex>)
> >         status: active
> >
> >
> > em3@pci0:3:0:0: class=0x020000 card=0x34ec8086 chip=0x10d38086
> > rev=0x00 hdr=0x00
> >     vendor     = 'Intel Corporation'
> >     device     = 'Intel 82574L Gigabit Ethernet Controller (82574L)'
> >     class      = network
> >     subclass   = ethernet
> >     cap 01[c8] = powerspec 2  supports D0 D3  current D0
> >     cap 05[d0] = MSI supports 1 message, 64 bit enabled with 1 message
> >     cap 10[e0] = PCI-Express 1 endpoint max data 128(256) link x1(x1)
> >     cap 11[a0] = MSI-X supports 5 messages in map 0x1c
> >
> >
> >
> > patch < em.csum_tso.20100817.patch
> > Hmm...  Looks like a unified diff to me...
> > The text leading up to this was:
> > --------------------------
> > |Index: sys/dev/e1000/if_em.c
> > |===================================================================
> > |--- sys/dev/e1000/if_em.c      (revision 211398)
> > |+++ sys/dev/e1000/if_em.c      (working copy)
> > --------------------------
> > Patching file sys/dev/e1000/if_em.c using Plan A...
> > Hunk #1 succeeded at 237.
> > Hunk #2 succeeded at 1730.
> > Hunk #3 succeeded at 1759.
> > Hunk #4 succeeded at 1930.
> > Hunk #5 succeeded at 3148.
> > Hunk #6 succeeded at 3351.
> > Hunk #7 succeeded at 3533.
> > Hunk #8 succeeded at 3590.
> > Hunk #9 succeeded at 3603.
> > Hmm...  The next patch looks like a unified diff to me...
> > The text leading up to this was:
> > --------------------------
> > |Index: sys/dev/e1000/if_em.h
> > |===================================================================
> > |--- sys/dev/e1000/if_em.h      (revision 211398)
> > |+++ sys/dev/e1000/if_em.h      (working copy)
> > --------------------------
> > Patching file sys/dev/e1000/if_em.h using Plan A...
> > Hunk #1 succeeded at 284.
> > done
> >
> >         ---Mike
> >
> >
> > >you're using multiple Tx queues with em(4) it would be better to
> > >disable Tx checksum offloading as driver always have to create a
> > >new checksum context for each frame. This will effectively disable
> > >pipelined Tx data DMA which in turn greatly slows down Tx
> > >performance for small sized frames. The reason driver have to
> > >create a new checksum context when it uses multiple Tx queues comes
> > >from hardware limitation. The controller tracks only for the last
> > >context descriptor that was written such that driver does not know
> > >the state of checksum context configured in other Tx queue.
> > >Hope this helps.
> > >
> > >>
> > >>
> > >>         ---Mike
> > >>
> > >>
> > >> At 03:36 PM 7/2/2010, Pyun YongHyeon wrote:
> > >> >On Fri, Jul 02, 2010 at 01:39:22PM -0400, Mike Tancsa wrote:
> > >> >> Hi Jack,
> > >> >>         Just a followup to the email below. I now saw what appears
> > >> >> to be the same problem on RELENG_8, but on a different nic and with
> > >> >> VLANs.  So not sure if this is a general em problem, a problem
> > >> >> specific to some em NICs, or a TSO problem in general.  The issue
> > >> >> seemed to be triggered when I added a new vlan based on
> > >> >>
> > >> >> em3@pci0:14:0:0:        class=0x020000 card=0x109a15d9
> > >> >> chip=0x109a8086 rev=0x00 hdr=0x00
> > >> >>     vendor     = 'Intel Corporation'
> > >> >>     device     = 'Intel PRO/1000 PL Network Adaptor (82573L)'
> > >> >>     class      = network
> > >> >>     subclass   = ethernet
> > >> >>     cap 01[c8] = powerspec 2  supports D0 D3  current D0
> > >> >>     cap 05[d0] = MSI supports 1 message, 64 bit enabled 
> with 1 message
> > >> >>     cap 10[e0] = PCI-Express 1 endpoint max data 128(256) link x1(x1)
> > >> >>
> > >> >> pci14: <ACPI PCI bus> on pcib5
> > >> >> em3: <Intel(R) PRO/1000 Network Connection 7.0.5> port 0x6000-0x601f
> > >> >> mem 0xe8300000-0xe831ffff irq 17 at device 0.0 on pci14
> > >> >> em3: Using MSI interrupt
> > >> >> em3: [FILTER]
> > >> >> em3: Ethernet address: 00:30:48:9f:eb:81
> > >> >>
> > >> >> em3: flags=8943<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST>
> > >> >> metric 0 mtu 1500
> > >> >>         options=2098<VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,WOL_MAGIC>
> > >> >>         ether 00:30:48:9f:eb:81
> > >> >>         inet 10.255.255.254 netmask 0xfffffffc broadcast
> > >10.255.255.255
> > >> >>         media: Ethernet autoselect (1000baseT <full-duplex>)
> > >> >>         status: active
> > >> >>
> > >> >> I had to disable tso, rxcsum and txsum in order to see the devices on
> > >> >> the other side of the two vlans trunked off em3.  Unfortunately, the
> > >> >> other sides were switches 100km and 500km away so I didnt have any
> > >> >> tcpdump capabilities to diagnose the issue.  I had already created
> > >> >> one vlan off this NIC and all was fine.  A few weeks later, I added a
> > >> >> new one and I could no longer telnet into the remote switches from
> > >> >> the local machine.... But, I could telnet into the switches from
> > >> >> machines not on the problem box. Hence, it would appear to be a
> > >> >> general TSO issue no ? I disabled tso on the nic (I didnt disable
> > >> >> net.inet.tcp.tso as I forgot about that).. Still nothing. I could
> > >> >> always ping the remote devices, but no tcp services.  I then
> > >> >> remembered this issue from before, so I tried disabling tso on the
> > >> >> NIC. Still nothing. Then I disabled rxcsum and txcsum and I could
> > >> >> then telnet into the remote devices.
> > >> >>
> > >> >> This newly observed issue was from a buildworld on Mon Jun 14
> > >> >> 11:29:12 EDT 2010.
> > >> >>
> > >> >> I will try and recreate the issue locally again to see if I can
> > >> >> trigger the problem on demand.  Any thoughts on what it might be ?
> > >> >> Perhaps an issue specific to certain em nics ?
> > >> >>
> > >> >
> > >> >http://www.freebsd.org/cgi/query-pr.cgi?pr=141843
> > >> >I'm not sure whether you're seeing the same issue though.
> > >> >I didn't have chance to try latest em(4) on stable/7.
> > >>
> > >> --------------------------------------------------------------------
> > >> Mike Tancsa,                                      tel +1 519 651 3400
> > >> Sentex Communications,                            mike@sentex.net
> > >> Providing Internet since 1994                    www.sentex.net
> > >> Cambridge, Ontario Canada                         www.sentex.net/mike
> > >>
> >
> > --------------------------------------------------------------------
> > Mike Tancsa,                                      tel +1 519 651 3400
> > Sentex Communications,                            mike@sentex.net
> > Providing Internet since 1994                    www.sentex.net
> > Cambridge, Ontario Canada                         www.sentex.net/mike
> >

--------------------------------------------------------------------
Mike Tancsa,                                      tel +1 519 651 3400
Sentex Communications,                            mike@sentex.net
Providing Internet since 1994                    www.sentex.net
Cambridge, Ontario Canada                         www.sentex.net/mike




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?201009141759.o8EHxcZ0013539>