Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 27 Oct 2011 17:17:34 +0400
From:      Emil Muratov <gpm@hotplug.ru>
To:        Hooman Fazaeli <hoomanfazaeli@gmail.com>
Cc:        freebsd-net@freebsd.org
Subject:   Re: Intel 82574L interface wedging on em 7.1.9/7.2.3	when	MSIX	enabled
Message-ID:  <4EA959EE.2070806@hotplug.ru>
In-Reply-To: <4EA91836.2040508@gmail.com>
References:  <CAAAm0r0RXEJo4UiKS=Ui0e5OQTg6sg-xcYf3mYB5%2Bvk8i8557w@mail.gmail.com>	<4E8F157A.40702@sentex.net>	<CAAAm0r2JH43Rct7UxQK2duH1p43Nepnj5mpb6bXo==DPayhJLg@mail.gmail.com>	<4E8F51D4.1060509@sentex.net>	<CACqU3MVwLaepFymZJkaVk6p=SpykGhqs=VYFjLh9fP9S=AxDhg@mail.gmail.com>	<CAAAm0r1DKvoL9=Ket9up=4%2B5xiCzTTZJK99FhF9jcCA28B0M%2BA@mail.gmail.com>	<CAAAm0r3XdsMHZh%2BP_NF-txZasdExzwZ8ymmGQgGhJQds0fOiBQ@mail.gmail.com>	<CAAAm0r1iS3z-7CBJ=xYDf%2BJOA1Q2nU0O54Twbyb7FjvgWHjKVw@mail.gmail.com>	<4EA7E203.3020306@sepehrs.com>	<CAAAm0r3Nr2t8cCetPkFnLQ-3KwqHw_0SpqbtvYPRUkSP=9n8CA@mail.gmail.com>	<4EA80818.3030504@sentex.net>	<4EA80F88.4000400@hotplug.ru> <4EA82715.2000404@gmail.com>	<4EA8FA40.7010504@hotplug.ru> <4EA91836.2040508@gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help

>>
>> Hi Hooman
>>
>> Here is what I've got when the script triggered just in time when the 
>> interface was locked
>>
>>
>> 11.10.26-23:39:10 ... interface em0 is down...
>>
>> FreeBSD ion.hotplug.ru 8.2-STABLE FreeBSD 8.2-STABLE #0: Thu Oct 20 
>> 20:20:25 MSD 2011     root@epia.home
>> .lan:/usr/obj/usr/src/sys/ION6debug  amd64
>> 11:39PM  up  1:12, 2 users, load averages: 0.26, 0.48, 0.58
>>
>>
>>  == vmstat -i ==
>> interrupt                          total       rate
>> irq22: nfe0                     16644480       3865
>> cpu0: timer                      8610122       1999
>> irq256: ahci0                     606705        140
>> irq257: em0:rx 0                 3896622        904
>> irq258: em0:tx 0                 2762957        641
>> irq259: em0:link                     620          0
>> cpu3: timer                      8609499       1999
>> cpu1: timer                      8609499       1999
>> cpu2: timer                      8609499       1999
>> Total                           58350003      13550
>>
>>  == netstat -ind ==
>> Name    Mtu Network       Address              Ipkts Ierrs Idrop    
>> Opkts Oerrs  Coll Drop
>> usbus     0 <Link#1>                               0     0     
>> 0        0     0     0    0
>> usbus     0 <Link#2>                               0     0     
>> 0        0     0     0    0
>> nfe0   1500 <Link#3>      00:25:22:21:86:89  7157140     0     0 
>> 12266747     0     0    0
>> nfe0   1500 fe80::225:22f fe80::225:22ff:fe        0     -     
>> -       85     -     -    -
>> nfe0   1500 10.16.128.0/1 10.16.189.71             0     -     -    
>> 48135     -     -    -
>> em0    9000 <Link#4>      00:1b:21:ab:bf:4a  5465087   623     0  
>> 2862028     0     0  113
>> em0    9000 192.168.168.0 192.168.168.1       764085     -     -  
>> 1005078     -     -    -
>> em0    9000 fe80::21b:21f fe80::21b:21ff:fe       45     -     -      
>> 252     -     -    -
>> em0    9000 2002:d58d:871 2002:d58d:8715:1:       73     -     
>> -       38     -     -    -
>> wifi   1500 <Link#7>      00:1b:21:ab:bf:4a      347     0     0      
>> 350     0     0    0
>> wifi   1500 192.168.168.6 192.168.168.65           0     -     
>> -        0     -     -    -
>> wifi   1500 fe80::225:x fe80::225:x:x        0     -     -      
>> 349     -     -    -
>> wifi   1500 2002:x:x 2002:x:x:2:        0     -     -        0     
>> -     -    -
>> wifio  1500 <Link#8>      00:1b:21:ab:bf:4a    59559     0     0   
>> 114639     0     0    0
>> wifio  1500 192.168.168.8 192.168.168.81           0     -     -      
>> 160     -     -    -
>> wifio  1500 fe80::225:x fe80::225:x:x        0     -     -        
>> 0     -     -    -
>> stf0   1280 <Link#9>                            5725     0     0     
>> 6125   420     0    0
>> stf0   1280 2002:x:x 2002:x:x::1     1878     -     -     1121     
>> -     -    -
>> ng0*   1500 <Link#10>                              0     0     
>> 0        0     0     0    0
>> ng1*   1500 <Link#11>                              0     0     
>> 0        0     0     0    0
>> ng2    1492 <Link#12>                        7143733     0     0 
>> 12234436     0     0    0
>> ng2    1492 213.141.x.x 213.141.x.x     4735932     -     -  
>> 8480089     -     -    -
>> ng2    1492 fe80::x:x fe80::x:x:x        0     -     -        1     
>> -     -    -
>> tun0   1455 <Link#13>                            350     0     0      
>> 172     0     0    0
>> tun0   1455 fe80::225:x fe80::225:x:x        0     -     -        
>> 2     -     -    -
>> tun0   1455 192.168.169.1 192.168.169.1          117     -     -      
>> 167     -     -    -
>>
>> Oct 26 23:39:11 ion kernel: em0: hw tdh = 975, hw tdt = 944
>> Oct 26 23:39:11 ion kernel: em0: hw rdh = 960, hw rdt = 959
>> Oct 26 23:39:11 ion kernel: em0: Tx Queue Status = 1
>> Oct 26 23:39:11 ion kernel: em0: TX descriptors avail = 31
>> Oct 26 23:39:11 ion kernel: em0: Tx Descriptors avail failure = 0
>> Oct 26 23:39:11 ion kernel: em0: RX discarded packets = 0
>> Oct 26 23:39:11 ion kernel: em0: RX Next to Check = 960
>> Oct 26 23:39:11 ion kernel: em0: RX Next to Refresh = 959
>>
>> net.inet.ip.intr_queue_maxlen: 4096
>> net.inet.ip.intr_queue_drops: 0
>> dev.em.0.%desc: Intel(R) PRO/1000 Network Connection 7.2.3
>> dev.em.0.%driver: em
>> dev.em.0.%location: slot=0 function=0
>> dev.em.0.%pnpinfo: vendor=0x8086 device=0x10d3 subvendor=0x8086 
>> subdevice=0xa01f class=0x020000
>> dev.em.0.%parent: pci2
>> dev.em.0.nvm: -1
>> dev.em.0.debug: -1
>> dev.em.0.rx_int_delay: 200
>> dev.em.0.tx_int_delay: 200
>> dev.em.0.rx_abs_int_delay: 4096
>> dev.em.0.tx_abs_int_delay: 4096
>> dev.em.0.rx_processing_limit: 100
>> dev.em.0.flow_control: 3
>> dev.em.0.eee_control: 0
>> dev.em.0.link_irq: 648
>> dev.em.0.mbuf_alloc_fail: 0
>> dev.em.0.cluster_alloc_fail: 0
>> dev.em.0.dropped: 0
>> dev.em.0.tx_dma_fail: 0
>> dev.em.0.rx_overruns: 0
>> dev.em.0.watchdog_timeouts: 0
>> dev.em.0.device_control: 1477444168
>> dev.em.0.rx_control: 100827170
>> dev.em.0.fc_high_water: 11264
>> dev.em.0.fc_low_water: 9764
>> dev.em.0.queue0.txd_head: 975
>> dev.em.0.queue0.txd_tail: 944
>> dev.em.0.queue0.tx_irq: 2762762
>> dev.em.0.queue0.no_desc_avail: 0
>> dev.em.0.queue0.rxd_head: 960
>> dev.em.0.queue0.rxd_tail: 959
>> dev.em.0.queue0.rx_irq: 3895860
>> dev.em.0.mac_stats.excess_coll: 0
>> dev.em.0.mac_stats.single_coll: 0
>> dev.em.0.mac_stats.multiple_coll: 0
>> dev.em.0.mac_stats.late_coll: 0
>> dev.em.0.mac_stats.collision_count: 0
>> dev.em.0.mac_stats.symbol_errors: 0
>> dev.em.0.mac_stats.sequence_errors: 0
>> dev.em.0.mac_stats.defer_count: 0
>> dev.em.0.mac_stats.missed_packets: 647
>> dev.em.0.mac_stats.recv_no_buff: 0
>> dev.em.0.mac_stats.recv_undersize: 0
>> dev.em.0.mac_stats.recv_fragmented: 0
>> dev.em.0.mac_stats.recv_fragmented: 0
>> dev.em.0.mac_stats.recv_oversize: 0
>> dev.em.0.mac_stats.recv_jabber: 0
>> dev.em.0.mac_stats.recv_errs: 0
>> dev.em.0.mac_stats.crc_errs: 0
>> dev.em.0.mac_stats.alignment_errs: 0
>> dev.em.0.mac_stats.coll_ext_errs: 0
>> dev.em.0.mac_stats.xon_recvd: 438789
>> dev.em.0.mac_stats.xon_txd: 366
>> dev.em.0.mac_stats.xoff_recvd: 438789
>> dev.em.0.mac_stats.xoff_txd: 1013
>> dev.em.0.mac_stats.total_pkts_recvd: 5465524
>> dev.em.0.mac_stats.good_pkts_recvd: 4587299
>> dev.em.0.mac_stats.bcast_pkts_recvd: 1102
>> dev.em.0.mac_stats.mcast_pkts_recvd: 162
>> dev.em.0.mac_stats.rx_frames_64: 325765
>> dev.em.0.mac_stats.rx_frames_65_127: 1029229
>> dev.em.0.mac_stats.rx_frames_128_255: 118432
>> dev.em.0.mac_stats.rx_frames_256_511: 11360
>> dev.em.0.mac_stats.rx_frames_512_1023: 100708
>> dev.em.0.mac_stats.rx_frames_1024_1522: 3001805
>> dev.em.0.mac_stats.good_octets_recvd: 4648591681
>> dev.em.0.mac_stats.good_octets_txd: 2203060494
>> dev.em.0.mac_stats.total_pkts_txd: 3780652
>> dev.em.0.mac_stats.good_pkts_txd: 3779273
>> dev.em.0.mac_stats.bcast_pkts_txd: 89
>> dev.em.0.mac_stats.mcast_pkts_txd: 534
>> dev.em.0.mac_stats.tx_frames_64: 1323163
>> dev.em.0.mac_stats.tx_frames_65_127: 850801
>> dev.em.0.mac_stats.tx_frames_128_255: 193136
>> dev.em.0.mac_stats.tx_frames_256_511: 64088
>> dev.em.0.mac_stats.tx_frames_512_1023: 47149
>> dev.em.0.mac_stats.tx_frames_1024_1522: 1300936
>> dev.em.0.mac_stats.tso_txd: 429804
>> dev.em.0.mac_stats.tso_ctx_fail: 0
>> dev.em.0.interrupts.asserts: 44
>> dev.em.0.interrupts.rx_pkt_timer: 0
>> dev.em.0.interrupts.rx_abs_timer: 0
>> dev.em.0.interrupts.tx_pkt_timer: 0
>> dev.em.0.interrupts.tx_abs_timer: 0
>> dev.em.0.interrupts.tx_queue_empty: 0
>> dev.em.0.interrupts.tx_queue_min_thresh: 0
>> dev.em.0.interrupts.rx_desc_min_thresh: 0
>> dev.em.0.interrupts.rx_overrun: 0
>>
>> ifconfig em0
>> em0: flags=8c43<UP,BROADCAST,RUNNING,OACTIVE,SIMPLEX,MULTICAST> 
>> metric 0 mtu 9000
>>         description: LAN
>>         
>> options=219b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,TSO4,WOL_MAGIC>
>>         ether 00:1b:21:ab:bf:4a
>>         inet 192.168.168.1 netmask 0xffffffc0 broadcast 192.168.168.63
>>         inet6 fe80::21b:21ff:feab:bf4a%em0 prefixlen 64 scopeid 0x4
>>         inet6 2002:x:x:1::1 prefixlen 64
>>         nd6 options=1<PERFORMNUD>
>>         media: Ethernet autoselect (1000baseT <full-duplex>)
>>         status: active
>>
> What device is at the other end of link?
>

Just a simple soho router/switch/wifi AP. I use it as a gigabit switch 
with dot1q support and wifi AP.
It's management capabilities and features are very limited. Here is what 
I've got form the switch info, it's rather obscure, for ex. I have no 
idea what does Dot3StatsFCSErrors means. But at least I had no problems 
with this switch and another nic which is nvidia LOM running nfe driver


root@tplink:~# swconfig dev rtl8366rb port 1 show
Port 1:
         link: port:1 link:up speed:1000baseT full-duplex tx-pause rx-pause
         mib: Port 1 MIB counters
IfInOctets                          : 7538704547344
EtherStatsOctets                    : 7538876056305
EtherStatsUnderSizePkts             : 0
EtherFragments                      : 3
EtherStatsPkts64Octets              : 662581957
EtherStatsPkts65to127Octets         : 2535622995
EtherStatsPkts128to255Octets        : 96290375
EtherStatsPkts256to511Octets        : 42103721
EtherStatsPkts512to1023Octets       : 83464044
EtherStatsPkts1024to1518Octets      : 384590702
EtherOversizeStats                  : 22227404
EtherStatsJabbers                   : 38641
IfInUcastPkts                       : 3757127256
EtherStatsMulticastPkts             : 69494916
EtherStatsBroadcastPkts             : 109605
EtherStatsDropEvents                : 69198572
Dot3StatsFCSErrors                  : 149421
Dot3StatsSymbolErrors               : 90698
Dot3InPauseFrames                   : 68996650
Dot3ControlInUnknownOpcodes         : 0
IfOutOctets                         : 8216482296676
Dot3StatsSingleCollisionFrames      : 0
Dot3StatMultipleCollisionFrames     : 0
Dot3sDeferredTransmissions          : 1474621219
Dot3StatsLateCollisions             : 0
EtherStatsCollisions                : 0
Dot3StatsExcessiveCollisions        : 0
Dot3OutPauseFrames                  : 1512177530
Dot1dBasePortDelayExceededDiscards  : 0
Dot1dTpPortInDiscards               : 9143
IfOutUcastPkts                      : 3920068852
IfOutMulticastPkts                  : 9715117
IfOutBroadcastPkts                  : 1006622


> The input errors netstat reports correspond to the missed_packets 
> reported by the driver.
> Also the driver has been OACTIVE when the problem occurred.
> There are also a strange number of xon/xoff frames transmitted and 
> received by the driver.
>
> You may try these settings and see if they help:
>
> - hw.em.fc_setting=0 (in /boot/loader.conf)
> - hw.em.rxd="4096" (in /boot/loader.conf)
> - hw.em.txd="4096" (in /boot/loader.conf)
> - Fix speed and duplex at both link sides. After doing that, confirm 
> on the freebsd
>   box (with ifconfig) and the other device (with whatever command it 
> provides) that
>   the same speed and duplex is used by both devices.
>
Thanks for taking a look at this and for the tips, I'll do the changes 
and will see if this helps.

> you also have  high values for dev.em.0.rx/tx_[abs]_int_delay. If you
> have set them manually, remove them or replace them with these in 
> loader.conf:
>
> hw.em.rx_int_delay=0
> hw.em.tx_int_delay=66
> hw.em.tx_abs_int_delay=66
> hw.em.rx_abs_int_delay=66
>
>
Yes, indeed it was my blind tuning trying to change anything here and 
there because of those locks. Will remove this to the default.






Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?4EA959EE.2070806>