Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 24 Sep 2013 11:16:57 +0300
From:      Oleg Tarasov <adpacifyer@gmail.com>
To:        freebsd-net@freebsd.org
Subject:   FreeBSD 9 em driver high inerrupt load + network crashes
Message-ID:  <CAA3OJXkeiUv1YLtcjbO2TX0_O1KK8cVRW5g3VYo5XuA2aPbQVQ@mail.gmail.com>

next in thread | raw e-mail | index | archive | help
Hello,

I am currently experiencing a problem on one of my servers. Interestingly
it is the only one with such problem however it is the most loaded one.

Sometimes on this server network becomes completely unresponsitive forcing
me to attach console and reboot the system. There are no records in the
logs that could possibly point me to any direction.

However I am seing unexpected CPU interrupt load on this server:
CPU:  0.7% user,  0.0% nice,  0.2% system, 18.1% interrupt, 80.9% idle

# vmstat -i
interrupt                          total       rate
irq16: em0 aac0+              8776771724      99009     <---    This is
definitely a problem
irq23: ehci1                      177381          2
cpu0:timer                       8442700         95
irq264: em1:rx 0                12525811        141
irq265: em1:tx 0                10743747        121
irq266: em1:link                    3893          0
cpu1:timer                       5513791         62
cpu2:timer                       5937620         66
cpu3:timer                      84101501        948
Total                         8904218168     100446

This server is using ipfw firewall with ipfw nat. The kernel is compiled
with IPSEC support which has 8 rules in SPD table.

The network card causing high load is em0 facing internal network. In ipfw
there are just plain allow rules on it. All nat conversions and ipsec
operations are done on another interface (em1) facing internet.

I see both em0 and aac0 sharing irq 16 so maybe it is a RAID controller
causing conflicts or whatever. Currently I can't find a solution or a
method of more in-depth diagnostics of the matter.

Please advice what can possibly cause the irq overload and how can I fight
it?

Kernel config:
=========================================
cpu             HAMMER
ident           SGP

makeoptions     DEBUG=-g                # Build kernel with gdb(1) debug
symbols

# IPFW

options         IPFIREWALL              #firewall
options         IPFIREWALL_VERBOSE      #enable logging to syslogd(8)
options         IPFIREWALL_VERBOSE_LIMIT=300    #limit verbosity
options         IPFIREWALL_DEFAULT_TO_ACCEPT    #allow everything by default
options         IPFIREWALL_FORWARD      #packet destination changes
options         IPFIREWALL_NAT          #ipfw kernel nat support
options         IPDIVERT                #divert sockets

# NAT
options         LIBALIAS                # libalias library, performing NAT

# IPSEC

options         IPSEC
device          crypto

# Below is a common GENERIC config
=========================================

dmesg output:
=========================================
FreeBSD 9.1-RELEASE-p7 #2: Mon Sep 23 10:02:56 EEST 2013
    ******@****************:/usr/obj/usr/src/sys/SGP amd64
CPU: Intel(R) Core(TM) i7-2600K CPU @ 3.40GHz (3392.37-MHz K8-class CPU)
  Origin = "GenuineIntel"  Id = 0x206a7  Family = 6  Model = 2a  Stepping =
7

Features=0xbfebfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,DTS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE>

Features2=0x1f9ae3bf<SSE3,PCLMULQDQ,DTES64,MON,DS_CPL,VMX,EST,TM2,SSSE3,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,POPCNT,TSCDLT,AESNI,XSAVE,OSXSAVE,AVX>
  AMD Features=0x28100800<SYSCALL,NX,RDTSCP,LM>
  AMD Features2=0x1<LAHF>
  TSC: P-state invariant, performance statistics
real memory  = 8589934592 (8192 MB)
avail memory = 8141856768 (7764 MB)
Event timer "LAPIC" quality 600
ACPI APIC Table: <INTEL  DH67BL  >
FreeBSD/SMP: Multiprocessor System Detected: 4 CPUs
FreeBSD/SMP: 1 package(s) x 4 core(s)
 cpu0 (BSP): APIC ID:  0
 cpu1 (AP): APIC ID:  2
 cpu2 (AP): APIC ID:  4
 cpu3 (AP): APIC ID:  6
ioapic0 <Version 2.0> irqs 0-23 on motherboard
kbd1 at kbdmux0
ctl: CAM Target Layer loaded
cryptosoft0: <software crypto> on motherboard
acpi0: <INTEL DH67BL> on motherboard
acpi0: Power Button (fixed)
cpu0: <ACPI CPU> on acpi0
cpu1: <ACPI CPU> on acpi0
cpu2: <ACPI CPU> on acpi0
cpu3: <ACPI CPU> on acpi0
attimer0: <AT timer> port 0x40-0x43 irq 0 on acpi0
Timecounter "i8254" frequency 1193182 Hz quality 0
Event timer "i8254" frequency 1193182 Hz quality 100
atrtc0: <AT realtime clock> port 0x70-0x71 irq 8 on acpi0
Event timer "RTC" frequency 32768 Hz quality 0
hpet0: <High Precision Event Timer> iomem 0xfed00000-0xfed003ff on acpi0
Timecounter "HPET" frequency 14318180 Hz quality 950
Event timer "HPET" frequency 14318180 Hz quality 550
Event timer "HPET1" frequency 14318180 Hz quality 440
Event timer "HPET2" frequency 14318180 Hz quality 440
Event timer "HPET3" frequency 14318180 Hz quality 440
Event timer "HPET4" frequency 14318180 Hz quality 440
Timecounter "ACPI-fast" frequency 3579545 Hz quality 900
acpi_timer0: <24-bit timer at 3.579545MHz> port 0x408-0x40b on acpi0
pcib0: <ACPI Host-PCI bridge> port 0xcf8-0xcff on acpi0
pci0: <ACPI PCI bus> on pcib0
pcib1: <ACPI PCI-PCI bridge> irq 16 at device 1.0 on pci0
pci1: <ACPI PCI bus> on pcib1
aac0: <Adaptec RAID 2405> mem 0xf7800000-0xf79fffff irq 16 at device 0.0 on
pci1
aac0: Enabling 64-bit address support
aac0: Enable Raw I/O
aac0: Enable 64-bit array
aac0: New comm. interface enabled
aac0: Adaptec 2405, aac driver 2.1.9-1
aacp0 on aac0
aacp1 on aac0
aacp2 on aac0
vgapci0: <VGA-compatible display> port 0xf000-0xf03f mem
0xf7400000-0xf77fffff,0xe0000000-0xefffffff irq 16 at device 2.0 on pci0
agp0: <SandyBridge desktop GT2+ IG> on vgapci0
agp0: aperture size is 256M, detected 65532k stolen memory
pci0: <simple comms> at device 22.0 (no driver attached)
ehci0: <EHCI (generic) USB 2.0 controller> mem 0xf7e02000-0xf7e023ff irq 16
at device 26.0 on pci0
usbus0: EHCI version 1.0
usbus0 on ehci0
pcib2: <ACPI PCI-PCI bridge> irq 17 at device 28.0 on pci0
pci2: <ACPI PCI bus> on pcib2
pcib3: <ACPI PCI-PCI bridge> irq 16 at device 0.0 on pci2
pci3: <ACPI PCI bus> on pcib3
em0: <Intel(R) PRO/1000 Legacy Network Connection 1.0.4> port 0xe000-0xe03f
mem 0xf7d40000-0xf7d5ffff,0xf7d20000-0xf7d3ffff irq 16 at device 0.0 on pci3
em0: Ethernet address: 90:e2:ba:17:59:d5
pcib4: <ACPI PCI-PCI bridge> irq 18 at device 28.2 on pci0
pci4: <ACPI PCI bus> on pcib4
em1: <Intel(R) PRO/1000 Network Connection 7.3.2> port 0xd000-0xd01f mem
0xf7cc0000-0xf7cdffff,0xf7c00000-0xf7c7ffff,0xf7ce0000-0xf7ce3fff irq 18 at
device 0.0 on pci4
em1: Using MSIX interrupts with 3 vectors
em1: Ethernet address: 68:05:ca:05:75:34
ehci1: <EHCI (generic) USB 2.0 controller> mem 0xf7e01000-0xf7e013ff irq 23
at device 29.0 on pci0
usbus1: EHCI version 1.0
usbus1 on ehci1
isab0: <PCI-ISA bridge> at device 31.0 on pci0
isa0: <ISA bus> on isab0
pci0: <serial bus, SMBus> at device 31.3 (no driver attached)
acpi_button0: <Power Button> on acpi0
orm0: <ISA Option ROMs> at iomem
0xc0000-0xcdfff,0xce000-0xd47ff,0xd4800-0xd57ff,0xd5800-0xd67ff on isa0
sc0: <System console> at flags 0x100 on isa0
sc0: VGA <16 virtual consoles, flags=0x300>
vga0: <Generic ISA VGA> at port 0x3c0-0x3df iomem 0xa0000-0xbffff on isa0
atkbdc0: <Keyboard controller (i8042)> at port 0x60,0x64 on isa0
atkbd0: <AT Keyboard> irq 1 on atkbdc0
kbd0 at atkbd0
atkbd0: [GIANT-LOCKED]
ppc0: cannot reserve I/O port range
est0: <Enhanced SpeedStep Frequency Control> on cpu0
p4tcc0: <CPU Frequency Thermal Control> on cpu0
est1: <Enhanced SpeedStep Frequency Control> on cpu1
p4tcc1: <CPU Frequency Thermal Control> on cpu1
est2: <Enhanced SpeedStep Frequency Control> on cpu2
p4tcc2: <CPU Frequency Thermal Control> on cpu2
est3: <Enhanced SpeedStep Frequency Control> on cpu3
p4tcc3: <CPU Frequency Thermal Control> on cpu3
Timecounters tick every 1.000 msec
IPsec: Initialized Security Association Processing.
ipfw2 (+ipv6) initialized, divert enabled, nat enabled, rule-based
forwarding enabled, default to accept, logging disabled
aacd0 on aac0
aacd0: 476150MB (975155200 sectors)
usbus0: 480Mbps High Speed USB v2.0
usbus1: 480Mbps High Speed USB v2.0
ugen0.1: <Intel> at usbus0
uhub0: <Intel EHCI root HUB, class 9/0, rev 2.00/1.00, addr 1> on usbus0
ugen1.1: <Intel> at usbus1
uhub1: <Intel EHCI root HUB, class 9/0, rev 2.00/1.00, addr 1> on usbus1
uhub0: 2 ports with 2 removable, self powered
uhub1: 2 ports with 2 removable, self powered
ugen0.2: <vendor 0x8087> at usbus0
uhub2: <vendor 0x8087 product 0x0024, class 9/0, rev 2.00/0.00, addr 2> on
usbus0
ugen1.2: <vendor 0x8087> at usbus1
uhub3: <vendor 0x8087 product 0x0024, class 9/0, rev 2.00/0.00, addr 2> on
usbus1
uhub2: 6 ports with 6 removable, self powered
uhub3: 8 ports with 8 removable, self powered
ugen0.3: <CHESEN> at usbus0
ukbd0: <CHESEN PS2 to USB Converter, class 0/0, rev 1.10/0.10, addr 3> on
usbus0
kbd2 at ukbd0
ums0: <CHESEN PS2 to USB Converter, class 0/0, rev 1.10/0.10, addr 3> on
usbus0
ums0: 5 buttons and [XYZ] coordinates ID=1
pass0 at aacp0 bus 0 scbus0 target 0 lun 0
pass0: < ST3500630AS 3.AA> Fixed Uninstalled SCSI-5 device (offline)
pass0: 3.300MB/s transfers
pass1 at aacp0 bus 0 scbus0 target 1 lun 0
pass1: < ST3500630AS 3.AA> Fixed Uninstalled SCSI-5 device (offline)
pass1: 3.300MB/s transfers
SMP: AP CPU #1 Launched!
SMP: AP CPU #2 Launched!
SMP: AP CPU #3 Launched!
=========================================

# sysctl -a | grep dev.em

dev.em.0.%desc: Intel(R) PRO/1000 Legacy Network Connection 1.0.4
dev.em.0.%driver: em
dev.em.0.%location: slot=0 function=0
dev.em.0.%pnpinfo: vendor=0x8086 device=0x107c subvendor=0x8086
subdevice=0x1376 class=0x020000
dev.em.0.%parent: pci3
dev.em.0.nvm: -1
dev.em.0.rx_int_delay: 0
dev.em.0.tx_int_delay: 66
dev.em.0.rx_abs_int_delay: 66
dev.em.0.tx_abs_int_delay: 66
dev.em.0.rx_processing_limit: 100
dev.em.0.flow_control: 3
dev.em.0.mbuf_alloc_fail: 0
dev.em.0.cluster_alloc_fail: 0
dev.em.0.dropped: 0
dev.em.0.tx_dma_fail: 0
dev.em.0.tx_desc_fail1: 0
dev.em.0.tx_desc_fail2: 35079
dev.em.0.rx_overruns: 0
dev.em.0.watchdog_timeouts: 0
dev.em.0.device_control: 1077674561
dev.em.0.rx_control: 32770
dev.em.0.fc_high_water: 47104
dev.em.0.fc_low_water: 45604
dev.em.0.fifo_workaround: 0
dev.em.0.fifo_reset: 0
dev.em.0.txd_head: 83
dev.em.0.txd_tail: 83
dev.em.0.rxd_head: 2
dev.em.0.rxd_tail: 1
dev.em.0.mac_stats.excess_coll: 0
dev.em.0.mac_stats.single_coll: 0
dev.em.0.mac_stats.multiple_coll: 0
dev.em.0.mac_stats.late_coll: 0
dev.em.0.mac_stats.collision_count: 0
dev.em.0.mac_stats.symbol_errors: 0
dev.em.0.mac_stats.sequence_errors: 0
dev.em.0.mac_stats.defer_count: 0
dev.em.0.mac_stats.missed_packets: 0
dev.em.0.mac_stats.recv_no_buff: 0
dev.em.0.mac_stats.recv_undersize: 0
dev.em.0.mac_stats.recv_fragmented: 0
dev.em.0.mac_stats.recv_oversize: 0
dev.em.0.mac_stats.recv_jabber: 0
dev.em.0.mac_stats.recv_errs: 0
dev.em.0.mac_stats.crc_errs: 0
dev.em.0.mac_stats.alignment_errs: 0
dev.em.0.mac_stats.coll_ext_errs: 0
dev.em.0.mac_stats.xon_recvd: 0
dev.em.0.mac_stats.xon_txd: 0
dev.em.0.mac_stats.xoff_recvd: 0
dev.em.0.mac_stats.xoff_txd: 0
dev.em.0.mac_stats.total_pkts_recvd: 15546581
dev.em.0.mac_stats.good_pkts_recvd: 15546581
dev.em.0.mac_stats.bcast_pkts_recvd: 74727
dev.em.0.mac_stats.mcast_pkts_recvd: 0
dev.em.0.mac_stats.rx_frames_64: 2240358
dev.em.0.mac_stats.rx_frames_65_127: 2026067
dev.em.0.mac_stats.rx_frames_128_255: 1667746
dev.em.0.mac_stats.rx_frames_256_511: 757367
dev.em.0.mac_stats.rx_frames_512_1023: 1081371
dev.em.0.mac_stats.rx_frames_1024_1522: 7773672
dev.em.0.mac_stats.good_octets_recvd: 11071669736
dev.em.0.mac_stats.good_octets_txd: 3680329427
dev.em.0.mac_stats.total_pkts_txd: 12769896
dev.em.0.mac_stats.good_pkts_txd: 12769896
dev.em.0.mac_stats.bcast_pkts_txd: 90
dev.em.0.mac_stats.mcast_pkts_txd: 2802292
dev.em.0.mac_stats.tx_frames_64: 5525466
dev.em.0.mac_stats.tx_frames_65_127: 2387324
dev.em.0.mac_stats.tx_frames_128_255: 2280501
dev.em.0.mac_stats.tx_frames_256_511: 337074
dev.em.0.mac_stats.tx_frames_512_1023: 204906
dev.em.0.mac_stats.tx_frames_1024_1522: 2034625
dev.em.0.mac_stats.tso_txd: 0
dev.em.0.mac_stats.tso_ctx_fail: 0
dev.em.1.%desc: Intel(R) PRO/1000 Network Connection 7.3.2
dev.em.1.%driver: em
dev.em.1.%location: slot=0 function=0
dev.em.1.%pnpinfo: vendor=0x8086 device=0x10d3 subvendor=0x8086
subdevice=0xa01f class=0x020000
dev.em.1.%parent: pci4
dev.em.1.nvm: -1
dev.em.1.debug: -1
dev.em.1.fc: 3
dev.em.1.rx_int_delay: 0
dev.em.1.tx_int_delay: 66
dev.em.1.rx_abs_int_delay: 66
dev.em.1.tx_abs_int_delay: 66
dev.em.1.rx_processing_limit: 100
dev.em.1.eee_control: 1
dev.em.1.link_irq: 3924
dev.em.1.mbuf_alloc_fail: 0
dev.em.1.cluster_alloc_fail: 0
dev.em.1.dropped: 0
dev.em.1.tx_dma_fail: 3
dev.em.1.rx_overruns: 0
dev.em.1.watchdog_timeouts: 0
dev.em.1.device_control: 1074790984
dev.em.1.rx_control: 67141634
dev.em.1.fc_high_water: 18432
dev.em.1.fc_low_water: 16932
dev.em.1.queue0.txd_head: 502
dev.em.1.queue0.txd_tail: 506
dev.em.1.queue0.tx_irq: 10924208
dev.em.1.queue0.no_desc_avail: 0
dev.em.1.queue0.rxd_head: 130
dev.em.1.queue0.rxd_tail: 129
dev.em.1.queue0.rx_irq: 12733678
dev.em.1.mac_stats.excess_coll: 142
dev.em.1.mac_stats.single_coll: 759301
dev.em.1.mac_stats.multiple_coll: 290042
dev.em.1.mac_stats.late_coll: 1049816
dev.em.1.mac_stats.collision_count: 1767409
dev.em.1.mac_stats.symbol_errors: 0
dev.em.1.mac_stats.sequence_errors: 0
dev.em.1.mac_stats.defer_count: 1959182
dev.em.1.mac_stats.missed_packets: 4794
dev.em.1.mac_stats.recv_no_buff: 0
dev.em.1.mac_stats.recv_undersize: 0
dev.em.1.mac_stats.recv_fragmented: 0
dev.em.1.mac_stats.recv_oversize: 0
dev.em.1.mac_stats.recv_jabber: 0
dev.em.1.mac_stats.recv_errs: 0
dev.em.1.mac_stats.crc_errs: 0
dev.em.1.mac_stats.alignment_errs: 0
dev.em.1.mac_stats.coll_ext_errs: 0
dev.em.1.mac_stats.xon_recvd: 0
dev.em.1.mac_stats.xon_txd: 0
dev.em.1.mac_stats.xoff_recvd: 0
dev.em.1.mac_stats.xoff_txd: 0
dev.em.1.mac_stats.total_pkts_recvd: 13461258
dev.em.1.mac_stats.good_pkts_recvd: 13456464
dev.em.1.mac_stats.bcast_pkts_recvd: 156733
dev.em.1.mac_stats.mcast_pkts_recvd: 0
dev.em.1.mac_stats.rx_frames_64: 1001246
dev.em.1.mac_stats.rx_frames_65_127: 5655384
dev.em.1.mac_stats.rx_frames_128_255: 3983776
dev.em.1.mac_stats.rx_frames_256_511: 539762
dev.em.1.mac_stats.rx_frames_512_1023: 260024
dev.em.1.mac_stats.rx_frames_1024_1522: 2016272
dev.em.1.mac_stats.good_octets_recvd: 4428966188
dev.em.1.mac_stats.good_octets_txd: 12327752567
dev.em.1.mac_stats.total_pkts_txd: 16190870
dev.em.1.mac_stats.good_pkts_txd: 16190870
dev.em.1.mac_stats.bcast_pkts_txd: 5
dev.em.1.mac_stats.mcast_pkts_txd: 0
dev.em.1.mac_stats.tx_frames_64: 547777
dev.em.1.mac_stats.tx_frames_65_127: 2630586
dev.em.1.mac_stats.tx_frames_128_255: 2404829
dev.em.1.mac_stats.tx_frames_256_511: 1283850
dev.em.1.mac_stats.tx_frames_512_1023: 1104051
dev.em.1.mac_stats.tx_frames_1024_1522: 8219777
dev.em.1.mac_stats.tso_txd: 234428
dev.em.1.mac_stats.tso_ctx_fail: 0
dev.em.1.interrupts.asserts: 1990
dev.em.1.interrupts.rx_pkt_timer: 0
dev.em.1.interrupts.rx_abs_timer: 0
dev.em.1.interrupts.tx_pkt_timer: 0
dev.em.1.interrupts.tx_abs_timer: 0
dev.em.1.interrupts.tx_queue_empty: 0
dev.em.1.interrupts.tx_queue_min_thresh: 0
dev.em.1.interrupts.rx_desc_min_thresh: 0
dev.em.1.interrupts.rx_overrun: 0



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAA3OJXkeiUv1YLtcjbO2TX0_O1KK8cVRW5g3VYo5XuA2aPbQVQ>