Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 7 Sep 2017 13:51:11 +0800
From:      Sepherosa Ziehau <sephe@freebsd.org>
To:        Paul Koch <paul.koch@akips.com>
Cc:        Julian Elischer <julian@freebsd.org>, freebsd-stable@freebsd.org,  Hongjiang Zhang <honzhan@microsoft.com>
Subject:   Re: 11.1 running on HyperV hn interface hangs
Message-ID:  <CAMOc5cx-dES8ABM8oNNKfT50ZUebVtpkKJn9Vtc_nQrRQ2vHgQ@mail.gmail.com>
In-Reply-To: <20170907150711.025e4e41@akips.com>
References:  <20170906193309.796c79ed@akips.com> <3f96c7d0-4fbd-26cb-5c84-8868d12eb427@ingresso.co.uk> <14997bba-aac2-947b-9b78-04af41ea29b0@freebsd.org> <CAMOc5cxQp1P3WnDf=Hgo8X0LmQfjCTowO=ya8VsWhu4cdTeajQ@mail.gmail.com> <20170907150711.025e4e41@akips.com>

next in thread | previous in thread | raw e-mail | index | archive | help
Weird, your traffic pattern does not even belong to anything heavy.
Sending is mainly UDP, which will never be able to saturate the TX
buffer ring causing the RXBUF ACK sending failure.  This is weird.
Anyhow, make sure to test this patch:
8762017-Sep-07 02:19 hn_inc_txbr.diff

On Thu, Sep 7, 2017 at 1:07 PM, Paul Koch <paul.koch@akips.com> wrote:
> On Thu, 7 Sep 2017 10:22:40 +0800
> Sepherosa Ziehau <sephe@freebsd.org> wrote:
>
>> Is it possible to tell me your workload?  e.g. TX heavy or RX heavy.
>> Enabled TSO or not.  Details like how the send syscalls are issue will
>> be interesting.  And your Windows version, include the patch level,
>> etc.
>>
>> Please try the following patch:
>> https://people.freebsd.org/~sephe/hn_dec_txdesc.diff
>>
>> Thanks,
>> sephe
>
> Hi Sephe,
>
> Here's a bit of an explanation of the environment...
>
> AKIPS Network Monitor workload:
> - 22000 devices (routers/switches/APs/etc)
> - 123000 interfaces (60 snmp polling)
> - 131 netflow exporters
> - ~1500 pings per second
> - ~1000 snmp requests/responses per second (~1.9 million MIB object/min)
> - ~250 netflow packets/sec (~4500 flows/sec incoming)
> - ~130 syslog messages/sec (incoming)
> - ~200 snmp traps/sec (incoming)
>
> The ping/snmp poller is a single monolithic process (no threads).
> Separate processes for each of the syslog/trap/netflow collection.
>
> SNMP requests are sent using the sendto() system call over a non-blocking UDP
> socket for both IPv4 and v6.  We set the UDP socket receive buffer size to
> 4 Mbytes.  Nothing really complex with it.
>
> Pings are interlaced with snmp requests so we limit the bursty nature of
> small back-to-back packets (eliminates issues with switch interfaces dropping
> bursts of packets).  Ping requests are sent using a raw icmp socket.  We
> don't read the responses from the icmp socket, instead we put the interface
> into promiscuous mode and use the BPF info to measure the tx/rx RTT values.
>
> Syslog daemon just listens on a UDP socket with a 4 Mbyte receive buffer.
> Same with the snmp trap daemon.
>
>
> Here's some links to performance graphs of the VM:
>  https://www.akips.com/downloads/hyperv-fbsd11.1p1/system-graphs-last2h.pdf
>  https://www.akips.com/downloads/hyperv-fbsd11.1p1/system-graphs-last24h.pdf
>  https://www.akips.com/downloads/hyperv-fbsd11.1p1/system-graphs-last7d.pdf
>
> The OS was upgraded to 11.1p1 at 5pm on the 5th Sep.  The hn0 interface hung
> at 7:36pm.  The interface hung three times before we reverted to 11.0p9.  It
> takes a few hours after rebooting the VM before the interface hangs.
>
>
> Microsoft Host is running Windows 2012 R2.  Waiting for patch level info from
> the customer.
>
> I'll have to get the customer to spin up a new VM before trying your patch.
>
>
> Here's some info (after a reboot of the VM)
>
> Guest VM dmesg:
>
> FreeBSD 11.1-RELEASE-p1 #0 r322350: Thu Aug 10 22:16:21 UTC 2017
>     root@shed31.akips.com:/usr/obj/usr/src/sys/GENERIC amd64
> FreeBSD clang version 4.0.0 (tags/RELEASE_400/final 297347) (based on LLVM
> 4.0.0)
> VT(vga): text 80x25
> Hyper-V Version: 6.3.9600 [SP18]
>   Features=0xe7f<VPRUNTIME,TMREFCNT,SYNIC,SYNTM,APIC,HYPERCALL,VPINDEX,REFTSC,IDLE,TMFREQ>
>   PM Features=0x0 [C2]
>   Features3=0x7b2<DEBUG,XMMHC,IDLE,NUMA,TMFREQ,SYNCMC,CRASH>
> Timecounter "Hyper-V" frequency 10000000 Hz quality 2000
> CPU: Intel(R) Xeon(R) CPU E5-2695 v3 @ 2.30GHz (2300.00-MHz K8-class CPU)
>   Origin="GenuineIntel"  Id=0x306f2  Family=0x6  Model=0x3f  Stepping=2
>   Features=0x1f83fbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,MMX,FXSR,SSE,SSE2,SS,HTT>
>   Features2=0x80002001<SSE3,CX16,HV>
>   AMD Features=0x20100800<SYSCALL,NX,LM>
>   AMD Features2=0x1<LAHF>
> Hypervisor: Origin = "Microsoft Hv"
> real memory  = 34359738368 (32768 MB)
> avail memory = 33325903872 (31782 MB)
> Event timer "LAPIC" quality 100
> ACPI APIC Table: <VRTUAL MICROSFT>
> FreeBSD/SMP: Multiprocessor System Detected: 4 CPUs
> FreeBSD/SMP: 1 package(s) x 4 core(s)
> random: unblocking device.
> ioapic0: Changing APIC ID to 0
> ioapic0 <Version 1.1> irqs 0-23 on motherboard
> SMP: AP CPU #1 Launched!
> SMP: AP CPU #3 Launched!
> SMP: AP CPU #2 Launched!
> Timecounter "Hyper-V-TSC" frequency 10000000 Hz quality 3000
> random: entropy device external interface
> kbd1 at kbdmux0
> netmap: loaded module
> module_register_init: MOD_LOAD (vesa, 0xffffffff80f5b220, 0) error 19
> nexus0
> vtvga0: <VT VGA driver> on motherboard
> cryptosoft0: <software crypto> on motherboard
> acpi0: <VRTUAL MICROSFT> on motherboard
> acpi0: Power Button (fixed)
> cpu0: <ACPI CPU> on acpi0
> cpu1: <ACPI CPU> on acpi0
> cpu2: <ACPI CPU> on acpi0
> cpu3: <ACPI CPU> on acpi0
> attimer0: <AT timer> port 0x40-0x43 irq 0 on acpi0
> Timecounter "i8254" frequency 1193182 Hz quality 0
> Event timer "i8254" frequency 1193182 Hz quality 100
> atrtc0: <AT realtime clock> port 0x70-0x71 irq 8 on acpi0
> Event timer "RTC" frequency 32768 Hz quality 0
> Timecounter "ACPI-fast" frequency 3579545 Hz quality 900
> acpi_timer0: <32-bit timer at 3.579545MHz> port 0x408-0x40b on acpi0
> pcib0: <ACPI Host-PCI bridge> port 0xcf8-0xcff on acpi0
> vmbus0: <Hyper-V Vmbus> on pcib0
> pci0: <ACPI PCI bus> on pcib0
> isab0: <PCI-ISA bridge> at device 7.0 on pci0
> isa0: <ISA bus> on isab0
> atapci0: <Intel PIIX4 UDMA33 controller> port
> 0x1f0-0x1f7,0x3f6,0x170-0x177,0x376,0xffa0-0xffaf at device 7.1 on pci0
> ata0: <ATA channel> at channel 0 on atapci0
> ata1: <ATA channel> at channel 1 on atapci0
> pci0: <bridge> at device 7.3 (no driver attached)
> vgapci0: <VGA-compatible display> mem 0xf8000000-0xfbffffff irq 11 at device
> 8.0 on pci0
> vgapci0: Boot video device
> atkbdc0: <Keyboard controller (i8042)> port 0x60,0x64 irq 1 on acpi0
> atkbd0: <AT Keyboard> irq 1 on atkbdc0
> kbd0 at atkbd0
> atkbd0: [GIANT-LOCKED]
> psm0: <PS/2 Mouse> irq 12 on atkbdc0
> psm0: [GIANT-LOCKED]
> psm0: model IntelliMouse Explorer, device ID 4
> uart0: <16550 or compatible> port 0x3f8-0x3ff irq 4 flags 0x10 on acpi0
> uart1: <16550 or compatible> port 0x2f8-0x2ff irq 3 on acpi0
> fdc0: <floppy drive controller (FDE)> port 0x3f0-0x3f5,0x3f7 irq 6 drq 2 on
> acpi0
> fd0: <1440-KB 3.5" drive> on fdc0 drive 0
> vmbus_res0: <Hyper-V Vmbus Resource> irq 5,7 on acpi0
> orm0: <ISA Option ROM> at iomem 0xc0000-0xcbfff on isa0
> vga0: <Generic ISA VGA> at port 0x3c0-0x3df iomem 0xa0000-0xbffff on isa0
> ppc0: cannot reserve I/O port range
> ZFS filesystem version: 5
> ZFS storage pool version: features support (5000)
> Timecounters tick every 1.000 msec
> usb_needs_explore_all: no devclass
> nvme cam probe device init
> vmbus0: version 3.0
> hvet0: <Hyper-V event timer> on vmbus0
> Event timer "Hyper-V" frequency 10000000 Hz quality 1000
> storvsc0: <Hyper-V IDE> on vmbus0
> hvkbd0: <Hyper-V KBD> on vmbus0
> hvheartbeat0: <Hyper-V Heartbeat> on vmbus0
> hvkvp0: <Hyper-V KVP> on vmbus0
> hvshutdown0: <Hyper-V Shutdown> on vmbus0
> hvvss0: <Hyper-V VSS> on vmbus0
> storvsc1: <Hyper-V SCSI> on vmbus0
> hn0: <Hyper-V Network Interface> on vmbus0
> da0 at blkvsc0 bus 0 scbus2 target 1 lun 0
> da0: <Msft Virtual Disk 1.0> Fixed Direct Access SPC-3 SCSI device
> da0: 300.000MB/s transfers
> da0: Command Queueing enabled
> da0: 307200MB (629145600 512 byte sectors)
> cd0 at ata1 bus 0 scbus1 target 0 lun 0
> cd0: <Msft Virtual CD/ROM 1.0> Removable CD-ROM SPC-3 SCSI device
> cd0: 16.700MB/s transfers (WDMA2, ATAPI 12bytes, PIO 65534bytes)
> cd0: Attempt to query device size failed: NOT READY, Medium not present
> hn0: Ethernet address: 00:15:5d:31:21:0f
> hn0: link state changed to UP
> Trying to mount root from ufs:/dev/gpt/akips-root1 [rw]...
> Setting hostuuid: 2b0f0733-401e-7646-9724-a89072a2f005.
> Setting hostid: 0xd87511b7.
> Starting file system checks:
> /dev/gpt/akips-root1: FILE SYSTEM CLEAN; SKIPPING CHECKS
> /dev/gpt/akips-root1: clean, 961795 free (227 frags, 120196 blocks, 0.0%
> fragmentation)
> /dev/gpt/akips-root0: FILE SYSTEM CLEAN; SKIPPING CHECKS
> /dev/gpt/akips-root0: clean, 977258 free (146 frags, 122139 blocks, 0.0%
> fragmentation)
> Mounting local filesystems:.
> ELF ldconfig
> path: /lib /usr/lib /usr/lib/compat /usr/local/lib /usr/local/lib/perl5/5.24/mach/CORE
> 32-bit compatibility ldconfig path: /usr/lib32
> Setting hostname: xxxxxx
> Setting up harvesting:
> [UMA],[FS_ATIME],SWI,INTERRUPT,NET_NG,NET_ETHER,NET_TUN,MOUSE,KEYBOARD,ATTACH,CACHED
> Feeding entropy: .
> Limiting icmp unreach response from 358 to 200 packets/sec
> Limiting icmp unreach response from 464 to 200 packets/sec
> Starting Network: lo0 hn0.
> lo0: flags=8049<UP,LOOPBACK,RUNNING,MULTICAST> metric 0 mtu 16384
>         options=600003<RXCSUM,TXCSUM,RXCSUM_IPV6,TXCSUM_IPV6>
>         inet6 ::1 prefixlen 128
>         inet6 fe80::1%lo0 prefixlen 64 scopeid 0x1
>         inet 127.0.0.1 netmask 0xff000000
>         nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
>         groups: lo
> hn0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
>         options=51b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,TSO4,LRO>
>         ether 00:15:5d:31:21:0f
>         hwaddr 00:15:5d:31:21:0f
>         inet 172.31.5.13 netmask 0xffffff00 broadcast 172.31.5.255
>         nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
>         media: Ethernet autoselect (10Gbase-T <full-duplex>)
>         status: active
> Starting devd.
> add host 127.0.0.1: gateway lo0 fib 0: route already in table
> add net default: gateway 172.31.5.254
> add host ::1: gateway lo0 fib 0: route already in table
> add net fe80::: gateway ::1
> add net ff02::: gateway ::1
> add net ::ffff:0.0.0.0: gateway ::1
> add net ::0.0.0.0: gateway ::1
> Creating and/or trimming log files.
> Starting syslogd.
> Setting date via ntp.
> Limiting icmp unreach response from 398 to 200 packets/sec
> Limiting icmp unreach response from 350 to 200 packets/sec
> Limiting icmp unreach response from 333 to 200 packets/sec
> Limiting icmp unreach response from 501 to 200 packets/sec
> Limiting icmp unreach response from 336 to 200 packets/sec
> Limiting icmp unreach response from 351 to 200 packets/sec
> No core dumps found.
> Clearing /tmp (X related).
> Updating motd:.
> Mounting late filesystems:.
> Starting ntpd.
> Limiting icmp unreach response from 424 to 200 packets/sec
> postfix/postfix-script: starting the Postfix mail system
> vfs.zfs.zfetch.max_distance: 8388608 -> 67108864
> vfs.usermount: 0 -> 1
> net.inet.tcp.sendbuf_inc: 8192 -> 16384
> net.inet.tcp.recvbuf_inc: 16384 -> 524288
> net.inet.tcp.keepidle: 7200000 -> 60000
> net.inet.tcp.keepintvl: 75000 -> 30000
> net.inet.tcp.keepinit: 75000 -> 30000
> net.inet.udp.blackhole: 0 -> 1
> net.inet.tcp.blackhole: 0 -> 1
> net.inet.sctp.blackhole: 0 -> 1
> net.link.ether.inet.log_arp_movements: 1 -> 0
> kern.init_shutdown_timeout: 120 -> 300
> root: Clearing temporary files
> root: Clearing core files
> root: Clearing state machines
> root: Prefetching database files
> root: Starting time daemon
> 2017-09-06 09:03:46 nm-timed common.c:6032 DEBUG: Create fmap tmap 1032 bytes
> root: Starting joat daemon
> root: Starting watchdog
> 2017-09-06 09:03:46 nm-joatd common.c:5802 INFO: Start
> Sep  6 09:03:46 bull nm-joatd: Start
> Configuring vt: blanktime.
> Performing sanity check on sshd configuration.
> Starting sshd.
> Starting cron.
> --------------------------------------------------------------------
>
>
> ifconfig:
>
> lo0: flags=8049<UP,LOOPBACK,RUNNING,MULTICAST> metric 0 mtu 16384
>         options=600003<RXCSUM,TXCSUM,RXCSUM_IPV6,TXCSUM_IPV6>
>         inet6 ::1 prefixlen 128
>         inet6 fe80::1%lo0 prefixlen 64 scopeid 0x1
>         inet 127.0.0.1 netmask 0xff000000
>         nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
>         groups: lo
> hn0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
>         options=51b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,TSO4,LRO>
>         ether 00:15:5d:31:21:0f
>         hwaddr 00:15:5d:31:21:0f
>         inet 172.31.5.13 netmask 0xffffff00 broadcast 172.31.5.255
>         nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
>         media: Ethernet autoselect (10Gbase-T <full-duplex>)
>         status: active
> ------------------------------------------------------------------------
>
>
> Some sysctl info for the hn device
>
> hw.hn.tx_agg_pkts: -1
> hw.hn.tx_agg_size: -1
> hw.hn.lro_mbufq_depth: 512
> hw.hn.tx_swq_depth: 0
> hw.hn.tx_ring_cnt: 0
> hw.hn.chan_cnt: 0
> hw.hn.use_if_start: 0
> hw.hn.use_txdesc_bufring: 1
> hw.hn.tx_taskq_mode: 0
> hw.hn.tx_taskq_cnt: 1
> hw.hn.lro_entry_count: 128
> hw.hn.direct_tx_size: 128
> hw.hn.tx_chimney_size: 0
> hw.hn.tso_maxlen: 65535
> hw.hn.trust_hostip: 1
> hw.hn.trust_hostudp: 1
> hw.hn.trust_hosttcp: 1
> dev.hn.0.vf:
> dev.hn.0.polling: 0
> dev.hn.0.agg_pkts: -1
> dev.hn.0.agg_size: -1
> dev.hn.0.rndis_agg_align: 8
> dev.hn.0.rndis_agg_pkts: 8
> dev.hn.0.rndis_agg_size: 4026531839
> dev.hn.0.rss_ind_size: 128
> dev.hn.0.rss_hash: 1701<TOEPLITZ,IP4,TCP4,IP6,TCP6>
> dev.hn.0.rxfilter: d<DIRECT,ALLMULTI,BROADCAST>
> dev.hn.0.hwassist: 17<CSUM_IP,CSUM_IP_UDP,CSUM_IP_TCP,CSUM_IP_TSO>
> dev.hn.0.caps:
> 3ff<VLAN,MTU,IPCS,TCP4CS,TCP6CS,UDP4CS,UDP6CS,TSO4,TSO6,HASHVAL>
> dev.hn.0.ndis_version: 6.30 dev.hn.0.nvs_version: 327680
> dev.hn.0.channel.13.sub.3.br.tx.state: rindex:122136 windex:122136 imask:0
> ravail:0 wavail:520192 dev.hn.0.channel.13.sub.3.br.rx.state: rindex:157368
> windex:157368 imask:0 ravail:0 wavail:520192 dev.hn.0.channel.13.sub.3.mnf: 1
> dev.hn.0.channel.13.sub.3.cpu: 3
> dev.hn.0.channel.13.sub.3.chanid: 16
> dev.hn.0.channel.13.sub.2.br.tx.state: rindex:116032 windex:116032 imask:0
> ravail:0 wavail:520192 dev.hn.0.channel.13.sub.2.br.rx.state: rindex:150096
> windex:150096 imask:0 ravail:0 wavail:520192 dev.hn.0.channel.13.sub.2.mnf: 1
> dev.hn.0.channel.13.sub.2.cpu: 2
> dev.hn.0.channel.13.sub.2.chanid: 15
> dev.hn.0.channel.13.sub.1.br.tx.state: rindex:236264 windex:236264 imask:0
> ravail:0 wavail:520192 dev.hn.0.channel.13.sub.1.br.rx.state: rindex:304448
> windex:304448 imask:0 ravail:0 wavail:520192 dev.hn.0.channel.13.sub.1.mnf: 1
> dev.hn.0.channel.13.sub.1.cpu: 1
> dev.hn.0.channel.13.sub.1.chanid: 14
> dev.hn.0.channel.13.br.tx.state: rindex:147520 windex:147520 imask:0 ravail:0
> wavail:520192 dev.hn.0.channel.13.br.rx.state: rindex:167776 windex:167776
> imask:0 ravail:0 wavail:520192 dev.hn.0.channel.13.mnf: 1
> dev.hn.0.channel.13.cpu: 0
> dev.hn.0.rx_ring_inuse: 4
> dev.hn.0.rx_ring_cnt: 4
> dev.hn.0.rx_ack_failed: 0
> dev.hn.0.small_pkts: 538
> dev.hn.0.csum_trusted: 0
> dev.hn.0.csum_udp: 8734
> dev.hn.0.csum_tcp: 18
> dev.hn.0.csum_ip: 9758
> dev.hn.0.trust_hostip: 1
> dev.hn.0.trust_hostudp: 1
> dev.hn.0.trust_hosttcp: 1
> dev.hn.0.lro_ackcnt_lim: 2
> dev.hn.0.lro_length_lim: 18000
> dev.hn.0.lro_tried: 18
> dev.hn.0.lro_flushed: 18
> dev.hn.0.lro_queued: 18
> dev.hn.0.rx.3.pktbuf_len: 16384
> dev.hn.0.rx.3.rss_pkts: 2223
> dev.hn.0.rx.3.packets: 2223
> dev.hn.0.rx.2.pktbuf_len: 16384
> dev.hn.0.rx.2.rss_pkts: 2186
> dev.hn.0.rx.2.packets: 2186
> dev.hn.0.rx.1.pktbuf_len: 16384
> dev.hn.0.rx.1.rss_pkts: 4323
> dev.hn.0.rx.1.packets: 4323
> dev.hn.0.rx.0.pktbuf_len: 16384
> dev.hn.0.rx.0.rss_pkts: 1026
> dev.hn.0.rx.0.packets: 1275
> dev.hn.0.agg_align: 8
> dev.hn.0.agg_pktmax: 8
> dev.hn.0.agg_szmax: 6144
> dev.hn.0.tx_ring_inuse: 4
> dev.hn.0.tx_ring_cnt: 4
> dev.hn.0.sched_tx: 1
> dev.hn.0.direct_tx_size: 128
> dev.hn.0.tx_chimney_size: 6144
> dev.hn.0.tx_chimney_max: 6144
> dev.hn.0.txdesc_cnt: 512
> dev.hn.0.tx_chimney_tried: 1345
> dev.hn.0.tx_chimney: 1345
> dev.hn.0.tx_collapsed: 0
> dev.hn.0.agg_flush_failed: 0
> dev.hn.0.txdma_failed: 0
> dev.hn.0.send_failed: 0
> dev.hn.0.no_txdescs: 0
> dev.hn.0.tx.3.sends: 0
> dev.hn.0.tx.3.packets: 0
> dev.hn.0.tx.3.oactive: 0
> dev.hn.0.tx.2.sends: 0
> dev.hn.0.tx.2.packets: 0
> dev.hn.0.tx.2.oactive: 0
> dev.hn.0.tx.1.sends: 19
> dev.hn.0.tx.1.packets: 19
> dev.hn.0.tx.1.oactive: 0
> dev.hn.0.tx.0.sends: 1326
> dev.hn.0.tx.0.packets: 1425
> dev.hn.0.tx.0.oactive: 0
> dev.hn.0.%parent: vmbus0
> dev.hn.0.%pnpinfo: classid=f8615163-df3e-46c5-913f-f2d2f965ed0e
> deviceid=503f1ad8-7dad-4d5f-9f16-6c4383e28d12 dev.hn.0.%location:
> dev.hn.0.%driver: hn
> dev.hn.0.%desc: Hyper-V Network Interface
> dev.hn.%parent:
> ------------------------------------------------------------
>



-- 
Tomorrow Will Never Die



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAMOc5cx-dES8ABM8oNNKfT50ZUebVtpkKJn9Vtc_nQrRQ2vHgQ>