Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 26 Nov 2019 07:55:11 +0200
From:      Artem Viklenko <artem@viklenko.net>
To:        bsd-lists@BSDforge.com, freebsd-net@freebsd.org
Subject:   Re: How to remove watchdog?
Message-ID:  <a326ce0a-2d9c-a357-13c2-9d027b933dc8@viklenko.net>
In-Reply-To: <a4babd3b49218255887ead10d3110b3f@udns.ultimatedns.net>
References:  <a4babd3b49218255887ead10d3110b3f@udns.ultimatedns.net>

next in thread | previous in thread | raw e-mail | index | archive | help
Hi!

I have several small boxes with realtek nics acting as a router/firewall.
Also had same issues. FreeBSD driver didn't work at least for me so I 
switched to Realtek's driver. But after some time traffic stops passing 
my routers. Did some investigation and found that the issue is 9k mbufs.
As far as I understand more traffic you push more issues with 9k mbufs
appears due to memory fragmentation.
You can check it with 'vmstat -z | grep mbuf'.

So I decided to do wery dirty hack - I've changed Jumbo_Frame_9k
to Jumbo_Frame_4k in the if_re.c from Realtek's latest 1.95 driver.
It comiles and work on FreeBSD 10.x and 11.x and now it just works
(vendor says that driver is for older versions of the FreeBSD OS).
And no more issues.

ITEM                   SIZE  LIMIT     USED     FREE      REQ FAIL SLEEP

mbuf_packet:            256, 2362080,       2,    1263, 2054916,   0,   0
mbuf:                   256, 2362080,     514,    1776,3460790080,   0,   0
mbuf_cluster:          2048, 369076,    1265,      31,  154081,   0,   0
mbuf_jumbo_page:       4096, 184537,     513,     294,1592339809,   0,   0
mbuf_jumbo_9k:         9216,  54677,       0,       0,       0,   0,   0
mbuf_jumbo_16k:       16384,  30756,       0,       0,       0,   0,   0

Now driver use mbuf_jumbo_page not mbuf_jumbo_9k and no fails.

I'm ok with mtu 1500 in my environment and I don't know if mtu 9000 will
work with this change. But at least it is stable now even after 100 days
of uptime (just rebooted after upgraded to 11.3-RELEASE-p5).

Hope this helps.


26.11.19 02:44, Chris пише:
> Or at least make it non fatal.
> OK here's the story; I'm experimenting with a multiport NIC (re(4))
> as we hope to start using multiport 10G NICs.
> Any of the re's we've used in the past have been very stable, which
> is why I picked the one I did for this experiment. This one has been
> performing rock solid for some 4 to 6 mos, under full time use. That
> is until the last week. Where we're seeing:
> watchdog timeout
> repeated frequently. Which is ultimately fatal. ifconfig up/down will
> not resuscitate it. Nor will service ifconfig restart, or plugging/
> unplugging the cable(s). Bouncing the server is the only cure. Which
> is unacceptable. Any, and All suggestions, or insight into the matter
> GREATLY appreciated. Note; while this is an old 11.1, we're not planning
> to up this box until we can confirm this can be cured. :)
> 
> Details follow:
> 11.1-STABLE r327867 amd64
> 
> watchdog timeout
> watchdog timeout
> watchdog timeout
> watchdog timeout
> watchdog timeout
> watchdog timeout
> watchdog timeout
> watchdog timeout
> watchdog timeout
> watchdog timeout
> watchdog timeout
> watchdog timeout
> 
> rc.conf(5)
> ifconfig_re0="inet AA.BBB.CC.XX netmask 255.255.255.0 rxcsum txcsum tso4"
> ifconfig_re1="inet AA.BBB.CC.WW netmask 255.255.255.0 rxcsum txcsum tso4"
> ifconfig_re1_alias0="inet AA.BBB.CC.ZZ netmask 255.255.255.0"
> 
> ifconfig(8)
> re0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
>      options=8219b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,TSO4,WOL_MAGIC,LINKSTATE>
>      ether 00:13:3b:0f:13:44
>      hwaddr 00:13:3b:0f:13:44
>      inet6 fe80::213:3bff:fe0f:1344%re0 prefixlen 64 scopeid 0x1 
>      inet AA.BBB.CC.XX netmask 0xffffff00 broadcast 24.113.41.255 
>      nd6 options=23<PERFORMNUD,ACCEPT_RTADV,AUTO_LINKLOCAL>
>      media: Ethernet autoselect (1000baseT <full-duplex>)
>      status: active
> re1: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
>      options=8219b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,TSO4,WOL_MAGIC,LINKSTATE>
>      ether 00:13:3b:0f:13:45
>      hwaddr 00:13:3b:0f:13:45
>      inet AA.BBB.CC.WW netmask 0xffffff00 broadcast 24.113.41.255 
>      inet AA.BBB.CC.ZZ netmask 0xffffff00 broadcast 24.113.41.255 
>      inet6 fe80::213:3bff:fe0f:1345%re1 prefixlen 64 scopeid 0x2     nd6 
> options=23<PERFORMNUD,ACCEPT_RTADV,AUTO_LINKLOCAL>
>      media: Ethernet autoselect (1000baseT <full-duplex>)
>      status: active
> lo0: flags=8049<UP,LOOPBACK,RUNNING,MULTICAST> metric 0 mtu 16384
>      options=600003<RXCSUM,TXCSUM,RXCSUM_IPV6,TXCSUM_IPV6>
>      inet6 ::1 prefixlen 128     inet6 fe80::1%lo0 prefixlen 64 scopeid 
> 0x3     inet 127.0.0.1 netmask 0xff000000     nd6 
> options=21<PERFORMNUD,AUTO_LINKLOCAL>
>      groups: lo
> pciconf(8)
> re0@pci0:5:0:0:    class=0x020000 card=0x012310ec chip=0x816810ec 
> rev=0x07 hdr=0x00
>     vendor     = 'Realtek Semiconductor Co., Ltd.'
>     device     = 'RTL8111/8168/8411 PCI Express Gigabit Ethernet 
> Controller'
>     class      = network
>     subclass   = ethernet
> re1@pci0:6:0:0:    class=0x020000 card=0x012310ec chip=0x816810ec 
> rev=0x07 hdr=0x00
>     vendor     = 'Realtek Semiconductor Co., Ltd.'
>     device     = 'RTL8111/8168/8411 PCI Express Gigabit Ethernet 
> Controller'
>     class      = network
>     subclass   = ethernet
> 
> Thanks again!
> 
> --Chris
> 
> 
> _______________________________________________
> freebsd-net@freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-net
> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org"

-- 
Regards!



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?a326ce0a-2d9c-a357-13c2-9d027b933dc8>