Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 18 Feb 2014 05:07:43 +0100
From:      Robert Sevat <robert.sevat@live.nl>
To:        "freebsd-net@freebsd.org" <freebsd-net@freebsd.org>
Cc:        "freebsd-stable@freebsd.org" <freebsd-stable@freebsd.org>
Subject:   re driver crashing under load, can reproduce it.
Message-ID:  <DUB114-W897C47A4F6106694AF38AD87980@phx.gbl>

next in thread | raw e-mail | index | archive | help
Hey=2C

I've got a small server on which the network driver crashes completely the =
instant I put any network load on it. The only way to fix it is by rebootin=
g the machine=2C it'll be completely unresponsive to ifconfig up or down.

I've seen a bunch of errors already:

re0: watchdog timeout
re0: link state changed to DOWN
re0: link state changed to UP

It'll start with that before the driver completely crashes and locks up=2C =
a few hunderd times the up/down changes.

Feb 18 00:49:33 transmission-video transmission-daemon[1791]: UDP Failed to=
 set receive buffer: No buffer space available (tr-udp.c:59)
Feb 18 00:49:33 transmission-video transmission-daemon[1791]: UDP Failed to=
 set receive buffer: requested 4194304=2C got 42080 (tr-udp.c:78)

I've also had this=2C so I've set the buffer already to 4194304 with: sysct=
l net.inet.udp.recvspace: 4194304.

After I did this transmission stopped complaining for a bit. An hour later =
the Re driver crashed again. This time after reboot the driver refused to w=
ork at all. I had to remove that from sysctl.conf and set it back to 42080 =
before the driver would work again.

netstat -sl re0: http://pastebin.com/NmDWJJ6k

This does show that a lot of udp packets are dropped due to full buffers:
"4880 dropped due to no socket
        2708 broadcast/multicast datagrams undelivered
        139828 dropped due to full socket buffers"

I have also gotten:=20
"Feb 15 02:39:00 incognitus kernel: sonewconn: pcb 0xfffff80028d28620: List=
en queue overflow: 193 already in queue awaiting acceptance
Feb 15 02:39:03 incognitus last message repeated 207 times"

After googling a bit I have tried multiple things:

Disable acpi in the bios=2C and enable ErP to ensure no weird things happen=
 with power states. I've also disabled powerd in rc.conf.

Because I also got these messages in dmesg: "ip6addrctl: socket(UDP): No bu=
ffer space available" I've disabled ipv6 on the machine.

ip6addrctl_enable=3D"NO"
ip6addrctl_policy=3D"ipv4_prefer"
ipv6_network_interfaces=3D"none"
ipv6_active_all_interfaces=3D"NO"

I have also disabling msix and msi in /boot/loader.conf because this was su=
ggested by others.

hw.re.msi_disable=3D"1"
hw.re.msix_disable=3D"1"

I also have disabled hardware checksum offloading with ifconfig

ifconfig re0 -txcsum
ifconfig re0 -rxcsum

I've tried forcing the nic to use Full duplex 1000BaseTX since some people =
suggested it was due to auto negotiation failure. When I did this the entir=
e driver locked up completely and refused to work until I rebooted it.

ifconfig re0 media 1000BaseTX mediaopt full-duplex

This is on a machine that runs:=20

root@incognitus:/ # uname -a
FreeBSD incognitus.indylix.nl 10.0-RELEASE FreeBSD 10.0-RELEASE #0 r261411:=
 Sun Feb  2 21:51:04 CET 2014     robert@Incognitus:/usr/obj/usr/src/sys/Pf=
  amd64

I've only added PF support to the kernel. It happens with PF enabled or dis=
abled=2C makes no difference.

I've ran Pfsense 2.1 on this machine for about 3-4 months without any of th=
ese problems. This was also while putting significant load on it (120 mbit =
internet). But now that it runs FreeBSD 10.0 it is highly unstable as soon =
as I push any traffic. I can manually trigger the crash by starting an Rsyn=
c upload to another server. This upload will do roughly 80 mbit of traffic =
and crash it within a few Gigabytes of traffic. Or by adding a few torrents=
 to Transmission that push a fair bit of netwerk traffic. But it's only the=
 Re driver that crashes=2C the machine it self is up and responsive=2C only=
 the network stops working. Crashes can be triggered within 10 minutes.

root@incognitus:/ #  pciconf -lcv
re0@pci0:1:0:0: class=3D0x020000 card=3D0xe0001458 chip=3D0x816810ec rev=3D=
0x06 hdr=3D0x00
    vendor     =3D 'Realtek Semiconductor Co.=2C Ltd.'
    device     =3D 'RTL8111/8168B PCI Express Gigabit Ethernet controller'
    class      =3D network
    subclass   =3D ethernet
    cap 01[40] =3D powerspec 3  supports D0 D1 D2 D3  current D0
    cap 05[50] =3D MSI supports 1 message=2C 64 bit
    cap 10[70] =3D PCI-Express 2 endpoint IRQ 1 max data 128(128) link x1(x=
1)
                 speed 2.5(2.5) ASPM disabled(L0s/L1)
    cap 11[b0] =3D MSI-X supports 4 messages
                 Table in map 0x20[0x0]=2C PBA in map 0x20[0x800]
    cap 03[d0] =3D VPD
    ecap 0001[100] =3D AER 1 0 fatal 0 non-fatal 1 corrected
    ecap 0002[140] =3D VC 1 max VC0
    ecap 0003[160] =3D Serial 1 01000000684ce000

This is on a Gigabyte GA-C847N with the Realtek RTL8111F network card.=20

Any things that I could try? Commands to run? Or extra info you'd like to h=
ave? Since I'm pretty much out of ideas.=20

(Except of course buying a different Intel nic=2C which I will resort to if=
 I can't get it resolved since it's unworkable now. I rather help debug a p=
roblem in the driver.)

Kind Regards=2C

Robert Sevat
 		 	   		  =



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?DUB114-W897C47A4F6106694AF38AD87980>