Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 03 Feb 2011 16:53:32 -0500
From:      dieterbsd@engineer.com
To:        bug-followup@FreeBSD.org
Cc:        freebsd-net@freebsd.org, freebsd-firewire@freebsd.org, freebsd-drivers@freebsd.org, freebsd-usb@freebsd.org
Subject:   Re: kern/118093: firewire bus reset hogs CPU, causing data to be lost
Message-ID:  <8CD9203EF6F6EE0-1760-B81@web-mmc-d01.sysops.aol.com>

next in thread | raw e-mail | index | archive | help
New data: USB causes the same problem that firewire does.

FreeBSD 8.0
amd64
nfe0: <NVIDIA nForce4 CK804 MCP9 Networking Adapter> port 0xb400-0xb407=20
mem 0xfebf9000-0xfebf9fff irq 23 at device 10.0 on pci0
bge0: <Broadcom NetXtreme Gigabit Ethernet Controller, ASIC rev.=20
0x4101> mem 0xfe4f0000-0xfe4fffff irq 19 at device 0.0 on pci5
fwohci1: <NEC uPD72871/2> mem 0xfdeff000-0xfdefffff irq 19 at device=20
8.0 on pci2
ehci0: <NVIDIA nForce4 USB 2.0 controller> mem 0xfebfe000-0xfebfe0ff=20
irq 22 at device 2.1 on pci0

Plugging in a USB_to_RS-232 bridge generates:

ugen0.2: <Prolific Technology Inc.> at usbus0
uplcom0: <Prolific Technology Inc. USB-Serial Controller D, class 0/0,=20
rev 1.10/
4.00, addr 2> on usbus0

And networking is locked out for too long, and data is lost.  (The node=20
sending data
via Ethernet has too small a transmit buffer, but is a closed source=20
closed hardware
black box and cannot be fixed.  Once data is lost it is lost forever,=20
it cannot be
recreated, so this is very bad.)  The FreeBSD box receiving data has a=20
very large
receive buffer, but if another device driver locks out networking it=20
doesn't help.

Changing the printf(9)s to log(9)s fixes the problem, but this is not a=20
good workaround,
there are LOTS of printfs in the kernel, I keep hitting new ones. =20
Printf(9) alone isn't
the problem, adding printfs to chown(2) does not cause the problem, but=20
printfs from
device drivers do. (usb, firewire, ...)

My theory is there is a locking conflict.

Here is lock profiling data for USB_RS-232 bridge being plugged in=20
printfs locking out networking:

debug.lock.prof.enable: 0
debug.lock.prof.reset: 0
debug.lock.prof.stats:
      max  wait_max       total  wait_total       count    avg wait_avg=20
cnt_hold cnt_lock name
   251240         0      499982           0         303   1650      0  0=
=20
      0 /usr/src/sys/dev/usb/usb_request.c:322 (sx:0123456789ABCDEF -=20
USB device SX lock)
   104255         0      110429           0           2  55214      0  0=
=20
      0 /usr/src/sys/dev/usb/usb_device.c:2459 (sx:123456789ABCDEF - USB=
=20
config SX lock)

   104217         0    29670949           0       47087    630      0  0=
=20
      0 /usr/src/sys/kern/uipc_sockbuf.c:148 (sx:so_rcv_sx)  =20
<---------<<<<

   101962         0      103089           0          27   3818      0  0=
=20
     0 /usr/src/sys/kern/kern_condvar.c:145 (sleep mutex:Giant)
    51071         0       81088           0           3  27029      0  0=
=20
     0 /usr/src/sys/kern/kern_cons.c:422 (spin mutex:cnputs_mtx)
    23049         0       31351           0           2  15675      0  0=
=20
     0 /usr/src/sys/kern/vfs_syscalls.c:3508 (lockmgr:ufs)
    12408         0      978544           0         971   1007      0  0=
=20
     0 /usr/src/sys/kern/vfs_vnops.c:604 (lockmgr:ufs)
    11466         0      744372           0       11567     64      0  0=
=20
     0 /usr/src/sys/kern/vfs_bio.c:2559 (lockmgr:bufwait)
    11141         0      501155           0        3850    130      0  0=
=20
     0 /usr/src/sys/kern/vfs_bio.c:1835 (lockmgr:bufwait)
     6115         0       61720           0          26   2373      0  0=
=20
      0 /usr/src/sys/dev/usb/usb_request.c:322 (sx:123456789ABCDEF - USB=
=20
device SX lock)
     5551         0       16749           0          19    881      0  0=
=20
     0 /usr/src/sys/kern/vfs_subr.c:1693 (lockmgr:syncer)
     3043         0        3749           0           6    624      0  0=
=20
      0 /usr/src/sys/ufs/ffs/ffs_vfsops.c:1321 (sleep mutex:struct mount=
=20
mtx)
     2333         0        9901           0          36    275      0  0=
=20
     0 /usr/src/sys/kern/vfs_mount.c:2231 (sleep mutex:struct mount mtx)
     2055         0        9462           0         100     94      0  0=
=20
      0 /usr/src/sys/kern/subr_hints.c:117 (sleep mutex:kernel=20
environment)
     1649         0       79272           0        3846     20      0  0=
=20
     0 /usr/src/sys/kern/vfs_bio.c:2963 (sleep mutex:vm object)
     1636         0       36196           0       15515      2      0  0=
=20
      0 /usr/src/sys/vm/vm_page.c:1052 (sleep mutex:vm page queue free=20
mutex)
     1205         0       55601           0       15411      3      0  0=
=20
     0 /usr/src/sys/kern/vfs_bio.c:2545 (sleep mutex:bufobj interlock)
     1076         0        1076           0           1   1076      0  0=
=20
     0 /usr/src/sys/kern/vfs_subr.c:1693 (lockmgr:ufs)
     1069         0       31117           0          30   1037      0  0=
=20
     0 /usr/src/sys/kern/kern_mutex.c:147 (sleep mutex:nfe0)
      848         0        3752           0        1359      2      0  0=
=20
      0 /usr/src/sys/ufs/ffs/ffs_vfsops.c:1297 (sleep mutex:vnode=20
interlock)
      754         0       10016           0        3846      2      0  0=
=20
      0 /usr/src/sys/ufs/ffs/ffs_softdep.c:4274 (sleep mutex:Softdep=20
Lock)
      739         0        2463           0           4    615      0  0=
=20
     0 /usr/src/sys/kern/vfs_subr.c:1693 (lockmgr:devfs)
      540         0        3370           0        1365      2      0  0=
=20
     0 /usr/src/sys/kern/vfs_subr.c:3179 (sleep mutex:vnode interlock)
      525         0       80852           0         160    505      0  0=
=20
     0 /usr/src/sys/dev/uart/uart_cpu.h:92 (spin mutex:uart_hwmtx)
      512         0        2065           0          23     89      0  0=
=20
     0 /usr/src/sys/kern/vfs_default.c:599 (lockmgr:bufwait)
      487         0       97058           0         481    201      0  0=
=20
     0 /usr/src/sys/vm/vm_pager.c:311 (lockmgr:bufwait)
      467         0        2069           0          12    172      0  0=
=20
     0 /usr/src/sys/dev/usb/usb_busdma.c:547 (sleep mutex:uplcom)
      455         0        2978           0          18    165      0  0=
=20
     0 /usr/src/sys/dev/usb/usb_transfer.c:286 (sleep mutex:uplcom)
      423         0         727           0           2    363      0  0=
=20
     0 /usr/src/sys/vm/uma_core.c:1565 (sleep mutex:UMA lock)
      367         0      230166           0       47087      4      0  0=
=20
     0 /usr/src/sys/netinet/tcp_usrreq.c:708 (rw:tcpinp)
      358         0        1387           0          11    126      0  0=
=20
     0 /usr/src/sys/kern/kern_synch.c:241 (sleep mutex:Giant)
      339         0       72085           0       46119      1      0  0=
=20
     0 /usr/src/sys/kern/kern_mutex.c:147 (sleep mutex:so_rcv)
      332         0      225434           0       93206      2      0  0=
=20
     0 /usr/src/sys/kern/uipc_socket.c:1441 (sleep mutex:so_rcv)
      332         0       19529           0       11551      1      0  0=
=20
     0 /usr/src/sys/kern/vfs_subr.c:1927 (sleep mutex:bufobj interlock)
      331         0      162523           0       94887      1      0  0=
=20
     0 /usr/src/sys/netinet/tcp_output.c:280 (sleep mutex:so_snd)
      330         0       73512           0       47087      1      0  0=
=20
     0 /usr/src/sys/kern/uipc_socket.c:1843 (sleep mutex:so_rcv)
      329         0       88069           0       47771      1      0  0=
=20
     0 /usr/src/sys/kern/uipc_socket.c:1687 (sleep mutex:so_rcv)
      329         0         348           0          12     29      0  0=
=20
     0 /usr/src/sys/kern/vfs_subr.c:1955 (sleep mutex:Syncer mtx)
      320         0       66185           0        3353     19      0  0=
=20
     0 /usr/src/sys/kern/vfs_cluster.c:868 (lockmgr:bufwait)
      303         0        7410           0        4360      1      0  0=
=20
     0 /usr/src/sys/kern/vfs_bio.c:3944 (sleep mutex:bufobj interlock)
debug.lock.prof.rejected: 0
debug.lock.prof.skipcount: 0
debug.lock.prof.skipspin: 0

Here is lock profiling data for firewire bus reset printfs locking out=20
Ethernet:

      max  wait_max       total  wait_total       count    avg wait_avg=20
cnt_hold cnt_lock name
   373345         0      397495           0         643    618      0  0=
=20
     0 /usr/src/sys/kern/vfs_lookup.c:497 (lockmgr:ufs)
   373325         0      390815           0        3140    124      0  0=
=20
     0 /usr/src/sys/kern/vfs_subr.c:2218 (sleep mutex:vnode interlock)

   359514         0    32199808           0       45582    706      0  0=
=20
      0 /usr/src/sys/kern/uipc_sockbuf.c:148 (sx:so_rcv_sx)  =20
<---------<<<<

    63827       320      746196         740       70067     10      0  0=
=20
     4 /usr/src/sys/dev/firewire/fwohci.c:2107 (sleep mutex:firewire)
    49927         0      165929           0           8  20741      0  0=
=20
     0 /usr/src/sys/kern/vfs_syscalls.c:3508 (lockmgr:ufs)
    33358         0     1707461           0        1106   1543      0  0=
=20
     0 /usr/src/sys/kern/vfs_vnops.c:604 (lockmgr:ufs)
    30294         0      401224           0        2641    151      0  0=
=20
     0 /usr/src/sys/vm/vm_map.c:3526 (sx:user map)
    28432         0      934681           0          74  12630      0  0=
=20
     0 /usr/src/sys/kern/kern_cons.c:422 (spin mutex:cnputs_mtx)
    28296         0       66892           0        2641     25      0  0=
=20
     0 /usr/src/sys/vm/vm_fault.c:297 (sleep mutex:vm object)
    25020         0       26667           0          45    592      0  0=
=20
     0 /usr/src/sys/kern/kern_exit.c:825 (sx:proctree)
    19027         0     1225496           0       11937    102      0  0=
=20
     0 /usr/src/sys/kern/vfs_bio.c:2559 (lockmgr:bufwait)
    18887         0      906297           0        4140    218      0  0=
=20
     0 /usr/src/sys/kern/vfs_bio.c:1835 (lockmgr:bufwait)
    12101         0       23459           0        1675     14      0  0=
=20
     0 /usr/src/sys/kern/vfs_subr.c:2342 (sleep mutex:vnode interlock)
     7448         0      170841           0        2229     76      0  0=
=20
     0 /usr/src/sys/kern/vfs_subr.c:2083 (lockmgr:ufs)
     7342         0       21792           0        2169     10      0  0=
=20
     0 /usr/src/sys/kern/vfs_cache.c:390 (rw:Name Cache)
     4648         0       10831           0          11    984      0  0=
=20
     0 /usr/src/sys/kern/vfs_subr.c:1693 (lockmgr:syncer)
     3536         0       35352           0        9218      3      0  0=
=20
     0 /usr/src/sys/amd64/amd64/pmap.c:3989 (sleep mutex:pmap)
     3039         0       34283           0        2641     12      0  0=
=20
     0 /usr/src/sys/vm/vm_fault.c:937 (sleep mutex:vm object)
     2825         0       53949           0        3495     15      0  0=
=20
     0 /usr/src/sys/vm/vm_fault.c:1010 (sleep mutex:vm object)
     2286         0        7169           0         547     13      0  0=
=20
     0 /usr/src/sys/kern/kern_sig.c:983 (sleep mutex:process lock)
     2083      4010        9037        5380        2143      4      2  0=
=20
     2 /usr/src/sys/kern/vfs_subr.c:2118 (sleep mutex:vnode interlock)
     2083        57        2089          57           4    522     14  0=
=20
     1 /usr/src/sys/kern/kern_mutex.c:147 (sleep mutex:pipe mutex)
     2022         0      181449           0         515    352      0  0=
=20
     0 /usr/src/sys/vm/vm_pager.c:311 (lockmgr:bufwait)
     1997         0        3397           0           2   1698      0  0=
=20
     0 /usr/src/sys/kern/kern_sysctl.c:1513 (sx:sysctl mem)
     1994         0        4890           0         117     41      0  0=
=20
     0 /usr/src/sys/kern/kern_sysctl.c:1521 (sx:sysctl lock)
     1971         0        4156           0          13    319      0  0=
=20
     0 /usr/src/sys/kern/kern_sysctl.c:1417 (sleep mutex:Giant)
     1965         0       31838           0        3471      9      0  0=
=20
     0 /usr/src/sys/vm/vm_object.c:482 (sleep mutex:vm object)
     1926         0       17695           0          83    213      0  0=
=20
     0 /usr/src/sys/vm/vm_object.c:541 (sleep mutex:vm object)


There is at least one line in common that looks networking related,=20
/usr/src/sys/kern/uipc_sockbuf.c:148 (sx:so_rcv_sx)

int
sblock(struct sockbuf *sb, int flags)
{

        KASSERT((flags & SBL_VALID) =3D=3D flags,
            ("sblock: flags invalid (0x%x)", flags));

        if (flags & SBL_WAIT) {
                if ((sb->sb_flags & SB_NOINTR) ||
                    (flags & SBL_NOINTR)) {
                        sx_xlock(&sb->sb_sx);
                        return (0);
                }
                return (sx_xlock_sig(&sb->sb_sx));
        } else {
                if (sx_try_xlock(&sb->sb_sx) =3D=3D 0)
                        return (EWOULDBLOCK);
                return (0);
        }
}

It is not clear to me what this code is doing. (comments might help)
It is not clear to me why networking and a USB_to_RS-232 bridge would
have a lock conflict. It is not clear to me why networking and firewire
would have a lock conflict.

How do we fix this?





Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?8CD9203EF6F6EE0-1760-B81>