Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 3 Feb 2011 22:00:24 GMT
From:      dieterbsd@engineer.com
To:        freebsd-firewire@FreeBSD.org
Subject:   Re: kern/118093: firewire bus reset hogs CPU, causing data to be lost
Message-ID:  <201102032200.p13M0OCn030576@freefall.freebsd.org>

next in thread | raw e-mail | index | archive | help
The following reply was made to PR kern/118093; it has been noted by GNATS.

From: dieterbsd@engineer.com
To: bug-followup@FreeBSD.org
Cc: freebsd-firewire@freebsd.org, freebsd-usb@freebsd.org,
        freebsd-drivers@freebsd.org, freebsd-net@freebsd.org,
        freebsd@sopwith.solgatos.com
Subject: Re: kern/118093: firewire bus reset hogs CPU, causing data to be lost
Date: Thu, 03 Feb 2011 16:53:32 -0500

 New data: USB causes the same problem that firewire does.
 
 FreeBSD 8.0
 amd64
 nfe0: <NVIDIA nForce4 CK804 MCP9 Networking Adapter> port 0xb400-0xb407=20
 mem 0xfebf9000-0xfebf9fff irq 23 at device 10.0 on pci0
 bge0: <Broadcom NetXtreme Gigabit Ethernet Controller, ASIC rev.=20
 0x4101> mem 0xfe4f0000-0xfe4fffff irq 19 at device 0.0 on pci5
 fwohci1: <NEC uPD72871/2> mem 0xfdeff000-0xfdefffff irq 19 at device=20
 8.0 on pci2
 ehci0: <NVIDIA nForce4 USB 2.0 controller> mem 0xfebfe000-0xfebfe0ff=20
 irq 22 at device 2.1 on pci0
 
 Plugging in a USB_to_RS-232 bridge generates:
 
 ugen0.2: <Prolific Technology Inc.> at usbus0
 uplcom0: <Prolific Technology Inc. USB-Serial Controller D, class 0/0,=20
 rev 1.10/
 4.00, addr 2> on usbus0
 
 And networking is locked out for too long, and data is lost.  (The node=20
 sending data
 via Ethernet has too small a transmit buffer, but is a closed source=20
 closed hardware
 black box and cannot be fixed.  Once data is lost it is lost forever,=20
 it cannot be
 recreated, so this is very bad.)  The FreeBSD box receiving data has a=20
 very large
 receive buffer, but if another device driver locks out networking it=20
 doesn't help.
 
 Changing the printf(9)s to log(9)s fixes the problem, but this is not a=20
 good workaround,
 there are LOTS of printfs in the kernel, I keep hitting new ones. =20
 Printf(9) alone isn't
 the problem, adding printfs to chown(2) does not cause the problem, but=20
 printfs from
 device drivers do. (usb, firewire, ...)
 
 My theory is there is a locking conflict.
 
 Here is lock profiling data for USB_RS-232 bridge being plugged in=20
 printfs locking out networking:
 
 debug.lock.prof.enable: 0
 debug.lock.prof.reset: 0
 debug.lock.prof.stats:
       max  wait_max       total  wait_total       count    avg wait_avg=20
 cnt_hold cnt_lock name
    251240         0      499982           0         303   1650      0  0=
 =20
       0 /usr/src/sys/dev/usb/usb_request.c:322 (sx:0123456789ABCDEF -=20
 USB device SX lock)
    104255         0      110429           0           2  55214      0  0=
 =20
       0 /usr/src/sys/dev/usb/usb_device.c:2459 (sx:123456789ABCDEF - USB=
 =20
 config SX lock)
 
    104217         0    29670949           0       47087    630      0  0=
 =20
       0 /usr/src/sys/kern/uipc_sockbuf.c:148 (sx:so_rcv_sx)  =20
 <---------<<<<
 
    101962         0      103089           0          27   3818      0  0=
 =20
      0 /usr/src/sys/kern/kern_condvar.c:145 (sleep mutex:Giant)
     51071         0       81088           0           3  27029      0  0=
 =20
      0 /usr/src/sys/kern/kern_cons.c:422 (spin mutex:cnputs_mtx)
     23049         0       31351           0           2  15675      0  0=
 =20
      0 /usr/src/sys/kern/vfs_syscalls.c:3508 (lockmgr:ufs)
     12408         0      978544           0         971   1007      0  0=
 =20
      0 /usr/src/sys/kern/vfs_vnops.c:604 (lockmgr:ufs)
     11466         0      744372           0       11567     64      0  0=
 =20
      0 /usr/src/sys/kern/vfs_bio.c:2559 (lockmgr:bufwait)
     11141         0      501155           0        3850    130      0  0=
 =20
      0 /usr/src/sys/kern/vfs_bio.c:1835 (lockmgr:bufwait)
      6115         0       61720           0          26   2373      0  0=
 =20
       0 /usr/src/sys/dev/usb/usb_request.c:322 (sx:123456789ABCDEF - USB=
 =20
 device SX lock)
      5551         0       16749           0          19    881      0  0=
 =20
      0 /usr/src/sys/kern/vfs_subr.c:1693 (lockmgr:syncer)
      3043         0        3749           0           6    624      0  0=
 =20
       0 /usr/src/sys/ufs/ffs/ffs_vfsops.c:1321 (sleep mutex:struct mount=
 =20
 mtx)
      2333         0        9901           0          36    275      0  0=
 =20
      0 /usr/src/sys/kern/vfs_mount.c:2231 (sleep mutex:struct mount mtx)
      2055         0        9462           0         100     94      0  0=
 =20
       0 /usr/src/sys/kern/subr_hints.c:117 (sleep mutex:kernel=20
 environment)
      1649         0       79272           0        3846     20      0  0=
 =20
      0 /usr/src/sys/kern/vfs_bio.c:2963 (sleep mutex:vm object)
      1636         0       36196           0       15515      2      0  0=
 =20
       0 /usr/src/sys/vm/vm_page.c:1052 (sleep mutex:vm page queue free=20
 mutex)
      1205         0       55601           0       15411      3      0  0=
 =20
      0 /usr/src/sys/kern/vfs_bio.c:2545 (sleep mutex:bufobj interlock)
      1076         0        1076           0           1   1076      0  0=
 =20
      0 /usr/src/sys/kern/vfs_subr.c:1693 (lockmgr:ufs)
      1069         0       31117           0          30   1037      0  0=
 =20
      0 /usr/src/sys/kern/kern_mutex.c:147 (sleep mutex:nfe0)
       848         0        3752           0        1359      2      0  0=
 =20
       0 /usr/src/sys/ufs/ffs/ffs_vfsops.c:1297 (sleep mutex:vnode=20
 interlock)
       754         0       10016           0        3846      2      0  0=
 =20
       0 /usr/src/sys/ufs/ffs/ffs_softdep.c:4274 (sleep mutex:Softdep=20
 Lock)
       739         0        2463           0           4    615      0  0=
 =20
      0 /usr/src/sys/kern/vfs_subr.c:1693 (lockmgr:devfs)
       540         0        3370           0        1365      2      0  0=
 =20
      0 /usr/src/sys/kern/vfs_subr.c:3179 (sleep mutex:vnode interlock)
       525         0       80852           0         160    505      0  0=
 =20
      0 /usr/src/sys/dev/uart/uart_cpu.h:92 (spin mutex:uart_hwmtx)
       512         0        2065           0          23     89      0  0=
 =20
      0 /usr/src/sys/kern/vfs_default.c:599 (lockmgr:bufwait)
       487         0       97058           0         481    201      0  0=
 =20
      0 /usr/src/sys/vm/vm_pager.c:311 (lockmgr:bufwait)
       467         0        2069           0          12    172      0  0=
 =20
      0 /usr/src/sys/dev/usb/usb_busdma.c:547 (sleep mutex:uplcom)
       455         0        2978           0          18    165      0  0=
 =20
      0 /usr/src/sys/dev/usb/usb_transfer.c:286 (sleep mutex:uplcom)
       423         0         727           0           2    363      0  0=
 =20
      0 /usr/src/sys/vm/uma_core.c:1565 (sleep mutex:UMA lock)
       367         0      230166           0       47087      4      0  0=
 =20
      0 /usr/src/sys/netinet/tcp_usrreq.c:708 (rw:tcpinp)
       358         0        1387           0          11    126      0  0=
 =20
      0 /usr/src/sys/kern/kern_synch.c:241 (sleep mutex:Giant)
       339         0       72085           0       46119      1      0  0=
 =20
      0 /usr/src/sys/kern/kern_mutex.c:147 (sleep mutex:so_rcv)
       332         0      225434           0       93206      2      0  0=
 =20
      0 /usr/src/sys/kern/uipc_socket.c:1441 (sleep mutex:so_rcv)
       332         0       19529           0       11551      1      0  0=
 =20
      0 /usr/src/sys/kern/vfs_subr.c:1927 (sleep mutex:bufobj interlock)
       331         0      162523           0       94887      1      0  0=
 =20
      0 /usr/src/sys/netinet/tcp_output.c:280 (sleep mutex:so_snd)
       330         0       73512           0       47087      1      0  0=
 =20
      0 /usr/src/sys/kern/uipc_socket.c:1843 (sleep mutex:so_rcv)
       329         0       88069           0       47771      1      0  0=
 =20
      0 /usr/src/sys/kern/uipc_socket.c:1687 (sleep mutex:so_rcv)
       329         0         348           0          12     29      0  0=
 =20
      0 /usr/src/sys/kern/vfs_subr.c:1955 (sleep mutex:Syncer mtx)
       320         0       66185           0        3353     19      0  0=
 =20
      0 /usr/src/sys/kern/vfs_cluster.c:868 (lockmgr:bufwait)
       303         0        7410           0        4360      1      0  0=
 =20
      0 /usr/src/sys/kern/vfs_bio.c:3944 (sleep mutex:bufobj interlock)
 debug.lock.prof.rejected: 0
 debug.lock.prof.skipcount: 0
 debug.lock.prof.skipspin: 0
 
 Here is lock profiling data for firewire bus reset printfs locking out=20
 Ethernet:
 
       max  wait_max       total  wait_total       count    avg wait_avg=20
 cnt_hold cnt_lock name
    373345         0      397495           0         643    618      0  0=
 =20
      0 /usr/src/sys/kern/vfs_lookup.c:497 (lockmgr:ufs)
    373325         0      390815           0        3140    124      0  0=
 =20
      0 /usr/src/sys/kern/vfs_subr.c:2218 (sleep mutex:vnode interlock)
 
    359514         0    32199808           0       45582    706      0  0=
 =20
       0 /usr/src/sys/kern/uipc_sockbuf.c:148 (sx:so_rcv_sx)  =20
 <---------<<<<
 
     63827       320      746196         740       70067     10      0  0=
 =20
      4 /usr/src/sys/dev/firewire/fwohci.c:2107 (sleep mutex:firewire)
     49927         0      165929           0           8  20741      0  0=
 =20
      0 /usr/src/sys/kern/vfs_syscalls.c:3508 (lockmgr:ufs)
     33358         0     1707461           0        1106   1543      0  0=
 =20
      0 /usr/src/sys/kern/vfs_vnops.c:604 (lockmgr:ufs)
     30294         0      401224           0        2641    151      0  0=
 =20
      0 /usr/src/sys/vm/vm_map.c:3526 (sx:user map)
     28432         0      934681           0          74  12630      0  0=
 =20
      0 /usr/src/sys/kern/kern_cons.c:422 (spin mutex:cnputs_mtx)
     28296         0       66892           0        2641     25      0  0=
 =20
      0 /usr/src/sys/vm/vm_fault.c:297 (sleep mutex:vm object)
     25020         0       26667           0          45    592      0  0=
 =20
      0 /usr/src/sys/kern/kern_exit.c:825 (sx:proctree)
     19027         0     1225496           0       11937    102      0  0=
 =20
      0 /usr/src/sys/kern/vfs_bio.c:2559 (lockmgr:bufwait)
     18887         0      906297           0        4140    218      0  0=
 =20
      0 /usr/src/sys/kern/vfs_bio.c:1835 (lockmgr:bufwait)
     12101         0       23459           0        1675     14      0  0=
 =20
      0 /usr/src/sys/kern/vfs_subr.c:2342 (sleep mutex:vnode interlock)
      7448         0      170841           0        2229     76      0  0=
 =20
      0 /usr/src/sys/kern/vfs_subr.c:2083 (lockmgr:ufs)
      7342         0       21792           0        2169     10      0  0=
 =20
      0 /usr/src/sys/kern/vfs_cache.c:390 (rw:Name Cache)
      4648         0       10831           0          11    984      0  0=
 =20
      0 /usr/src/sys/kern/vfs_subr.c:1693 (lockmgr:syncer)
      3536         0       35352           0        9218      3      0  0=
 =20
      0 /usr/src/sys/amd64/amd64/pmap.c:3989 (sleep mutex:pmap)
      3039         0       34283           0        2641     12      0  0=
 =20
      0 /usr/src/sys/vm/vm_fault.c:937 (sleep mutex:vm object)
      2825         0       53949           0        3495     15      0  0=
 =20
      0 /usr/src/sys/vm/vm_fault.c:1010 (sleep mutex:vm object)
      2286         0        7169           0         547     13      0  0=
 =20
      0 /usr/src/sys/kern/kern_sig.c:983 (sleep mutex:process lock)
      2083      4010        9037        5380        2143      4      2  0=
 =20
      2 /usr/src/sys/kern/vfs_subr.c:2118 (sleep mutex:vnode interlock)
      2083        57        2089          57           4    522     14  0=
 =20
      1 /usr/src/sys/kern/kern_mutex.c:147 (sleep mutex:pipe mutex)
      2022         0      181449           0         515    352      0  0=
 =20
      0 /usr/src/sys/vm/vm_pager.c:311 (lockmgr:bufwait)
      1997         0        3397           0           2   1698      0  0=
 =20
      0 /usr/src/sys/kern/kern_sysctl.c:1513 (sx:sysctl mem)
      1994         0        4890           0         117     41      0  0=
 =20
      0 /usr/src/sys/kern/kern_sysctl.c:1521 (sx:sysctl lock)
      1971         0        4156           0          13    319      0  0=
 =20
      0 /usr/src/sys/kern/kern_sysctl.c:1417 (sleep mutex:Giant)
      1965         0       31838           0        3471      9      0  0=
 =20
      0 /usr/src/sys/vm/vm_object.c:482 (sleep mutex:vm object)
      1926         0       17695           0          83    213      0  0=
 =20
      0 /usr/src/sys/vm/vm_object.c:541 (sleep mutex:vm object)
 
 
 There is at least one line in common that looks networking related,=20
 /usr/src/sys/kern/uipc_sockbuf.c:148 (sx:so_rcv_sx)
 
 int
 sblock(struct sockbuf *sb, int flags)
 {
 
         KASSERT((flags & SBL_VALID) =3D=3D flags,
             ("sblock: flags invalid (0x%x)", flags));
 
         if (flags & SBL_WAIT) {
                 if ((sb->sb_flags & SB_NOINTR) ||
                     (flags & SBL_NOINTR)) {
                         sx_xlock(&sb->sb_sx);
                         return (0);
                 }
                 return (sx_xlock_sig(&sb->sb_sx));
         } else {
                 if (sx_try_xlock(&sb->sb_sx) =3D=3D 0)
                         return (EWOULDBLOCK);
                 return (0);
         }
 }
 
 It is not clear to me what this code is doing. (comments might help)
 It is not clear to me why networking and a USB_to_RS-232 bridge would
 have a lock conflict. It is not clear to me why networking and firewire
 would have a lock conflict.
 
 How do we fix this?
 
 



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?201102032200.p13M0OCn030576>