From owner-freebsd-net@FreeBSD.ORG Thu Oct 8 00:54:03 2009 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 8C9D91065696 for ; Thu, 8 Oct 2009 00:54:03 +0000 (UTC) (envelope-from mdounin@mdounin.ru) Received: from mdounin.cust.ramtel.ru (mdounin.cust.ramtel.ru [81.19.69.81]) by mx1.freebsd.org (Postfix) with ESMTP id 22D318FC1E for ; Thu, 8 Oct 2009 00:54:02 +0000 (UTC) Received: from mdounin.ru (mdounin.cust.ramtel.ru [81.19.69.81]) by mdounin.cust.ramtel.ru (Postfix) with ESMTP id 1A4D31700D for ; Thu, 8 Oct 2009 04:37:23 +0400 (MSD) Date: Thu, 8 Oct 2009 04:37:23 +0400 From: Maxim Dounin To: freebsd-net@freebsd.org Message-ID: <20091008003723.GD1229@mdounin.ru> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.5.20 (2009-06-14) Subject: deadlock with pf uid rules + syncache X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 08 Oct 2009 00:54:03 -0000 Hello! We with ru@ investigated lockups our colleagues observing on RELENG_7 machines, and was able to track [at least some of] them down to the following deadlock: 1. syncache does syn+ack retransmit on timeout. it holds tcp_sc_head mutex, and output packet processed by pf which in turn waits for tcp lock to lookup uid. 2. em0 dispatches new ack packet to listen socket, tcp_input() holds tcp lock, syncache lookup requires tcp_sc_head. Here is data from ddb: db> bt 19 Tracing pid 19 tid 100011 td 0xc6ca0000 sched_switch(c6ca0000,0,1,c06388b0,a38b7a66,...) at sched_switch+0x320 mi_switch(1,0,c08455c6,2e3,c091d634,...) at mi_switch+0x143 turnstile_wait(c7881140,c6d76d80,1,bd5,c6d76d86,...) at turnstile_wait+0x234 _rw_rlock(c0966b8c,c73619f6,bd5,c736392c,ce3,...) at _rw_rlock+0x166 pf_socket_lookup(2,c6bc6a80,0,ce3,c6bc6a1c,...) at pf_socket_lookup+0xc8 pf_test_tcp(c6bc6b18,c6bc6b14,2,c7236400,c73c7800,...) at pf_test_tcp+0x86f pf_test(2,c704e800,c6bc6b74,0,0,...) at pf_test+0xde1 pf_check_out(0,c6bc6b74,c704e800,2,0,...) at pf_check_out+0x43 pfil_run_hooks(c0965e40,c6bc6c04,c704e800,2,0,...) at pfil_run_hooks+0x78 ip_output(c73c7800,0,c6bc6bd8,0,0,...) at ip_output+0x7c2 syncache_respond(c8929418,0,0,0,c0842ffa,...) at syncache_respond+0x255 syncache_timer(c6fd7ac0,0,c0842ffa,f1,1,...) at syncache_timer+0x127 softclock(0,0,c083f980,4cb,c6ce5070,...) at softclock+0x21f ithread_loop(c6c95330,c6bc6d38,c083f823,323,c6c97840,...) at ithread_loop+0x145 fork_exit(c05e7520,c6c95330,c6bc6d38) at fork_exit+0x94 fork_trampoline() at fork_trampoline+0x8 --- trap 0, eip = 0, esp = 0xc6bc6d70, ebp = 0 --- db> bt 31 Tracing pid 31 tid 100031 td 0xc6d76d80 sched_switch(c6d76d80,0,1,c06388ee,a38981fa,...) at sched_switch+0x320 mi_switch(1,0,c08455c6,2e3,c091d7d8,...) at mi_switch+0x143 turnstile_wait(c7881be0,c6ca0000,0,c6fd7ac0,1f0,...) at turnstile_wait+0x234 _mtx_lock_sleep(c6fd7ac0,c6d76d80,0,c0850acb,1f0,...) at _mtx_lock_sleep+0x10e _mtx_lock_flags(c6fd7ac0,0,c0850acb,1f0,e6f87b14,...) at _mtx_lock_flags+0x67 syncache_lookup(e6f87aec,e6f87a8c,e6f8799c,c07db700,c091d8b8,...) at syncache_lookup+0x5e syncache_expand(e6f87aec,e6f87b14,c7593024,e6f87b2c,c753d600,...) at syncache_expand+0x1e tcp_input(c753d600,14,c704e800,1,0,...) at tcp_input+0x459 ip_input(c753d600,c057d932,800,c704e800,800,...) at ip_input+0x620 netisr_dispatch(2,c753d600,0,c7257537,3ac,...) at netisr_dispatch+0x55 ether_demux(c704e800,c753d600,3,0,3,...) at ether_demux+0x1aa ether_input(c704e800,c753d600,c6d76e18,c08baa84,c082d68e,...) at ether_input+0x343 ether_demux(c6da6000,c753d600,3,0,3,...) at ether_demux+0x107 ether_input(c6da6000,c753d600,c082d68e,11ba,c084572e,...) at ether_input+0x343 em_rxeof(e6f87ca4,c05f860d,c6da159c,8,c6dc832c,...) at em_rxeof+0x499 em_handle_rxtx(c6dc4000,1,c0845360,52,c6da159c,...) at em_handle_rxtx+0x2e taskqueue_run(c6da1580,c6da159c,c0836cc3,0,c083f823,...) at taskqueue_run+0x10b taskqueue_thread_loop(c6dc835c,e6f87d38,c083f823,323,c6dcbb00,...) at taskqueue_thread_loop+0x68 fork_exit(c0636ee0,c6dc835c,e6f87d38) at fork_exit+0x94 fork_trampoline() at fork_trampoline+0x8 --- trap 0, eip = 0, esp = 0xe6f87d70, ebp = 0 --- db> show turnstile 0xc7881be0 Lock: 0xc6fd7ac0 - (sleep mutex) tcp_sc_head Lock Owner: 0xc6ca0000 (tid 100011, pid 19, "swi4: clock sio") Shared Waiters: empty Exclusive Waiters: 0xc6d76d80 (tid 100031, pid 31, "em0 taskq") Pending Threads: empty db> show turnstile 0xc7881140 Lock: 0xc0966b8c - (rw) tcp Lock Owner: 0xc6d76d80 (tid 100031, pid 31, "em0 taskq") Shared Waiters: 0xc6ca0000 (tid 100011, pid 19, "swi4: clock sio") 0xc7994d80 (tid 100192, pid 1833, "httpd") 0xc7883b40 (tid 100282, pid 1831, "httpd") 0xc7b1a000 (tid 100252, pid 1832, "httpd") Exclusive Waiters: 0xc6c9ab40 (tid 100013, pid 21, "swi1: net") 0xc7971000 (tid 100202, pid 1120, "nginx") 0xc7a04480 (tid 100204, pid 1122, "nginx") 0xc79a9000 (tid 100278, pid 1830, "httpd") Pending Threads: empty Unfortunately "show alllocks" output wasn't properly captured by textdump (as witness_list_lock() uses printf() for output, not db_printf() which is captured), but here is some reconstruction (lines from textdump's ddb.txt and msgbuf.txt, my memory about their relevant order): db> show alllocks Process 1319 (postgres) thread 0xc804a480 (100326) exclusive sx so_rcv_sx r = 0 (0xc7b088ac) locked @ /usr/src/sys/kern/uipc_sockbuf.c:148 Process 1117 (sshd) thread 0xc708b240 (100057) exclusive sx so_rcv_sx r = 0 (0xc7b06a4c) locked @ /usr/src/sys/kern/uipc_sockbuf.c:148 Process 956 (sshd) thread 0xc72476c0 (100119) exclusive sx so_rcv_sx r = 0 (0xc7d1bd8c) locked @ /usr/src/sys/kern/uipc_sockbuf.c:148 Process 55 (g_mirror home) thread 0xc70c2d80 (100055) exclusive sx gmirror:lock r = 0 (0xc705442c) locked @ /usr/src/sys/modules/geom/geom_mirror/../../../geom/mirror/g_mirror.c:1858 Process 31 (em0 taskq) thread 0xc6d76d80 (100031) exclusive rw tcpinp r = 0 (0xc751d36c) locked @ /usr/src/sys/netinet/tcp_input.c:480 exclusive rw tcp r = 0 (0xc0966b8c) locked @ /usr/src/sys/netinet/tcp_input.c:400 Process 27 (swi6: Giant taskq) thread 0xc6ca06c0 (100027) exclusive sleep mutex Giant r = 0 (0xc0910630) locked @ /usr/src/sys/kern/kern_intr.c:1125 Process 19 (swi4: clock sio) thread 0xc6ca0000 (100011) shared rw PFil hook read/write mutex r = 0 (0xc0965e58) locked @ /usr/src/sys/net/pfil.c:73 exclusive sleep mutex tcp_sc_head r = 0 (0xc6fd7ac0) locked @ /usr/src/sys/kern/kern_timeout.c:241 The machine in question runs RELENG_7 from something about 2009-10-05. Please let me know if you need some more info. Maxim Dounin