From owner-freebsd-hackers@FreeBSD.ORG Sun Aug 3 08:03:49 2008 Return-Path: Delivered-To: hackers@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 380991065673; Sun, 3 Aug 2008 08:03:49 +0000 (UTC) (envelope-from pjd@garage.freebsd.pl) Received: from mail.garage.freebsd.pl (chello087206045140.chello.pl [87.206.45.140]) by mx1.freebsd.org (Postfix) with ESMTP id 9D3928FC15; Sun, 3 Aug 2008 08:03:48 +0000 (UTC) (envelope-from pjd@garage.freebsd.pl) Received: by mail.garage.freebsd.pl (Postfix, from userid 65534) id 55DB845CA0; Sun, 3 Aug 2008 09:32:39 +0200 (CEST) Received: from localhost (abhw175.neoplus.adsl.tpnet.pl [83.7.112.175]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mail.garage.freebsd.pl (Postfix) with ESMTP id 13CAE45C98; Sun, 3 Aug 2008 09:32:34 +0200 (CEST) Date: Sun, 3 Aug 2008 09:32:40 +0200 From: Pawel Jakub Dawidek To: hackers@FreeBSD.org Message-ID: <20080803073240.GC2371@garage.freebsd.pl> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="TakKZr9L6Hm6aLOc" Content-Disposition: inline User-Agent: Mutt/1.4.2.3i X-PGP-Key-URL: http://people.freebsd.org/~pjd/pjd.asc X-OS: FreeBSD 8.0-CURRENT i386 X-Spam-Checker-Version: SpamAssassin 3.0.4 (2005-06-05) on mail.garage.freebsd.pl X-Spam-Level: X-Spam-Status: No, score=-0.6 required=3.0 tests=BAYES_00,RCVD_IN_SORBS_DUL autolearn=no version=3.0.4 Cc: Subject: Linker deadlock. X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 03 Aug 2008 08:03:49 -0000 --TakKZr9L6Hm6aLOc Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable Hi. Linker can easly deadlock when we try to load the same kernel module from two processes at the same time. This is because we drop kld_sx in linker_load_file() and reacquire it, which leads to LOR, because we already held vnode lock at this point. Interesing backtraces below. First process: db> tr 3066 Tracing pid 3066 tid 100090 td 0x8514b240 sched_switch(8514b240,0,104,177,bb6bbb2e,...) at sched_switch+0x40e mi_switch(104,0,80681605,1ca,0,...) at mi_switch+0x200 sleepq_switch(8514b240,0,80681605,237,80a281ec,...) at sleepq_switch+0x14d sleepq_wait(80a281ec,0,8067a18b,3,0,...) at sleepq_wait+0x63 _sx_xlock_hard(80a281ec,8514b240,0,8067a1cf,1a0,...) at _sx_xlock_hard+0x2c6 _sx_xlock(80a281ec,0,8067a1cf,1a0,0,...) at _sx_xlock+0x99 linker_load_module(853a1264,0,83ba8940,83ba893c,83ba8938,...) at linker_loa= d_module+0xa4a linker_load_dependencies(84fb8500,bb74,8539f000,2adc,156000,...) at linker_= load_dependencies+0x194 link_elf_load_file(806b74e0,8557e4c0,83ba8c24,17c,0,...) at link_elf_load_f= ile+0x4f0 linker_load_module(0,83ba8c4c,8067a1cf,3cd,280cb730,...) at linker_load_mod= ule+0x8db kern_kldload(8514b240,8592d400,83ba8c70,0,b395eb11,...) at kern_kldload+0xc8 [...] db> show lock 0x80a281ec class: sx name: kernel linker state: XLOCK: 0x8514bd80 (tid 100117, pid 3065, "zpool") waiters: exclusive Second process: db> tr 3065 Tracing pid 3065 tid 100117 td 0x8514bd80 sched_switch(8514bd80,0,104,177,bb7e358b,...) at sched_switch+0x40e mi_switch(104,0,80681605,1ca,50,...) at mi_switch+0x200 sleepq_switch(8514bd80,0,80681605,237,8523d9c0,...) at sleepq_switch+0x14d sleepq_wait(8523d9c0,50,806906bb,4,0,...) at sleepq_wait+0x63 __lockmgr_args(8523d9c0,80100,8523da28,0,0,...) at __lockmgr_args+0x9a5 vop_stdlock(83bd2660,8508aa80,2,80100,8523d968,...) at vop_stdlock+0x65 VOP_LOCK1_APV(806c3560,83bd2660,806d2ac0,8523d968,80100,...) at VOP_LOCK1_A= PV+0xa5 _vn_lock(8523d968,80100,8068815b,802,804c9cb4,...) at _vn_lock+0x5e vget(8523d968,80100,8514bd80,1b7,8065d00f,...) at vget+0xc9 cache_lookup(85090158,83bd2a00,83bd2a14,0,84f3b400,...) at cache_lookup+0x4= c2 nfs_lookup(83bd2838,80688e43,806d2720,80000,85090158,...) at nfs_lookup+0x1= 01 VOP_LOOKUP_APV(806c3560,83bd2838,8068783d,1bd,83bd2a00,...) at VOP_LOOKUP_A= PV+0xe5 lookup(83bd29e8,8068783d,e0,c0,8506e52c,...) at lookup+0x52e namei(83bd29e8,81159a38,80a352b4,4,8067be1f,...) at namei+0x48b vn_open_cred(83bd29e8,83bd2a4c,0,84f3b400,0,...) at vn_open_cred+0x2ba vn_open(83bd29e8,83bd2a4c,0,0,806b2a00,...) at vn_open+0x33 linker_lookup_file(3,0,3,8514bd80,0,...) at linker_lookup_file+0x163 linker_load_module(0,83bd2c4c,8067a1cf,3cd,280cb730,...) at linker_load_mod= ule+0x7bd kern_kldload(8514bd80,85a7e400,83bd2c70,0,b395eb11,...) at kern_kldload+0xc8 [...] db> show vnode 0x8523d968 vnode 0x8523d968: tag nfs, type VREG usecount 1, writecount 0, refcount 189 mountedhere 0 flags () v_object 0x852489b0 ref 0 pages 372 lock type nfs: EXCL by thread 0x8514b240 (pid 3066) with exclusive waiters pending #0 0x804c2e5d at __lockmgr_args+0xa6d #1 0x80546c85 at vop_stdlock+0x65 #2 0x8065dcd5 at VOP_LOCK1_APV+0xa5 #3 0x805627ee at _vn_lock+0x5e #4 0x80557419 at vget+0xc9 #5 0x805444b2 at cache_lookup+0x4c2 #6 0x805c3b51 at nfs_lookup+0x101 #7 0x8065ee65 at VOP_LOOKUP_APV+0xe5 #8 0x8054a9be at lookup+0x52e #9 0x8054b5eb at namei+0x48b #10 0x805621da at vn_open_cred+0x2ba #11 0x80562463 at vn_open+0x33 #12 0x804f45e8 at link_elf_load_file+0x68 #13 0x804c0f9b at linker_load_module+0x8db #14 0x804c1568 at kern_kldload+0xc8 #15 0x804c1624 at kldload+0x74 #16 0x80650513 at syscall+0x283 #17 0x80634e40 at Xint0x80_syscall+0x20 [...] --=20 Pawel Jakub Dawidek http://www.wheel.pl pjd@FreeBSD.org http://www.FreeBSD.org FreeBSD committer Am I Evil? Yes, I Am! --TakKZr9L6Hm6aLOc Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.4 (FreeBSD) iD8DBQFIlV8XForvXbEpPzQRAlmeAKD2QiHDRzEojyvdkCltBiqDEoUQaACfbSMq sMb31Fg3F11EYYaV0LJkyE0= =Ye8+ -----END PGP SIGNATURE----- --TakKZr9L6Hm6aLOc--