From owner-freebsd-current Sat Nov 30 11:15:40 2002 Delivered-To: freebsd-current@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 7FE9537B401 for ; Sat, 30 Nov 2002 11:15:36 -0800 (PST) Received: from beastie.mckusick.com (beastie.mckusick.com [209.31.233.184]) by mx1.FreeBSD.org (Postfix) with ESMTP id DCBED43EE5 for ; Sat, 30 Nov 2002 11:15:32 -0800 (PST) (envelope-from mckusick@beastie.mckusick.com) Received: from beastie.mckusick.com (localhost [127.0.0.1]) by beastie.mckusick.com (8.12.3/8.12.3) with ESMTP id gAUJFW59081565; Sat, 30 Nov 2002 11:15:32 -0800 (PST) (envelope-from mckusick@beastie.mckusick.com) Message-Id: <200211301915.gAUJFW59081565@beastie.mckusick.com> To: Sean Kelly Subject: Re: UFS Snapshot deadlock Cc: current@FreeBSD.ORG In-Reply-To: Your message of "Wed, 30 Oct 2002 03:57:52 CST." <20021030095752.GA1868@edgemaster.zombie.org> Date: Sat, 30 Nov 2002 11:15:32 -0800 From: Kirk McKusick Sender: owner-freebsd-current@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG Your deadlock should now be fixed. Kirk McKusick =-=-=-=-= From: Kirk McKusick Date: Fri, 29 Nov 2002 23:27:12 -0800 (PST) To: cvs-committers@FreeBSD.org, cvs-all@FreeBSD.org Subject: cvs commit: src/sys/ufs/ffs ffs_snapshot.c X-FreeBSD-CVS-Branch: HEAD mckusick 2002/11/29 23:27:12 PST Modified files: sys/ufs/ffs ffs_snapshot.c Log: Fix two deadlocks in snapshots: 1) Release the snapshot file lock while suspending the system. Otherwise a process trying to read the lock may block on its containing directory preventing the suspension from completing. Thanks to Sean Kelly for finding this deadlock. 2) Replace some bdwrite's with bawrite's so as not to fill all the buffers with dirty data. The buffers could not be cleaned as the snapshot vnode was locked hence the system could deadlock when making snapshots of really massive filesystems. Thanks to Hidetoshi Shimokawa for figuring this out. Sponsored by: DARPA & NAI Labs. Revision Changes Path 1.51 +7 -2 src/sys/ufs/ffs/ffs_snapshot.c =-=-=-=-=-= Date: Wed, 30 Oct 2002 03:57:52 -0600 From: Sean Kelly To: current@FreeBSD.ORG Subject: UFS Snapshot deadlock While playing with UFS snapshots on a UFS2 filesystem I mounted specifically for this purpose, I encountered a little problem. It seems I have processes deadlocked on each other. Steps to repeat: /# mount /dev/ad2a /mnt ; cd /mnt /dev/ad2a on /mnt (ufs, local, soft-updates, multilabel) # UFS2 /mnt# cd /mnt; mount -u -o snapshot /mnt/snapshot /mnt *switch vtys* /# cd /mnt; ls -l *ls deadlocks* *I get bored and ^C the mount on the other vty about 30 minutes later* /mnt# ls *this ls deadlocks too* For the record, /mnt was a new filesystem. It had *nothing* in it. No directories or anything. So now, I've got these: UID PID PPID CPU PRI NI VSZ RSS MWCHAN STAT TT TIME COMMAND 0 1133 669 0 -4 0 692 548 ufs D+ v1 0:00.00 ls 1001 939 856 0 -4 0 696 560 ufs D+ v2 0:00.00 ls -l 0 937 1 0 -4 0 560 336 ufs D v1 0:00.65 mount -u -o snapshot /mnt/snapshot /mnt Now for some numbers. db> trace 937 mi_switch(c71aab60,50,c03375c6,c7,c03ad2f8) at mi_switch+0x158 msleep(c75098dc,c03a9358,50,c034f732,0) at msleep+0x3b4 acquire(c75098dc,1000040,600,e6,3a9) at acquire+0xa7 lockmgr(c75098dc,1010002,c7509818,c71aab60,e5b076a8) at lockmgr+0x2f7 vop_stdlock(e5b076c4,e5b076e0,c021e306,e5b076c4,0) at vop_stdlock+0x2c ufs_vnoperate(e5b076c4,0,c033dd28,e5b076e0,c01ba4a5) at ufs_vnoperate+0x18 vn_lock(c7509818,10002,c71aab60,815,c7509818) at vn_lock+0xd6 vget(c7509818,2,c71aab60,470,0) at vget+0xd6 ffs_sync(c74c5400,1,c726a780,c71aab60,c74f1000) at ffs_sync+0x126 vfs_write_suspend(c74c5400,c74ffcb8,d351f08c,1,c2c06e80) at vfs_write_suspend+0x70 ffs_snapshot(c74c5400,bfbffd1d,70,c033990d,252) at ffs_snapshot+0xa48 ffs_mount(c74c5400,c745ce80,bfbff000,e5b07bf0,c71aab60) at ffs_mount+0x548 vfs_mount(c71aab60,c6d2b780,c745ce80,1010000,bfbff000) at vfs_mount+0x85e mount(c71aab60,e5b07d14,c03590ba,409,4) at mount+0xb8 syscall(2f,2f,2f,bfbfeffc,bfbff9f4) at syscall+0x22e Xint0x80_syscall() at Xint0x80_syscall+0x1d db> trace 939 mi_switch(c74260d0,50,c03375c6,c7,1cc) at mi_switch+0x158 msleep(c74ffd7c,c03a9688,50,c034f732,0) at msleep+0x3b4 acquire(c74ffd7c,1000040,600,e6,3ab) at acquire+0xa7 lockmgr(c74ffd7c,1010002,c74ffcb8,c74260d0,e5bfd83c) at lockmgr+0x2f7 vop_stdlock(e5bfd858,e5bfd874,c021e306,e5bfd858,246) at vop_stdlock+0x2c ufs_vnoperate(e5bfd858,246,0,c74f1000,0) at ufs_vnoperate+0x18 vn_lock(c74ffcb8,10002,c74260d0,7f,3) at vn_lock+0xd6 vget(c74ffcb8,10002,c74260d0,7f,c74260d0) at vget+0xd6 ufs_ihashget(c74cce00,3,2,e5bfd98c,e5bfd8f0) at ufs_ihashget+0xd2 ffs_vget(c74c5400,3,2,e5bfd98c,e5bfd994) at ffs_vget+0x44 ufs_lookup(e5bfdac0,e5bfdafc,c0207a24,e5bfdac0,e5bfdc3c) at ufs_lookup+0xdae ufs_vnoperate(e5bfdac0,e5bfdc3c,e5bfdc50,3ab,c74260d0) at ufs_vnoperate+0x18 vfs_cache_lookup(e5bfdb70,e5bfdb9c,c020bd39,e5bfdb70,c7509818) at vfs_cache_lookup+0x2e4 ufs_vnoperate(e5bfdb70,c7509818,e5bfdc50,e5bfdb5c,c74260d0) at ufs_vnoperate+0x18 lookup(e5bfdc28,0,c033d6ad,a4,c74260d0) at lookup+0x309 namei(e5bfdc28,c03ade38,c03ade10,c03b42a0,0) at namei+0x1e0 lstat(c74260d0,e5bfdd14,c03590ba,409,2) at lstat+0x52 syscall(2f,2f,2f,80d3200,80d1040) at syscall+0x22e Xint0x80_syscall() at Xint0x80_syscall+0x1d --- syscall (190, FreeBSD ELF32, lstat), eip = 0x805838b, esp = 0xbfbff3dc, ebp = 0xbfbff468 --- db> trace 1133 mi_switch(c6d31680,50,c03375c6,c7,2) at mi_switch+0x158 msleep(c75098dc,c03a9358,50,c034f732,0) at msleep+0x3b4 acquire(c75098dc,1000040,600,e6,46d) at acquire+0xa7 lockmgr(c75098dc,1030002,c7509818,c6d31680,e3887ad0) at lockmgr+0x2f7 vop_stdlock(e3887aec,e3887b08,c021e306,e3887aec,0) at vop_stdlock+0x2c ufs_vnoperate(e3887aec,0,c033e1ac,360,c01e3af0) at ufs_vnoperate+0x18 vn_lock(c7509818,20002,c6d31680,e3887b5c,c6d31680) at vn_lock+0xd6 lookup(e3887c28,0,c033d6ad,a4,c6d31680) at lookup+0x8e namei(e3887c28,c03ade38,c03ade10,c03b42a0,0) at namei+0x1e0 stat(c6d31680,e3887d14,c03590ba,409,2) at stat+0x52 syscall(2f,2f,2f,80d3080,80d1000) at syscall+0x22e Xint0x80_syscall() at Xint0x80_syscall+0x1d --- syscall (188, FreeBSD ELF32, stat), eip = 0x80583b3, esp = 0xbfbff4dc, ebp = 0xbfbff568 --- db> x/x 0xc74ffd7c, 20 0xc74ffd7c: c03a9688 1200440 0 1 0xc74ffd8c: 500001 c034f732 6 3a9 0xc74ffd9c: c74ffd7c c6be9500 c74c5400 0 0xc74ffdac: 0 c74ffdac 968 c74ffcb8 0xc74ffdbc: 0 0 1 0 0xc74ffdcc: 0 0 0 0 0xc74ffddc: ffffffff c0370e80 c033dd98 c033dd98 0xc74ffdec: 30000 c74cb734 c7508010 c03acfd8 db> x/x 0xc75098dc, 10 0xc75098dc: c03a9358 1200440 0 3 0xc75098ec: 500001 c034f732 6 3ab 0xc75098fc: c75098dc c6be9500 c74c5400 0 0xc750990c: 0 c750990c 93c c7509818 (gdb) list *( ufs_lookup+0xdae) 0xc02bd86e is in ufs_lookup (/usr/src/sys/ufs/ufs/ufs_lookup.c:602). 597 } else if (dp->i_number == dp->i_ino) { 598 VREF(vdp); /* we want ourself, ie "." */ 599 *vpp = vdp; 600 } else { 601 error = VFS_VGET(pdp->v_mount, dp->i_ino, LK_EXCLUSIVE, &tdp); 602 if (error) 603 return (error); 604 if (!lockparent || !(flags & ISLASTCN)) { 605 VOP_UNLOCK(pdp, 0, td); 606 cnp->cn_flags |= PDIRUNLOCK; -- Sean Kelly | PGP KeyID: 77042C7B smkelly@zombie.org | http://www.zombie.org To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-current" in the body of the message To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-current" in the body of the message