From owner-freebsd-stable@FreeBSD.ORG Fri Dec 28 09:19:34 2012 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 2DC51803; Fri, 28 Dec 2012 09:19:34 +0000 (UTC) (envelope-from longwitz@incore.de) Received: from dss.incore.de (dss.incore.de [195.145.1.138]) by mx1.freebsd.org (Postfix) with ESMTP id D3CAA8FC12; Fri, 28 Dec 2012 09:19:33 +0000 (UTC) Received: from inetmail.dmz (inetmail.dmz [10.3.0.3]) by dss.incore.de (Postfix) with ESMTP id D880A5C36A; Fri, 28 Dec 2012 10:19:32 +0100 (CET) X-Virus-Scanned: amavisd-new at incore.de Received: from dss.incore.de ([10.3.0.3]) by inetmail.dmz (inetmail.dmz [10.3.0.3]) (amavisd-new, port 10024) with LMTP id rwPEqDXUl2Da; Fri, 28 Dec 2012 10:19:32 +0100 (CET) Received: from mail.incore (fwintern.dmz [10.0.0.253]) by dss.incore.de (Postfix) with ESMTP id 001015C36E; Fri, 28 Dec 2012 10:19:31 +0100 (CET) Received: from bsdlo.incore (bsdlo.incore [192.168.0.84]) by mail.incore (Postfix) with ESMTP id EA0615083F; Fri, 28 Dec 2012 10:19:31 +0100 (CET) Message-ID: <50DD6423.5090305@incore.de> Date: Fri, 28 Dec 2012 10:19:31 +0100 From: Andreas Longwitz User-Agent: Thunderbird 2.0.0.19 (X11/20090113) MIME-Version: 1.0 To: Konstantin Belousov Subject: Re: FS hang with suspfs when creating snapshot on a UFS + GJOURNAL setup References: <50DC30F6.1050904@incore.de> <20121227133355.GI82219@kib.kiev.ua> <50DC8999.8000708@incore.de> <20121227194145.GM82219@kib.kiev.ua> In-Reply-To: <20121227194145.GM82219@kib.kiev.ua> Content-Type: text/plain; charset=ISO-8859-15 Content-Transfer-Encoding: 7bit Cc: freebsd-stable@freebsd.org, fs@freebsd.org X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 28 Dec 2012 09:19:34 -0000 Konstantin Belousov wrote: >>> On Thu, Dec 27, 2012 at 12:28:54PM +0100, Andreas Longwitz wrote: >> db> alltrace (pid 18 and 7126) >> >> Tracing command g_journal switcher pid 18 tid 100076 td 0xffffff0002bd5000 >> sched_switch() at sched_switch+0xde >> mi_switch() at mi_switch+0x186 >> sleepq_wait() at sleepq_wait+0x42 >> __lockmgr_args() at __lockmgr_args+0x49b >> ffs_copyonwrite() at ffs_copyonwrite+0x19a >> ffs_geom_strategy() at ffs_geom_strategy+0x1b5 >> bufwrite() at bufwrite+0xe9 >> ffs_sbupdate() at ffs_sbupdate+0x12a >> g_journal_ufs_clean() at g_journal_ufs_clean+0x3e >> g_journal_switcher() at g_journal_switcher+0xe5e >> fork_exit() at fork_exit+0x11f >> fork_trampoline() at fork_trampoline+0xe >> --- trap 0, rip = 0, rsp = 0xffffff8242ca8cf0, rbp = 0 --- >> >> Tracing command mksnap_ffs pid 7126 tid 100157 td 0xffffff000807a470 >> sched_switch() at sched_switch+0xde >> mi_switch() at mi_switch+0x186 >> sleepq_wait() at sleepq_wait+0x42 >> _sleep() at _sleep+0x373 >> vn_start_write() at vn_start_write+0xdf >> ffs_snapshot() at ffs_snapshot+0xe2b > Can you look up the line number for the ffs_snapshot+0xe2b ? (kgdb) list *ffs_snapshot+0xe2b 0xffffffff8056287b is in ffs_snapshot (/usr/src/sys/ufs/ffs/ffs_snapshot.c:676). 671 /* 672 * Resume operation on filesystem. 673 */ 674 vfs_write_resume(vp->v_mount); 675 vn_start_write(NULL, &wrtmp, V_WAIT); 676 if (collectsnapstats && starttime.tv_sec > 0) { 677 nanotime(&endtime); 678 timespecsub(&endtime, &starttime); 679 printf("%s: suspended %ld.%03ld sec, redo %ld of %d\n", 680 vp->v_mount->mnt_stat.f_mntonname, (long)endtime.tv_sec, > I think the bug is that vn_start_write() is called while the snaplock > is owned, after the out1 label in ffs_snapshot() (I am looking at the > HEAD code). You are right, the vn_start_write() is just after the out1 label. >> ffs_mount() at ffs_mount+0x65a >> vfs_donmount() at vfs_donmount+0xdc5 >> nmount() at nmount+0x63 >> amd64_syscall() at amd64_syscall+0x1f4 >> Xfast_syscall() at Xfast_syscall+0xfc >> --- syscall (378, FreeBSD ELF64, nmount), rip = 0x18069e35c, rsp = >> 0x7fffffffe358, rbp = 0x7fffffffedc7 --- -- Andreas Longwitz