From owner-svn-src-stable@FreeBSD.ORG Sat Mar 23 22:41:48 2013 Return-Path: Delivered-To: svn-src-stable@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 9E8A8DEA; Sat, 23 Mar 2013 22:41:48 +0000 (UTC) (envelope-from kib@FreeBSD.org) Received: from svn.freebsd.org (svn.freebsd.org [IPv6:2001:1900:2254:2068::e6a:0]) by mx1.freebsd.org (Postfix) with ESMTP id 901FCE77; Sat, 23 Mar 2013 22:41:48 +0000 (UTC) Received: from svn.freebsd.org ([127.0.1.70]) by svn.freebsd.org (8.14.6/8.14.6) with ESMTP id r2NMfm9a013790; Sat, 23 Mar 2013 22:41:48 GMT (envelope-from kib@svn.freebsd.org) Received: (from kib@localhost) by svn.freebsd.org (8.14.6/8.14.5/Submit) id r2NMfmOI013789; Sat, 23 Mar 2013 22:41:48 GMT (envelope-from kib@svn.freebsd.org) Message-Id: <201303232241.r2NMfmOI013789@svn.freebsd.org> From: Konstantin Belousov Date: Sat, 23 Mar 2013 22:41:48 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-9@freebsd.org Subject: svn commit: r248667 - stable/9/sys/ufs/ffs X-SVN-Group: stable-9 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: SVN commit messages for all the -stable branches of the src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 23 Mar 2013 22:41:48 -0000 Author: kib Date: Sat Mar 23 22:41:48 2013 New Revision: 248667 URL: http://svnweb.freebsd.org/changeset/base/248667 Log: MFC r247387: Inode block must not be read or written while cg block buffer is owned. Modified: stable/9/sys/ufs/ffs/ffs_alloc.c Directory Properties: stable/9/sys/ (props changed) Modified: stable/9/sys/ufs/ffs/ffs_alloc.c ============================================================================== --- stable/9/sys/ufs/ffs/ffs_alloc.c Sat Mar 23 22:23:15 2013 (r248666) +++ stable/9/sys/ufs/ffs/ffs_alloc.c Sat Mar 23 22:41:48 2013 (r248667) @@ -1724,6 +1724,17 @@ fail: return (0); } +static inline struct buf * +getinobuf(struct inode *ip, u_int cg, u_int32_t cginoblk, int gbflags) +{ + struct fs *fs; + + fs = ip->i_fs; + return (getblk(ip->i_devvp, fsbtodb(fs, ino_to_fsba(fs, + cg * fs->fs_ipg + cginoblk)), (int)fs->fs_bsize, 0, 0, + gbflags)); +} + /* * Determine whether an inode can be allocated. * @@ -1748,9 +1759,11 @@ ffs_nodealloccg(ip, cg, ipref, mode, unu u_int8_t *inosused; struct ufs2_dinode *dp2; int error, start, len, loc, map, i; + u_int32_t old_initediblk; fs = ip->i_fs; ump = ip->i_ump; +check_nifree: if (fs->fs_cs(fs, cg).cs_nifree == 0) return (0); UFS_UNLOCK(ump); @@ -1762,13 +1775,13 @@ ffs_nodealloccg(ip, cg, ipref, mode, unu return (0); } cgp = (struct cg *)bp->b_data; +restart: if (!cg_chkmagic(cgp) || cgp->cg_cs.cs_nifree == 0) { brelse(bp); UFS_LOCK(ump); return (0); } bp->b_xflags |= BX_BKGRDWRITE; - cgp->cg_old_time = cgp->cg_time = time_second; inosused = cg_inosused(cgp); if (ipref) { ipref %= fs->fs_ipg; @@ -1796,7 +1809,6 @@ ffs_nodealloccg(ip, cg, ipref, mode, unu panic("ffs_nodealloccg: block not in map"); } ipref = i * NBBY + ffs(map) - 1; - cgp->cg_irotor = ipref; gotit: /* * Check to see if we need to initialize more inodes. @@ -1804,9 +1816,37 @@ gotit: if (fs->fs_magic == FS_UFS2_MAGIC && ipref + INOPB(fs) > cgp->cg_initediblk && cgp->cg_initediblk < cgp->cg_niblk) { - ibp = getblk(ip->i_devvp, fsbtodb(fs, - ino_to_fsba(fs, cg * fs->fs_ipg + cgp->cg_initediblk)), - (int)fs->fs_bsize, 0, 0, 0); + old_initediblk = cgp->cg_initediblk; + + /* + * Free the cylinder group lock before writing the + * initialized inode block. Entering the + * babarrierwrite() with the cylinder group lock + * causes lock order violation between the lock and + * snaplk. + * + * Another thread can decide to initialize the same + * inode block, but whichever thread first gets the + * cylinder group lock after writing the newly + * allocated inode block will update it and the other + * will realize that it has lost and leave the + * cylinder group unchanged. + */ + ibp = getinobuf(ip, cg, old_initediblk, GB_LOCK_NOWAIT); + brelse(bp); + if (ibp == NULL) { + /* + * The inode block buffer is already owned by + * another thread, which must initialize it. + * Wait on the buffer to allow another thread + * to finish the updates, with dropped cg + * buffer lock, then retry. + */ + ibp = getinobuf(ip, cg, old_initediblk, 0); + brelse(ibp); + UFS_LOCK(ump); + goto check_nifree; + } bzero(ibp->b_data, (int)fs->fs_bsize); dp2 = (struct ufs2_dinode *)(ibp->b_data); for (i = 0; i < INOPB(fs); i++) { @@ -1823,8 +1863,29 @@ gotit: * loading of newly created filesystems. */ babarrierwrite(ibp); - cgp->cg_initediblk += INOPB(fs); + + /* + * After the inode block is written, try to update the + * cg initediblk pointer. If another thread beat us + * to it, then leave it unchanged as the other thread + * has already set it correctly. + */ + error = bread(ip->i_devvp, fsbtodb(fs, cgtod(fs, cg)), + (int)fs->fs_cgsize, NOCRED, &bp); + UFS_LOCK(ump); + ACTIVECLEAR(fs, cg); + UFS_UNLOCK(ump); + if (error != 0) { + brelse(bp); + return (error); + } + cgp = (struct cg *)bp->b_data; + if (cgp->cg_initediblk == old_initediblk) + cgp->cg_initediblk += INOPB(fs); + goto restart; } + cgp->cg_old_time = cgp->cg_time = time_second; + cgp->cg_irotor = ipref; UFS_LOCK(ump); ACTIVECLEAR(fs, cg); setbit(inosused, ipref);