From owner-svn-src-head@freebsd.org Tue Aug 16 21:53:58 2016 Return-Path: Delivered-To: svn-src-head@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id D2C31BBC62C; Tue, 16 Aug 2016 21:53:58 +0000 (UTC) (envelope-from jilles@stack.nl) Received: from mx1.stack.nl (relay04.stack.nl [IPv6:2001:610:1108:5010::107]) (using TLSv1 with cipher DHE-RSA-CAMELLIA256-SHA (256/256 bits)) (Client CN "mailhost.stack.nl", Issuer "CA Cert Signing Authority" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id 6A61E11FA; Tue, 16 Aug 2016 21:53:58 +0000 (UTC) (envelope-from jilles@stack.nl) Received: from snail.stack.nl (snail.stack.nl [IPv6:2001:610:1108:5010::131]) by mx1.stack.nl (Postfix) with ESMTP id 32198B808A; Tue, 16 Aug 2016 23:53:54 +0200 (CEST) Received: by snail.stack.nl (Postfix, from userid 1677) id 91F8928494; Tue, 16 Aug 2016 23:53:55 +0200 (CEST) Date: Tue, 16 Aug 2016 23:53:55 +0200 From: Jilles Tjoelker To: Konstantin Belousov Cc: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-head@freebsd.org Subject: Re: svn commit: r304180 - head/sys/ufs/ffs Message-ID: <20160816215355.GB12199@stack.nl> References: <201608151922.u7FJMOmT099410@repo.freebsd.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <201608151922.u7FJMOmT099410@repo.freebsd.org> User-Agent: Mutt/1.5.21 (2010-09-15) X-BeenThere: svn-src-head@freebsd.org X-Mailman-Version: 2.1.22 Precedence: list List-Id: SVN commit messages for the src tree for head/-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 16 Aug 2016 21:53:58 -0000 On Mon, Aug 15, 2016 at 07:22:24PM +0000, Konstantin Belousov wrote: > [snip] > @@ -254,15 +259,23 @@ loop: > if ((bp->b_vflags & BV_SCANNED) != 0) > continue; > bp->b_vflags |= BV_SCANNED; > - /* Flush indirects in order. */ > + /* > + * Flush indirects in order, if requested. > + * > + * Note that if only datasync is requested, we can > + * skip indirect blocks when softupdates are not > + * active. Otherwise we must flush them with data, > + * since dependencies prevent data block writes. > + */ > if (waitfor == MNT_WAIT && bp->b_lblkno <= -NDADDR && > - lbn_level(bp->b_lblkno) >= passes) > + (lbn_level(bp->b_lblkno) >= passes || > + ((flags & DATA_ONLY) != 0 && !DOINGSOFTDEP(vp)))) > continue; > if (bp->b_lblkno > lbn) > panic("ffs_syncvnode: syncing truncated data."); > if (BUF_LOCK(bp, LK_EXCLUSIVE | LK_NOWAIT, NULL) == 0) { > BO_UNLOCK(bo); > - } else if (wait != 0) { > + } else if (wait) { > if (BUF_LOCK(bp, > LK_EXCLUSIVE | LK_SLEEPFAIL | LK_INTERLOCK, > BO_LOCKPTR(bo)) != 0) { Hmm, some people interpret POSIX differently than I do, but I think it is clear that XBD 3.384 Synchronized I/O Data Integrity Completion requires the indirect blocks to be written (because "all file system information required to retrieve the data is successfully transferred"). The Linux man page matches this interpretation except that an fsync of the parent directory may be required. If the whole file is being considered, then any change to indirect blocks is needed to access some data (because the file was extended, a hole was filled or snapshotted data was overwritten). For other overwrites, there is no change to indirect blocks and no conflict with fdatasync's performance objectives. > @@ -330,31 +343,59 @@ next: > * these will be done with one sync and one async pass. > */ > if (bo->bo_dirty.bv_cnt > 0) { > - /* Write the inode after sync passes to flush deps. */ > - if (wait && DOINGSOFTDEP(vp) && (flags & NO_INO_UPDT) == 0) { > - BO_UNLOCK(bo); > - ffs_update(vp, 1); > - BO_LOCK(bo); > + if ((flags & DATA_ONLY) == 0) { > + still_dirty = true; > + } else { > + /* > + * For data-only sync, dirty indirect buffers > + * are ignored. > + */ > + still_dirty = false; > + TAILQ_FOREACH(bp, &bo->bo_dirty.bv_hd, b_bobufs) { > + if (bp->b_lblkno > -NDADDR) { > + still_dirty = true; > + break; > + } > + } > } > - /* switch between sync/async. */ > - wait = !wait; > - if (wait == 1 || ++passes < NIADDR + 2) > - goto loop; > + > + if (still_dirty) { > + /* Write the inode after sync passes to flush deps. */ > + if (wait && DOINGSOFTDEP(vp) && > + (flags & NO_INO_UPDT) == 0) { > + BO_UNLOCK(bo); > + ffs_update(vp, 1); > + BO_LOCK(bo); > + } > + /* switch between sync/async. */ > + wait = !wait; > + if (wait || ++passes < NIADDR + 2) > + goto loop; > #ifdef INVARIANTS > - if (!vn_isdisk(vp, NULL)) > - vn_printf(vp, "ffs_fsync: dirty "); > + if (!vn_isdisk(vp, NULL)) > + vn_printf(vp, "ffs_fsync: dirty "); > #endif > + } > } > BO_UNLOCK(bo); > error = 0; > - if ((flags & NO_INO_UPDT) == 0) > - error = ffs_update(vp, 1); > - if (DOINGSUJ(vp)) > - softdep_journal_fsync(VTOI(vp)); > + if ((flags & DATA_ONLY) == 0) { > + if ((flags & NO_INO_UPDT) == 0) > + error = ffs_update(vp, 1); > + if (DOINGSUJ(vp)) > + softdep_journal_fsync(VTOI(vp)); > + } > return (error); > } Ideally, a changed i_size would also cause the inode to be written, but I don't know how to determine that (there is no i_flag for it). Always writing the inode would defeat the point of fdatasync. -- Jilles Tjoelker