From owner-freebsd-fs@FreeBSD.ORG Sat Nov 26 07:25:14 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 00E0A1065672 for ; Sat, 26 Nov 2011 07:25:14 +0000 (UTC) (envelope-from mckusick@mckusick.com) Received: from chez.mckusick.com (chez.mckusick.com [70.36.157.235]) by mx1.freebsd.org (Postfix) with ESMTP id D39188FC0C for ; Sat, 26 Nov 2011 07:25:13 +0000 (UTC) Received: from chez.mckusick.com (localhost [127.0.0.1]) by chez.mckusick.com (8.14.3/8.14.3) with ESMTP id pAQ7PDow056289; Fri, 25 Nov 2011 23:25:13 -0800 (PST) (envelope-from mckusick@chez.mckusick.com) Message-Id: <201111260725.pAQ7PDow056289@chez.mckusick.com> To: Kostik Belousov In-reply-to: <20111123194444.GE50300@deviant.kiev.zoral.com.ua> Date: Fri, 25 Nov 2011 23:25:13 -0800 From: Kirk McKusick X-Spam-Status: No, score=0.0 required=5.0 tests=MISSING_MID, UNPARSEABLE_RELAY autolearn=failed version=3.2.5 X-Spam-Checker-Version: SpamAssassin 3.2.5 (2008-06-10) on chez.mckusick.com Cc: freebsd-fs@freebsd.org Subject: Re: Does UFS2 send BIO_FLUSH to GEOM when update metadata (with softupdates)? X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 26 Nov 2011 07:25:14 -0000 Kostik, You are entirely correct when you say that the requirement for SU and SU+J is that it requires that notification of a disk-write complete mean that the data is on the disk (stable). The problem that arises is that (apparently) some tag-queue implementations report back that tags have been written when in fact they have not been written. I believe that they only way to ensure that a tagged request is on stable store is to send a BIO_BARRIER request to the disk. The BIO_BARRIER request is not supposed to return until all I/O requests that were sent down prior to the BIO_BARRIER have been committed to stable store. If in fact the disk hardware lies about tag completion, my proposed way for SU and SU+J to use BIO_BARRIER is to send one down periodically (say every 100ms) and then defer processing any I/O completions from before the barrier request until the BIO_BARRIER completes. Since most SU activity is going on in background, the delay should not be too noticable. The main place where it would likely show up is in fsync which could be delayed for up to the period (100ms). For such cases, the fsync should issue its own BIO_BARRIER once it has initiated all of its required I/O. Kirk McKusick