Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 5 Dec 2011 15:02:54 +0400
From:      Lev Serebryakov <lev@FreeBSD.org>
To:        Pawel Jakub Dawidek <pjd@FreeBSD.org>
Cc:        Kirk McKusick <mckusick@mckusick.com>, freebsd-fs@freebsd.org
Subject:   Re: Does UFS2 send BIO_FLUSH to GEOM when update metadata (with softupdates)?
Message-ID:  <62478423.20111205150254@serebryakov.spb.ru>
In-Reply-To: <20111205080148.GA1660@garage.freebsd.pl>
References:  <20111125110235.GB1642@garage.freebsd.pl> <201112020236.pB22aQi6059579@chez.mckusick.com> <20111205080148.GA1660@garage.freebsd.pl>

next in thread | previous in thread | raw e-mail | index | archive | help
Hello, Pawel.
You wrote 5 =E4=E5=EA=E0=E1=F0=FF 2011 =E3., 12:01:48:

> Again, different layers:) The reordering I worry about is at disk level.
> When you write a bunch of blocks they are really stored in disk write
> cache. Disk sends you ack eventhough the data is not on the stable
> storage yet. Under assumption that previous data is safe you send more
> data that is also placed in disk write cache. At some point disk decides
> to flush its write cache, but it will most likely sort the blocks by
> offset, which will mess your initial ordering and the data you sent last
> may hit the disk first.

> To avoid this reordering to happen you either need to turn off disk
> write cache or send BIO_FLUSH between those two write groups.
  As far as I understand Kostik and Kirk, it is Ok, as long as disk is
able to flush whole cache in case of crash or power failure, and disks
(or controllers) which are not able to do so, considered deeply
broken.

  If disk (or controller) is able not to lose data in any crash, it is
not so important, when and how data will be written, as READ of data
from cache will return "written" one and there is no time frame when
outer layers (driver, geom and UFS) could see un-ordered (or old) data
at all. So, disk (or controller) could ACK write as soon as data in its
(battery/super-capacitor/whatever protected) cache memory.

  In case of software (GEOM) cache, here is one big flaw: even if
computer as whole (and RAM, where cache resides, as part of computer)
is protected with battery (UPS with notification about battery and
line power state), software crash could prevent cache from being
flushed to stable device (disk). And in this scenario ACKing data
which is copied into cache, but not written (maybe, to hardware cache
of disk, not disk itself) is dangerous.

--=20
// Black Lion AKA Lev Serebryakov <lev@FreeBSD.org>




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?62478423.20111205150254>