Date: Sun, 16 Dec 2007 18:22:25 +0100 From: Bernd Walter <ticso@cicely12.cicely.de> To: Darren Reed <darrenr@freebsd.org> Cc: freebsd-current@freebsd.org, ticso@cicely.de, Ivan Voras <ivoras@freebsd.org> Subject: Re: ZFS melting under postgres... Message-ID: <20071216172225.GB51627@cicely12.cicely.de> In-Reply-To: <4764F282.7030706@freebsd.org> References: <ADCCD5E6-A792-49B9-A346-753176C12F2E@tamu.edu> <fjuljp$cvb$1@ger.gmane.org> <476343B4.8080208@FreeBSD.org> <fk09p8$b16$1@ger.gmane.org> <86tzmk54tt.fsf@ds4.des.no> <fk0ue7$bp$1@ger.gmane.org> <476419CD.9070401@terranova.net> <fk1j0l$o4l$1@ger.gmane.org> <20071216024259.GI48684@cicely12.cicely.de> <4764F282.7030706@freebsd.org>
next in thread | previous in thread | raw e-mail | index | archive | help
On Sun, Dec 16, 2007 at 08:40:18PM +1100, Darren Reed wrote: > Bernd Walter wrote: > ... > > One problem is with the data blocks beeing that big, when writing > > 512 Byte you effectifly do a read-modify-write of a larger physical > > block. > > This can be handled quite well with larger FS block. > > The much bigger problem is with power loss when writing such a > > maintenence block. > > You loose a very large area of logical blocks when this fails, > > since a 4k maintenence block contains the allocation for several hundert > > kB of logical data blocks. > > In other words - you possibly loose data blocks that were not written > > a long time and the database wouldn't expect a problem with that data. > > Even for ZIL it is very questionable if you loose a large data area, > > since the purpose is to have the data that was already sinced readable > > after a power loss. > ... > > ZFS doesn't suffer from this problem because the design > is to always write a new section of data rather than > over write "current" data. You missed the point: The filesystem doesn't overwrite written data, but the media does internaly to manage itself. You can loose data which hasn't beeen writen at all, since there is a large dependency chain with flash media. > So if you lose power in the middle of a write to a data > block, there is no damage to the old data. Yes there is, because that's the way flash media works. You write block x and if something goes wrong y is unreadable as well. And those dependency areas are very hughe. -- B.Walter http://www.bwct.de http://www.fizon.de bernd@bwct.de info@bwct.de support@fizon.de
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20071216172225.GB51627>