Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 26 Oct 2010 12:17:01 -0700
From:      Chuck Swiger <cswiger@mac.com>
To:        "Marc G. Fournier" <scrappy@hub.org>
Cc:        freebsd-questions@freebsd.org
Subject:   Re: fsync: Linux vs FreeBSD
Message-ID:  <8BE3CF91-D8C3-491D-9EFC-6CF9A547F280@mac.com>
In-Reply-To: <alpine.BSF.2.00.1010261530200.39682@hub.org>
References:  <alpine.BSF.2.00.1010261530200.39682@hub.org>

next in thread | previous in thread | raw e-mail | index | archive | help
On Oct 26, 2010, at 11:33 AM, Marc G. Fournier wrote:
> Someone recently posted on one of the PostgreSQL Blogs concerning fsync on Linux/Windows/Mac OS X, but failed to make any comments on any of the BSDs ... the post has to do with how fsync works on the various OSs, and am curious as to whether or not this is something that also afflicts us:
> 
> http://rhaas.blogspot.com/2010/10/wal-reliability.html
> 
>> From reading our man page, I see no warnings similar to what the other OSs 
> have, specifically:
> 
> Mac OS X: For applications that require tighter guarantees about the
>          integrity of their data, Mac OS X provides the F_FULLFSYNC fcntl
> 
> Linux: If the underlying hard disk has write caching enabled, then the
>       data may not really be on permanent storage when fsync() /
>       fdatasync() return.
> 
> So, do we hide the fact, or are, in fact, not afflicted by this?


Whether the data actually gets written and the on-disk cache itself flushed seems to depend on a sysctl called hw.ata.wc for FreeBSD or the dkctl setting in NetBSD; write-caching seems to always default to on because otherwise people scream bloody murder about the factor of ten reduction in write performance with it off.  Further, by default (ie, FFSv2 with soft updates), data changes are synced out when you do an fsync(), but metadata changes are done asynchronously-- which is exactly what MacOS X does.

In other words, if you have write-caching on, no effort is made to invoke ATA_FLUSHCACHE or SCSI "SYNCHRONIZE CACHE" to make sure that your disk has actually written the bits to permanent storage.

[ ... ]

http://www.usenix.org/publications/library/proceedings/usenix2000/general/full_papers/seltzer/seltzer_html/index.html

"Both journaling and Soft Updates systems ensure the integrity of meta-data operations, but they provide slightly different semantics. The four areas of difference are the durability of meta-data operations such as create and delete, the status of the file system after a reboot and recovery, the guarantees made about the data in files after recovery, and the ability to provide atomicity.

The original FFS implemented meta-data operations such as create, delete, and rename synchronously, guaranteeing that when the system call returned, the meta-data changes were persistent. Some FFS variants (e.g., Solaris) made deletes asynchronous and other variants (e.g., SVR4) made create and rename asynchronous. However, on FreeBSD, FFS does guarantee that create, delete, and rename operations are synchronous.  FFS-async makes no such guarantees, and furthermore does not guarantee that the resulting file system can be recovered (via fsck) to a consistent state after failure. Thus, instead of being a viable candidate for a production file system, FFS-async provides an upper bound on the performance one can expect to achieve with the FFS derivatives.

Soft Updates provides looser guarantees than FFS about when meta-data changes reach disk. Create, delete, and rename operations typically reach disk within 45 seconds of the corresponding system call, but can be delayed up to 90 seconds in certain boundary cases (a newly created file in a hierarchy of newly created directories)."

Regards,
-- 
-Chuck




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?8BE3CF91-D8C3-491D-9EFC-6CF9A547F280>