Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 27 Sep 2008 15:16:11 -0400 (EDT)
From:      Charles Sprickman <spork@bway.net>
To:        freebsd-stable@FreeBSD.org
Subject:   Recommendations for servers running SATA drives
Message-ID:  <Pine.OSX.4.64.0809271453550.4630@toasty.nat.fasttrackmonkey.com>
In-Reply-To: <20080927064417.GA43638@icarus.home.lan>
References:  <20080921213426.GA13923@0lsen.net> <20080921215203.GC9494@icarus.home.lan> <20080921215930.GA25826@0lsen.net> <20080921220720.GA9847@icarus.home.lan> <249873145.20080926213341@takeda.tk> <20080927051413.GA42700@icarus.home.lan> <765067435.20080926223557@takeda.tk> <20080927064417.GA43638@icarus.home.lan>

next in thread | previous in thread | raw e-mail | index | archive | help
I'm forking the thread on fsck/soft-updates in hopes of getting some 
practical advice based on the discussion here of background fsck, 
softupdates and write-caching on SATA drives.

On Fri, 26 Sep 2008, Jeremy Chadwick wrote:

> Let's be realistic.  We're talking about ATA and SATA hard disks, hooked
> up to on-board controllers -- these are the majority of users.  Those
> with ATA/SATA RAID controllers (not on-board RAID either; most/all of
> those do not let you disable drive write caching) *might* have a RAID
> BIOS menu item for disabling said feature.

While I would love to deploy every server with SAS, that's not practical 
in many cases, especially for light-duty servers that are not being pushed 
very hard.  I am taking my chances with multiple affordable drives and 
gmirror where I cannot throw in a 3Ware card.  I imagine that many 
non-desktop FreeBSD users are doing the same considering you can fetch a 
decent 1U box with plenty of storage for not much more than $1K.  I assume 
many here are in agreement on this point -- just making it clear that the 
bargain crowd is not some weird edge case in the userbase...

> Regardless of all of this, end-users should, in no way shape or form,
> be expected to go to great lengths to disable their disk's write cache.
> They will not, I can assure you.  Thus, we must assume: write caching
> on a disk will be enabled, period.  If a filesystem is engineered with
> that fact ignored, then the filesystem is either 1) worthless, or 2)
> serves a very niche purpose and should not be the default filesystem.

Arguments about defaults aside, this is my first questions.  If I've got a 
server with multiple SATA drives mirrored with gmirror, is turning on 
write-caching a good idea?  What kind of performance impact should I 
expect?  What is the relationship between caching, soft-updates, and 
either NCQ or TCQ?

Here's an example of a Seagate, trimmed for brevity:

Protocol              Serial ATA v1.0
device model          ST3160811AS

Feature                      Support  Enable    Value           Vendor
write cache                    yes      yes
read ahead                     yes      yes
Native Command Queuing (NCQ)   yes       -      31/0x1F
Tagged Command Queuing (TCQ)   no       no      31/0x1F

TCQ is clearly not supported, NCQ seems to be supported, but I don't know 
how to tell if it's actually enabled or not.  Write-caching is currently 
on.

The tradeoff is apparently performance vs. more reliable recovery should 
the machine lose power, smoke itself, etc., but all I've seen is anecdotal 
evidence of how bad performance gets.

FWIW, this machine in particular had it's mainboard go up in smoke last 
week.  One drive was too far gone for gmirror to rebuild it without doing 
a "forget" and "insert".  The remaining drive was too screwy for 
background fsck, but a manual check in single-user left me with no real 
suprises or problems.

> The system is already up and the filesystems mounted.  If the error in
> question is of such severity that it would impact a user's ability to
> reliably use the filesystem, how do you expect constant screaming on
> the console will help?  A user won't know what it means; there is
> already evidence of this happening (re: mysterious ATA DMA errors which
> still cannot be figured out[6]).
>
> IMHO, a dirty filesystem should not be mounted until it's been fully
> analysed/scanned by fsck.  So again, people are putting faith into
> UFS2+SU despite actual evidence proving that it doesn't handle all
> scenarios.

I'll ask, but it seems like the consensus here is that background fsck, 
while the default, is best left disabled.  The cases where it might make 
sense are:

-desktop systems
-servers that have incredibly huge filesystems (and even there being able 
to selectively background fsck filesystems might be helpful)

The first example is obvious, people want a fast-booting desktop.  The 
second is trading long fsck times in single-user for some uncertainty.

> The problem here is that when it was created, it was sort of an
> "experiment".  Now, when someone installs FreeBSD, UFS2 is the default
> filesystem used, and SU are enabled on every filesystem except the root
> fs.  Thus, we have now put ourselves into a situation where said
> feature ***must*** be reliable in all cases.
>
> You're also forgetting a huge focus of SU -- snapshots[1].  However, there
> are more than enough facts on the table at this point concluding that
> snapshots are causing more problems[7] than previously expected.  And
> there's further evidence filesystem snapshots shouldn't even be used in
> this way[8].

...

> Filesystems have to be reliable; data integrity is focus #1, and cannot
> be sacrificed.  Users and administrators *expect* a filesystem to be
> reliable.  No one is going to keep using a filesystem if it has
> disadvantages which can result in data loss or "waste of administrative
> time" (which I believe is what's occurring here).

The softupdates question seems tied quite closely to the write-caching 
question.  If write-caching "breaks" SU, that makes things tricky.  So 
another big question:

If write-caching is enabled, should SU be disabled?

And again, what kind of performance and/or reliability sacrifices are 
being made?

I'd love to hear some input from both admins dealing with this stuff in 
production and from any developers who are making decisions about the 
future direction of all of this.

Thanks,

Charles


> [1]: http://www.usenix.org/publications/library/proceedings/bsdcon02/mckusick/mckusick_html/index.html
> [6]: http://wiki.freebsd.org/JeremyChadwick/ATA_issues_and_troubleshooting
> [7]: http://wiki.freebsd.org/JeremyChadwick/Commonly_reported_issues
> [8]: http://lists.freebsd.org/pipermail/freebsd-stable/2007-January/032070.html
>
> -- 
> | Jeremy Chadwick                                jdc at parodius.com |
> | Parodius Networking                       http://www.parodius.com/ |
> | UNIX Systems Administrator                  Mountain View, CA, USA |
> | Making life hard for others since 1977.              PGP: 4BD6C0CB |
>
> _______________________________________________
> freebsd-stable@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-stable
> To unsubscribe, send any mail to "freebsd-stable-unsubscribe@freebsd.org"
>



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?Pine.OSX.4.64.0809271453550.4630>