From owner-freebsd-fs@FreeBSD.ORG  Sun Jun  9 00:15:54 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: fs@freebsd.org
Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115])
 by hub.freebsd.org (Postfix) with ESMTP id 52A0A6D1
 for <fs@freebsd.org>; Sun,  9 Jun 2013 00:15:54 +0000 (UTC)
 (envelope-from jdc@koitsu.org)
Received: from relay5-d.mail.gandi.net (relay5-d.mail.gandi.net
 [217.70.183.197])
 by mx1.freebsd.org (Postfix) with ESMTP id CE40C1FA0
 for <fs@freebsd.org>; Sun,  9 Jun 2013 00:15:53 +0000 (UTC)
Received: from mfilter16-d.gandi.net (mfilter16-d.gandi.net [217.70.178.144])
 by relay5-d.mail.gandi.net (Postfix) with ESMTP id 67E4541C056;
 Sun,  9 Jun 2013 02:15:36 +0200 (CEST)
X-Virus-Scanned: Debian amavisd-new at mfilter16-d.gandi.net
Received: from relay5-d.mail.gandi.net ([217.70.183.197])
 by mfilter16-d.gandi.net (mfilter16-d.gandi.net [10.0.15.180]) (amavisd-new,
 port 10024)
 with ESMTP id q9rpAwHWpxl3; Sun,  9 Jun 2013 02:15:34 +0200 (CEST)
X-Originating-IP: 76.102.14.35
Received: from jdc.koitsu.org (c-76-102-14-35.hsd1.ca.comcast.net
 [76.102.14.35]) (Authenticated sender: jdc@koitsu.org)
 by relay5-d.mail.gandi.net (Postfix) with ESMTPSA id 2F0BD41C054;
 Sun,  9 Jun 2013 02:15:34 +0200 (CEST)
Received: by icarus.home.lan (Postfix, from userid 1000)
 id 548CB73A1C; Sat,  8 Jun 2013 17:15:32 -0700 (PDT)
Date: Sat, 8 Jun 2013 17:15:32 -0700
From: Jeremy Chadwick <jdc@koitsu.org>
To: Steven Hartland <killing@multiplay.co.uk>
Subject: Re: Changing the default for ZFS atime to off?
Message-ID: <20130609001532.GA21540@icarus.home.lan>
References: <16FEF774EE8E4100AD2CAEC65276A49D@multiplay.co.uk>
 <20130608213331.GB18201@icarus.home.lan>
 <01719722FD8A41B4A4366611972A703A@multiplay.co.uk>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <01719722FD8A41B4A4366611972A703A@multiplay.co.uk>
User-Agent: Mutt/1.5.21 (2010-09-15)
Cc: fs@freebsd.org
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sun, 09 Jun 2013 00:15:54 -0000

On Sun, Jun 09, 2013 at 12:34:29AM +0100, Steven Hartland wrote:
> ----- Original Message ----- From: "Jeremy Chadwick"
> <jdc@koitsu.org>
> To: "Steven Hartland" <smh@freebsd.org>
> Cc: <fs@freebsd.org>
> Sent: Saturday, June 08, 2013 10:33 PM
> Subject: Re: Changing the default for ZFS atime to off?
> 
> 
> >On Sat, Jun 08, 2013 at 07:54:04PM +0100, Steven Hartland wrote:
> >>One of the first changes we make here when installing machines
> >>here to changing atime=off on all ZFS pool roots.
> >>
> >>I know there are a few apps which can rely on atime updates
> >>such as qmail and possibly postfix, but those seem like special
> >>cases for which admins should enable atime instead of the other
> >>way round.
> >>
> >>This is going to of particular interest for flash based storage
> >>which should avoid unnessacary writes to reduce wear, but it will
> >>also help improve performance in general.
> >>
> >>So what do people think is it worth considering changing the
> >>default from atime=on to atime=off moving forward?
> >>
> >>If so what about UFS, same change?
> >
> >I **strongly** oppose this change, for one key reason: the classic
> >Berkeley UNIX mail spool format (known as "mbox"), which is still
> >predominantly used on most UNIX systems today.
> >
> >Mail clients which read mbox files require a combination of atime and
> >mtime to determine if new mail has arrived within the mailbox.  If
> >mtime > atime, then there's new mail.  Not all mail clients support
> >alternate methods of detection (for example mutt has check_mbox_size,
> >which has had bugs/problems in the past (Google check_mbox_size),
> >and is fallible in other ways).
> ..
> 
> To clarify when I say "by default" this only effect newly created
> pools / volumes, it would not effect any existing volumes and hence
> couldn't break existing installs.
> 
> As I mentioned there are apps, mainly mail focused ones, which rely
> on on atime, but thats easy to keep working by ensuring these are
> stored on volumes which do have atime=on.

The problem is that your proposed change (to set atime=off as the
default) means the administrator:

1. Has to be aware that the default is now atime=off going forward,
and thus,

2. Must manually set atime=on on filesystems where it matters, which may
also mean creating a separate filesystem just for certain
purposes/tasks (which may not be possible with UFS after-the-fact).

The reality of #1, I'm sorry to say, is that barring some kind of mass
announcement on every single FreeBSD mailing list (I don't mean just
-announce, I mean EVERY LIST) to inform people of this change, as well
as some gigantic 72pt font text on www.freebsd.org telling people, most
people are not going to know about it.  I know that reality doesn't work
in your favour, but it's how things are.  A single line in the Release
Notes is going to be overlooked.

I cannot even begin to cover all the situations/cases of #2, so I'll
just do a brain dump as I think:

i) ZFS: You might think this is as easy as creating a separate
filesystem that's for /var/mail -- it is not that simple.  Many people
have their mail delivered to mboxes within $HOME, i.e. ~user/Mail, and
/var/mail never gets used.  It worsens when you consider people are
being insane with ZFS filesystems, such as creating a separate
filesystem for every single user on the system.

ii) With UFS, you might think it's as easy as removing noatime from
/etc/fstab for /var, but it isn't -- same situation as (i).

iii) There is the situation with UFS and bsdinstall where you can choose
the "quick and easy" partitioning/filesystem setu results in one big /
and that's all.  Now the admin has to remove noatime from /etc/fstab and
basically loses any benefit noatime provided per your proposal.

iv) It is very common for setups to have two separate places for mail
storage, i.e. the default is /var/mail/username, but users with a
.forward and/or .procmailrc may be siphoning mail to $HOME/Mail/folder
instead.  So now you have two filesystems where atime needs to be
enabled.

v) Non-mail-related stuff, meaning there may actually be users and
administrators who rely upon access times to indicate something.

None of these touche base on what Bruce Evans stated too: that atime=on
by default is a requirement to be POSIX-compliant.  That's also
confirmed here at Wikipedia WRT stat(2) (which also mentions some other
software that relies on atime too):

http://en.wikipedia.org/wiki/Stat_%28system_call%29#Criticism_of_atime

> The messaging and changes to installers which support ZFS root
> installs, such as mfsbsd, would need to be included in this but
> I don't see that as a blocker.

See above -- I think you are assuming mail always gets stored on one
filesystem, which quite often not the case.

> I suggesting this now as it seems like its time to consider that
> the vast majority of systems don't need this option for all volumes
> and the performance and reliability of systems are in question if
> we don't consider it.

My personal feeling is that this is extremely hasty -- do we have any
idea how much software relies on atime?  Because I certainly don't.

Sorry for sounding rude (I don't mean to be, I just can't be bothered to
phrase it differently), but: were you yourself even aware that atime was
relied upon/used for classic UNIX mailboxes?  I get the impression you
weren't, which just strengthens my point.

For example, I use atime everywhere, simply because I do not know what
might break/stop working reliably if atime was disabled on some
filesystems.  I do not know the internals of every single daemon and
program on a system (does anyone?), so I must take the stance of
choosing stability/reliability.

All said and done: I do appreciate having this discussion, particularly
publicly on a list.  Too many "key changes" in FreeBSD in the past few
years have been results of closed-door meetings of sorts (private mail
or in-person *con meetings), so the fact this is public is good.

-- 
| Jeremy Chadwick                                   jdc@koitsu.org |
| UNIX Systems Administrator                http://jdc.koitsu.org/ |
| Making life hard for others since 1977.             PGP 4BD6C0CB |


From owner-freebsd-fs@FreeBSD.ORG  Sun Jun  9 00:49:01 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115])
 by hub.freebsd.org (Postfix) with ESMTP id 82AC2DB9
 for <fs@freebsd.org>; Sun,  9 Jun 2013 00:49:01 +0000 (UTC)
 (envelope-from prvs=18721298a7=killing@multiplay.co.uk)
Received: from mail1.multiplay.co.uk (mail1.multiplay.co.uk [85.236.96.23])
 by mx1.freebsd.org (Postfix) with ESMTP id 26ECD11CC
 for <fs@freebsd.org>; Sun,  9 Jun 2013 00:49:00 +0000 (UTC)
Received: from r2d2 ([82.69.141.170])
 by mail1.multiplay.co.uk (mail1.multiplay.co.uk [85.236.96.23])
 (MDaemon PRO v10.0.4) with ESMTP id md50004223432.msg
 for <fs@freebsd.org>; Sun, 09 Jun 2013 01:48:59 +0100
X-Spam-Processed: mail1.multiplay.co.uk, Sun, 09 Jun 2013 01:48:59 +0100
 (not processed: message from valid local sender)
X-MDDKIM-Result: neutral (mail1.multiplay.co.uk)
X-MDRemoteIP: 82.69.141.170
X-Return-Path: prvs=18721298a7=killing@multiplay.co.uk
X-Envelope-From: killing@multiplay.co.uk
X-MDaemon-Deliver-To: fs@freebsd.org
Message-ID: <459E2FCADB4E40079066E4ABDBE47AFE@multiplay.co.uk>
From: "Steven Hartland" <killing@multiplay.co.uk>
To: "Jeremy Chadwick" <jdc@koitsu.org>
References: <16FEF774EE8E4100AD2CAEC65276A49D@multiplay.co.uk>
 <20130608213331.GB18201@icarus.home.lan>
 <01719722FD8A41B4A4366611972A703A@multiplay.co.uk>
 <20130609001532.GA21540@icarus.home.lan>
Subject: Re: Changing the default for ZFS atime to off?
Date: Sun, 9 Jun 2013 01:48:57 +0100
MIME-Version: 1.0
Content-Type: text/plain; format=flowed; charset="iso-8859-1";
 reply-type=original
Content-Transfer-Encoding: 7bit
X-Priority: 3
X-MSMail-Priority: Normal
X-Mailer: Microsoft Outlook Express 6.00.2900.5931
X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2900.6157
Cc: fs@freebsd.org
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sun, 09 Jun 2013 00:49:01 -0000

----- Original Message ----- 
From: "Jeremy Chadwick" <jdc@koitsu.org>

>> To clarify when I say "by default" this only effect newly created
>> pools / volumes, it would not effect any existing volumes and hence
>> couldn't break existing installs.
>> 
>> As I mentioned there are apps, mainly mail focused ones, which rely
>> on on atime, but thats easy to keep working by ensuring these are
>> stored on volumes which do have atime=on.
> 
> The problem is that your proposed change (to set atime=off as the
> default) means the administrator:
> 
> 1. Has to be aware that the default is now atime=off going forward,
> and thus,
> 
> 2. Must manually set atime=on on filesystems where it matters, which may
> also mean creating a separate filesystem just for certain
> purposes/tasks (which may not be possible with UFS after-the-fact).
> 
> The reality of #1, I'm sorry to say, is that barring some kind of mass
> announcement on every single FreeBSD mailing list (I don't mean just
> -announce, I mean EVERY LIST) to inform people of this change, as well
> as some gigantic 72pt font text on www.freebsd.org telling people, most
> people are not going to know about it.  I know that reality doesn't work
> in your favour, but it's how things are.  A single line in the Release
> Notes is going to be overlooked.
> 
> I cannot even begin to cover all the situations/cases of #2, so I'll
> just do a brain dump as I think:
> 
> i) ZFS: You might think this is as easy as creating a separate
> filesystem that's for /var/mail -- it is not that simple.  Many people
> have their mail delivered to mboxes within $HOME, i.e. ~user/Mail, and
> /var/mail never gets used.  It worsens when you consider people are
> being insane with ZFS filesystems, such as creating a separate
> filesystem for every single user on the system.
> 
> ii) With UFS, you might think it's as easy as removing noatime from
> /etc/fstab for /var, but it isn't -- same situation as (i).
> 
> iii) There is the situation with UFS and bsdinstall where you can choose
> the "quick and easy" partitioning/filesystem setu results in one big /
> and that's all.  Now the admin has to remove noatime from /etc/fstab and
> basically loses any benefit noatime provided per your proposal.

The initial question was for ZFS, with UFS being secondary, but yes
UFS isn't as easy as UFS.
 
> iv) It is very common for setups to have two separate places for mail
> storage, i.e. the default is /var/mail/username, but users with a
> .forward and/or .procmailrc may be siphoning mail to $HOME/Mail/folder
> instead.  So now you have two filesystems where atime needs to be
> enabled.

Could that not be covered by: /var /home for the common case at least?

> v) Non-mail-related stuff, meaning there may actually be users and
> administrators who rely upon access times to indicate something.
> 
> None of these touche base on what Bruce Evans stated too: that atime=on
> by default is a requirement to be POSIX-compliant.  That's also
> confirmed here at Wikipedia WRT stat(2) (which also mentions some other
> software that relies on atime too):
> 
> http://en.wikipedia.org/wiki/Stat_%28system_call%29#Criticism_of_atime

So yes others think its a less than stellar idea ;-)

>> The messaging and changes to installers which support ZFS root
>> installs, such as mfsbsd, would need to be included in this but
>> I don't see that as a blocker.
> 
> See above -- I think you are assuming mail always gets stored on one
> filesystem, which quite often not the case.

Its still seems simple to fix, see above.

>> I suggesting this now as it seems like its time to consider that
>> the vast majority of systems don't need this option for all volumes
>> and the performance and reliability of systems are in question if
>> we don't consider it.
> 
> My personal feeling is that this is extremely hasty -- do we have any
> idea how much software relies on atime?  Because I certainly don't.

Hasty no, just opening the idea up for discussion ;-)

> Sorry for sounding rude (I don't mean to be, I just can't be bothered to
> phrase it differently), but: were you yourself even aware that atime was
> relied upon/used for classic UNIX mailboxes?  I get the impression you
> weren't, which just strengthens my point.

Yes I am aware, which is why I mentioned mail in my original post.

> For example, I use atime everywhere, simply because I do not know what
> might break/stop working reliably if atime was disabled on some
> filesystems.  I do not know the internals of every single daemon and
> program on a system (does anyone?), so I must take the stance of
> choosing stability/reliability.

I did already mention, we set atime=off on everything and have never had
an issue, there's been similar mentions on the illumos list too.

Now that doesn't mean its suitable for everthing, mail has already been
mentioned, but thats still seems like a small set of use cases where its
required.

I guess where I'm coming from is making better for the vast majority.

I believe there's no point in configuring for a rare case by default
when it will make the much more common case worse.

> All said and done: I do appreciate having this discussion, particularly
> publicly on a list.  Too many "key changes" in FreeBSD in the past few
> years have been results of closed-door meetings of sorts (private mail
> or in-person *con meetings), so the fact this is public is good.

Everyone has their different uses of any OS, different experience etc,
so things like this need open discussion IMO.

    Regards
    Steve

================================================
This e.mail is private and confidential between Multiplay (UK) Ltd. and the person or entity to whom it is addressed. In the event of misdirection, the recipient is prohibited from using, copying, printing or otherwise disseminating it or any information contained in it. 

In the event of misdirection, illegible or incomplete transmission please telephone +44 845 868 1337
or return the E.mail to postmaster@multiplay.co.uk.


From owner-freebsd-fs@FreeBSD.ORG  Sun Jun  9 01:04:47 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115])
 by hub.freebsd.org (Postfix) with ESMTP id 2EC1B1C7
 for <fs@freebsd.org>; Sun,  9 Jun 2013 01:04:47 +0000 (UTC)
 (envelope-from cross+freebsd@distal.com)
Received: from mail.distal.com (mail.distal.com [IPv6:2001:470:e24c:200::ae25])
 by mx1.freebsd.org (Postfix) with ESMTP id DB8A41254
 for <fs@freebsd.org>; Sun,  9 Jun 2013 01:04:46 +0000 (UTC)
Received: from magrathea.distal.com (magrathea.distal.com
 [IPv6:2001:470:e24c:200:ea06:88ff:feca:960e]) (authenticated bits=0)
 by mail.distal.com (8.14.3/8.14.3) with ESMTP id r5914iip011649
 (version=TLSv1/SSLv3 cipher=AES128-SHA bits=128 verify=NO);
 Sat, 8 Jun 2013 21:04:45 -0400 (EDT)
Subject: Re: Changing the default for ZFS atime to off?
Mime-Version: 1.0 (Mac OS X Mail 6.3 \(1503\))
Content-Type: text/plain; charset=us-ascii
From: Chris Ross <cross+freebsd@distal.com>
X-Priority: 3
In-Reply-To: <459E2FCADB4E40079066E4ABDBE47AFE@multiplay.co.uk>
Date: Sat, 8 Jun 2013 21:04:44 -0400
Content-Transfer-Encoding: quoted-printable
Message-Id: <CA7987A0-2B4B-444D-AE37-FC5E8736C3BB@distal.com>
References: <16FEF774EE8E4100AD2CAEC65276A49D@multiplay.co.uk>
 <20130608213331.GB18201@icarus.home.lan>
 <01719722FD8A41B4A4366611972A703A@multiplay.co.uk>
 <20130609001532.GA21540@icarus.home.lan>
 <459E2FCADB4E40079066E4ABDBE47AFE@multiplay.co.uk>
To: "Steven Hartland" <killing@multiplay.co.uk>
X-Mailer: Apple Mail (2.1503)
X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.2.2
 (mail.distal.com [IPv6:2001:470:e24c:200::ae25]);
 Sat, 08 Jun 2013 21:04:45 -0400 (EDT)
Cc: fs@freebsd.org
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sun, 09 Jun 2013 01:04:47 -0000


  I agree strongly with Jeremy's general opinion.  But, am far less =
established
in the community, so only wanted to make a couple of small points.

On Jun 8, 2013, at 20:48 , "Steven Hartland" <killing@multiplay.co.uk> =
wrote:
> I guess where I'm coming from is making better for the vast majority.
>=20
> I believe there's no point in configuring for a rare case by default
> when it will make the much more common case worse.

  I think the point being made, and certainly in my mind reading this =
thread,
is that you're considering the "rare" case to be more rare than you =
factually
know it to be, and more importantly (IMO), you're considering "worse" on
something that I consider a very small issue.  I understand the reasons =
we
choose to turn off atime (by adding it to the kernel, at the time, in =
1994) at
UUNET for the USENET filesystems.  It was just too much activity.  But, =
for
a less than 110% active system, and given the relatively small number of =
things
that are accessed far more often than they're updated, I just don't =
think it's that
big of an issue.

  And, yes, I'm aware of the flash write issue, and I side with turning =
off there,
though I wouldn't be default.  (And, defaulting filesystem parameters =
based on
some impression of the underlying hardware seems risky at best anyway.)

  I think there are a small number of cases where it's an issue, and =
those people,
yourself included, already know how to solve the problem.  Myself, =
personally,
running only small systems, have never turned off atime updates.  Don't =
feel
any need to.  For specific heavy-load production systems, _everything_ =
is
looked at with a fine-toothed-comb.  No reason to "default" something =
that
only those systems need.

>> All said and done: I do appreciate having this discussion, =
particularly
>> publicly on a list.  Too many "key changes" in FreeBSD in the past =
few
>> years have been results of closed-door meetings of sorts (private =
mail
>> or in-person *con meetings), so the fact this is public is good.
>=20
> Everyone has their different uses of any OS, different experience etc,
> so things like this need open discussion IMO.

  I agree very much, and while my opinions may not match many others, =
I've
been very pleased to read this discussion.  Thank you for bringing it =
up.

                                                - Chris


From owner-freebsd-fs@FreeBSD.ORG  Sun Jun  9 02:58:45 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115])
 by hub.freebsd.org (Postfix) with ESMTP id 3FC07E2A
 for <fs@freebsd.org>; Sun,  9 Jun 2013 02:58:45 +0000 (UTC)
 (envelope-from prvs=18721298a7=killing@multiplay.co.uk)
Received: from mail1.multiplay.co.uk (mail1.multiplay.co.uk [85.236.96.23])
 by mx1.freebsd.org (Postfix) with ESMTP id D9DDA190C
 for <fs@freebsd.org>; Sun,  9 Jun 2013 02:58:44 +0000 (UTC)
Received: from r2d2 ([82.69.141.170])
 by mail1.multiplay.co.uk (mail1.multiplay.co.uk [85.236.96.23])
 (MDaemon PRO v10.0.4) with ESMTP id md50004224407.msg
 for <fs@freebsd.org>; Sun, 09 Jun 2013 03:58:43 +0100
X-Spam-Processed: mail1.multiplay.co.uk, Sun, 09 Jun 2013 03:58:43 +0100
 (not processed: message from valid local sender)
X-MDDKIM-Result: neutral (mail1.multiplay.co.uk)
X-MDRemoteIP: 82.69.141.170
X-Return-Path: prvs=18721298a7=killing@multiplay.co.uk
X-Envelope-From: killing@multiplay.co.uk
X-MDaemon-Deliver-To: fs@freebsd.org
Message-ID: <798D298E63D34820AF2D804E6123997B@multiplay.co.uk>
From: "Steven Hartland" <killing@multiplay.co.uk>
To: <kpneal@pobox.com>,
	<fs@freebsd.org>
References: <16FEF774EE8E4100AD2CAEC65276A49D@multiplay.co.uk>
 <20130608200522.GA77122@neutralgood.org>
Subject: Re: Changing the default for ZFS atime to off?
Date: Sun, 9 Jun 2013 03:58:38 +0100
MIME-Version: 1.0
Content-Type: text/plain; format=flowed; charset="iso-8859-1";
 reply-type=original
Content-Transfer-Encoding: 7bit
X-Priority: 3
X-MSMail-Priority: Normal
X-Mailer: Microsoft Outlook Express 6.00.2900.5931
X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2900.6157
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sun, 09 Jun 2013 02:58:45 -0000

----- Original Message ----- 
From: <kpneal@pobox.com>


> On Sat, Jun 08, 2013 at 07:54:04PM +0100, Steven Hartland wrote:
>> One of the first changes we make here when installing machines
>> here to changing atime=off on all ZFS pool roots.
>> 
>> I know there are a few apps which can rely on atime updates
>> such as qmail and possibly postfix, but those seem like special
>> cases for which admins should enable atime instead of the other
>> way round.
> 
> I believe mutt also uses them. Basically, any mail program using mbox mail
> folders uses them to correctly report which mailboxes have not been read
> yet.
> 
> There are probably other cases as well. I don't think they should be
> discounted simply because nobody here who bothers to speak up runs into
> them.
> 
> Turning off atime creates surprises for users.
> 
>> This is going to of particular interest for flash based storage
>> which should avoid unnessacary writes to reduce wear, but it will
>> also help improve performance in general.
>> 
>> So what do people think is it worth considering changing the
>> default from atime=on to atime=off moving forward?
> 
> I vote no. At least, don't change it unless the filesystem is actually on
> a flash device. Otherwise we risk breakage down the road because something
> that used to work doesn't work on a fresh FreeBSD install.

I don't think having different defaults for different disks would be a good
thing as that would just cause confusion.

Would updating the installers to enable atime on the volumes that require
it be an acceptable solution?

> Has anyone done any kind of study to see exactly how much I/O is caused
> by having atime updates be enabled? Does it _really_ make that much of
> a difference to performance, and would it _really_ help prolong the life
> of flash devices?

I've just done some a very basic tests here on an 8.3-RELEASE machine:-
1. make buildkernel # atime=on adds 2k writes totalling 27MB
2. find /usr/src # atime=on adds 100 writes totaling 3MB

    Regards
    Steve


================================================
This e.mail is private and confidential between Multiplay (UK) Ltd. and the person or entity to whom it is addressed. In the event of misdirection, the recipient is prohibited from using, copying, printing or otherwise disseminating it or any information contained in it. 

In the event of misdirection, illegible or incomplete transmission please telephone +44 845 868 1337
or return the E.mail to postmaster@multiplay.co.uk.


From owner-freebsd-fs@FreeBSD.ORG  Sun Jun  9 02:59:44 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by hub.freebsd.org (Postfix) with ESMTP id 4B6EFEBF;
 Sun,  9 Jun 2013 02:59:44 +0000 (UTC)
 (envelope-from amvandemore@gmail.com)
Received: from mail-pd0-f173.google.com (mail-pd0-f173.google.com
 [209.85.192.173])
 by mx1.freebsd.org (Postfix) with ESMTP id 262121919;
 Sun,  9 Jun 2013 02:59:43 +0000 (UTC)
Received: by mail-pd0-f173.google.com with SMTP id v14so2328773pde.18
 for <multiple recipients>; Sat, 08 Jun 2013 19:59:43 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
 h=mime-version:in-reply-to:references:date:message-id:subject:from:to
 :cc:content-type;
 bh=bK7ZJ30mOgv+o+SdSrCGZ6glYF6nqX0U4ob2VewHOSw=;
 b=q85BECHlYON5f9ydymr1f/uF0gziErmNIbGERxcnSLdnYxXX2t/5HRnurZblx8VCE9
 rbGJUAntksUi/SemdEQ6WNlRKM+/BGPQhBm1J/0zpvNnSY0GfMRuNU3mUgetm1nkNU7B
 34h7v7mI5fiPYWltWBOm13yp5gElRmO/rY2M/S2q/d2oAYobp0cWMQCAXVo7umbXZW1K
 34uUBbRKwEtXwpwY8HX3gdNR3xCBmVD4hrC10cKM0CUtYP3VzYRQl087yorCMkJwtaEn
 RgJ9/iWt3tI+YYugeyd++IknPM3u2xuhpyrsm4VVdDCg0cxHoLxjmrz8O2ts265YcjL2
 +/6A==
MIME-Version: 1.0
X-Received: by 10.66.175.205 with SMTP id cc13mr8616764pac.191.1370746783517; 
 Sat, 08 Jun 2013 19:59:43 -0700 (PDT)
Received: by 10.70.31.195 with HTTP; Sat, 8 Jun 2013 19:59:43 -0700 (PDT)
In-Reply-To: <20130608213331.GB18201@icarus.home.lan>
References: <16FEF774EE8E4100AD2CAEC65276A49D@multiplay.co.uk>
 <20130608213331.GB18201@icarus.home.lan>
Date: Sat, 8 Jun 2013 21:59:43 -0500
Message-ID: <CA+tpaK0QH51tpaOvK9H6-O86-CiuhCUrsiyXeA7OrGW3rwAfTw@mail.gmail.com>
Subject: Re: Changing the default for ZFS atime to off?
From: Adam Vande More <amvandemore@gmail.com>
To: Jeremy Chadwick <jdc@koitsu.org>
Content-Type: text/plain; charset=ISO-8859-1
X-Content-Filtered-By: Mailman/MimeDel 2.1.14
Cc: Steven Hartland <smh@freebsd.org>, fs@freebsd.org
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sun, 09 Jun 2013 02:59:44 -0000

On Sat, Jun 8, 2013 at 4:33 PM, Jeremy Chadwick <jdc@koitsu.org> wrote:

> I **strongly** oppose this change, for one key reason: the classic
> Berkeley UNIX mail spool format (known as "mbox"), which is still
> predominantly used on most UNIX systems today.
>
> Mail clients which read mbox files require a combination of atime and
> mtime to determine if new mail has arrived within the mailbox.  If
> mtime > atime, then there's new mail.  Not all mail clients support
> alternate methods of detection (for example mutt has check_mbox_size,
> which has had bugs/problems in the past (Google check_mbox_size),
> and is fallible in other ways).
>
> Further points:
>
> - FreeBSD comes with sendmail (MTA/MDA), which supports only mbox
>   natively
> - FreeBSD comes with mail/Mail/mailx (client), which only supports
>   only mbox natively
> - FreeBSD comes with biff/comsat, as well as from(1), which supports
>   only mbox natively
>

Most modern linuce use relatime eg the benefits of noatime and preserving
functionality for mail stuff.

https://access.redhat.com/site/documentation/en-US/Red_Hat_Enterprise_Linux/6/html/Power_Management_Guide/Relatime.html

-- 
Adam Vande More

From owner-freebsd-fs@FreeBSD.ORG  Sun Jun  9 03:04:58 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115])
 by hub.freebsd.org (Postfix) with ESMTP id 601E01BE;
 Sun,  9 Jun 2013 03:04:58 +0000 (UTC)
 (envelope-from delphij@gmail.com)
Received: from mail-qc0-x22f.google.com (mail-qc0-x22f.google.com
 [IPv6:2607:f8b0:400d:c01::22f])
 by mx1.freebsd.org (Postfix) with ESMTP id 18D1D1A86;
 Sun,  9 Jun 2013 03:04:58 +0000 (UTC)
Received: by mail-qc0-f175.google.com with SMTP id k14so2355367qcv.20
 for <multiple recipients>; Sat, 08 Jun 2013 20:04:57 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
 h=mime-version:in-reply-to:references:date:message-id:subject:from:to
 :cc:content-type;
 bh=Q2fwWn7BP/WwDilp1kAoUkHimjEBFLi5yc+0oBcmmnU=;
 b=Je0bGJ+IXL/7nrWYrbqCKLMkEJilfMJ+HIiAR4CD5nz86/FfnpDQlR31ytpm3IRBrY
 TfgFsc32eNieISaCSD39RmKDkOznQSjvH3MJD4mxxmFuidsovR3pN9prNSV09/t5/Z2Q
 ZbweUQkYrzkEDGzpRUBj3rygo2laNLMS32UcTbT9MU+jxMZNQBXk5ZSI/kRCj6wopMF/
 nKNwU3snPdsJTsgkSj7pBALLKZDgFo7k/0I4h2qwS3HOIvI1KKRP7Gt7LLn8K0FWMoZc
 k3Zy6CkDqiapAD763Y2Nc/33qmiqLMfYvt1oPqYE1ZdwveaQXdBxxCIPOOSJSO4eZzTk
 8LCw==
MIME-Version: 1.0
X-Received: by 10.224.51.7 with SMTP id b7mr8862370qag.8.1370747097551; Sat,
 08 Jun 2013 20:04:57 -0700 (PDT)
Received: by 10.49.42.73 with HTTP; Sat, 8 Jun 2013 20:04:57 -0700 (PDT)
In-Reply-To: <16FEF774EE8E4100AD2CAEC65276A49D@multiplay.co.uk>
References: <16FEF774EE8E4100AD2CAEC65276A49D@multiplay.co.uk>
Date: Sat, 8 Jun 2013 20:04:57 -0700
Message-ID: <CAGMYy3s9V_FYWBBoCPyak9DpM_XZJ_ReVrOZqgmGmJ_KzcKzVA@mail.gmail.com>
Subject: Re: Changing the default for ZFS atime to off?
From: Xin LI <delphij@gmail.com>
To: Steven Hartland <smh@freebsd.org>
Content-Type: text/plain; charset=UTF-8
Cc: fs@freebsd.org
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sun, 09 Jun 2013 03:04:58 -0000

I'd suggest implementing relative atime in VFS layer first:

https://github.com/delphij/freebsd/commit/6a199821fbdbf424027499d4a0f8f113f6943e16

From owner-freebsd-fs@FreeBSD.ORG  Sun Jun  9 03:14:23 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115])
 by hub.freebsd.org (Postfix) with ESMTP id 99DE83C9
 for <fs@freebsd.org>; Sun,  9 Jun 2013 03:14:23 +0000 (UTC)
 (envelope-from prvs=18721298a7=killing@multiplay.co.uk)
Received: from mail1.multiplay.co.uk (mail1.multiplay.co.uk [85.236.96.23])
 by mx1.freebsd.org (Postfix) with ESMTP id 3F9AF1B0E
 for <fs@freebsd.org>; Sun,  9 Jun 2013 03:14:22 +0000 (UTC)
Received: from r2d2 ([82.69.141.170])
 by mail1.multiplay.co.uk (mail1.multiplay.co.uk [85.236.96.23])
 (MDaemon PRO v10.0.4) with ESMTP id md50004224506.msg
 for <fs@freebsd.org>; Sun, 09 Jun 2013 04:14:22 +0100
X-Spam-Processed: mail1.multiplay.co.uk, Sun, 09 Jun 2013 04:14:22 +0100
 (not processed: message from valid local sender)
X-MDDKIM-Result: neutral (mail1.multiplay.co.uk)
X-MDRemoteIP: 82.69.141.170
X-Return-Path: prvs=18721298a7=killing@multiplay.co.uk
X-Envelope-From: killing@multiplay.co.uk
X-MDaemon-Deliver-To: fs@freebsd.org
Message-ID: <8C34552BD7074953A74E0443BAD1CCB7@multiplay.co.uk>
From: "Steven Hartland" <killing@multiplay.co.uk>
To: "Adam Vande More" <amvandemore@gmail.com>,
 "Jeremy Chadwick" <jdc@koitsu.org>
References: <16FEF774EE8E4100AD2CAEC65276A49D@multiplay.co.uk>
 <20130608213331.GB18201@icarus.home.lan>
 <CA+tpaK0QH51tpaOvK9H6-O86-CiuhCUrsiyXeA7OrGW3rwAfTw@mail.gmail.com>
Subject: Re: Changing the default for ZFS atime to off?
Date: Sun, 9 Jun 2013 04:14:18 +0100
MIME-Version: 1.0
Content-Type: text/plain; format=flowed; charset="iso-8859-1";
 reply-type=original
Content-Transfer-Encoding: 7bit
X-Priority: 3
X-MSMail-Priority: Normal
X-Mailer: Microsoft Outlook Express 6.00.2900.5931
X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2900.6157
Cc: fs@freebsd.org
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sun, 09 Jun 2013 03:14:23 -0000


----- Original Message ----- 
From: "Adam Vande More" <amvandemore@gmail.com>
To: "Jeremy Chadwick" <jdc@koitsu.org>
Cc: "Steven Hartland" <smh@freebsd.org>; <fs@freebsd.org>
Sent: Sunday, June 09, 2013 3:59 AM
Subject: Re: Changing the default for ZFS atime to off?


> On Sat, Jun 8, 2013 at 4:33 PM, Jeremy Chadwick <jdc@koitsu.org> wrote:
> 
>> I **strongly** oppose this change, for one key reason: the classic
>> Berkeley UNIX mail spool format (known as "mbox"), which is still
>> predominantly used on most UNIX systems today.
>>
>> Mail clients which read mbox files require a combination of atime and
>> mtime to determine if new mail has arrived within the mailbox.  If
>> mtime > atime, then there's new mail.  Not all mail clients support
>> alternate methods of detection (for example mutt has check_mbox_size,
>> which has had bugs/problems in the past (Google check_mbox_size),
>> and is fallible in other ways).
>>
>> Further points:
>>
>> - FreeBSD comes with sendmail (MTA/MDA), which supports only mbox
>>   natively
>> - FreeBSD comes with mail/Mail/mailx (client), which only supports
>>   only mbox natively
>> - FreeBSD comes with biff/comsat, as well as from(1), which supports
>>   only mbox natively
>>
> 
> Most modern linuce use relatime eg the benefits of noatime and preserving
> functionality for mail stuff.
> 
> https://access.redhat.com/site/documentation/en-US/Red_Hat_Enterprise_Linux/6/html/Power_Management_Guide/Relatime.html

Now thats a clever idea, like it.

    Regards
    Steve

================================================
This e.mail is private and confidential between Multiplay (UK) Ltd. and the person or entity to whom it is addressed. In the event of misdirection, the recipient is prohibited from using, copying, printing or otherwise disseminating it or any information contained in it. 

In the event of misdirection, illegible or incomplete transmission please telephone +44 845 868 1337
or return the E.mail to postmaster@multiplay.co.uk.


From owner-freebsd-fs@FreeBSD.ORG  Sun Jun  9 04:46:20 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by hub.freebsd.org (Postfix) with ESMTP id 0EF2E478
 for <freebsd-fs@freebsd.org>; Sun,  9 Jun 2013 04:46:20 +0000 (UTC)
 (envelope-from rcartwri@asu.edu)
Received: from mail-wg0-x22e.google.com (mail-wg0-x22e.google.com
 [IPv6:2a00:1450:400c:c00::22e])
 by mx1.freebsd.org (Postfix) with ESMTP id 9C0F71045
 for <freebsd-fs@freebsd.org>; Sun,  9 Jun 2013 04:46:19 +0000 (UTC)
Received: by mail-wg0-f46.google.com with SMTP id c11so574758wgh.1
 for <freebsd-fs@freebsd.org>; Sat, 08 Jun 2013 21:46:18 -0700 (PDT)
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=google.com; s=20120113;
 h=mime-version:in-reply-to:references:date:message-id:subject:from:to
 :content-type:x-gm-message-state;
 bh=crxu/YAp6wm4o+Ec8vQytjho0IEdl5iVyOIoflllzRU=;
 b=l/EXFp0qriZlYFSiXogmkklJsfMrf3DKXMhr8E8K4ffJE8FZIGllA+iuQNtSo/4YGh
 8gfmqjDA2GaAtjqMlcMJvTCKrufjwV6Vk6UPXDQmR5p6ImZNrwaMms2GLcBoZ9Tx3UQw
 AmGUiOHYvQcwaI0rJuk6uMebyoakv3CGyXk+8IIl9lE+X1wI6CBkreNcsSENEOBQ4EcU
 bUUvBLHzzLHZ+RXazXmUhk/1DNNSSoz4gk+wFCBoDPwoAXVvPLorTKnngOWf8vayA37m
 ODvRJ0ygVoWMDDl0PmLp5Cb4/Obs8s7h9kfSYF/jtFv7BMtAIIcpWbF5MeZTZk6D3/BD
 yDVg==
MIME-Version: 1.0
X-Received: by 10.180.185.44 with SMTP id ez12mr2041578wic.7.1370753178216;
 Sat, 08 Jun 2013 21:46:18 -0700 (PDT)
Received: by 10.180.76.114 with HTTP; Sat, 8 Jun 2013 21:46:18 -0700 (PDT)
In-Reply-To: <CALOkxuxb0HBMXOwP6Z4JjvBk+btnriCeCdhcKfmP=FyNEaseTA@mail.gmail.com>
References: <CALOkxuzH81UFuVZifJNxyuo6+hu9mCPB1TC91dn5fkjVLFqTKw@mail.gmail.com>
 <CALOkxuxb0HBMXOwP6Z4JjvBk+btnriCeCdhcKfmP=FyNEaseTA@mail.gmail.com>
Date: Sat, 8 Jun 2013 21:46:18 -0700
Message-ID: <CALOkxuxT9Oko1jm40GzYZbcTNu4epNgh8BoVhc-cH2UKNNbSvw@mail.gmail.com>
Subject: Re: ZFS and Glabel
From: "Reed A. Cartwright" <cartwright@asu.edu>
To: freebsd-fs@freebsd.org
Content-Type: text/plain; charset=ISO-8859-1
X-Gm-Message-State: ALoCoQnYp8BcflJUSX0m8B6NbC9U3wMOAwwPbVIE7Wp3165LWSROFEXHx2TGB169IOXi8K5pBoYr
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sun, 09 Jun 2013 04:46:20 -0000

I'm looking at my dmesg.boot to figure out what settings I need to
wire down my HDDs.  I read the cam(4) documentation but I'm not sure I
know what I'm doing.  Any advice would be helpful.

Let's assume that I want to wire everything down to their current
positions, what should I put in loader.conf?  I'll paste below some of
my hardware configuration and lines from dmesg.boot that I think I
need to look at.

I have 4 LSI cards in the system: mps0, mps1, mps2, mps3.

mps0: <LSI SAS2008> port 0xd000-0xd0ff mem
0xdff3c000-0xdff3ffff,0xdff40000-0xdff7ffff irq 24 at device 0.0 on
pci5
mps1: <LSI SAS2008> port 0xc000-0xc0ff mem
0xdfe3c000-0xdfe3ffff,0xdfe40000-0xdfe7ffff irq 44 at device 0.0 on
pci4
mps2: <LSI SAS2008> port 0xb000-0xb0ff mem
0xdfd3c000-0xdfd3ffff,0xdfd40000-0xdfd7ffff irq 32 at device 0.0 on
pci3
mps3: <LSI SAS2008> port 0xe000-0xe0ff mem
0xdbf3c000-0xdbf3ffff,0xdbf40000-0xdbf7ffff irq 56 at device 0.0 on
pci65

I have drives attached to two of those cards:

da0 at mps0 bus 0 scbus0 target 0 lun 0
da1 at mps0 bus 0 scbus0 target 1 lun 0
da2 at mps0 bus 0 scbus0 target 2 lun 0
da3 at mps0 bus 0 scbus0 target 3 lun 0
da4 at mps0 bus 0 scbus0 target 4 lun 0
da5 at mps0 bus 0 scbus0 target 5 lun 0
da6 at mps0 bus 0 scbus0 target 6 lun 0
da7 at mps0 bus 0 scbus0 target 7 lun 0
da8 at mps3 bus 0 scbus9 target 0 lun 0
da9 at mps3 bus 0 scbus9 target 1 lun 0
da10 at mps3 bus 0 scbus9 target 2 lun 0
da11 at mps3 bus 0 scbus9 target 3 lun 0
da12 at mps3 bus 0 scbus9 target 4 lun 0

# camcontrol devlist -v
scbus0 on mps0 bus 0:
<ATA Hitachi HUA72202 A3EA>        at scbus0 target 0 lun 0 (pass8,da7)
<ATA Hitachi HUA72202 A3EA>        at scbus0 target 1 lun 0 (pass9,da8)
<ATA Hitachi HUA72202 A3EA>        at scbus0 target 2 lun 0 (pass6,da5)
<ATA Hitachi HUA72202 A3EA>        at scbus0 target 3 lun 0 (pass7,da6)
<ATA Hitachi HUA72202 A3EA>        at scbus0 target 4 lun 0 (pass13,da12)
<ATA Hitachi HUA72202 A3EA>        at scbus0 target 5 lun 0 (pass12,da11)
<ATA Hitachi HUA72202 A3EA>        at scbus0 target 6 lun 0 (pass11,da10)
<ATA Hitachi HUA72202 A3EA>        at scbus0 target 7 lun 0 (pass10,da9)
scbus1 on mps1 bus 0:
scbus2 on mps2 bus 0:
scbus3 on ahcich0 bus 0:
<>                                 at scbus3 target -1 lun -1 ()
scbus4 on ahcich1 bus 0:
<>                                 at scbus4 target -1 lun -1 ()
scbus5 on ahcich2 bus 0:
<>                                 at scbus5 target -1 lun -1 ()
scbus6 on ahcich3 bus 0:
<>                                 at scbus6 target -1 lun -1 ()
scbus7 on ata0 bus 0:
<>                                 at scbus7 target -1 lun -1 ()
scbus8 on ata1 bus 0:
<>                                 at scbus8 target -1 lun -1 ()
scbus9 on mps3 bus 0:
<ATA WDC WD2503ABYX-0 1S02>        at scbus9 target 0 lun 0 (da0,pass0)
<ATA WDC WD2503ABYX-0 1S02>        at scbus9 target 1 lun 0 (da1,pass1)
<ATA WDC WD1003FBYX-0 1V02>        at scbus9 target 2 lun 0 (da2,pass2)
<ATA D2CSTK251M11-048 2.25>        at scbus9 target 3 lun 0 (da3,pass3)
<ATA INTEL SSDSA2CW30 0362>        at scbus9 target 4 lun 0 (da4,pass4)
scbus10 on umass-sim0 bus 0:
<KVM vmDisk-CD 0.01>               at scbus10 target 0 lun 0 (cd0,pass5)
scbus-1 on xpt0 bus 0:
<>                                 at scbus-1 target -1 lun -1 (xpt0)


# zpool status
  pool: storage
 state: ONLINE
 scan: scrub repaired 0 in 18h56m with 0 errors on Mon May 13 22:00:51 2013
config:

        NAME        STATE     READ WRITE CKSUM
        storage     ONLINE       0     0     0
          raidz2-0  ONLINE       0     0     0
            da6     ONLINE       0     0     0
            da5     ONLINE       0     0     0
            da8     ONLINE       0     0     0
            da7     ONLINE       0     0     0
            da9     ONLINE       0     0     0
            da10    ONLINE       0     0     0
            da11    ONLINE       0     0     0
            da12    ONLINE       0     0     0
        cache
          da3       ONLINE       0     0     0

errors: No known data errors

  pool: zroot
 state: ONLINE
 scan: scrub repaired 0 in 0h29m with 0 errors on Mon May 13 03:34:18 2013
config:

        NAME                                            STATE     READ
WRITE CKSUM
        zroot                                           ONLINE       0
    0     0
          mirror-0                                      ONLINE       0
    0     0
            gptid/8e7b4f79-7367-11e1-8722-00259058939a  ONLINE       0
    0     0
            gptid/910120c5-7367-11e1-8722-00259058939a  ONLINE       0
    0     0

errors: No known data errors


--
Reed A. Cartwright, PhD
Assistant Professor of Genomics, Evolution, and Bioinformatics
School of Life Sciences
Center for Evolutionary Medicine and Informatics
The Biodesign Institute
Arizona State University
-
Address: The Biodesign Institute, PO Box 875301, Tempe, AZ 85287-5301 USA
Packages: The Biodesign Institute, 1001 S. McAllister Ave, Tempe, AZ
85287-5301 USA
Office: Biodesign A-224A, 1-480-965-9949

From owner-freebsd-fs@FreeBSD.ORG  Sun Jun  9 06:54:48 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by hub.freebsd.org (Postfix) with ESMTP id 9BEB0F86
 for <freebsd-fs@freebsd.org>; Sun,  9 Jun 2013 06:54:48 +0000 (UTC)
 (envelope-from jdc@koitsu.org)
Received: from relay3-d.mail.gandi.net (relay3-d.mail.gandi.net
 [217.70.183.195])
 by mx1.freebsd.org (Postfix) with ESMTP id 22FEE1AE9
 for <freebsd-fs@freebsd.org>; Sun,  9 Jun 2013 06:54:47 +0000 (UTC)
Received: from mfilter4-d.gandi.net (mfilter4-d.gandi.net [217.70.178.134])
 by relay3-d.mail.gandi.net (Postfix) with ESMTP id 12A71A80B9;
 Sun,  9 Jun 2013 08:54:36 +0200 (CEST)
X-Virus-Scanned: Debian amavisd-new at mfilter4-d.gandi.net
Received: from relay3-d.mail.gandi.net ([217.70.183.195])
 by mfilter4-d.gandi.net (mfilter4-d.gandi.net [10.0.15.180]) (amavisd-new,
 port 10024)
 with ESMTP id 8CZWnTkV-RuU; Sun,  9 Jun 2013 08:54:34 +0200 (CEST)
X-Originating-IP: 76.102.14.35
Received: from jdc.koitsu.org (c-76-102-14-35.hsd1.ca.comcast.net
 [76.102.14.35]) (Authenticated sender: jdc@koitsu.org)
 by relay3-d.mail.gandi.net (Postfix) with ESMTPSA id 08309A80B1;
 Sun,  9 Jun 2013 08:54:33 +0200 (CEST)
Received: by icarus.home.lan (Postfix, from userid 1000)
 id 402CF73A1C; Sat,  8 Jun 2013 23:54:30 -0700 (PDT)
Date: Sat, 8 Jun 2013 23:54:30 -0700
From: Jeremy Chadwick <jdc@koitsu.org>
To: "Reed A. Cartwright" <cartwright@asu.edu>
Subject: Re: ZFS and Glabel
Message-ID: <20130609065430.GA28206@icarus.home.lan>
References: <CALOkxuzH81UFuVZifJNxyuo6+hu9mCPB1TC91dn5fkjVLFqTKw@mail.gmail.com>
 <CALOkxuxb0HBMXOwP6Z4JjvBk+btnriCeCdhcKfmP=FyNEaseTA@mail.gmail.com>
 <CALOkxuxT9Oko1jm40GzYZbcTNu4epNgh8BoVhc-cH2UKNNbSvw@mail.gmail.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <CALOkxuxT9Oko1jm40GzYZbcTNu4epNgh8BoVhc-cH2UKNNbSvw@mail.gmail.com>
User-Agent: Mutt/1.5.21 (2010-09-15)
Cc: freebsd-fs@freebsd.org
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sun, 09 Jun 2013 06:54:48 -0000

On Sat, Jun 08, 2013 at 09:46:18PM -0700, Reed A. Cartwright wrote:
> I'm looking at my dmesg.boot to figure out what settings I need to
> wire down my HDDs.  I read the cam(4) documentation but I'm not sure I
> know what I'm doing.  Any advice would be helpful.
> 
> Let's assume that I want to wire everything down to their current
> positions, what should I put in loader.conf?  I'll paste below some of
> my hardware configuration and lines from dmesg.boot that I think I
> need to look at.
> 
> I have 4 LSI cards in the system: mps0, mps1, mps2, mps3.
> 
> mps0: <LSI SAS2008> port 0xd000-0xd0ff mem
> 0xdff3c000-0xdff3ffff,0xdff40000-0xdff7ffff irq 24 at device 0.0 on
> pci5
> mps1: <LSI SAS2008> port 0xc000-0xc0ff mem
> 0xdfe3c000-0xdfe3ffff,0xdfe40000-0xdfe7ffff irq 44 at device 0.0 on
> pci4
> mps2: <LSI SAS2008> port 0xb000-0xb0ff mem
> 0xdfd3c000-0xdfd3ffff,0xdfd40000-0xdfd7ffff irq 32 at device 0.0 on
> pci3
> mps3: <LSI SAS2008> port 0xe000-0xe0ff mem
> 0xdbf3c000-0xdbf3ffff,0xdbf40000-0xdbf7ffff irq 56 at device 0.0 on
> pci65
> 
> I have drives attached to two of those cards:
> 
> da0 at mps0 bus 0 scbus0 target 0 lun 0
> da1 at mps0 bus 0 scbus0 target 1 lun 0
> da2 at mps0 bus 0 scbus0 target 2 lun 0
> da3 at mps0 bus 0 scbus0 target 3 lun 0
> da4 at mps0 bus 0 scbus0 target 4 lun 0
> da5 at mps0 bus 0 scbus0 target 5 lun 0
> da6 at mps0 bus 0 scbus0 target 6 lun 0
> da7 at mps0 bus 0 scbus0 target 7 lun 0

> da8 at mps3 bus 0 scbus9 target 0 lun 0
> da9 at mps3 bus 0 scbus9 target 1 lun 0
> da10 at mps3 bus 0 scbus9 target 2 lun 0
> da11 at mps3 bus 0 scbus9 target 3 lun 0
> da12 at mps3 bus 0 scbus9 target 4 lun 0
>
> {snip}

As usual, the situation is insane because you have so many controllers
on the system (more than just mps(4)) -- specifically 11 separate
controllers or systems using CAM (hence scbus0 to scbus10).

Below is for mps(4).  If you want to wire down ahci(4), things are
a bit different, but you can read this post of mine:

http://lists.freebsd.org/pipermail/freebsd-stable/2013-January/071851.html

Enjoy:

hint.scbus.0.at="mps0"
hint.scbus.1.at="mps1"
hint.scbus.2.at="mps2"
hint.scbus.9.at="mps3"
hint.da.0.at="scbus0"
hint.da.1.at="scbus0"
hint.da.2.at="scbus0"
hint.da.3.at="scbus0"
hint.da.4.at="scbus0"
hint.da.5.at="scbus0"
hint.da.6.at="scbus0"
hint.da.7.at="scbus0"
hint.da.8.at="scbus9"
hint.da.9.at="scbus9"
hint.da.10.at="scbus9"
hint.da.11.at="scbus9"
hint.da.12.at="scbus9"
hint.da.13.at="scbus9"
hint.da.14.at="scbus9"
hint.da.15.at="scbus9"
hint.da.16.at="scbus1"
hint.da.17.at="scbus1"
hint.da.18.at="scbus1"
hint.da.19.at="scbus1"
hint.da.20.at="scbus1"
hint.da.21.at="scbus1"
hint.da.22.at="scbus1"
hint.da.23.at="scbus1"
hint.da.24.at="scbus2"
hint.da.25.at="scbus2"
hint.da.26.at="scbus2"
hint.da.27.at="scbus2"
hint.da.28.at="scbus2"
hint.da.29.at="scbus2"
hint.da.30.at="scbus2"
hint.da.31.at="scbus2"
hint.da.0.target="0"
hint.da.1.target="1"
hint.da.2.target="2"
hint.da.3.target="3"
hint.da.4.target="4"
hint.da.5.target="5"
hint.da.6.target="6"
hint.da.7.target="7"
hint.da.8.target="0"
hint.da.9.target="1"
hint.da.10.target="2"
hint.da.11.target="3"
hint.da.12.target="4"
hint.da.13.target="5"
hint.da.14.target="6"
hint.da.15.target="7"
hint.da.16.target="0"
hint.da.17.target="1"
hint.da.18.target="2"
hint.da.19.target="3"
hint.da.20.target="4"
hint.da.21.target="5"
hint.da.22.target="6"
hint.da.23.target="7"
hint.da.24.target="0"
hint.da.25.target="1"
hint.da.26.target="2"
hint.da.27.target="3"
hint.da.28.target="4"
hint.da.29.target="5"
hint.da.30.target="6"
hint.da.31.target="7"

-- 
| Jeremy Chadwick                                   jdc@koitsu.org |
| UNIX Systems Administrator                http://jdc.koitsu.org/ |
| Making life hard for others since 1977.             PGP 4BD6C0CB |


From owner-freebsd-fs@FreeBSD.ORG  Sun Jun  9 10:40:24 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by hub.freebsd.org (Postfix) with ESMTP id 6CB7EEC8
 for <fs@freebsd.org>; Sun,  9 Jun 2013 10:40:24 +0000 (UTC)
 (envelope-from ml@my.gd)
Received: from mail-wg0-x234.google.com (mail-wg0-x234.google.com
 [IPv6:2a00:1450:400c:c00::234])
 by mx1.freebsd.org (Postfix) with ESMTP id 0666F1095
 for <fs@freebsd.org>; Sun,  9 Jun 2013 10:40:23 +0000 (UTC)
Received: by mail-wg0-f52.google.com with SMTP id z12so3523073wgg.19
 for <fs@freebsd.org>; Sun, 09 Jun 2013 03:40:23 -0700 (PDT)
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=google.com; s=20120113;
 h=references:mime-version:in-reply-to:content-type
 :content-transfer-encoding:message-id:cc:x-mailer:from:subject:date
 :to:x-gm-message-state;
 bh=9N+NLraiOty4QUjVP50QXOHB1IUkRH1XedUPb5pWWPA=;
 b=U+Xv4bPY+eCx9APmqFQ7xUyJoWMUhB22lZyK+t/uuSJYHaEHY1oDx1zw/UYF3CybpW
 U3o39YuJTJZ+Wb17JWhTNscb5GIAEEdWRAHHpivYQDaL7V6Hd1j1Px8w8miaInfQoZqI
 IwWR96HgdUHWHKOteO3h7JH2wW1lbNUuvuGm6OD8bdYWF8iYgddm8RZXiPzZ98rA/ruH
 4yojs7GjS8/UN+9tMLFMW12rCRfRSfNxVprrydjFWZx4n9DOB8NWOHcKIL02bgdK231H
 LBW/Jb7Yt8EQK/papN6pvwmFVuAtzy2EX96ovUqxNnfBTAfPG3mkWtUoDzk6uO3ZEO2y
 xHBQ==
X-Received: by 10.180.89.140 with SMTP id bo12mr2448667wib.22.1370774423141;
 Sun, 09 Jun 2013 03:40:23 -0700 (PDT)
Received: from [192.168.0.9]
 (AAubervilliers-652-1-225-231.w83-112.abo.wanadoo.fr. [83.112.232.231])
 by mx.google.com with ESMTPSA id ft10sm5532746wib.7.2013.06.09.03.40.21
 for <multiple recipients>
 (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128);
 Sun, 09 Jun 2013 03:40:22 -0700 (PDT)
References: <16FEF774EE8E4100AD2CAEC65276A49D@multiplay.co.uk>
Mime-Version: 1.0 (1.0)
In-Reply-To: <16FEF774EE8E4100AD2CAEC65276A49D@multiplay.co.uk>
Content-Type: text/plain;
	charset=us-ascii
Content-Transfer-Encoding: quoted-printable
Message-Id: <2AC5E8F4-3AF1-4EA5-975D-741506AC70A5@my.gd>
X-Mailer: iPhone Mail (10B144)
From: Damien Fleuriot <ml@my.gd>
Subject: Re: Changing the default for ZFS atime to off?
Date: Sun, 9 Jun 2013 12:39:17 +0200
To: Steven Hartland <smh@freebsd.org>
X-Gm-Message-State: ALoCoQlUgO9W/8K2n7fCgPkhgsxEN6lAa7nkTrhfhhOmwdDOyrT+Z4Hx/WcCBxymmmDLO/JnsiYM
Cc: "<fs@freebsd.org>" <fs@freebsd.org>
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sun, 09 Jun 2013 10:40:24 -0000


On 8 Jun 2013, at 20:54, "Steven Hartland" <smh@freebsd.org> wrote:

> One of the first changes we make here when installing machines
> here to changing atime=3Doff on all ZFS pool roots.
>=20
> I know there are a few apps which can rely on atime updates
> such as qmail and possibly postfix, but those seem like special
> cases for which admins should enable atime instead of the other
> way round.
>=20
> This is going to of particular interest for flash based storage
> which should avoid unnessacary writes to reduce wear, but it will
> also help improve performance in general.
>=20
> So what do people think is it worth considering changing the
> default from atime=3Don to atime=3Doff moving forward?
>=20
> If so what about UFS, same change?
>=20


I strongly oppose the change for reasons already raised by many people regar=
ding the mbox file.

Besides, if atime should default to off on 2 filesystems and on on all other=
s, that would definitely create confusion.

Last, I believe it should be the admin's decision to turn atime off, just li=
ke it is his decision to turn compression on.

Don't mistake me, we turn atime=3Doff on every box, every filesystem, even o=
n Mac's HFS.
Yet I believe defaulting it to off is a mistake.=

From owner-freebsd-fs@FreeBSD.ORG  Sun Jun  9 11:45:35 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@FreeBSD.org
Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115])
 by hub.freebsd.org (Postfix) with ESMTP id EE63ECA1
 for <freebsd-fs@FreeBSD.org>; Sun,  9 Jun 2013 11:45:35 +0000 (UTC)
 (envelope-from marck@rinet.ru)
Received: from woozle.rinet.ru (woozle.rinet.ru [195.54.192.68])
 by mx1.freebsd.org (Postfix) with ESMTP id 800051374
 for <freebsd-fs@FreeBSD.org>; Sun,  9 Jun 2013 11:45:35 +0000 (UTC)
Received: from localhost (localhost [127.0.0.1])
 by woozle.rinet.ru (8.14.5/8.14.5) with ESMTP id r59BjSIC084468
 for <freebsd-fs@FreeBSD.org>; Sun, 9 Jun 2013 15:45:28 +0400 (MSK)
 (envelope-from marck@rinet.ru)
Date: Sun, 9 Jun 2013 15:45:28 +0400 (MSK)
From: Dmitry Morozovsky <marck@rinet.ru>
To: freebsd-fs@FreeBSD.org
Subject: /tmp: change default to mdmfs and/or tmpfs?
Message-ID: <alpine.BSF.2.00.1306091538490.48048@woozle.rinet.ru>
User-Agent: Alpine 2.00 (BSF 1167 2008-08-23)
X-NCC-RegID: ru.rinet
X-OpenPGP-Key-ID: 6B691B03
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.4.3
 (woozle.rinet.ru [0.0.0.0]); Sun, 09 Jun 2013 15:45:28 +0400 (MSK)
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sun, 09 Jun 2013 11:45:36 -0000

Dear colleagues,

what do you think about stop using precious disk or even SSD resources for 
/tmp?

For last several (well, maybe over 10?) years I constantly use md (swap-backed) 
for /tmp, usually 128M in size, which is enough for most of our server needs.  
Some require more, but none more than 512M.  Regarding the options, we use
tmpmfs_flags="-S -n -o async -b 4096 -f 512"

Given more and more fixes/improvements committed to tmpfs, switching /tmp to it 
would be even better idea.

You thoughts?  Thank you!


-- 
Sincerely,
D.Marck                                     [DM5020, MCK-RIPE, DM3-RIPN]
[ FreeBSD committer:                                 marck@FreeBSD.org ]
------------------------------------------------------------------------
*** Dmitry Morozovsky --- D.Marck --- Wild Woozle --- marck@rinet.ru ***
------------------------------------------------------------------------

From owner-freebsd-fs@FreeBSD.ORG  Sun Jun  9 12:00:36 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by hub.freebsd.org (Postfix) with ESMTP id 386FEEA
 for <freebsd-fs@freebsd.org>; Sun,  9 Jun 2013 12:00:36 +0000 (UTC)
 (envelope-from loic.blot@unix-experience.fr)
Received: from smtp.smtpout.orange.fr (smtp03.smtpout.orange.fr
 [80.12.242.125]) by mx1.freebsd.org (Postfix) with ESMTP id ADB9115EB
 for <freebsd-fs@freebsd.org>; Sun,  9 Jun 2013 12:00:34 +0000 (UTC)
Received: from [10.42.69.5] ([82.120.202.131]) by mwinf5d06 with ME
 id mBsx1l0082qcW6A03Bsx5c; Sun, 09 Jun 2013 13:52:57 +0200
Message-ID: <1370779193.2018.10.camel@Nerz-PC>
Subject: Re: /tmp: change default to mdmfs and/or tmpfs?
From: =?ISO-8859-1?Q?Lo=EFc?= BLOT <loic.blot@unix-experience.fr>
To: freebsd-fs@freebsd.org
Date: Sun, 09 Jun 2013 13:59:53 +0200
In-Reply-To: <alpine.BSF.2.00.1306091538490.48048@woozle.rinet.ru>
References: <alpine.BSF.2.00.1306091538490.48048@woozle.rinet.ru>
Organization: UNIX Experience Fr
Content-Type: multipart/signed; micalg="pgp-sha256";
 protocol="application/pgp-signature"; boundary="=-S1znKLqXGdPP+SXnlk4B"
X-Mailer: Evolution 3.8.3 
Mime-Version: 1.0
X-Content-Filtered-By: Mailman/MimeDel 2.1.14
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
Reply-To: loic.blot@unix-experience.fr
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sun, 09 Jun 2013 12:00:36 -0000


--=-S1znKLqXGdPP+SXnlk4B
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable

Hello Dmitry,
I agree with you. /tmp is a temporary filesystem. On machines (both
servers and clients), i think /tmp must be like Linux, cleared at reboot
because it's a temporary FS.
The Linux point i don't agree is size of /tmp (on Linux: ram-memory/2).
This formula is quite good for system with lower than 2Gb of RAM but i
think 2Go is sufficient to system with more RAM (we could control this a
install of in the fstab after ?).
--=20
Best regards,
Lo=C3=AFc BLOT,=20
UNIX systems, security and network expert
http://www.unix-experience.fr


Le dimanche 09 juin 2013 =C3=A0 15:45 +0400, Dmitry Morozovsky a =C3=A9crit=
 :

> Dear colleagues,
>=20
> what do you think about stop using precious disk or even SSD resources fo=
r=20
> /tmp?
>=20
> For last several (well, maybe over 10?) years I constantly use md (swap-b=
acked)=20
> for /tmp, usually 128M in size, which is enough for most of our server ne=
eds. =20
> Some require more, but none more than 512M.  Regarding the options, we us=
e
> tmpmfs_flags=3D"-S -n -o async -b 4096 -f 512"
>=20
> Given more and more fixes/improvements committed to tmpfs, switching /tmp=
 to it=20
> would be even better idea.
>=20
> You thoughts?  Thank you!
>=20
>=20

--=-S1znKLqXGdPP+SXnlk4B
Content-Type: application/pgp-signature; name="signature.asc"
Content-Description: This is a digitally signed message part
Content-Transfer-Encoding: 7bit

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.20 (GNU/Linux)

iF4EABEIAAYFAlG0bjkACgkQh290DZyz8uaPGQEApIRm/z1EYj4jJvdWHGnn2X+j
hLQTsdMMktHhYm2t0e8BANHRRbEGr1coZpLYnCJnzIS7YkFOLHhMvsMoSYVqlDlZ
=Esvp
-----END PGP SIGNATURE-----

--=-S1znKLqXGdPP+SXnlk4B--


From owner-freebsd-fs@FreeBSD.ORG  Sun Jun  9 12:18:31 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115])
 by hub.freebsd.org (Postfix) with ESMTP id 9EF3B33D
 for <freebsd-fs@freebsd.org>; Sun,  9 Jun 2013 12:18:31 +0000 (UTC)
 (envelope-from ronald-freebsd8@klop.yi.org)
Received: from smarthost1.greenhost.nl (smarthost1.greenhost.nl
 [195.190.28.81]) by mx1.freebsd.org (Postfix) with ESMTP id 62E961674
 for <freebsd-fs@freebsd.org>; Sun,  9 Jun 2013 12:18:30 +0000 (UTC)
Received: from smtp.greenhost.nl ([213.108.104.138])
 by smarthost1.greenhost.nl with esmtps (TLS1.0:RSA_AES_256_CBC_SHA1:32)
 (Exim 4.69) (envelope-from <ronald-freebsd8@klop.yi.org>)
 id 1UleZh-0004kY-P1
 for freebsd-fs@freebsd.org; Sun, 09 Jun 2013 14:18:22 +0200
Received: from dhcp-077-251-158-153.chello.nl ([77.251.158.153] helo=pinky)
 by smtp.greenhost.nl with esmtpsa (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32)
 (Exim 4.72) (envelope-from <ronald-freebsd8@klop.yi.org>)
 id 1UleZe-0003CC-TS
 for freebsd-fs@freebsd.org; Sun, 09 Jun 2013 14:18:18 +0200
Content-Type: text/plain; charset=us-ascii; format=flowed; delsp=yes
To: freebsd-fs@freebsd.org
Subject: Re: /tmp: change default to mdmfs and/or tmpfs?
References: <alpine.BSF.2.00.1306091538490.48048@woozle.rinet.ru>
Date: Sun, 09 Jun 2013 14:18:20 +0200
MIME-Version: 1.0
Content-Transfer-Encoding: 8bit
From: "Ronald Klop" <ronald-freebsd8@klop.yi.org>
Message-ID: <op.wyeu4ulx8527sy@pinky>
In-Reply-To: <alpine.BSF.2.00.1306091538490.48048@woozle.rinet.ru>
User-Agent: Opera Mail/12.15 (Win32)
X-Virus-Scanned: by clamav at smarthost1.samage.net
X-Spam-Level: /
X-Spam-Score: 0.8
X-Spam-Status: No, score=0.8 required=5.0 tests=BAYES_50 autolearn=disabled
 version=3.3.1
X-Scan-Signature: 2ecd0b53b7de9511489f92806276a3d7
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sun, 09 Jun 2013 12:18:31 -0000

On Sun, 09 Jun 2013 13:45:28 +0200, Dmitry Morozovsky <marck@rinet.ru>  
wrote:

> Dear colleagues,
>
> what do you think about stop using precious disk or even SSD resources  
> for
> /tmp?
>
> For last several (well, maybe over 10?) years I constantly use md  
> (swap-backed)
> for /tmp, usually 128M in size, which is enough for most of our server  
> needs.
> Some require more, but none more than 512M.  Regarding the options, we  
> use
> tmpmfs_flags="-S -n -o async -b 4096 -f 512"
>
> Given more and more fixes/improvements committed to tmpfs, switching  
> /tmp to it
> would be even better idea.
>
> You thoughts?  Thank you!
>
>

What keeps you from putting this in fstab and stop using the tmpmfs  
rc.conf variable?
'tmpfs           /tmp            tmpfs   rw,size=536870912      0       0'

I thought tmpmfs/varmfs infrastructure was more for diskless/full-NFS  
systems anyways.

Ronald.

From owner-freebsd-fs@FreeBSD.ORG  Sun Jun  9 12:23:13 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115])
 by hub.freebsd.org (Postfix) with ESMTP id D3DA940A
 for <freebsd-fs@freebsd.org>; Sun,  9 Jun 2013 12:23:13 +0000 (UTC)
 (envelope-from marck@rinet.ru)
Received: from woozle.rinet.ru (woozle.rinet.ru [195.54.192.68])
 by mx1.freebsd.org (Postfix) with ESMTP id 665A0169A
 for <freebsd-fs@freebsd.org>; Sun,  9 Jun 2013 12:23:12 +0000 (UTC)
Received: from localhost (localhost [127.0.0.1])
 by woozle.rinet.ru (8.14.5/8.14.5) with ESMTP id r59CN95Y085950;
 Sun, 9 Jun 2013 16:23:09 +0400 (MSK) (envelope-from marck@rinet.ru)
Date: Sun, 9 Jun 2013 16:23:09 +0400 (MSK)
From: Dmitry Morozovsky <marck@rinet.ru>
To: Ronald Klop <ronald-freebsd8@klop.yi.org>
Subject: Re: /tmp: change default to mdmfs and/or tmpfs?
In-Reply-To: <op.wyeu4ulx8527sy@pinky>
Message-ID: <alpine.BSF.2.00.1306091620100.48048@woozle.rinet.ru>
References: <alpine.BSF.2.00.1306091538490.48048@woozle.rinet.ru>
 <op.wyeu4ulx8527sy@pinky>
User-Agent: Alpine 2.00 (BSF 1167 2008-08-23)
X-NCC-RegID: ru.rinet
X-OpenPGP-Key-ID: 6B691B03
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.4.3
 (woozle.rinet.ru [0.0.0.0]); Sun, 09 Jun 2013 16:23:09 +0400 (MSK)
Cc: freebsd-fs@freebsd.org
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sun, 09 Jun 2013 12:23:13 -0000

On Sun, 9 Jun 2013, Ronald Klop wrote:

> > what do you think about stop using precious disk or even SSD resources for
> > /tmp?
> > 
> > For last several (well, maybe over 10?) years I constantly use md
> > (swap-backed)
> > for /tmp, usually 128M in size, which is enough for most of our server
> > needs.
> > Some require more, but none more than 512M.  Regarding the options, we use
> > tmpmfs_flags="-S -n -o async -b 4096 -f 512"
> > 
> > Given more and more fixes/improvements committed to tmpfs, switching /tmp to
> > it
> > would be even better idea.
> > 
> > You thoughts?  Thank you!
> > 
> > 
> 
> What keeps you from putting this in fstab and stop using the tmpmfs rc.conf
> variable?
> 'tmpfs           /tmp            tmpfs   rw,size=536870912      0       0'
> 
> I thought tmpmfs/varmfs infrastructure was more for diskless/full-NFS systems
> anyways.

I do not see much difference here, to be honest.  Either way, you have 
memory-backed /tmp (though via using /etc/rc.d/tmp you can fine-tune FS options 
a bit easier, at least for my PoV)

The question is: shouldn't we treat this as a default at least for usual 
amd64/i386 installation with "non-embedded" quantity of RAM (like, e.g. > 
512M)?

-- 
Sincerely,
D.Marck                                     [DM5020, MCK-RIPE, DM3-RIPN]
[ FreeBSD committer:                                 marck@FreeBSD.org ]
------------------------------------------------------------------------
*** Dmitry Morozovsky --- D.Marck --- Wild Woozle --- marck@rinet.ru ***
------------------------------------------------------------------------

From owner-freebsd-fs@FreeBSD.ORG  Sun Jun  9 12:33:59 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by hub.freebsd.org (Postfix) with ESMTP id 8F65651C
 for <fs@freebsd.org>; Sun,  9 Jun 2013 12:33:59 +0000 (UTC)
 (envelope-from lee@dilkie.com)
Received: from data.snhdns.com (data.snhdns.com [208.76.82.136])
 by mx1.freebsd.org (Postfix) with ESMTP id 5C5F116DE
 for <fs@freebsd.org>; Sun,  9 Jun 2013 12:33:59 +0000 (UTC)
Received: from [142.46.160.218] (port=60357 helo=[206.51.1.11])
 by data.snhdns.com with esmtpsa (TLSv1:DHE-RSA-AES256-SHA:256)
 (Exim 4.80) (envelope-from <lee@dilkie.com>)
 id 1UldxB-0004xR-Gt; Sun, 09 Jun 2013 07:38:33 -0400
Message-ID: <51B4693B.8020704@dilkie.com>
Date: Sun, 09 Jun 2013 07:38:35 -0400
From: Lee Dilkie <lee@dilkie.com>
User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64;
 rv:17.0) Gecko/20130509 Thunderbird/17.0.6
MIME-Version: 1.0
To: Steven Hartland <killing@multiplay.co.uk>
Subject: Re: Changing the default for ZFS atime to off?
References: <16FEF774EE8E4100AD2CAEC65276A49D@multiplay.co.uk>
 <20130608213331.GB18201@icarus.home.lan>
 <CA+tpaK0QH51tpaOvK9H6-O86-CiuhCUrsiyXeA7OrGW3rwAfTw@mail.gmail.com>
 <8C34552BD7074953A74E0443BAD1CCB7@multiplay.co.uk>
In-Reply-To: <8C34552BD7074953A74E0443BAD1CCB7@multiplay.co.uk>
X-Enigmail-Version: 1.5.1
X-AntiAbuse: This header was added to track abuse,
 please include it with any abuse report
X-AntiAbuse: Primary Hostname - data.snhdns.com
X-AntiAbuse: Original Domain - freebsd.org
X-AntiAbuse: Originator/Caller UID/GID - [47 12] / [47 12]
X-AntiAbuse: Sender Address Domain - dilkie.com
X-Get-Message-Sender-Via: data.snhdns.com: authenticated_id: lee@dilkie.com
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 7bit
X-Content-Filtered-By: Mailman/MimeDel 2.1.14
Cc: fs@freebsd.org
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sun, 09 Jun 2013 12:33:59 -0000


On 6/8/2013 11:14 PM, Steven Hartland wrote:
>
>> Most modern linuce use relatime eg the benefits of noatime and
>> preserving
>> functionality for mail stuff.
>>
>> https://access.redhat.com/site/documentation/en-US/Red_Hat_Enterprise_Linux/6/html/Power_Management_Guide/Relatime.html
>>
>
> Now thats a clever idea, like it.
>

Indeed... very clever. caching atime itself. I like it too.

-lee


From owner-freebsd-fs@FreeBSD.ORG  Sun Jun  9 12:46:18 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@FreeBSD.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by hub.freebsd.org (Postfix) with ESMTP id E0CAAA93
 for <freebsd-fs@FreeBSD.org>; Sun,  9 Jun 2013 12:46:18 +0000 (UTC)
 (envelope-from jdc@koitsu.org)
Received: from relay5-d.mail.gandi.net (relay5-d.mail.gandi.net
 [217.70.183.197])
 by mx1.freebsd.org (Postfix) with ESMTP id 82ACA178E
 for <freebsd-fs@FreeBSD.org>; Sun,  9 Jun 2013 12:46:18 +0000 (UTC)
Received: from mfilter14-d.gandi.net (mfilter14-d.gandi.net [217.70.178.142])
 by relay5-d.mail.gandi.net (Postfix) with ESMTP id 500ED41C067;
 Sun,  9 Jun 2013 14:46:07 +0200 (CEST)
X-Virus-Scanned: Debian amavisd-new at mfilter14-d.gandi.net
Received: from relay5-d.mail.gandi.net ([217.70.183.197])
 by mfilter14-d.gandi.net (mfilter14-d.gandi.net [10.0.15.180]) (amavisd-new,
 port 10024)
 with ESMTP id WFI7CVzrHShu; Sun,  9 Jun 2013 14:46:05 +0200 (CEST)
X-Originating-IP: 76.102.14.35
Received: from jdc.koitsu.org (c-76-102-14-35.hsd1.ca.comcast.net
 [76.102.14.35]) (Authenticated sender: jdc@koitsu.org)
 by relay5-d.mail.gandi.net (Postfix) with ESMTPSA id 1C89141C060;
 Sun,  9 Jun 2013 14:46:05 +0200 (CEST)
Received: by icarus.home.lan (Postfix, from userid 1000)
 id 4114D73A1C; Sun,  9 Jun 2013 05:46:03 -0700 (PDT)
Date: Sun, 9 Jun 2013 05:46:03 -0700
From: Jeremy Chadwick <jdc@koitsu.org>
To: Dmitry Morozovsky <marck@rinet.ru>
Subject: Re: /tmp: change default to mdmfs and/or tmpfs?
Message-ID: <20130609124603.GA35681@icarus.home.lan>
References: <alpine.BSF.2.00.1306091538490.48048@woozle.rinet.ru>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <alpine.BSF.2.00.1306091538490.48048@woozle.rinet.ru>
User-Agent: Mutt/1.5.21 (2010-09-15)
Cc: freebsd-fs@FreeBSD.org
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sun, 09 Jun 2013 12:46:18 -0000

On Sun, Jun 09, 2013 at 03:45:28PM +0400, Dmitry Morozovsky wrote:
> Dear colleagues,
> 
> what do you think about stop using precious disk or even SSD resources for 
> /tmp?
> 
> For last several (well, maybe over 10?) years I constantly use md (swap-backed) 
> for /tmp, usually 128M in size, which is enough for most of our server needs.  
> Some require more, but none more than 512M.  Regarding the options, we use
> tmpmfs_flags="-S -n -o async -b 4096 -f 512"

Hold up.  Let's start with what you just gave.  Everything I'm talking
about below is for stable/9 by the way:

1. grep -r tmpfs /etc returns nothing, so I don't know where this magic
comes from,

2. tmpfs(5) documents none of these flags, and the flags you've given
cannot be mdconfig(8) flags because:

a) -S requires a sector size (you specified none),
b) -n would have no bearing given the context,
c) -o async applies only to vnode-backed models (default is malloc,
   and I see no -t vnode),
d) There is no -b flag,
e) The -f flag is for -t vnode only, and refers to a filename for the
   vnode-backing store.

So consider me very, very confused with what you've given.  Maybe the
flags were different on FreeBSD 6.x or 7.x or 8.x?  I haven't checked
http://www.freebsd.org/cgi/man.cgi yet.

> Given more and more fixes/improvements committed to tmpfs, switching /tmp to it 
> would be even better idea.
> 
> You thoughts?  Thank you!

As I understand it, there are (or were -- because I remember seeing them
repeatedly brought up on the mailing lists) problems with tmpfs.
Sometimes these issues would turn out to be with other filesystems (such
as unionfs), but other times not so much.

If my memory serves me correct, there are major complexities with
VM/memory management when intermixing tmpfs + ZFS + UFS on a system***.

Skimming lists and my memory, I come across these (and I recommend
anyone replying please read the full thread from that post onward):

http://lists.freebsd.org/pipermail/freebsd-current/2011-June/025459.html
http://lists.freebsd.org/pipermail/freebsd-current/2011-June/025461.html
http://lists.freebsd.org/pipermail/freebsd-fs/2013-January/016165.html

Be aware the -current thread posts I linked come from a thread started
asking if tmpfs should "really still be considered experimental or not".

Then there's this, which shows issues getting MFC'd to stable/9 but not
8.x, so one may want to be very careful about decisions where tmpfs gets
used by default going forward (but keep reading):

http://www.freebsd.org/cgi/query-pr.cgi?pr=kern/139312
http://www.freebsd.org/cgi/query-pr.cgi?pr=kern/159418
http://www.freebsd.org/cgi/query-pr.cgi?pr=kern/155411
http://www.freebsd.org/cgi/query-pr.cgi?pr=kern/171626

However PR 155411 claims the issue happens on 9.0-RELEASE as well, and
PR 139312 even mentions/brings up ZFS -- I have no idea what "State:
patched" means (is it fixed?  Is it committed?  Why isn't the PR closed?
etc.)

I also see this:

http://forums.freebsd.org/archive/index.php/t-30467.html

Where someone stated that excessive ARC usage on ZFS had an indirect
effect on tmpfs.  r233769 to stable/9 may have fixed this, but given the
history of all of this "juggling" of Feature X causing memory exhaustion
for Feature Y, and in turn affecting Feature Z, all within kernel space,
I really don't know how much I can trust all of this.

One should probably review the FreeBSD forums for other posts as well,
as gut feeling says there's probably more there too.

Now some more generic items:

tmpfs does not retain data across reboots -- that's by design, of
course.  I have concerns with regards to stuff that may end up in /tmp
that *should* persist across reboots and may surprise an administrator
that the files he/she placed in /tmp + reboot no longer appear.

While this may be considered a social problem of sorts, it definitely
requires one to reconsider use of /tmp (instead /var/tmp, for example)
for certain tasks.

In closing:

If you want to make bsdinstall ask/prompt the administrator "would you
like to use tmpfs for /tmp?", then I'm all for it -- sounds good to me.
But doing it by default would be something (at this time) I would not be
in favour of.  I just don't get the impression of stability from tmpfs
given its track record.  (Yes, I am paranoid in this regard)


*** -- For example I personally have experienced strange behaviour when
ZFS+UFS are used on the same system with massive amounts of I/O being
done between the two (my experience showed the ZFS ARC suddenly limiting
itself in a strange manner, to some abysmally small limit (much lower
than arc_max)).  In this case, I can only imagine tmpfs making things
"even worse" given added memory pressure and so on.

-- 
| Jeremy Chadwick                                   jdc@koitsu.org |
| UNIX Systems Administrator                http://jdc.koitsu.org/ |
| Making life hard for others since 1977.             PGP 4BD6C0CB |


From owner-freebsd-fs@FreeBSD.ORG  Sun Jun  9 13:01:57 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by hub.freebsd.org (Postfix) with ESMTP id A0EC6916
 for <freebsd-fs@freebsd.org>; Sun,  9 Jun 2013 13:01:57 +0000 (UTC)
 (envelope-from fullermd@over-yonder.net)
Received: from thyme.infocus-llc.com (server.infocus-llc.com [206.156.254.44])
 by mx1.freebsd.org (Postfix) with ESMTP id 7E5F218F8
 for <freebsd-fs@freebsd.org>; Sun,  9 Jun 2013 13:01:57 +0000 (UTC)
Received: from draco.over-yonder.net (c-75-65-60-66.hsd1.ms.comcast.net
 [75.65.60.66])
 (using TLSv1 with cipher ADH-CAMELLIA256-SHA (256/256 bits))
 (No client certificate requested)
 by thyme.infocus-llc.com (Postfix) with ESMTPSA id A0C1637B4AE;
 Sun,  9 Jun 2013 08:01:56 -0500 (CDT)
Received: by draco.over-yonder.net (Postfix, from userid 100)
 id 3bSyKw1nrDzG2w; Sun,  9 Jun 2013 08:01:56 -0500 (CDT)
Date: Sun, 9 Jun 2013 08:01:56 -0500
From: "Matthew D. Fuller" <fullermd@over-yonder.net>
To: Ronald Klop <ronald-freebsd8@klop.yi.org>
Subject: Re: /tmp: change default to mdmfs and/or tmpfs?
Message-ID: <20130609130156.GN61341@over-yonder.net>
References: <alpine.BSF.2.00.1306091538490.48048@woozle.rinet.ru>
 <op.wyeu4ulx8527sy@pinky>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <op.wyeu4ulx8527sy@pinky>
X-Editor: vi
X-OS: FreeBSD <http://www.freebsd.org/>
User-Agent: Mutt/1.5.21-fullermd.4 (2010-09-15)
X-Virus-Scanned: clamav-milter 0.97.6 at thyme.infocus-llc.com
X-Virus-Status: Clean
Cc: freebsd-fs@freebsd.org
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sun, 09 Jun 2013 13:01:57 -0000

On Sun, Jun 09, 2013 at 02:18:20PM +0200 I heard the voice of
Ronald Klop, and lo! it spake thus:
> 
> What keeps you from putting this in fstab and stop using the tmpmfs  
> rc.conf variable?
> 'tmpfs           /tmp            tmpfs   rw,size=536870912      0       0'

That makes a tmpfs(5) filesystem, not a ufs-on-md(8) filesystem like
rc.conf tmpmfs does.  Whether that matters depends on your own
peculiar situation, but they're not exactly the same.


-- 
Matthew Fuller     (MF4839)   |  fullermd@over-yonder.net
Systems/Network Administrator |  http://www.over-yonder.net/~fullermd/
           On the Internet, nobody can hear you scream.

From owner-freebsd-fs@FreeBSD.ORG  Sun Jun  9 13:06:08 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@FreeBSD.org
Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115])
 by hub.freebsd.org (Postfix) with ESMTP id 87FBAA06
 for <freebsd-fs@FreeBSD.org>; Sun,  9 Jun 2013 13:06:08 +0000 (UTC)
 (envelope-from fullermd@over-yonder.net)
Received: from thyme.infocus-llc.com (server.infocus-llc.com [206.156.254.44])
 by mx1.freebsd.org (Postfix) with ESMTP id 65D4B191C
 for <freebsd-fs@FreeBSD.org>; Sun,  9 Jun 2013 13:06:08 +0000 (UTC)
Received: from draco.over-yonder.net (c-75-65-60-66.hsd1.ms.comcast.net
 [75.65.60.66])
 (using TLSv1 with cipher ADH-CAMELLIA256-SHA (256/256 bits))
 (No client certificate requested)
 by thyme.infocus-llc.com (Postfix) with ESMTPSA id 109ED37B4E4;
 Sun,  9 Jun 2013 08:00:38 -0500 (CDT)
Received: by draco.over-yonder.net (Postfix, from userid 100)
 id 3bSyJP3WXhzG2l; Sun,  9 Jun 2013 08:00:37 -0500 (CDT)
Date: Sun, 9 Jun 2013 08:00:37 -0500
From: "Matthew D. Fuller" <fullermd@over-yonder.net>
To: Jeremy Chadwick <jdc@koitsu.org>
Subject: Re: /tmp: change default to mdmfs and/or tmpfs?
Message-ID: <20130609130037.GM61341@over-yonder.net>
References: <alpine.BSF.2.00.1306091538490.48048@woozle.rinet.ru>
 <20130609124603.GA35681@icarus.home.lan>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <20130609124603.GA35681@icarus.home.lan>
X-Editor: vi
X-OS: FreeBSD <http://www.freebsd.org/>
User-Agent: Mutt/1.5.21-fullermd.4 (2010-09-15)
X-Virus-Scanned: clamav-milter 0.97.6 at thyme.infocus-llc.com
X-Virus-Status: Clean
Cc: freebsd-fs@FreeBSD.org, Dmitry Morozovsky <marck@rinet.ru>
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sun, 09 Jun 2013 13:06:08 -0000

On Sun, Jun 09, 2013 at 05:46:03AM -0700 I heard the voice of
Jeremy Chadwick, and lo! it spake thus:
> 
> 1. grep -r tmpfs /etc returns nothing, so I don't know where this magic
> comes from,

Remembering the second 'm' in tmpmfs   8-}


> 2. tmpfs(5) documents none of these flags, and the flags you've given
> cannot be mdconfig(8) flags because:

Which is why they're mdmfs(8) flags (/etc/rc.d/tmp -> mount_md from
rc.subr -> mdmfs).


-- 
Matthew Fuller     (MF4839)   |  fullermd@over-yonder.net
Systems/Network Administrator |  http://www.over-yonder.net/~fullermd/
           On the Internet, nobody can hear you scream.

From owner-freebsd-fs@FreeBSD.ORG  Sun Jun  9 13:16:25 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by hub.freebsd.org (Postfix) with ESMTP id 86721F63
 for <freebsd-fs@freebsd.org>; Sun,  9 Jun 2013 13:16:25 +0000 (UTC)
 (envelope-from marck@rinet.ru)
Received: from woozle.rinet.ru (woozle.rinet.ru [195.54.192.68])
 by mx1.freebsd.org (Postfix) with ESMTP id F1F151985
 for <freebsd-fs@freebsd.org>; Sun,  9 Jun 2013 13:16:24 +0000 (UTC)
Received: from localhost (localhost [127.0.0.1])
 by woozle.rinet.ru (8.14.5/8.14.5) with ESMTP id r59DGNIe088449;
 Sun, 9 Jun 2013 17:16:23 +0400 (MSK) (envelope-from marck@rinet.ru)
Date: Sun, 9 Jun 2013 17:16:23 +0400 (MSK)
From: Dmitry Morozovsky <marck@rinet.ru>
To: Jeremy Chadwick <jdc@koitsu.org>
Subject: Re: /tmp: change default to mdmfs and/or tmpfs?
In-Reply-To: <20130609124603.GA35681@icarus.home.lan>
Message-ID: <alpine.BSF.2.00.1306091709350.48048@woozle.rinet.ru>
References: <alpine.BSF.2.00.1306091538490.48048@woozle.rinet.ru>
 <20130609124603.GA35681@icarus.home.lan>
User-Agent: Alpine 2.00 (BSF 1167 2008-08-23)
X-NCC-RegID: ru.rinet
X-OpenPGP-Key-ID: 6B691B03
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.4.3
 (woozle.rinet.ru [0.0.0.0]); Sun, 09 Jun 2013 17:16:23 +0400 (MSK)
Cc: freebsd-fs@freebsd.org
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sun, 09 Jun 2013 13:16:25 -0000

On Sun, 9 Jun 2013, Jeremy Chadwick wrote:

> > what do you think about stop using precious disk or even SSD resources for 
> > /tmp?
> > 
> > For last several (well, maybe over 10?) years I constantly use md (swap-backed) 
> > for /tmp, usually 128M in size, which is enough for most of our server needs.  
> > Some require more, but none more than 512M.  Regarding the options, we use
> > tmpmfs_flags="-S -n -o async -b 4096 -f 512"
> 
> Hold up.  Let's start with what you just gave.  Everything I'm talking
> about below is for stable/9 by the way:

Don't mix md-backed tmp with tmpfs, see below:

> 1. grep -r tmpfs /etc returns nothing, so I don't know where this magic
> comes from,

it is /etc/rc.d/tmp with tmpmfs_* rc variables actually

> 2. tmpfs(5) documents none of these flags, and the flags you've given
> cannot be mdconfig(8) flags because:
> 
> a) -S requires a sector size (you specified none),
> b) -n would have no bearing given the context,
> c) -o async applies only to vnode-backed models (default is malloc,
>    and I see no -t vnode),
> d) There is no -b flag,
> e) The -f flag is for -t vnode only, and refers to a filename for the
>    vnode-backing store.

all these are related to mdmfs(8)

> So consider me very, very confused with what you've given.  Maybe the
> flags were different on FreeBSD 6.x or 7.x or 8.x?  I haven't checked
> http://www.freebsd.org/cgi/man.cgi yet.

Actually, there are two different questions (or kind of questions):
- are we considering switching off /tmp from real media-backed storage?
- is so, what are we selecting: memory/swap-backed UFS (mdmfs) or tmpfs?

> As I understand it, there are (or were -- because I remember seeing them
> repeatedly brought up on the mailing lists) problems with tmpfs.
> Sometimes these issues would turn out to be with other filesystems (such
> as unionfs), but other times not so much.
> 
> If my memory serves me correct, there are major complexities with
> VM/memory management when intermixing tmpfs + ZFS + UFS on a system***.

Yes, hence my question about status of tmpfs now.

And yes, I personally do *not* used tmpfs-backed /tmp on real productionj 
servers -- just mdmfs-backed.

OTOH, I *do* use tmpfs for my builder (for tinderbox for now, but I'm planning 
switch buildworld/buildkernel there too), with little issues yet.

[snip the rest, I have to dig a bit more to answer]

-- 
Sincerely,
D.Marck                                     [DM5020, MCK-RIPE, DM3-RIPN]
[ FreeBSD committer:                                 marck@FreeBSD.org ]
------------------------------------------------------------------------
*** Dmitry Morozovsky --- D.Marck --- Wild Woozle --- marck@rinet.ru ***
------------------------------------------------------------------------

From owner-freebsd-fs@FreeBSD.ORG  Sun Jun  9 13:17:01 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@FreeBSD.org
Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115])
 by hub.freebsd.org (Postfix) with ESMTP id CF63CFD9
 for <freebsd-fs@FreeBSD.org>; Sun,  9 Jun 2013 13:17:01 +0000 (UTC)
 (envelope-from jdc@koitsu.org)
Received: from relay3-d.mail.gandi.net (relay3-d.mail.gandi.net
 [217.70.183.195])
 by mx1.freebsd.org (Postfix) with ESMTP id 8CACA198D
 for <freebsd-fs@FreeBSD.org>; Sun,  9 Jun 2013 13:17:01 +0000 (UTC)
Received: from mfilter3-d.gandi.net (mfilter3-d.gandi.net [217.70.178.133])
 by relay3-d.mail.gandi.net (Postfix) with ESMTP id 7F4A4A80D0;
 Sun,  9 Jun 2013 15:16:50 +0200 (CEST)
X-Virus-Scanned: Debian amavisd-new at mfilter3-d.gandi.net
Received: from relay3-d.mail.gandi.net ([217.70.183.195])
 by mfilter3-d.gandi.net (mfilter3-d.gandi.net [10.0.15.180]) (amavisd-new,
 port 10024)
 with ESMTP id GvWTETafvw6Z; Sun,  9 Jun 2013 15:16:48 +0200 (CEST)
X-Originating-IP: 76.102.14.35
Received: from jdc.koitsu.org (c-76-102-14-35.hsd1.ca.comcast.net
 [76.102.14.35]) (Authenticated sender: jdc@koitsu.org)
 by relay3-d.mail.gandi.net (Postfix) with ESMTPSA id 47B63A80C4;
 Sun,  9 Jun 2013 15:16:48 +0200 (CEST)
Received: by icarus.home.lan (Postfix, from userid 1000)
 id 8EC7773A1C; Sun,  9 Jun 2013 06:16:46 -0700 (PDT)
Date: Sun, 9 Jun 2013 06:16:46 -0700
From: Jeremy Chadwick <jdc@koitsu.org>
To: "Matthew D. Fuller" <fullermd@over-yonder.net>
Subject: Re: /tmp: change default to mdmfs and/or tmpfs?
Message-ID: <20130609131646.GA37012@icarus.home.lan>
References: <alpine.BSF.2.00.1306091538490.48048@woozle.rinet.ru>
 <20130609124603.GA35681@icarus.home.lan>
 <20130609130037.GM61341@over-yonder.net>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <20130609130037.GM61341@over-yonder.net>
User-Agent: Mutt/1.5.21 (2010-09-15)
Cc: freebsd-fs@FreeBSD.org, Dmitry Morozovsky <marck@rinet.ru>
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sun, 09 Jun 2013 13:17:01 -0000

On Sun, Jun 09, 2013 at 08:00:37AM -0500, Matthew D. Fuller wrote:
> On Sun, Jun 09, 2013 at 05:46:03AM -0700 I heard the voice of
> Jeremy Chadwick, and lo! it spake thus:
> > 
> > 1. grep -r tmpfs /etc returns nothing, so I don't know where this magic
> > comes from,
> 
> Remembering the second 'm' in tmpmfs   8-}
> 
> 
> > 2. tmpfs(5) documents none of these flags, and the flags you've given
> > cannot be mdconfig(8) flags because:
> 
> Which is why they're mdmfs(8) flags (/etc/rc.d/tmp -> mount_md from
> rc.subr -> mdmfs).

Thank you -- the magic has been discovered!  ;-)  I had never heard of
mdmfs(8) until now (mdconfig(8) sure, mdmfs(8) nope).

Looking at the source, this thing is just a "fancy wrapper" written in
C, using mdconfig(8) and newfs(8), as well as geom_uzip(4) (which I also
didn't know about until now) in some manner (not sure how that fits
into the puzzle).  I guess it's mainly a program for convenience.

-- 
| Jeremy Chadwick                                   jdc@koitsu.org |
| UNIX Systems Administrator                http://jdc.koitsu.org/ |
| Making life hard for others since 1977.             PGP 4BD6C0CB |


From owner-freebsd-fs@FreeBSD.ORG  Sun Jun  9 13:25:56 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@FreeBSD.org
Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115])
 by hub.freebsd.org (Postfix) with ESMTP id 707E0458
 for <freebsd-fs@FreeBSD.org>; Sun,  9 Jun 2013 13:25:56 +0000 (UTC)
 (envelope-from wblock@wonkity.com)
Received: from wonkity.com (wonkity.com [67.158.26.137])
 by mx1.freebsd.org (Postfix) with ESMTP id 35B4819C6
 for <freebsd-fs@FreeBSD.org>; Sun,  9 Jun 2013 13:25:55 +0000 (UTC)
Received: from wonkity.com (localhost [127.0.0.1])
 by wonkity.com (8.14.7/8.14.7) with ESMTP id r59DPtKv070635;
 Sun, 9 Jun 2013 07:25:55 -0600 (MDT)
 (envelope-from wblock@wonkity.com)
Received: from localhost (wblock@localhost)
 by wonkity.com (8.14.7/8.14.7/Submit) with ESMTP id r59DPtLM070632;
 Sun, 9 Jun 2013 07:25:55 -0600 (MDT)
 (envelope-from wblock@wonkity.com)
Date: Sun, 9 Jun 2013 07:25:55 -0600 (MDT)
From: Warren Block <wblock@wonkity.com>
To: Dmitry Morozovsky <marck@rinet.ru>
Subject: Re: /tmp: change default to mdmfs and/or tmpfs?
In-Reply-To: <alpine.BSF.2.00.1306091538490.48048@woozle.rinet.ru>
Message-ID: <alpine.BSF.2.00.1306090711250.70087@wonkity.com>
References: <alpine.BSF.2.00.1306091538490.48048@woozle.rinet.ru>
User-Agent: Alpine 2.00 (BSF 1167 2008-08-23)
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; format=flowed; charset=US-ASCII
X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.4.3
 (wonkity.com [127.0.0.1]); Sun, 09 Jun 2013 07:25:55 -0600 (MDT)
Cc: freebsd-fs@FreeBSD.org
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sun, 09 Jun 2013 13:25:56 -0000

On Sun, 9 Jun 2013, Dmitry Morozovsky wrote:

> Dear colleagues,
>
> what do you think about stop using precious disk or even SSD resources for
> /tmp?
>
> For last several (well, maybe over 10?) years I constantly use md (swap-backed)
> for /tmp, usually 128M in size, which is enough for most of our server needs.
> Some require more, but none more than 512M.  Regarding the options, we use
> tmpmfs_flags="-S -n -o async -b 4096 -f 512"
>
> Given more and more fixes/improvements committed to tmpfs, switching /tmp to it
> would be even better idea.
>
> You thoughts?  Thank you!

tmpfs has been working fine here for /tmp.  I also use it for /usr/obj. 
It does not tie up a fixed chunk of RAM, and can grow to large sizes if 
necessary.  And maximum size can be limited in fstab.  (Possible 
improvement: allow human-readable sizes instead of just blocks.)

One problem is that tmpfs is cleared by a reboot.  This would surprise 
users expecting the default behavior (clear_tmp_enable="NO"), and would 
require some prominent warnings in the release notes and maybe in the 
installer.  Or in the startup scripts: "/tmp on tmpfs, contents will be 
discarded on reboot".

From owner-freebsd-fs@FreeBSD.ORG  Sun Jun  9 14:09:07 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115])
 by hub.freebsd.org (Postfix) with ESMTP id 06B80CC1
 for <freebsd-fs@freebsd.org>; Sun,  9 Jun 2013 14:09:07 +0000 (UTC)
 (envelope-from marck@rinet.ru)
Received: from woozle.rinet.ru (woozle.rinet.ru [195.54.192.68])
 by mx1.freebsd.org (Postfix) with ESMTP id 6F4601AD5
 for <freebsd-fs@freebsd.org>; Sun,  9 Jun 2013 14:09:06 +0000 (UTC)
Received: from localhost (localhost [127.0.0.1])
 by woozle.rinet.ru (8.14.5/8.14.5) with ESMTP id r59E950f090752;
 Sun, 9 Jun 2013 18:09:05 +0400 (MSK) (envelope-from marck@rinet.ru)
Date: Sun, 9 Jun 2013 18:09:05 +0400 (MSK)
From: Dmitry Morozovsky <marck@rinet.ru>
To: Jeremy Chadwick <jdc@koitsu.org>
Subject: Re: /tmp: change default to mdmfs and/or tmpfs?
In-Reply-To: <20130609124603.GA35681@icarus.home.lan>
Message-ID: <alpine.BSF.2.00.1306091759530.48048@woozle.rinet.ru>
References: <alpine.BSF.2.00.1306091538490.48048@woozle.rinet.ru>
 <20130609124603.GA35681@icarus.home.lan>
User-Agent: Alpine 2.00 (BSF 1167 2008-08-23)
X-NCC-RegID: ru.rinet
X-OpenPGP-Key-ID: 6B691B03
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.4.3
 (woozle.rinet.ru [0.0.0.0]); Sun, 09 Jun 2013 18:09:05 +0400 (MSK)
Cc: freebsd-fs@freebsd.org
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sun, 09 Jun 2013 14:09:07 -0000

On Sun, 9 Jun 2013, Jeremy Chadwick wrote:

[back to second part]

[and snip a lot here too]

> Where someone stated that excessive ARC usage on ZFS had an indirect
> effect on tmpfs.  r233769 to stable/9 may have fixed this, but given the
> history of all of this "juggling" of Feature X causing memory exhaustion
> for Feature Y, and in turn affecting Feature Z, all within kernel space,
> I really don't know how much I can trust all of this.
> 
> One should probably review the FreeBSD forums for other posts as well,
> as gut feeling says there's probably more there too.

.. that's why I'm trying to discuss this in public (maybe wrong list had been 
chosen, perhaps -stable@ would fit a bit more) -- to share knowledge, opinions 
and other related stuff ;)

> In closing:
> 
> If you want to make bsdinstall ask/prompt the administrator "would you
> like to use tmpfs for /tmp?", then I'm all for it -- sounds good to me.
> But doing it by default would be something (at this time) I would not be
> in favour of.  I just don't get the impression of stability from tmpfs
> given its track record.  (Yes, I am paranoid in this regard)

Agree at most.

> *** -- For example I personally have experienced strange behaviour when
> ZFS+UFS are used on the same system with massive amounts of I/O being
> done between the two (my experience showed the ZFS ARC suddenly limiting
> itself in a strange manner, to some abysmally small limit (much lower
> than arc_max)).  In this case, I can only imagine tmpfs making things
> "even worse" given added memory pressure and so on.

For our backup server, which uses rather huge 24*2T raidz2 and periodically 
synced on eSATA UFS, I sometimes seen speed drops, but nothing really bad.
It's stable/9 with 16G of RAM though, perhaps on systems where RAM is tighter 
the situation could be much worse...


-- 
Sincerely,
D.Marck                                     [DM5020, MCK-RIPE, DM3-RIPN]
[ FreeBSD committer:                                 marck@FreeBSD.org ]
------------------------------------------------------------------------
*** Dmitry Morozovsky --- D.Marck --- Wild Woozle --- marck@rinet.ru ***
------------------------------------------------------------------------

From owner-freebsd-fs@FreeBSD.ORG  Sun Jun  9 16:37:41 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by hub.freebsd.org (Postfix) with ESMTP id E4A0E1FD;
 Sun,  9 Jun 2013 16:37:41 +0000 (UTC)
 (envelope-from amvandemore@gmail.com)
Received: from mail-pd0-f177.google.com (mail-pd0-f177.google.com
 [209.85.192.177])
 by mx1.freebsd.org (Postfix) with ESMTP id BDECE111A;
 Sun,  9 Jun 2013 16:37:41 +0000 (UTC)
Received: by mail-pd0-f177.google.com with SMTP id p10so1071300pdj.8
 for <multiple recipients>; Sun, 09 Jun 2013 09:37:41 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
 h=mime-version:in-reply-to:references:date:message-id:subject:from:to
 :cc:content-type;
 bh=HuRdr8G1D01VyeayCEqb/wxl1tWu6SzrR37YxYA0d6I=;
 b=TgRZuto+mvf9mSNglXbvbD9t8pIr/Z8kUEW32XQ2zrzA/dWIb0g/ZHeHXeD+GW+nkS
 DjQnGrs519FXmSKETod+ltqPGPEr3tRgyy9OXGHkWsnamegR7So8RNjzqBCWpeSCt0AZ
 xORRN0g9zERMZahjymll9TtAutwKiJxFTnAiBbYEz7Msx+UPyaFDgMhirsfvjA9JjcUe
 mjn3HZ7m7pq4Azu8LWdmFrtxe36QKuc4z5/I8li4ryGRNvXs8gM+Mxxxzo8ruKumFiAs
 +RvrQij1LvY7LxAwNC6nnrd+MQUnaE7X+twcUahukfoJYDNa9seYj2K4JqAc9l8zyhdY
 +y9A==
MIME-Version: 1.0
X-Received: by 10.66.26.231 with SMTP id o7mr10647586pag.207.1370795861194;
 Sun, 09 Jun 2013 09:37:41 -0700 (PDT)
Received: by 10.70.31.195 with HTTP; Sun, 9 Jun 2013 09:37:41 -0700 (PDT)
In-Reply-To: <CAGMYy3s9V_FYWBBoCPyak9DpM_XZJ_ReVrOZqgmGmJ_KzcKzVA@mail.gmail.com>
References: <16FEF774EE8E4100AD2CAEC65276A49D@multiplay.co.uk>
 <CAGMYy3s9V_FYWBBoCPyak9DpM_XZJ_ReVrOZqgmGmJ_KzcKzVA@mail.gmail.com>
Date: Sun, 9 Jun 2013 11:37:41 -0500
Message-ID: <CA+tpaK1b7tG5ZWsRKj6NDo-9agBKehwKcr88Zj3ude-LMEQm_Q@mail.gmail.com>
Subject: Re: Changing the default for ZFS atime to off?
From: Adam Vande More <amvandemore@gmail.com>
To: Xin LI <delphij@gmail.com>
Content-Type: text/plain; charset=ISO-8859-1
X-Content-Filtered-By: Mailman/MimeDel 2.1.14
Cc: Steven Hartland <smh@freebsd.org>, fs@freebsd.org
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sun, 09 Jun 2013 16:37:42 -0000

On Sat, Jun 8, 2013 at 10:04 PM, Xin LI <delphij@gmail.com> wrote:

> I'd suggest implementing relative atime in VFS layer first:
>
>
> https://github.com/delphij/freebsd/commit/6a199821fbdbf424027499d4a0f8f113f6943e16


Cool, looks like you were already on this. I would offer to test some, but
I'm pretty much ZFS only at this point.  I imagine there would be much less
objections to defaulting to relatime rather than noatime.  AFAIK, relatime
doesn't break any major tools.


-- 
Adam Vande More

From owner-freebsd-fs@FreeBSD.ORG  Sun Jun  9 16:39:46 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115])
 by hub.freebsd.org (Postfix) with ESMTP id 844C62E0
 for <fs@freebsd.org>; Sun,  9 Jun 2013 16:39:46 +0000 (UTC)
 (envelope-from prvs=18721298a7=killing@multiplay.co.uk)
Received: from mail1.multiplay.co.uk (mail1.multiplay.co.uk [85.236.96.23])
 by mx1.freebsd.org (Postfix) with ESMTP id 2915C1131
 for <fs@freebsd.org>; Sun,  9 Jun 2013 16:39:45 +0000 (UTC)
Received: from r2d2 ([82.69.141.170])
 by mail1.multiplay.co.uk (mail1.multiplay.co.uk [85.236.96.23])
 (MDaemon PRO v10.0.4) with ESMTP id md50004231304.msg
 for <fs@freebsd.org>; Sun, 09 Jun 2013 17:39:44 +0100
X-Spam-Processed: mail1.multiplay.co.uk, Sun, 09 Jun 2013 17:39:44 +0100
 (not processed: message from valid local sender)
X-MDDKIM-Result: neutral (mail1.multiplay.co.uk)
X-MDRemoteIP: 82.69.141.170
X-Return-Path: prvs=18721298a7=killing@multiplay.co.uk
X-Envelope-From: killing@multiplay.co.uk
X-MDaemon-Deliver-To: fs@freebsd.org
Message-ID: <3152D35416D047BCA14009F3108A8967@multiplay.co.uk>
From: "Steven Hartland" <killing@multiplay.co.uk>
To: "Damien Fleuriot" <ml@my.gd>
References: <16FEF774EE8E4100AD2CAEC65276A49D@multiplay.co.uk>
 <2AC5E8F4-3AF1-4EA5-975D-741506AC70A5@my.gd>
Subject: Re: Changing the default for ZFS atime to off?
Date: Sun, 9 Jun 2013 17:39:42 +0100
MIME-Version: 1.0
Content-Type: text/plain; format=flowed; charset="iso-8859-1";
 reply-type=original
Content-Transfer-Encoding: 7bit
X-Priority: 3
X-MSMail-Priority: Normal
X-Mailer: Microsoft Outlook Express 6.00.2900.5931
X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2900.6157
Cc: fs@freebsd.org
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sun, 09 Jun 2013 16:39:46 -0000


----- Original Message ----- 
From: "Damien Fleuriot" <ml@my.gd>
To: "Steven Hartland" <smh@freebsd.org>
Cc: <fs@freebsd.org>
Sent: Sunday, June 09, 2013 11:39 AM
Subject: Re: Changing the default for ZFS atime to off?


> 
> On 8 Jun 2013, at 20:54, "Steven Hartland" <smh@freebsd.org> wrote:
> 
>> One of the first changes we make here when installing machines
>> here to changing atime=off on all ZFS pool roots.
>> 
>> I know there are a few apps which can rely on atime updates
>> such as qmail and possibly postfix, but those seem like special
>> cases for which admins should enable atime instead of the other
>> way round.
>> 
>> This is going to of particular interest for flash based storage
>> which should avoid unnessacary writes to reduce wear, but it will
>> also help improve performance in general.
>> 
>> So what do people think is it worth considering changing the
>> default from atime=on to atime=off moving forward?
>> 
>> If so what about UFS, same change?
>
> I strongly oppose the change for reasons already raised by many
> people regarding the mbox file.
> 
> Besides, if atime should default to off on 2 filesystems and on
> on all others, that would definitely create confusion.

A very valid point.

> Last, I believe it should be the admin's decision to turn atime
> off, just like it is his decision to turn compression on.

Trying to play devils advocate here; compression is off by default
because it uses resources and doesn't give a benefit for all cases.

Is that not the same as atime, and it should be an admins decision
to turn it on where it's wanted?

> Don't mistake me, we turn atime=off on every box, every
> filesystem, even on Mac's HFS.
> Yet I believe defaulting it to off is a mistake.

That's what prompted me to start this discussion. If a large portion
of users either disable atime already or would disable atime if they
knew about it, does that bring into question the current default?

Potentially a better solution would be to make atime an option
in the installer, as that helps educate admins that the option
exists, which is potentially the biggest issue here?

    Regards
    Steve

================================================
This e.mail is private and confidential between Multiplay (UK) Ltd. and the person or entity to whom it is addressed. In the event of misdirection, the recipient is prohibited from using, copying, printing or otherwise disseminating it or any information contained in it. 

In the event of misdirection, illegible or incomplete transmission please telephone +44 845 868 1337
or return the E.mail to postmaster@multiplay.co.uk.


From owner-freebsd-fs@FreeBSD.ORG  Sun Jun  9 16:42:34 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by hub.freebsd.org (Postfix) with ESMTP id D00903AA
 for <fs@freebsd.org>; Sun,  9 Jun 2013 16:42:34 +0000 (UTC)
 (envelope-from prvs=18721298a7=killing@multiplay.co.uk)
Received: from mail1.multiplay.co.uk (mail1.multiplay.co.uk [85.236.96.23])
 by mx1.freebsd.org (Postfix) with ESMTP id 7138D1146
 for <fs@freebsd.org>; Sun,  9 Jun 2013 16:42:34 +0000 (UTC)
Received: from r2d2 ([82.69.141.170])
 by mail1.multiplay.co.uk (mail1.multiplay.co.uk [85.236.96.23])
 (MDaemon PRO v10.0.4) with ESMTP id md50004231359.msg
 for <fs@freebsd.org>; Sun, 09 Jun 2013 17:42:34 +0100
X-Spam-Processed: mail1.multiplay.co.uk, Sun, 09 Jun 2013 17:42:34 +0100
 (not processed: message from valid local sender)
X-MDDKIM-Result: neutral (mail1.multiplay.co.uk)
X-MDRemoteIP: 82.69.141.170
X-Return-Path: prvs=18721298a7=killing@multiplay.co.uk
X-Envelope-From: killing@multiplay.co.uk
X-MDaemon-Deliver-To: fs@freebsd.org
Message-ID: <E8C0862413CE458CAD26703A9E91E8C7@multiplay.co.uk>
From: "Steven Hartland" <killing@multiplay.co.uk>
To: "Xin LI" <delphij@gmail.com>
References: <16FEF774EE8E4100AD2CAEC65276A49D@multiplay.co.uk>
 <CAGMYy3s9V_FYWBBoCPyak9DpM_XZJ_ReVrOZqgmGmJ_KzcKzVA@mail.gmail.com>
Subject: Re: Changing the default for ZFS atime to off?
Date: Sun, 9 Jun 2013 17:42:31 +0100
MIME-Version: 1.0
Content-Type: text/plain; format=flowed; charset="UTF-8"; reply-type=original
Content-Transfer-Encoding: 7bit
X-Priority: 3
X-MSMail-Priority: Normal
X-Mailer: Microsoft Outlook Express 6.00.2900.5931
X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2900.6157
Cc: fs@freebsd.org
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sun, 09 Jun 2013 16:42:34 -0000

----- Original Message ----- 
From: "Xin LI" <delphij@gmail.com>

> I'd suggest implementing relative atime in VFS layer first:
> 
> https://github.com/delphij/freebsd/commit/6a199821fbdbf424027499d4a0f8f113f6943e16
>

Cool, its this something your looking to commit to HEAD Xin?

    Regards
    Steve

================================================
This e.mail is private and confidential between Multiplay (UK) Ltd. and the person or entity to whom it is addressed. In the event of misdirection, the recipient is prohibited from using, copying, printing or otherwise disseminating it or any information contained in it. 

In the event of misdirection, illegible or incomplete transmission please telephone +44 845 868 1337
or return the E.mail to postmaster@multiplay.co.uk.


From owner-freebsd-fs@FreeBSD.ORG  Sun Jun  9 17:14:08 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by hub.freebsd.org (Postfix) with ESMTP id 4215B257
 for <fs@freebsd.org>; Sun,  9 Jun 2013 17:14:08 +0000 (UTC)
 (envelope-from jdc@koitsu.org)
Received: from relay4-d.mail.gandi.net (relay4-d.mail.gandi.net
 [217.70.183.196])
 by mx1.freebsd.org (Postfix) with ESMTP id D9BF312FE
 for <fs@freebsd.org>; Sun,  9 Jun 2013 17:14:07 +0000 (UTC)
Received: from mfilter10-d.gandi.net (mfilter10-d.gandi.net [217.70.178.139])
 by relay4-d.mail.gandi.net (Postfix) with ESMTP id BF177172089;
 Sun,  9 Jun 2013 19:13:56 +0200 (CEST)
X-Virus-Scanned: Debian amavisd-new at mfilter10-d.gandi.net
Received: from relay4-d.mail.gandi.net ([217.70.183.196])
 by mfilter10-d.gandi.net (mfilter10-d.gandi.net [10.0.15.180]) (amavisd-new,
 port 10024)
 with ESMTP id NT9OyW5jv1WZ; Sun,  9 Jun 2013 19:13:55 +0200 (CEST)
X-Originating-IP: 76.102.14.35
Received: from jdc.koitsu.org (c-76-102-14-35.hsd1.ca.comcast.net
 [76.102.14.35]) (Authenticated sender: jdc@koitsu.org)
 by relay4-d.mail.gandi.net (Postfix) with ESMTPSA id 97F37172067;
 Sun,  9 Jun 2013 19:13:54 +0200 (CEST)
Received: by icarus.home.lan (Postfix, from userid 1000)
 id DEBAF73A1C; Sun,  9 Jun 2013 10:13:51 -0700 (PDT)
Date: Sun, 9 Jun 2013 10:13:51 -0700
From: Jeremy Chadwick <jdc@koitsu.org>
To: Steven Hartland <killing@multiplay.co.uk>
Subject: Re: Changing the default for ZFS atime to off?
Message-ID: <20130609171351.GA41133@icarus.home.lan>
References: <16FEF774EE8E4100AD2CAEC65276A49D@multiplay.co.uk>
 <2AC5E8F4-3AF1-4EA5-975D-741506AC70A5@my.gd>
 <3152D35416D047BCA14009F3108A8967@multiplay.co.uk>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <3152D35416D047BCA14009F3108A8967@multiplay.co.uk>
User-Agent: Mutt/1.5.21 (2010-09-15)
Cc: fs@freebsd.org
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sun, 09 Jun 2013 17:14:08 -0000

On Sun, Jun 09, 2013 at 05:39:42PM +0100, Steven Hartland wrote:
> 
> ----- Original Message ----- From: "Damien Fleuriot" <ml@my.gd>
> To: "Steven Hartland" <smh@freebsd.org>
> Cc: <fs@freebsd.org>
> Sent: Sunday, June 09, 2013 11:39 AM
> Subject: Re: Changing the default for ZFS atime to off?
> 
> 
> >
> >On 8 Jun 2013, at 20:54, "Steven Hartland" <smh@freebsd.org> wrote:
> >
> >>One of the first changes we make here when installing machines
> >>here to changing atime=off on all ZFS pool roots.
> >>
> >>I know there are a few apps which can rely on atime updates
> >>such as qmail and possibly postfix, but those seem like special
> >>cases for which admins should enable atime instead of the other
> >>way round.
> >>
> >>This is going to of particular interest for flash based storage
> >>which should avoid unnessacary writes to reduce wear, but it will
> >>also help improve performance in general.
> >>
> >>So what do people think is it worth considering changing the
> >>default from atime=on to atime=off moving forward?
> >>
> >>If so what about UFS, same change?
> >
> >I strongly oppose the change for reasons already raised by many
> >people regarding the mbox file.
> >
> >Besides, if atime should default to off on 2 filesystems and on
> >on all others, that would definitely create confusion.
> 
> A very valid point.
> 
> >Last, I believe it should be the admin's decision to turn atime
> >off, just like it is his decision to turn compression on.
> 
> Trying to play devils advocate here; compression is off by default
> because it uses resources and doesn't give a benefit for all cases.

Not to mention ZFS on FreeBSD, specifically WRT compression and dedup,
still lack a separate priority class for their threads.  Info on that:

http://lists.freebsd.org/pipermail/freebsd-fs/2011-October/012718.html
http://lists.freebsd.org/pipermail/freebsd-fs/2011-October/012726.html

While the discussion about atime default is fine/good to have, there are
bigger/more impacting than atime.  (Compression and dedup are something
people *really* want to use, and I understand -- hell, I'd be using
compression if it weren't for the above problem.  It's the sole blocker
for me -- really).

> Is that not the same as atime, and it should be an admins decision
> to turn it on where it's wanted?

While I understand you're playing devil's advocate, you will find I, as
well as most BSD people (in my experience), tend to err on the side of
caution.  That means atime=on as a default.

> >Don't mistake me, we turn atime=off on every box, every
> >filesystem, even on Mac's HFS.
> >Yet I believe defaulting it to off is a mistake.
> 
> That's what prompted me to start this discussion. If a large portion
> of users either disable atime already or would disable atime if they
> knew about it, does that bring into question the current default?

You've just encountered 1 user who sets atime=off on every box they
admin/maintain.

And you have me -- who has atime=on on every box he admins/maintains.

If you're looking for a vote, you won't get one that satisfies everyone,
nor the majority of FreeBSD users -- because most users are not
subscribed to the mailing list, do not visit the forums, etc..  They
install the OS + use it and live happily in their hobbit hole.

I also have no idea how this would impact the commercial companies who
rely on FreeBSD for their enterprise products.  I imagine their feedback
would (should?  Matter of opinion) hold more weight.

> Potentially a better solution would be to make atime an option
> in the installer, as that helps educate admins that the option
> exists, which is potentially the biggest issue here?

As I've stated in some other threads (probably on -stable), I'm all for
people adding options/checkboxes/etc. to bsdinstall to allow more
granularity during installation (vs. having to do things after-the-fact
or the "final shell" prior to rebooting).  If someone wants to add an
atime checkbox (checked == atime enabled) to the filesystem creation
phase, that's fantastic.

But I strongly feel that checkbox needs to default to checked/enabled,
solely so there are no "unwanted surprises" since we have no idea what
software they'll be using on the system.

There is also (still) the concern of POSIX compliance, which the BSDs
have historically been very strict about.  I guess you can hash that out
with Bruce.

Honestly the relatime thing from Linux sounds like a decent compromise.

-- 
| Jeremy Chadwick                                   jdc@koitsu.org |
| UNIX Systems Administrator                http://jdc.koitsu.org/ |
| Making life hard for others since 1977.             PGP 4BD6C0CB |


From owner-freebsd-fs@FreeBSD.ORG  Sun Jun  9 20:25:49 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by hub.freebsd.org (Postfix) with ESMTP id 1521E7A8
 for <freebsd-fs@freebsd.org>; Sun,  9 Jun 2013 20:25:49 +0000 (UTC)
 (envelope-from rcartwri@asu.edu)
Received: from mail-wg0-x22c.google.com (mail-wg0-x22c.google.com
 [IPv6:2a00:1450:400c:c00::22c])
 by mx1.freebsd.org (Postfix) with ESMTP id A47A71089
 for <freebsd-fs@freebsd.org>; Sun,  9 Jun 2013 20:25:48 +0000 (UTC)
Received: by mail-wg0-f44.google.com with SMTP id m15so3301962wgh.23
 for <freebsd-fs@freebsd.org>; Sun, 09 Jun 2013 13:25:47 -0700 (PDT)
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=google.com; s=20120113;
 h=mime-version:in-reply-to:references:date:message-id:subject:from:to
 :cc:content-type:x-gm-message-state;
 bh=Xq5I2j4UaEqkJhZFPlUFAtbAZnv5+9q6B0USzNHIsLg=;
 b=OLudLj6WcgGVCkZZL8yrjZE7oL0rhBZz59ByaR3lrkzFsNBcEz8DB7qvOe4XorfHLh
 3KuQfwEo19SyRwPMiTk6nyRCEKRy1J8/UvyDzTPzG1SCj+6U0q5zyeNCNOKX+prrZS7D
 A/dCOzzgniKYpxpr/B5qje46iduomSdsNlOFskDkhFjtfjFK4QI5j6lF04m8XH2claz1
 TBylVXZvKilAAnlD3RExoQwZqamRCQhty14VUTftnt6lyycHlxCtrYOkRzJVoKVpYunz
 iak7B7Rj7QcBr1EHVcYOcFMPYJNySyuob/NmRXkmfvil/vdG2/GNsnCIs1i/gqIUQRuf
 PkOQ==
MIME-Version: 1.0
X-Received: by 10.194.123.9 with SMTP id lw9mr4104756wjb.24.1370809547842;
 Sun, 09 Jun 2013 13:25:47 -0700 (PDT)
Received: by 10.180.76.114 with HTTP; Sun, 9 Jun 2013 13:25:47 -0700 (PDT)
In-Reply-To: <20130609065430.GA28206@icarus.home.lan>
References: <CALOkxuzH81UFuVZifJNxyuo6+hu9mCPB1TC91dn5fkjVLFqTKw@mail.gmail.com>
 <CALOkxuxb0HBMXOwP6Z4JjvBk+btnriCeCdhcKfmP=FyNEaseTA@mail.gmail.com>
 <CALOkxuxT9Oko1jm40GzYZbcTNu4epNgh8BoVhc-cH2UKNNbSvw@mail.gmail.com>
 <20130609065430.GA28206@icarus.home.lan>
Date: Sun, 9 Jun 2013 13:25:47 -0700
Message-ID: <CALOkxuwVDYmJwFzgqWw_QFOQwXRjB0pisvLmXCR1YUnUgWPZqA@mail.gmail.com>
Subject: Re: ZFS and Glabel
From: "Reed A. Cartwright" <cartwright@asu.edu>
To: Jeremy Chadwick <jdc@koitsu.org>
Content-Type: text/plain; charset=ISO-8859-1
X-Gm-Message-State: ALoCoQnwqpJD6hBCDxu4D/NnKeHWRVFbbG2p3vXTz2wE4M8NzpdiaW7a8MqhhuLGZxKzKY3eYWZ6
Cc: freebsd-fs@freebsd.org
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sun, 09 Jun 2013 20:25:49 -0000

Thanks, it makes sense now.

Would it make sense to have a script that processes the output of "cam
devlist -v" to produce such an example output?

On Sat, Jun 8, 2013 at 11:54 PM, Jeremy Chadwick <jdc@koitsu.org> wrote:
> On Sat, Jun 08, 2013 at 09:46:18PM -0700, Reed A. Cartwright wrote:
>> I'm looking at my dmesg.boot to figure out what settings I need to
>> wire down my HDDs.  I read the cam(4) documentation but I'm not sure I
>> know what I'm doing.  Any advice would be helpful.
>>
>> Let's assume that I want to wire everything down to their current
>> positions, what should I put in loader.conf?  I'll paste below some of
>> my hardware configuration and lines from dmesg.boot that I think I
>> need to look at.
>>
>> I have 4 LSI cards in the system: mps0, mps1, mps2, mps3.
>>
>> mps0: <LSI SAS2008> port 0xd000-0xd0ff mem
>> 0xdff3c000-0xdff3ffff,0xdff40000-0xdff7ffff irq 24 at device 0.0 on
>> pci5
>> mps1: <LSI SAS2008> port 0xc000-0xc0ff mem
>> 0xdfe3c000-0xdfe3ffff,0xdfe40000-0xdfe7ffff irq 44 at device 0.0 on
>> pci4
>> mps2: <LSI SAS2008> port 0xb000-0xb0ff mem
>> 0xdfd3c000-0xdfd3ffff,0xdfd40000-0xdfd7ffff irq 32 at device 0.0 on
>> pci3
>> mps3: <LSI SAS2008> port 0xe000-0xe0ff mem
>> 0xdbf3c000-0xdbf3ffff,0xdbf40000-0xdbf7ffff irq 56 at device 0.0 on
>> pci65
>>
>> I have drives attached to two of those cards:
>>
>> da0 at mps0 bus 0 scbus0 target 0 lun 0
>> da1 at mps0 bus 0 scbus0 target 1 lun 0
>> da2 at mps0 bus 0 scbus0 target 2 lun 0
>> da3 at mps0 bus 0 scbus0 target 3 lun 0
>> da4 at mps0 bus 0 scbus0 target 4 lun 0
>> da5 at mps0 bus 0 scbus0 target 5 lun 0
>> da6 at mps0 bus 0 scbus0 target 6 lun 0
>> da7 at mps0 bus 0 scbus0 target 7 lun 0
>
>> da8 at mps3 bus 0 scbus9 target 0 lun 0
>> da9 at mps3 bus 0 scbus9 target 1 lun 0
>> da10 at mps3 bus 0 scbus9 target 2 lun 0
>> da11 at mps3 bus 0 scbus9 target 3 lun 0
>> da12 at mps3 bus 0 scbus9 target 4 lun 0
>>
>> {snip}
>
> As usual, the situation is insane because you have so many controllers
> on the system (more than just mps(4)) -- specifically 11 separate
> controllers or systems using CAM (hence scbus0 to scbus10).
>
> Below is for mps(4).  If you want to wire down ahci(4), things are
> a bit different, but you can read this post of mine:
>
> http://lists.freebsd.org/pipermail/freebsd-stable/2013-January/071851.html
>
> Enjoy:
>
> hint.scbus.0.at="mps0"
> hint.scbus.1.at="mps1"
> hint.scbus.2.at="mps2"
> hint.scbus.9.at="mps3"
> hint.da.0.at="scbus0"
> hint.da.1.at="scbus0"
> hint.da.2.at="scbus0"
> hint.da.3.at="scbus0"
> hint.da.4.at="scbus0"
> hint.da.5.at="scbus0"
> hint.da.6.at="scbus0"
> hint.da.7.at="scbus0"
> hint.da.8.at="scbus9"
> hint.da.9.at="scbus9"
> hint.da.10.at="scbus9"
> hint.da.11.at="scbus9"
> hint.da.12.at="scbus9"
> hint.da.13.at="scbus9"
> hint.da.14.at="scbus9"
> hint.da.15.at="scbus9"
> hint.da.16.at="scbus1"
> hint.da.17.at="scbus1"
> hint.da.18.at="scbus1"
> hint.da.19.at="scbus1"
> hint.da.20.at="scbus1"
> hint.da.21.at="scbus1"
> hint.da.22.at="scbus1"
> hint.da.23.at="scbus1"
> hint.da.24.at="scbus2"
> hint.da.25.at="scbus2"
> hint.da.26.at="scbus2"
> hint.da.27.at="scbus2"
> hint.da.28.at="scbus2"
> hint.da.29.at="scbus2"
> hint.da.30.at="scbus2"
> hint.da.31.at="scbus2"
> hint.da.0.target="0"
> hint.da.1.target="1"
> hint.da.2.target="2"
> hint.da.3.target="3"
> hint.da.4.target="4"
> hint.da.5.target="5"
> hint.da.6.target="6"
> hint.da.7.target="7"
> hint.da.8.target="0"
> hint.da.9.target="1"
> hint.da.10.target="2"
> hint.da.11.target="3"
> hint.da.12.target="4"
> hint.da.13.target="5"
> hint.da.14.target="6"
> hint.da.15.target="7"
> hint.da.16.target="0"
> hint.da.17.target="1"
> hint.da.18.target="2"
> hint.da.19.target="3"
> hint.da.20.target="4"
> hint.da.21.target="5"
> hint.da.22.target="6"
> hint.da.23.target="7"
> hint.da.24.target="0"
> hint.da.25.target="1"
> hint.da.26.target="2"
> hint.da.27.target="3"
> hint.da.28.target="4"
> hint.da.29.target="5"
> hint.da.30.target="6"
> hint.da.31.target="7"
>
> --
> | Jeremy Chadwick                                   jdc@koitsu.org |
> | UNIX Systems Administrator                http://jdc.koitsu.org/ |
> | Making life hard for others since 1977.             PGP 4BD6C0CB |
>


-- 
Reed A. Cartwright, PhD
Assistant Professor of Genomics, Evolution, and Bioinformatics
School of Life Sciences
Center for Evolutionary Medicine and Informatics
The Biodesign Institute
Arizona State University
-
Address: The Biodesign Institute, PO Box 875301, Tempe, AZ 85287-5301 USA
Packages: The Biodesign Institute, 1001 S. McAllister Ave, Tempe, AZ
85287-5301 USA
Office: Biodesign A-224A, 1-480-965-9949

From owner-freebsd-fs@FreeBSD.ORG  Mon Jun 10 02:20:14 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115])
 by hub.freebsd.org (Postfix) with ESMTP id 0F5C263D
 for <freebsd-fs@freebsd.org>; Mon, 10 Jun 2013 02:20:14 +0000 (UTC)
 (envelope-from jdc@koitsu.org)
Received: from relay5-d.mail.gandi.net (relay5-d.mail.gandi.net
 [217.70.183.197])
 by mx1.freebsd.org (Postfix) with ESMTP id AA69F1622
 for <freebsd-fs@freebsd.org>; Mon, 10 Jun 2013 02:20:13 +0000 (UTC)
Received: from mfilter14-d.gandi.net (mfilter14-d.gandi.net [217.70.178.142])
 by relay5-d.mail.gandi.net (Postfix) with ESMTP id 15EC441C064;
 Mon, 10 Jun 2013 04:19:57 +0200 (CEST)
X-Virus-Scanned: Debian amavisd-new at mfilter14-d.gandi.net
Received: from relay5-d.mail.gandi.net ([217.70.183.197])
 by mfilter14-d.gandi.net (mfilter14-d.gandi.net [10.0.15.180]) (amavisd-new,
 port 10024)
 with ESMTP id A5-+RIutWa4L; Mon, 10 Jun 2013 04:19:55 +0200 (CEST)
X-Originating-IP: 76.102.14.35
Received: from jdc.koitsu.org (c-76-102-14-35.hsd1.ca.comcast.net
 [76.102.14.35]) (Authenticated sender: jdc@koitsu.org)
 by relay5-d.mail.gandi.net (Postfix) with ESMTPSA id B5BCB41C05C;
 Mon, 10 Jun 2013 04:19:54 +0200 (CEST)
Received: by icarus.home.lan (Postfix, from userid 1000)
 id D00C773A1C; Sun,  9 Jun 2013 19:19:50 -0700 (PDT)
Date: Sun, 9 Jun 2013 19:19:50 -0700
From: Jeremy Chadwick <jdc@koitsu.org>
To: "Reed A. Cartwright" <cartwright@asu.edu>
Subject: Re: ZFS and Glabel
Message-ID: <20130610021950.GA50356@icarus.home.lan>
References: <CALOkxuzH81UFuVZifJNxyuo6+hu9mCPB1TC91dn5fkjVLFqTKw@mail.gmail.com>
 <CALOkxuxb0HBMXOwP6Z4JjvBk+btnriCeCdhcKfmP=FyNEaseTA@mail.gmail.com>
 <CALOkxuxT9Oko1jm40GzYZbcTNu4epNgh8BoVhc-cH2UKNNbSvw@mail.gmail.com>
 <20130609065430.GA28206@icarus.home.lan>
 <CALOkxuwVDYmJwFzgqWw_QFOQwXRjB0pisvLmXCR1YUnUgWPZqA@mail.gmail.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <CALOkxuwVDYmJwFzgqWw_QFOQwXRjB0pisvLmXCR1YUnUgWPZqA@mail.gmail.com>
User-Agent: Mutt/1.5.21 (2010-09-15)
Cc: freebsd-fs@freebsd.org
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 10 Jun 2013 02:20:14 -0000

On Sun, Jun 09, 2013 at 01:25:47PM -0700, Reed A. Cartwright wrote:
> Thanks, it makes sense now.
> 
> Would it make sense to have a script that processes the output of "cam
> devlist -v" to produce such an example output?
> 
> On Sat, Jun 8, 2013 at 11:54 PM, Jeremy Chadwick <jdc@koitsu.org> wrote:
> > On Sat, Jun 08, 2013 at 09:46:18PM -0700, Reed A. Cartwright wrote:
> >> I'm looking at my dmesg.boot to figure out what settings I need to
> >> wire down my HDDs.  I read the cam(4) documentation but I'm not sure I
> >> know what I'm doing.  Any advice would be helpful.
> >>
> >> Let's assume that I want to wire everything down to their current
> >> positions, what should I put in loader.conf?  I'll paste below some of
> >> my hardware configuration and lines from dmesg.boot that I think I
> >> need to look at.
> >>
> >> I have 4 LSI cards in the system: mps0, mps1, mps2, mps3.
> >>
> >> mps0: <LSI SAS2008> port 0xd000-0xd0ff mem
> >> 0xdff3c000-0xdff3ffff,0xdff40000-0xdff7ffff irq 24 at device 0.0 on
> >> pci5
> >> mps1: <LSI SAS2008> port 0xc000-0xc0ff mem
> >> 0xdfe3c000-0xdfe3ffff,0xdfe40000-0xdfe7ffff irq 44 at device 0.0 on
> >> pci4
> >> mps2: <LSI SAS2008> port 0xb000-0xb0ff mem
> >> 0xdfd3c000-0xdfd3ffff,0xdfd40000-0xdfd7ffff irq 32 at device 0.0 on
> >> pci3
> >> mps3: <LSI SAS2008> port 0xe000-0xe0ff mem
> >> 0xdbf3c000-0xdbf3ffff,0xdbf40000-0xdbf7ffff irq 56 at device 0.0 on
> >> pci65
> >>
> >> I have drives attached to two of those cards:
> >>
> >> da0 at mps0 bus 0 scbus0 target 0 lun 0
> >> da1 at mps0 bus 0 scbus0 target 1 lun 0
> >> da2 at mps0 bus 0 scbus0 target 2 lun 0
> >> da3 at mps0 bus 0 scbus0 target 3 lun 0
> >> da4 at mps0 bus 0 scbus0 target 4 lun 0
> >> da5 at mps0 bus 0 scbus0 target 5 lun 0
> >> da6 at mps0 bus 0 scbus0 target 6 lun 0
> >> da7 at mps0 bus 0 scbus0 target 7 lun 0
> >
> >> da8 at mps3 bus 0 scbus9 target 0 lun 0
> >> da9 at mps3 bus 0 scbus9 target 1 lun 0
> >> da10 at mps3 bus 0 scbus9 target 2 lun 0
> >> da11 at mps3 bus 0 scbus9 target 3 lun 0
> >> da12 at mps3 bus 0 scbus9 target 4 lun 0
> >>
> >> {snip}
> >
> > As usual, the situation is insane because you have so many controllers
> > on the system (more than just mps(4)) -- specifically 11 separate
> > controllers or systems using CAM (hence scbus0 to scbus10).
> >
> > Below is for mps(4).  If you want to wire down ahci(4), things are
> > a bit different, but you can read this post of mine:
> >
> > http://lists.freebsd.org/pipermail/freebsd-stable/2013-January/071851.html
> >
> > Enjoy:
> >
> > hint.scbus.0.at="mps0"
> > hint.scbus.1.at="mps1"
> > hint.scbus.2.at="mps2"
> > hint.scbus.9.at="mps3"
> > hint.da.0.at="scbus0"
> > hint.da.1.at="scbus0"
> > hint.da.2.at="scbus0"
> > hint.da.3.at="scbus0"
> > hint.da.4.at="scbus0"
> > hint.da.5.at="scbus0"
> > hint.da.6.at="scbus0"
> > hint.da.7.at="scbus0"
> > hint.da.8.at="scbus9"
> > hint.da.9.at="scbus9"
> > hint.da.10.at="scbus9"
> > hint.da.11.at="scbus9"
> > hint.da.12.at="scbus9"
> > hint.da.13.at="scbus9"
> > hint.da.14.at="scbus9"
> > hint.da.15.at="scbus9"
> > hint.da.16.at="scbus1"
> > hint.da.17.at="scbus1"
> > hint.da.18.at="scbus1"
> > hint.da.19.at="scbus1"
> > hint.da.20.at="scbus1"
> > hint.da.21.at="scbus1"
> > hint.da.22.at="scbus1"
> > hint.da.23.at="scbus1"
> > hint.da.24.at="scbus2"
> > hint.da.25.at="scbus2"
> > hint.da.26.at="scbus2"
> > hint.da.27.at="scbus2"
> > hint.da.28.at="scbus2"
> > hint.da.29.at="scbus2"
> > hint.da.30.at="scbus2"
> > hint.da.31.at="scbus2"
> > hint.da.0.target="0"
> > hint.da.1.target="1"
> > hint.da.2.target="2"
> > hint.da.3.target="3"
> > hint.da.4.target="4"
> > hint.da.5.target="5"
> > hint.da.6.target="6"
> > hint.da.7.target="7"
> > hint.da.8.target="0"
> > hint.da.9.target="1"
> > hint.da.10.target="2"
> > hint.da.11.target="3"
> > hint.da.12.target="4"
> > hint.da.13.target="5"
> > hint.da.14.target="6"
> > hint.da.15.target="7"
> > hint.da.16.target="0"
> > hint.da.17.target="1"
> > hint.da.18.target="2"
> > hint.da.19.target="3"
> > hint.da.20.target="4"
> > hint.da.21.target="5"
> > hint.da.22.target="6"
> > hint.da.23.target="7"
> > hint.da.24.target="0"
> > hint.da.25.target="1"
> > hint.da.26.target="2"
> > hint.da.27.target="3"
> > hint.da.28.target="4"
> > hint.da.29.target="5"
> > hint.da.30.target="6"
> > hint.da.31.target="7"

The script would be ugly and require one-offs per driver.  For example,
look at your "camcontrol devlist -v" output with regards to mps1 and
mps2.  There's no indication of what the bus #, target #, or lun #
should be because there are no disks on the controller.  I made an
educated guess based off of mps0/mps3 and previous familiarity (on the
lists -- I've never used one of these controllers) with mps(4).

Had you shown me "camcontrol devlist -v" output with only 1 controller
and 1 disk, I would have had to go purely off of what I've seen in the
past.

The behaviour could change as well, depending on firmware upgrades or
driver changes (many of the storage drivers in FreeBSD in the past 3-4
years have gone through massive changes), or even operational mode (RAID
vs. non-RAID), where the target then is always 0 but the lun #
increases, or maybe the bus number, or maybe a combination.

If someone really wants to take a stab at writing some script that does
this, be my guest, but I definitely don't.  :-)  There's just too many
one-offs or assumptions that have to be made which a human mind +
experience can do more reliably, IMO.

Because remember: the last thing you want to do is modify loader.conf
for wiring down and botch it/break it.

-- 
| Jeremy Chadwick                                   jdc@koitsu.org |
| UNIX Systems Administrator                http://jdc.koitsu.org/ |
| Making life hard for others since 1977.             PGP 4BD6C0CB |


From owner-freebsd-fs@FreeBSD.ORG  Mon Jun 10 03:05:28 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115])
 by hub.freebsd.org (Postfix) with ESMTP id 6EDE9C35
 for <freebsd-fs@freebsd.org>; Mon, 10 Jun 2013 03:05:28 +0000 (UTC)
 (envelope-from julian@freebsd.org)
Received: from vps1.elischer.org (vps1.elischer.org [204.109.63.16])
 by mx1.freebsd.org (Postfix) with ESMTP id 46BBA182A
 for <freebsd-fs@freebsd.org>; Mon, 10 Jun 2013 03:05:27 +0000 (UTC)
Received: from jre-mbp.elischer.org
 (ppp121-45-237-17.lns20.per1.internode.on.net [121.45.237.17])
 (authenticated bits=0)
 by vps1.elischer.org (8.14.5/8.14.5) with ESMTP id r5A2d8Z9072116
 (version=TLSv1/SSLv3 cipher=DHE-RSA-CAMELLIA256-SHA bits=256 verify=NO);
 Sun, 9 Jun 2013 19:39:11 -0700 (PDT)
 (envelope-from julian@freebsd.org)
Message-ID: <51B53C4C.3030007@freebsd.org>
Date: Mon, 10 Jun 2013 10:39:08 +0800
From: Julian Elischer <julian@freebsd.org>
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.8;
 rv:17.0) Gecko/20130509 Thunderbird/17.0.6
MIME-Version: 1.0
To: Dmitry Morozovsky <marck@rinet.ru>
Subject: Re: /tmp: change default to mdmfs and/or tmpfs?
References: <alpine.BSF.2.00.1306091538490.48048@woozle.rinet.ru>
 <20130609124603.GA35681@icarus.home.lan>
 <alpine.BSF.2.00.1306091709350.48048@woozle.rinet.ru>
In-Reply-To: <alpine.BSF.2.00.1306091709350.48048@woozle.rinet.ru>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Cc: freebsd-fs@freebsd.org
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 10 Jun 2013 03:05:28 -0000

On 6/9/13 9:16 PM, Dmitry Morozovsky wrote:
> On Sun, 9 Jun 2013, Jeremy Chadwick wrote:
>
>>> what do you think about stop using precious disk or even SSD resources for
>>> /tmp?
>>>
>>> For last several (well, maybe over 10?) years I constantly use md (swap-backed)
>>> for /tmp, usually 128M in size, which is enough for most of our server needs.
>>> Some require more, but none more than 512M.  Regarding the options, we use
>>> tmpmfs_flags="-S -n -o async -b 4096 -f 512"
>> [...]

I sometimes use virtual filesystems but there are cases when I am 
looking to store HUGE amounts of trace data in /tmp and end up cursing 
and remounting a real disk partition.


From owner-freebsd-fs@FreeBSD.ORG  Mon Jun 10 03:26:24 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115])
 by hub.freebsd.org (Postfix) with ESMTP id 0AE27E53
 for <freebsd-fs@freebsd.org>; Mon, 10 Jun 2013 03:26:24 +0000 (UTC)
 (envelope-from wblock@wonkity.com)
Received: from wonkity.com (wonkity.com [67.158.26.137])
 by mx1.freebsd.org (Postfix) with ESMTP id C10E818D8
 for <freebsd-fs@freebsd.org>; Mon, 10 Jun 2013 03:26:23 +0000 (UTC)
Received: from wonkity.com (localhost [127.0.0.1])
 by wonkity.com (8.14.7/8.14.7) with ESMTP id r5A3QLJG075281;
 Sun, 9 Jun 2013 21:26:21 -0600 (MDT)
 (envelope-from wblock@wonkity.com)
Received: from localhost (wblock@localhost)
 by wonkity.com (8.14.7/8.14.7/Submit) with ESMTP id r5A3QLjg075278;
 Sun, 9 Jun 2013 21:26:21 -0600 (MDT)
 (envelope-from wblock@wonkity.com)
Date: Sun, 9 Jun 2013 21:26:21 -0600 (MDT)
From: Warren Block <wblock@wonkity.com>
To: Jeremy Chadwick <jdc@koitsu.org>
Subject: Re: ZFS and Glabel
In-Reply-To: <20130610021950.GA50356@icarus.home.lan>
Message-ID: <alpine.BSF.2.00.1306092122550.75158@wonkity.com>
References: <CALOkxuzH81UFuVZifJNxyuo6+hu9mCPB1TC91dn5fkjVLFqTKw@mail.gmail.com>
 <CALOkxuxb0HBMXOwP6Z4JjvBk+btnriCeCdhcKfmP=FyNEaseTA@mail.gmail.com>
 <CALOkxuxT9Oko1jm40GzYZbcTNu4epNgh8BoVhc-cH2UKNNbSvw@mail.gmail.com>
 <20130609065430.GA28206@icarus.home.lan>
 <CALOkxuwVDYmJwFzgqWw_QFOQwXRjB0pisvLmXCR1YUnUgWPZqA@mail.gmail.com>
 <20130610021950.GA50356@icarus.home.lan>
User-Agent: Alpine 2.00 (BSF 1167 2008-08-23)
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed
X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.4.3
 (wonkity.com [127.0.0.1]); Sun, 09 Jun 2013 21:26:21 -0600 (MDT)
Cc: freebsd-fs@freebsd.org, "Reed A. Cartwright" <cartwright@asu.edu>
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 10 Jun 2013 03:26:24 -0000

On Sun, 9 Jun 2013, Jeremy Chadwick wrote:

> On Sun, Jun 09, 2013 at 01:25:47PM -0700, Reed A. Cartwright wrote:
>> Thanks, it makes sense now.
>>
>> Would it make sense to have a script that processes the output of "cam
>> devlist -v" to produce such an example output?
>>>
...
>
> The script would be ugly and require one-offs per driver.  For example,
> look at your "camcontrol devlist -v" output with regards to mps1 and
> mps2.  There's no indication of what the bus #, target #, or lun #
> should be because there are no disks on the controller.  I made an
> educated guess based off of mps0/mps3 and previous familiarity (on the
> lists -- I've never used one of these controllers) with mps(4).
>
> Had you shown me "camcontrol devlist -v" output with only 1 controller
> and 1 disk, I would have had to go purely off of what I've seen in the
> past.

This all looks remarkably complicated and fragile compared to labels.

From owner-freebsd-fs@FreeBSD.ORG  Mon Jun 10 08:43:15 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115])
 by hub.freebsd.org (Postfix) with ESMTP id 11B6025A;
 Mon, 10 Jun 2013 08:43:15 +0000 (UTC) (envelope-from marck@rinet.ru)
Received: from woozle.rinet.ru (woozle.rinet.ru [195.54.192.68])
 by mx1.freebsd.org (Postfix) with ESMTP id 94D0D1393;
 Mon, 10 Jun 2013 08:43:14 +0000 (UTC)
Received: from localhost (localhost [127.0.0.1])
 by woozle.rinet.ru (8.14.5/8.14.5) with ESMTP id r5A8hC7s051794;
 Mon, 10 Jun 2013 12:43:12 +0400 (MSK) (envelope-from marck@rinet.ru)
Date: Mon, 10 Jun 2013 12:43:12 +0400 (MSK)
From: Dmitry Morozovsky <marck@rinet.ru>
To: Julian Elischer <julian@freebsd.org>
Subject: Re: /tmp: change default to mdmfs and/or tmpfs?
In-Reply-To: <51B53C4C.3030007@freebsd.org>
Message-ID: <alpine.BSF.2.00.1306101242500.48048@woozle.rinet.ru>
References: <alpine.BSF.2.00.1306091538490.48048@woozle.rinet.ru>
 <20130609124603.GA35681@icarus.home.lan>
 <alpine.BSF.2.00.1306091709350.48048@woozle.rinet.ru>
 <51B53C4C.3030007@freebsd.org>
User-Agent: Alpine 2.00 (BSF 1167 2008-08-23)
X-NCC-RegID: ru.rinet
X-OpenPGP-Key-ID: 6B691B03
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.4.3
 (woozle.rinet.ru [0.0.0.0]); Mon, 10 Jun 2013 12:43:12 +0400 (MSK)
Cc: freebsd-fs@freebsd.org
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 10 Jun 2013 08:43:15 -0000

On Mon, 10 Jun 2013, Julian Elischer wrote:

> > > > what do you think about stop using precious disk or even SSD resources
> > > > for
> > > > /tmp?
> > > > 
> > > > For last several (well, maybe over 10?) years I constantly use md
> > > > (swap-backed)
> > > > for /tmp, usually 128M in size, which is enough for most of our server
> > > > needs.
> > > > Some require more, but none more than 512M.  Regarding the options, we
> > > > use
> > > > tmpmfs_flags="-S -n -o async -b 4096 -f 512"
> > > [...]
> 
> I sometimes use virtual filesystems but there are cases when I am looking to
> store HUGE amounts of trace data in /tmp and end up cursing and remounting a
> real disk partition.

Hmm, don't /var/tmp exist for such a task?

-- 
Sincerely,
D.Marck                                     [DM5020, MCK-RIPE, DM3-RIPN]
[ FreeBSD committer:                                 marck@FreeBSD.org ]
------------------------------------------------------------------------
*** Dmitry Morozovsky --- D.Marck --- Wild Woozle --- marck@rinet.ru ***
------------------------------------------------------------------------

From owner-freebsd-fs@FreeBSD.ORG  Mon Jun 10 08:44:39 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by hub.freebsd.org (Postfix) with ESMTP id D8CF82F0
 for <freebsd-fs@freebsd.org>; Mon, 10 Jun 2013 08:44:39 +0000 (UTC)
 (envelope-from pierre@lemazurier.fr)
Received: from mail.lemazurier.fr (mail.lemazurier.fr [62.147.151.66])
 by mx1.freebsd.org (Postfix) with ESMTP id 5557D13A9
 for <freebsd-fs@freebsd.org>; Mon, 10 Jun 2013 08:44:39 +0000 (UTC)
Received: from [172.18.8.191] (zup50-1-88-186-33-16.fbx.proxad.net
 [88.186.33.16])
 (using TLSv1 with cipher ECDHE-RSA-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by mail.lemazurier.fr (Postfix) with ESMTPSA id 20DAC23E15
 for <freebsd-fs@freebsd.org>; Mon, 10 Jun 2013 10:44:31 +0200 (CEST)
Message-ID: <51B59257.3070500@lemazurier.fr>
Date: Mon, 10 Jun 2013 10:46:15 +0200
From: Pierre Lemazurier <pierre@lemazurier.fr>
User-Agent: Mozilla/5.0 (X11; Linux x86_64;
 rv:10.0.12) Gecko/20130116 Icedove/10.0.12
MIME-Version: 1.0
To: freebsd-fs@freebsd.org
Subject: Re: [ZFS] Raid 10 performance issues
References: <51B1EBD1.9010207@gmail.com> <51B1F726.7090402@lemazurier.fr>
In-Reply-To: <51B1F726.7090402@lemazurier.fr>
Content-Type: text/plain; charset=windows-1252; format=flowed
Content-Transfer-Encoding: 8bit
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 10 Jun 2013 08:44:40 -0000

I add my /boot/loader.conf for more information :

zfs_load="YES"
vm.kmem_size="22528M"
vfs.zfs.arc_min="20480M"
vfs.zfs.arc_max="20480M"
vfs.zfs.prefetch_disable="0"
vfs.zfs.txg.timeout="5"
vfs.zfs.vdev.max_pending="10"
vfs.zfs.vdev.min_pending="4"
vfs.zfs.write_limit_override="0"
vfs.zfs.no_write_throttle="0"

Le 07/06/2013 17:07, Pierre Lemazurier a �crit :
> Hi, i think i suffer of write and read performance issues on my zpool.
>
> About my system and hardware :
>
> uname -a
> FreeBSD bsdnas 9.1-RELEASE FreeBSD 9.1-RELEASE #0 r243825: Tue Dec 4
> 09:23:10 UTC 2012
> root@farrell.cse.buffalo.edu:/usr/obj/usr/src/sys/GENERIC amd64
>
> sysinfo -a : http://www.privatepaste.com/b32f34c938
>
> - 24 (4gbx6) GB DDR3 ECC :
> http://www.ec.kingston.com/ecom/configurator_new/partsinfo.asp?ktcpartno=KVR16R11D8/4HC
>
> - 14x this drive :
> http://www.wdc.com/global/products/specs/?driveID=1086&language=1
> - server :
> http://www.supermicro.com/products/system/1u/5017/sys-5017r-wrf.cfm?parts=show
>
> - CPU :
> http://ark.intel.com/fr/products/64594/Intel-Xeon-Processor-E5-2620-15M-Cache-2_00-GHz-7_20-GTs-Intel-QPI
>
> - chassis :
> http://www.supermicro.com/products/chassis/4u/847/sc847e16-rjbod1.cfm
> - HBA sas connector :
> http://www.lsi.com/products/storagecomponents/Pages/LSISAS9200-8e.aspx
> - Cable between chassis and server :
> http://www.provantage.com/supermicro-cbl-0166l~7SUPA01R.htm
>
> I use this command for test write speed :dd if=/dev/zero of=test.dd
> bs=2M count=10000
> I use this command for test read speed :dd if=test.dd of=/dev/null bs=2M
> count=10000
>
> Of course no compression on zfs dataset.
>
> Test on one of this disk format with UFS :
>
> Write :
> some gstat raising : http://www.privatepaste.com/dd31fafaa6
> speed around 140 mo/s and something like 1100 iops
> dd result : 20971520000 bytes transferred in 146.722126 secs (142933589
> bytes/sec)
>
> Read :
> I think I read on RAM (20971520000 bytes transferred in 8.813298 secs
> (2379531480 bytes/sec)).
> Then I make the test on all the drive (dd if=/dev/gpt/disk14.nop
> of=/dev/null bs=2M count=10000)
> some gstat raising : http://www.privatepaste.com/d022b7c480
> speed around 140 mo/s again an near 1100+ iops
> dd reslut : 20971520000 bytes transferred in 142.895212 secs (146761530
> bytes/sec)
>
>
> ZFS - I make my zpool on this way : http://www.privatepaste.com/e74d9cc3b9
>
> zpool status : http://www.privatepaste.com/0276801ef6
> zpool get all : http://www.privatepaste.com/74b37a2429
> zfs get all : http://www.privatepaste.com/e56f4a33f8
> zfs-stats -a : http://www.privatepaste.com/f017890aa1
> zdb : http://www.privatepaste.com/7d723c5556
>
> With this setup I hope to have near 7x more speed for write and near 14x
> for
> read than the UFS device alone. Then for be realistic, something like
> 850 mo/s for write and 1700 mo/s for read.
>
>
> ZFS � test :
>
> Write :
> gstat raising : http://www.privatepaste.com/7cefb9393a
> zpool iostat -v 1 of a fastest try : http://www.privatepaste.com/8ade4defbe
> dd result : 20971520000 bytes transferred in 54.326509 secs (386027381
> bytes/sec)
>
> 386 mo/s more than twice less than I expect.
>
>
> Read :
> I export and import the pool for limit the ARC effect. I don't know how
> to do better, I hope that sufficient.
> gstat raising : http://www.privatepaste.com/130ce43af1
> zpool iostat -v 1 : http://privatepaste.com/eb5f9d3432
> dd result : 20971520000 bytes transferred in 30.347214 secs (691052563
> bytes/sec)
>
> 690 mo/s 2,5x less than I expect.
>
>
> It's appear to not be an hardware issue, when I do a dd test of each
> whole disk at the same time with the command dd if=/dev/gpt/diskX
> of=/dev/null bs=1M count=10000, I have this gstat raising :
> http://privatepaste.com/df9f63fd4d
>
> Near 130 mo/s for each device, something like I expect.
>
> In your opinion where the problem come from ?
>
>
> Forgive me for my English, please keep easy language, i'm not realy easy
> with English.
> I can give you more information if you need.
>
> Many thanks for your help.
> _______________________________________________
> freebsd-fs@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-fs
> To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org"


From owner-freebsd-fs@FreeBSD.ORG  Mon Jun 10 10:03:43 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115])
 by hub.freebsd.org (Postfix) with ESMTP id 2AABC67F
 for <freebsd-fs@freebsd.org>; Mon, 10 Jun 2013 10:03:43 +0000 (UTC)
 (envelope-from girgen@FreeBSD.org)
Received: from melon.pingpong.net (melon.pingpong.net [79.136.116.200])
 by mx1.freebsd.org (Postfix) with ESMTP id C1FD318B6
 for <freebsd-fs@freebsd.org>; Mon, 10 Jun 2013 10:03:42 +0000 (UTC)
Received: from girgBook.local (citron2.pingpong.net [195.178.173.68])
 (using TLSv1 with cipher DHE-RSA-CAMELLIA256-SHA (256/256 bits))
 (No client certificate requested)
 by melon.pingpong.net (Postfix) with ESMTPSA id 0A4B616571;
 Mon, 10 Jun 2013 11:54:58 +0200 (CEST)
Message-ID: <51B5A277.2060904@FreeBSD.org>
Date: Mon, 10 Jun 2013 11:55:03 +0200
From: Palle Girgensohn <girgen@FreeBSD.org>
User-Agent: Postbox 3.0.8 (Macintosh/20130427)
MIME-Version: 1.0
To: Kirk McKusick <mckusick@mckusick.com>
Subject: Re: leaking lots of unreferenced inodes (pg_xlog files?)
References: <201306022101.r52L19vg033389@chez.mckusick.com>
In-Reply-To: <201306022101.r52L19vg033389@chez.mckusick.com>
X-Enigmail-Version: 1.2.3
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 7bit
Cc: freebsd-fs@freebsd.org, Dan Thomas <godders@gmail.com>,
 Jeff Roberson <jroberson@jroberson.net>, Julian Akehurst <julian@pingpong.se>
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 10 Jun 2013 10:03:43 -0000

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Kirk McKusick skrev:
>> Date: Sun, 02 Jun 2013 22:35:23 +0200 From: Palle Girgensohn
>> <girgen@freebsd.org> To: Kirk McKusick <mckusick@mckusick.com> 
>> Subject: Re: leaking lots of unreferenced inodes (pg_xlog files?) 
>> Cc: freebsd-fs@freebsd.org, Dan Thomas <godders@gmail.com>, Jeff
>> Roberson <jroberson@jroberson.net>, Julian Akehurst
>> <julian@pingpong.se>
>> 
>> --On 31 maj 2013 11.25.40 -0700 Kirk McKusick
>> <mckusick@mckusick.com> wrote:
>> 
>>> Your results are very enlightening. Especially the fact that you
>>> have to do a forcible unmount of the filesystem. What that tells
>>> me is that somehow we are getting vnodes that have phantom
>>> references. That is there is some system call where we get a
>>> reference on a vnode (vref, vget, or similar) that does not
>>> ultimately have a corresponding drop of the reference (vrele,
>>> vput, or similar). The net effect is that the file is held open
>>> despite the fact that there are no longer any connections to it.
>>> When you do the forcible unmount, the kernel walks the list of
>>> vnodes associated with the filesystem and does a vgone on each of
>>> them. That causes each to be inactivated which then triggers the
>>> release of their associated disk space. The reason that the
>>> unmount takes 20 seconds is to process all the releasing of the
>>> space. My guess is that there is an error path in some system
>>> call that is missing the vrele or vput.
>>> 
>>> Assuming that you are able to run some more tests on your test
>>> machine, the next step in narrowing down the set of code to look
>>> at is to try running your system with soft updates disabled. The
>>> idea is to find out whether the miss-matched references are in
>>> the soft updates code or are in one of the filesystem system
>>> calls themselves. To disable soft updates run the command `tunefs
>>> -n disable /pgsql' on the unmounted /pgsql filesystem. If the
>>> system then runs without the problem, I will know to search the
>>> soft updates code. If the problem persists, then I'll know to
>>> look in the system calls themselves. You may want to do some 
>>> preliminary tests to see how quickly the problem manifests
>>> itself. You can do this by running it for a short time (10
>>> minutes say) and then checking to see if you need to do a
>>> forcible unmount of the filesystem. Once you establish how long
>>> you have to run before you reliably have to do a forcible
>>> unmount, you will know how long to run the test with soft updates
>>> turned off. If you find that running with soft updates turned off
>>> makes your application run too slowly you can mount your
>>> filesystem asynchronously. Note however, that you should not run
>>> asynchronously if the data on the filesystem is critical as you
>>> may end up with an unrecoverable filesystem after a power
>>> failure or system crash. So only run asynchronously if you can
>>> afford to lose your filesystem.
>>> 
>>> Finally, it would be helpful if you could add two more commands
>>> to your diskspacecheck.sh script:
>>> 
>>> sysctl -a | egrep vnode mount -v
>>> 
>>> The first shows the vnode usage and the second shows the
>>> operational state of your filesystems.
>>> 
>>> Kirk McKusick
>> OK, I have now turned off soft updates. This is on the test server.
>> It is not as busy as the production machine, but I'll keep an eye
>> on it and will mail new results as soon as I see any evidence of
>> either that soft updates is the culprit or that it is not.
>> 
>> FWIW, I attach the script from this remount process as well, which
>> includes
>> 
>> sysctl -a | grep vnode ; mount -v.
>> 
>> Note that it is all in one script file this time.
>> 
>> Cheers, Palle
> 
> This looks good. Keep me posted.

After running for a number of days without soft updates, it seems to me
that the culprit is indeed in the soft updates code.

# df -k /pgsql; du -sk /pgsql
Filesystem  1024-blocks     Used    Avail Capacity  Mounted on
/dev/da2s1d   134763348 86339044 37643238    70%    /pgsql
86303252	/pgsql

Palle

-----BEGIN PGP SIGNATURE-----
Version: GnuPG/MacGPG2 v2.0.17 (Darwin)
Comment: GPGTools - http://gpgtools.org
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iQEcBAEBAgAGBQJRtaJ3AAoJEIhV+7FrxBJD+IkH/3FOoZ95VGE0fOWSuFIwVn8I
jvHiJ6qTx0zh17pZNnc+G0UpU5fHxCazD1yT6yCwfkWebWKXELXtfQMeZUMGi0AX
e94P0HJ2O4RQSMHC1rlWSLUidAB6m1ZtAtpXzgziB9P/Jonk78uFqRcTmZyMycsy
pxPFHsbywsjJm9FLF4ZuhiSPX57tbAKLQM3HYDMFQ/rHPJiBlkx7VVeON6svtmMO
bRZWnQTUXUAAMT1NDUEL8opGAO2S72+hFBiCjJsgS22SSq7KIMzAlJqq01L2svhH
o7KNAkN6lIMuJS9B2idjJWLVXG/vNQ1QBOha0VY80fIQYSYeZt25EGlXf3rYL6Y=
=Zmu2
-----END PGP SIGNATURE-----

From owner-freebsd-fs@FreeBSD.ORG  Mon Jun 10 11:06:47 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@FreeBSD.org
Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115])
 by hub.freebsd.org (Postfix) with ESMTP id CC96FFD5
 for <freebsd-fs@FreeBSD.org>; Mon, 10 Jun 2013 11:06:47 +0000 (UTC)
 (envelope-from owner-bugmaster@FreeBSD.org)
Received: from freefall.freebsd.org (freefall.freebsd.org
 [IPv6:2001:1900:2254:206c::16:87])
 by mx1.freebsd.org (Postfix) with ESMTP id BE9321C81
 for <freebsd-fs@FreeBSD.org>; Mon, 10 Jun 2013 11:06:47 +0000 (UTC)
Received: from freefall.freebsd.org (localhost [127.0.0.1])
 by freefall.freebsd.org (8.14.7/8.14.7) with ESMTP id r5AB6lL8096938
 for <freebsd-fs@FreeBSD.org>; Mon, 10 Jun 2013 11:06:47 GMT
 (envelope-from owner-bugmaster@FreeBSD.org)
Received: (from gnats@localhost)
 by freefall.freebsd.org (8.14.7/8.14.7/Submit) id r5AB6ljf096936
 for freebsd-fs@FreeBSD.org; Mon, 10 Jun 2013 11:06:47 GMT
 (envelope-from owner-bugmaster@FreeBSD.org)
Date: Mon, 10 Jun 2013 11:06:47 GMT
Message-Id: <201306101106.r5AB6ljf096936@freefall.freebsd.org>
X-Authentication-Warning: freefall.freebsd.org: gnats set sender to
 owner-bugmaster@FreeBSD.org using -f
From: FreeBSD bugmaster <bugmaster@freebsd.org>
To: freebsd-fs@FreeBSD.org
Subject: Current problem reports assigned to freebsd-fs@FreeBSD.org
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 10 Jun 2013 11:06:47 -0000

Note: to view an individual PR, use:
  http://www.freebsd.org/cgi/query-pr.cgi?pr=(number).

The following is a listing of current problems submitted by FreeBSD users.
These represent problem reports covering all versions including
experimental development code and obsolete releases.


S Tracker      Resp.      Description
--------------------------------------------------------------------------------
o kern/178999  fs         [zfs] dev entries for cloned zvol don't show up until 
o bin/178996   fs         [zfs] [patch] error in message with zfs mount -> there
o kern/178854  fs         [ufs] FreeBSD kernel crash in UFS
o kern/178713  fs         [nfs] [patch] Correct WebNFS support in NFS server and
o kern/178412  fs         [smbfs] Coredump when smbfs mounted
o kern/178388  fs         [zfs] [patch] allow up to 8MB recordsize
o kern/178349  fs         [zfs] zfs scrub on deduped data could be much less see
o kern/178329  fs         [zfs] extended attributes leak
o kern/178238  fs         [nullfs] nullfs don't release i-nodes on unlink.
f kern/178231  fs         [nfs] 8.3 nfsv4 client reports "nfsv4 client/server pr
o kern/178103  fs         [kernel] [nfs] [patch] Correct support of index files 
o kern/177985  fs         [zfs] disk usage problem when copying from one zfs dat
o kern/177971  fs         [nfs] FreeBSD 9.1 nfs client dirlist problem w/ nfsv3,
o kern/177966  fs         [zfs] resilver completes but subsequent scrub reports 
o kern/177658  fs         [ufs] FreeBSD panics after get full filesystem with uf
o kern/177536  fs         [zfs] zfs livelock (deadlock) with high write-to-disk 
o kern/177445  fs         [hast] HAST panic
f kern/177335  fs         [nfs] [panic] Sleeping on "vmopar" with the following 
o kern/177240  fs         [zfs] zpool import failed with state UNAVAIL but all d
o kern/176978  fs         [zfs] [panic] zfs send -D causes "panic: System call i
o kern/176857  fs         [softupdates] [panic] 9.1-RELEASE/amd64/GENERIC panic 
o bin/176253   fs         zpool(8): zfs pool indentation is misleading/wrong
o kern/176141  fs         [zfs] sharesmb=on makes errors for sharenfs, and still
o kern/175950  fs         [zfs] Possible deadlock in zfs after long uptime
o kern/175897  fs         [zfs] operations on readonly zpool hang
o kern/175179  fs         [zfs] ZFS may attach wrong device on move
o kern/175071  fs         [ufs] [panic] softdep_deallocate_dependencies: unrecov
o kern/174372  fs         [zfs] Pagefault appears to be related to ZFS
o kern/174315  fs         [zfs] chflags uchg not supported
o kern/174310  fs         [zfs] root point mounting broken on CURRENT with multi
o kern/174279  fs         [ufs] UFS2-SU+J journal and filesystem corruption
o kern/174060  fs         [ext2fs] Ext2FS system crashes (buffer overflow?)
o kern/173830  fs         [zfs] Brain-dead simple change to ZFS error descriptio
o kern/173718  fs         [zfs] phantom directory in zraid2 pool
f kern/173657  fs         [nfs] strange UID map with nfsuserd
o kern/173363  fs         [zfs] [panic] Panic on 'zpool replace' on readonly poo
o kern/173136  fs         [unionfs] mounting above the NFS read-only share panic
o kern/172942  fs         [smbfs] Unmounting a smb mount when the server became 
o kern/172348  fs         [unionfs] umount -f of filesystem in use with readonly
o kern/172334  fs         [unionfs] unionfs permits recursive union mounts; caus
o kern/171626  fs         [tmpfs] tmpfs should be noisier when the requested siz
o kern/171415  fs         [zfs] zfs recv fails with "cannot receive incremental 
o kern/170945  fs         [gpt] disk layout not portable between direct connect 
o bin/170778   fs         [zfs] [panic] FreeBSD panics randomly
o kern/170680  fs         [nfs] Multiple NFS Client bug in the FreeBSD 7.4-RELEA
o kern/170497  fs         [xfs][panic] kernel will panic whenever I ls a mounted
o kern/169945  fs         [zfs] [panic] Kernel panic while importing zpool (afte
o kern/169480  fs         [zfs] ZFS stalls on heavy I/O
o kern/169398  fs         [zfs] Can't remove file with permanent error
o kern/169339  fs         panic while " : > /etc/123"
o kern/169319  fs         [zfs] zfs resilver can't complete
o kern/168947  fs         [nfs] [zfs] .zfs/snapshot directory is messed up when 
o kern/168942  fs         [nfs] [hang] nfsd hangs after being restarted (not -HU
o kern/168158  fs         [zfs] incorrect parsing of sharenfs options in zfs (fs
o kern/167979  fs         [ufs] DIOCGDINFO ioctl does not work on 8.2 file syste
o kern/167977  fs         [smbfs] mount_smbfs results are differ when utf-8 or U
o kern/167688  fs         [fusefs] Incorrect signal handling with direct_io
o kern/167685  fs         [zfs] ZFS on USB drive prevents shutdown / reboot
o kern/167612  fs         [portalfs] The portal file system gets stuck inside po
o kern/167272  fs         [zfs] ZFS Disks reordering causes ZFS to pick the wron
o kern/167260  fs         [msdosfs] msdosfs disk was mounted the second time whe
o kern/167109  fs         [zfs] [panic] zfs diff kernel panic Fatal trap 9: gene
o kern/167105  fs         [nfs] mount_nfs can not handle source exports wiht mor
o kern/167067  fs         [zfs] [panic] ZFS panics the server
o kern/167065  fs         [zfs] boot fails when a spare is the boot disk
o kern/167048  fs         [nfs] [patch] RELEASE-9 crash when using ZFS+NULLFS+NF
o kern/166912  fs         [ufs] [panic] Panic after converting Softupdates to jo
o kern/166851  fs         [zfs] [hang] Copying directory from the mounted UFS di
o kern/166477  fs         [nfs] NFS data corruption.
o kern/165950  fs         [ffs] SU+J and fsck problem
o kern/165521  fs         [zfs] [hang] livelock on 1 Gig of RAM with zfs when 31
o kern/165392  fs         Multiple mkdir/rmdir fails with errno 31
o kern/165087  fs         [unionfs] lock violation in unionfs
o kern/164472  fs         [ufs] fsck -B panics on particular data inconsistency
o kern/164370  fs         [zfs] zfs destroy for snapshot fails on i386 and sparc
o kern/164261  fs         [nullfs] [patch] fix panic with NFS served from NULLFS
o kern/164256  fs         [zfs] device entry for volume is not created after zfs
o kern/164184  fs         [ufs] [panic] Kernel panic with ufs_makeinode
o kern/163801  fs         [md] [request] allow mfsBSD legacy installed in 'swap'
o kern/163770  fs         [zfs] [hang] LOR between zfs&syncer + vnlru leading to
o kern/163501  fs         [nfs] NFS exporting a dir and a subdir in that dir to 
o kern/162944  fs         [coda] Coda file system module looks broken in 9.0
o kern/162860  fs         [zfs] Cannot share ZFS filesystem to hosts with a hyph
o kern/162751  fs         [zfs] [panic] kernel panics during file operations
o kern/162591  fs         [nullfs] cross-filesystem nullfs does not work as expe
o kern/162519  fs         [zfs] "zpool import" relies on buggy realpath() behavi
o kern/161968  fs         [zfs] [hang] renaming snapshot with -r including a zvo
o kern/161864  fs         [ufs] removing journaling from UFS partition fails on 
o bin/161807   fs         [patch] add option for explicitly specifying metadata 
o kern/161579  fs         [smbfs] FreeBSD sometimes panics when an smb share is 
o kern/161533  fs         [zfs] [panic] zfs receive panic: system ioctl returnin
o kern/161438  fs         [zfs] [panic] recursed on non-recursive spa_namespace_
o kern/161424  fs         [nullfs] __getcwd() calls fail when used on nullfs mou
o kern/161280  fs         [zfs] Stack overflow in gptzfsboot
o kern/161205  fs         [nfs] [pfsync] [regression] [build] Bug report freebsd
o kern/161169  fs         [zfs] [panic] ZFS causes kernel panic in dbuf_dirty
o kern/161112  fs         [ufs] [lor] filesystem LOR in FreeBSD 9.0-BETA3
o kern/160893  fs         [zfs] [panic] 9.0-BETA2 kernel panic
f kern/160860  fs         [ufs] Random UFS root filesystem corruption with SU+J 
o kern/160801  fs         [zfs] zfsboot on 8.2-RELEASE fails to boot from root-o
o kern/160790  fs         [fusefs] [panic] VPUTX: negative ref count with FUSE
o kern/160777  fs         [zfs] [hang] RAID-Z3 causes fatal hang upon scrub/impo
o kern/160706  fs         [zfs] zfs bootloader fails when a non-root vdev exists
o kern/160591  fs         [zfs] Fail to boot on zfs root with degraded raidz2 [r
o kern/160410  fs         [smbfs] [hang] smbfs hangs when transferring large fil
o kern/160283  fs         [zfs] [patch] 'zfs list' does abort in make_dataset_ha
o kern/159930  fs         [ufs] [panic] kernel core
o kern/159402  fs         [zfs][loader] symlinks cause I/O errors
o kern/159357  fs         [zfs] ZFS MAXNAMELEN macro has confusing name (off-by-
o kern/159356  fs         [zfs] [patch] ZFS NAME_ERR_DISKLIKE check is Solaris-s
o kern/159351  fs         [nfs] [patch] - divide by zero in mountnfs()
o kern/159251  fs         [zfs] [request]: add FLETCHER4 as DEDUP hash option
o kern/159077  fs         [zfs] Can't cd .. with latest zfs version
o kern/159048  fs         [smbfs] smb mount corrupts large files
o kern/159045  fs         [zfs] [hang] ZFS scrub freezes system
o kern/158839  fs         [zfs] ZFS Bootloader Fails if there is a Dead Disk
o kern/158802  fs         amd(8) ICMP storm and unkillable process.
o kern/158231  fs         [nullfs] panic on unmounting nullfs mounted over ufs o
f kern/157929  fs         [nfs] NFS slow read
o kern/157399  fs         [zfs] trouble with: mdconfig force delete && zfs strip
o kern/157179  fs         [zfs] zfs/dbuf.c: panic: solaris assert: arc_buf_remov
o kern/156797  fs         [zfs] [panic] Double panic with FreeBSD 9-CURRENT and 
o kern/156781  fs         [zfs] zfs is losing the snapshot directory,
p kern/156545  fs         [ufs] mv could break UFS on SMP systems
o kern/156193  fs         [ufs] [hang] UFS snapshot hangs && deadlocks processes
o kern/156039  fs         [nullfs] [unionfs] nullfs + unionfs do not compose, re
o kern/155615  fs         [zfs] zfs v28 broken on sparc64 -current
o kern/155587  fs         [zfs] [panic] kernel panic with zfs
p kern/155411  fs         [regression] [8.2-release] [tmpfs]: mount: tmpfs : No 
o kern/155199  fs         [ext2fs] ext3fs mounted as ext2fs gives I/O errors
o bin/155104   fs         [zfs][patch] use /dev prefix by default when importing
o kern/154930  fs         [zfs] cannot delete/unlink file from full volume -> EN
o kern/154828  fs         [msdosfs] Unable to create directories on external USB
o kern/154491  fs         [smbfs] smb_co_lock: recursive lock for object 1
p kern/154228  fs         [md] md getting stuck in wdrain state
o kern/153996  fs         [zfs] zfs root mount error while kernel is not located
o kern/153753  fs         [zfs] ZFS v15 - grammatical error when attempting to u
o kern/153716  fs         [zfs] zpool scrub time remaining is incorrect
o kern/153695  fs         [patch] [zfs] Booting from zpool created on 4k-sector 
o kern/153680  fs         [xfs] 8.1 failing to mount XFS partitions
o kern/153418  fs         [zfs] [panic] Kernel Panic occurred writing to zfs vol
o kern/153351  fs         [zfs] locking directories/files in ZFS
o bin/153258   fs         [patch][zfs] creating ZVOLs requires `refreservation' 
s kern/153173  fs         [zfs] booting from a gzip-compressed dataset doesn't w
o bin/153142   fs         [zfs] ls -l outputs `ls: ./.zfs: Operation not support
o kern/153126  fs         [zfs] vdev failure, zpool=peegel type=vdev.too_small
o kern/152022  fs         [nfs] nfs service hangs with linux client [regression]
o kern/151942  fs         [zfs] panic during ls(1) zfs snapshot directory
o kern/151905  fs         [zfs] page fault under load in /sbin/zfs
o bin/151713   fs         [patch] Bug in growfs(8) with respect to 32-bit overfl
o kern/151648  fs         [zfs] disk wait bug
o kern/151629  fs         [fs] [patch] Skip empty directory entries during name 
o kern/151330  fs         [zfs] will unshare all zfs filesystem after execute a 
o kern/151326  fs         [nfs] nfs exports fail if netgroups contain duplicate 
o kern/151251  fs         [ufs] Can not create files on filesystem with heavy us
o kern/151226  fs         [zfs] can't delete zfs snapshot
o kern/150503  fs         [zfs] ZFS disks are UNAVAIL and corrupted after reboot
o kern/150501  fs         [zfs] ZFS vdev failure vdev.bad_label on amd64
o kern/150390  fs         [zfs] zfs deadlock when arcmsr reports drive faulted
o kern/150336  fs         [nfs] mountd/nfsd became confused; refused to reload n
o kern/149208  fs         mksnap_ffs(8) hang/deadlock
o kern/149173  fs         [patch] [zfs] make OpenSolaris <sys/nvpair.h> installa
o kern/149015  fs         [zfs] [patch] misc fixes for ZFS code to build on Glib
o kern/149014  fs         [zfs] [patch] declarations in ZFS libraries/utilities 
o kern/149013  fs         [zfs] [patch] make ZFS makefiles use the libraries fro
o kern/148504  fs         [zfs] ZFS' zpool does not allow replacing drives to be
o kern/148490  fs         [zfs]: zpool attach - resilver bidirectionally, and re
o kern/148368  fs         [zfs] ZFS hanging forever on 8.1-PRERELEASE
o kern/148138  fs         [zfs] zfs raidz pool commands freeze
o kern/147903  fs         [zfs] [panic] Kernel panics on faulty zfs device
o kern/147881  fs         [zfs] [patch] ZFS "sharenfs" doesn't allow different "
o kern/147420  fs         [ufs] [panic] ufs_dirbad, nullfs, jail panic (corrupt 
o kern/146941  fs         [zfs] [panic] Kernel Double Fault - Happens constantly
o kern/146786  fs         [zfs] zpool import hangs with checksum errors
o kern/146708  fs         [ufs] [panic] Kernel panic in softdep_disk_write_compl
o kern/146528  fs         [zfs] Severe memory leak in ZFS on i386
o kern/146502  fs         [nfs] FreeBSD 8 NFS Client Connection to Server
s kern/145712  fs         [zfs] cannot offline two drives in a raidz2 configurat
o kern/145411  fs         [xfs] [panic] Kernel panics shortly after mounting an 
f bin/145309   fs         bsdlabel: Editing disk label invalidates the whole dev
o kern/145272  fs         [zfs] [panic] Panic during boot when accessing zfs on 
o kern/145246  fs         [ufs] dirhash in 7.3 gratuitously frees hashes when it
o kern/145238  fs         [zfs] [panic] kernel panic on zpool clear tank
o kern/145229  fs         [zfs] Vast differences in ZFS ARC behavior between 8.0
o kern/145189  fs         [nfs] nfsd performs abysmally under load
o kern/144929  fs         [ufs] [lor] vfs_bio.c + ufs_dirhash.c
p kern/144447  fs         [zfs] sharenfs fsunshare() & fsshare_main() non functi
o kern/144416  fs         [panic] Kernel panic on online filesystem optimization
s kern/144415  fs         [zfs] [panic] kernel panics on boot after zfs crash
o kern/144234  fs         [zfs] Cannot boot machine with recent gptzfsboot code 
o kern/143825  fs         [nfs] [panic] Kernel panic on NFS client
o bin/143572   fs         [zfs] zpool(1): [patch] The verbose output from iostat
o kern/143212  fs         [nfs] NFSv4 client strange work ...
o kern/143184  fs         [zfs] [lor] zfs/bufwait LOR
o kern/142878  fs         [zfs] [vfs] lock order reversal
o kern/142597  fs         [ext2fs] ext2fs does not work on filesystems with real
o kern/142489  fs         [zfs] [lor] allproc/zfs LOR
o kern/142466  fs         Update 7.2 -> 8.0 on Raid 1 ends with screwed raid [re
o kern/142306  fs         [zfs] [panic] ZFS drive (from OSX Leopard) causes two 
o kern/142068  fs         [ufs] BSD labels are got deleted spontaneously
o kern/141897  fs         [msdosfs] [panic] Kernel panic. msdofs: file name leng
o kern/141463  fs         [nfs] [panic] Frequent kernel panics after upgrade fro
o kern/141305  fs         [zfs] FreeBSD ZFS+sendfile severe performance issues (
o kern/141091  fs         [patch] [nullfs] fix panics with DIAGNOSTIC enabled
o kern/141086  fs         [nfs] [panic] panic("nfs: bioread, not dir") on FreeBS
o kern/141010  fs         [zfs] "zfs scrub" fails when backed by files in UFS2
o kern/140888  fs         [zfs] boot fail from zfs root while the pool resilveri
o kern/140661  fs         [zfs] [patch] /boot/loader fails to work on a GPT/ZFS-
o kern/140640  fs         [zfs] snapshot crash
o kern/140068  fs         [smbfs] [patch] smbfs does not allow semicolon in file
o kern/139725  fs         [zfs] zdb(1) dumps core on i386 when examining zpool c
o kern/139715  fs         [zfs] vfs.numvnodes leak on busy zfs
p bin/139651   fs         [nfs] mount(8): read-only remount of NFS volume does n
o kern/139407  fs         [smbfs] [panic] smb mount causes system crash if remot
o kern/138662  fs         [panic] ffs_blkfree: freeing free block
o kern/138421  fs         [ufs] [patch] remove UFS label limitations
o kern/138202  fs         mount_msdosfs(1) see only 2Gb
o kern/136968  fs         [ufs] [lor] ufs/bufwait/ufs (open)
o kern/136945  fs         [ufs] [lor] filedesc structure/ufs (poll)
o kern/136944  fs         [ffs] [lor] bufwait/snaplk (fsync)
o kern/136873  fs         [ntfs] Missing directories/files on NTFS volume
o kern/136865  fs         [nfs] [patch] NFS exports atomic and on-the-fly atomic
p kern/136470  fs         [nfs] Cannot mount / in read-only, over NFS
o kern/135546  fs         [zfs] zfs.ko module doesn't ignore zpool.cache filenam
o kern/135469  fs         [ufs] [panic] kernel crash on md operation in ufs_dirb
o kern/135050  fs         [zfs] ZFS clears/hides disk errors on reboot
o kern/134491  fs         [zfs] Hot spares are rather cold...
o kern/133676  fs         [smbfs] [panic] umount -f'ing a vnode-based memory dis
p kern/133174  fs         [msdosfs] [patch] msdosfs must support multibyte inter
o kern/132960  fs         [ufs] [panic] panic:ffs_blkfree: freeing free frag
o kern/132397  fs         reboot causes filesystem corruption (failure to sync b
o kern/132331  fs         [ufs] [lor] LOR ufs and syncer
o kern/132237  fs         [msdosfs] msdosfs has problems to read MSDOS Floppy
o kern/132145  fs         [panic] File System Hard Crashes
o kern/131441  fs         [unionfs] [nullfs] unionfs and/or nullfs not combineab
o kern/131360  fs         [nfs] poor scaling behavior of the NFS server under lo
o kern/131342  fs         [nfs] mounting/unmounting of disks causes NFS to fail
o bin/131341   fs         makefs: error "Bad file descriptor"  on the mount poin
o kern/130920  fs         [msdosfs] cp(1) takes 100% CPU time while copying file
o kern/130210  fs         [nullfs] Error by check nullfs
o kern/129760  fs         [nfs] after 'umount -f' of a stale NFS share FreeBSD l
o kern/129488  fs         [smbfs] Kernel "bug" when using smbfs in smbfs_smb.c: 
o kern/129231  fs         [ufs] [patch] New UFS mount (norandom) option - mostly
o kern/129152  fs         [panic] non-userfriendly panic when trying to mount(8)
o kern/127787  fs         [lor] [ufs] Three LORs: vfslock/devfs/vfslock, ufs/vfs
o bin/127270   fs         fsck_msdosfs(8) may crash if BytesPerSec is zero
o kern/127029  fs         [panic] mount(8): trying to mount a write protected zi
o kern/126287  fs         [ufs] [panic] Kernel panics while mounting an UFS file
o kern/125895  fs         [ffs] [panic] kernel: panic: ffs_blkfree: freeing free
s kern/125738  fs         [zfs] [request] SHA256 acceleration in ZFS
o kern/123939  fs         [msdosfs] corrupts new files
o kern/122380  fs         [ffs] ffs_valloc:dup alloc (Soekris 4801/7.0/USB Flash
o bin/122172   fs         [fs]: amd(8) automount daemon dies on 6.3-STABLE i386,
o bin/121898   fs         [nullfs] pwd(1)/getcwd(2) fails with Permission denied
o bin/121072   fs         [smbfs] mount_smbfs(8) cannot normally convert the cha
o kern/120483  fs         [ntfs] [patch] NTFS filesystem locking changes
o kern/120482  fs         [ntfs] [patch] Sync style changes between NetBSD and F
o kern/118912  fs         [2tb] disk sizing/geometry problem with large array
o kern/118713  fs         [minidump] [patch] Display media size required for a k
o kern/118318  fs         [nfs] NFS server hangs under special circumstances
o bin/118249   fs         [ufs] mv(1): moving a directory changes its mtime
o kern/118126  fs         [nfs] [patch] Poor NFS server write performance
o kern/118107  fs         [ntfs] [panic] Kernel panic when accessing a file at N
o kern/117954  fs         [ufs] dirhash on very large directories blocks the mac
o bin/117315   fs         [smbfs] mount_smbfs(8) and related options can't mount
o kern/117158  fs         [zfs] zpool scrub causes panic if geli vdevs detach on
o bin/116980   fs         [msdosfs] [patch] mount_msdosfs(8) resets some flags f
o conf/116931  fs         lack of fsck_cd9660 prevents mounting iso images with 
o kern/116583  fs         [ffs] [hang] System freezes for short time when using 
o bin/115361   fs         [zfs] mount(8) gets into a state where it won't set/un
o kern/114955  fs         [cd9660] [patch] [request] support for mask,dirmask,ui
o kern/114847  fs         [ntfs] [patch] [request] dirmask support for NTFS ala 
o kern/114676  fs         [ufs] snapshot creation panics: snapacct_ufs2: bad blo
o bin/114468   fs         [patch] [request] add -d option to umount(8) to detach
o kern/113852  fs         [smbfs] smbfs does not properly implement DFS referral
o bin/113838   fs         [patch] [request] mount(8): add support for relative p
o bin/113049   fs         [patch] [request] make quot(8) use getopt(3) and show 
o kern/112658  fs         [smbfs] [patch] smbfs and caching problems (resolves b
o kern/111843  fs         [msdosfs] Long Names of files are incorrectly created 
o kern/111782  fs         [ufs] dump(8) fails horribly for large filesystems
s bin/111146   fs         [2tb] fsck(8) fails on 6T filesystem
o bin/107829   fs         [2TB] fdisk(8): invalid boundary checking in fdisk / w
o kern/106107  fs         [ufs] left-over fsck_snapshot after unfinished backgro
o kern/104406  fs         [ufs] Processes get stuck in "ufs" state under persist
o kern/104133  fs         [ext2fs] EXT2FS module corrupts EXT2/3 filesystems
o kern/103035  fs         [ntfs] Directories in NTFS mounted disc images appear 
o kern/101324  fs         [smbfs] smbfs sometimes not case sensitive when it's s
o kern/99290   fs         [ntfs] mount_ntfs ignorant of cluster sizes
s bin/97498    fs         [request] newfs(8) has no option to clear the first 12
o kern/97377   fs         [ntfs] [patch] syntax cleanup for ntfs_ihash.c
o kern/95222   fs         [cd9660] File sections on ISO9660 level 3 CDs ignored
o kern/94849   fs         [ufs] rename on UFS filesystem is not atomic
o bin/94810    fs         fsck(8) incorrectly reports 'file system marked clean'
o kern/94769   fs         [ufs] Multiple file deletions on multi-snapshotted fil
o kern/94733   fs         [smbfs] smbfs may cause double unlock
o kern/93942   fs         [vfs] [patch] panic: ufs_dirbad: bad dir (patch from D
o kern/92272   fs         [ffs] [hang] Filling a filesystem while creating a sna
o kern/91134   fs         [smbfs] [patch] Preserve access and modification time 
a kern/90815   fs         [smbfs] [patch] SMBFS with character conversions somet
o kern/88657   fs         [smbfs] windows client hang when browsing a samba shar
o kern/88555   fs         [panic] ffs_blkfree: freeing free frag on AMD 64
o bin/87966    fs         [patch] newfs(8): introduce -A flag for newfs to enabl
o kern/87859   fs         [smbfs] System reboot while umount smbfs.
o kern/86587   fs         [msdosfs] rm -r /PATH fails with lots of small files
o bin/85494    fs         fsck_ffs: unchecked use of cg_inosused macro etc.
o kern/80088   fs         [smbfs] Incorrect file time setting on NTFS mounted vi
o bin/74779    fs         Background-fsck checks one filesystem twice and omits 
o kern/73484   fs         [ntfs] Kernel panic when doing `ls` from the client si
o bin/73019    fs         [ufs] fsck_ufs(8) cannot alloc 607016868 bytes for ino
o kern/71774   fs         [ntfs] NTFS cannot "see" files on a WinXP filesystem
o bin/70600    fs         fsck(8) throws files away when it can't grow lost+foun
o kern/68978   fs         [panic] [ufs] crashes with failing hard disk, loose po
o kern/65920   fs         [nwfs] Mounted Netware filesystem behaves strange
o kern/65901   fs         [smbfs] [patch] smbfs fails fsx write/truncate-down/tr
o kern/61503   fs         [smbfs] mount_smbfs does not work as non-root
o kern/55617   fs         [smbfs] Accessing an nsmb-mounted drive via a smb expo
o kern/51685   fs         [hang] Unbounded inode allocation causes kernel to loc
o kern/36566   fs         [smbfs] System reboot with dead smb mount and umount
o bin/27687    fs         fsck(8) wrapper is not properly passing options to fsc
o kern/18874   fs         [2TB] 32bit NFS servers export wrong negative values t

320 problems total.


From owner-freebsd-fs@FreeBSD.ORG  Mon Jun 10 11:12:57 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by hub.freebsd.org (Postfix) with ESMTP id B032CC89
 for <freebsd-fs@freebsd.org>; Mon, 10 Jun 2013 11:12:57 +0000 (UTC)
 (envelope-from jdc@koitsu.org)
Received: from relay4-d.mail.gandi.net (relay4-d.mail.gandi.net
 [217.70.183.196])
 by mx1.freebsd.org (Postfix) with ESMTP id 4FC3A1E9D
 for <freebsd-fs@freebsd.org>; Mon, 10 Jun 2013 11:12:57 +0000 (UTC)
Received: from mfilter24-d.gandi.net (mfilter24-d.gandi.net [217.70.178.152])
 by relay4-d.mail.gandi.net (Postfix) with ESMTP id 55159172094;
 Mon, 10 Jun 2013 13:12:40 +0200 (CEST)
X-Virus-Scanned: Debian amavisd-new at mfilter24-d.gandi.net
Received: from relay4-d.mail.gandi.net ([217.70.183.196])
 by mfilter24-d.gandi.net (mfilter24-d.gandi.net [10.0.15.180]) (amavisd-new,
 port 10024)
 with ESMTP id E5B2Aw3VxnIO; Mon, 10 Jun 2013 13:12:38 +0200 (CEST)
X-Originating-IP: 76.102.14.35
Received: from jdc.koitsu.org (c-76-102-14-35.hsd1.ca.comcast.net
 [76.102.14.35]) (Authenticated sender: jdc@koitsu.org)
 by relay4-d.mail.gandi.net (Postfix) with ESMTPSA id BCABB172092;
 Mon, 10 Jun 2013 13:12:37 +0200 (CEST)
Received: by icarus.home.lan (Postfix, from userid 1000)
 id C9C8873A1C; Mon, 10 Jun 2013 04:12:35 -0700 (PDT)
Date: Mon, 10 Jun 2013 04:12:35 -0700
From: Jeremy Chadwick <jdc@koitsu.org>
To: Pierre Lemazurier <pierre@lemazurier.fr>
Subject: Re: [ZFS] Raid 10 performance issues
Message-ID: <20130610111235.GB61858@icarus.home.lan>
References: <51B1EBD1.9010207@gmail.com> <51B1F726.7090402@lemazurier.fr>
 <51B59257.3070500@lemazurier.fr>
MIME-Version: 1.0
Content-Type: text/plain; charset=unknown-8bit
Content-Disposition: inline
In-Reply-To: <51B59257.3070500@lemazurier.fr>
User-Agent: Mutt/1.5.21 (2010-09-15)
Content-Transfer-Encoding: quoted-printable
Cc: freebsd-fs@freebsd.org
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 10 Jun 2013 11:12:57 -0000

On Mon, Jun 10, 2013 at 10:46:15AM +0200, Pierre Lemazurier wrote:
> I add my /boot/loader.conf for more information :
>=20
> zfs_load=3D"YES"
> vm.kmem_size=3D"22528M"
> vfs.zfs.arc_min=3D"20480M"
> vfs.zfs.arc_max=3D"20480M"
> vfs.zfs.prefetch_disable=3D"0"
> vfs.zfs.txg.timeout=3D"5"
> vfs.zfs.vdev.max_pending=3D"10"
> vfs.zfs.vdev.min_pending=3D"4"
> vfs.zfs.write_limit_override=3D"0"
> vfs.zfs.no_write_throttle=3D"0"

Please remove these variables:

vm.kmem_size=3D"22528M"
vfs.zfs.arc_min=3D"20480M"

You do not need to set vm.kmem_size any longer (that was addressed long
ago, during the mid-days of stable/8), and you should let the ARC shrink
if need be (my concern here is that possibly limiting the lower end of
the ARC size may be triggering some other portions of FreeBSD's VM or
ZFS to behave oddly.  No proof/evidence, just guesswork on my part).

At bare minimum, *definitely* remove the vm.kmem_size setting.

Next, please remove the following variables, as these serve no purpose
(they are the defaults in 9.1-RELEASE):

vfs.zfs.prefetch_disable=3D"0"
vfs.zfs.txg.timeout=3D"5"
vfs.zfs.vdev.max_pending=3D"10"
vfs.zfs.vdev.min_pending=3D"4"
vfs.zfs.write_limit_override=3D"0"
vfs.zfs.no_write_throttle=3D"0"

So in short all you should have in your loader.conf is:

zfs_load=3D"yes"
vfs.zfs.arc_max=3D"20480M"

> Le 07/06/2013 17:07, Pierre Lemazurier a =E9crit :
> >Hi, i think i suffer of write and read performance issues on my zpool.
> >
> >About my system and hardware :
> >
> >uname -a
> >FreeBSD bsdnas 9.1-RELEASE FreeBSD 9.1-RELEASE #0 r243825: Tue Dec 4
> >09:23:10 UTC 2012
> >root@farrell.cse.buffalo.edu:/usr/obj/usr/src/sys/GENERIC amd64
> >
> >sysinfo -a : http://www.privatepaste.com/b32f34c938

Going forward, I would recommend also providing "dmesg".  It is a lot
easier to read to most of us.

All I can work out is that your storage controller is mps(4), except I
can't see any of the important details about it.  dmesg would give that,
not this weird "sysinfo" thing.

I would also like to request "pciconf -lvbc" output.

> >- 24 (4gbx6) GB DDR3 ECC :
> >http://www.ec.kingston.com/ecom/configurator_new/partsinfo.asp?ktcpart=
no=3DKVR16R11D8/4HC
> >
> >- 14x this drive :
> >http://www.wdc.com/global/products/specs/?driveID=3D1086&language=3D1

Worth pointing out for readers:

These are 4096-byte sector 2TB WD Red drives.

> >- server :
> >http://www.supermicro.com/products/system/1u/5017/sys-5017r-wrf.cfm?pa=
rts=3Dshow
> >
> >- CPU :
> >http://ark.intel.com/fr/products/64594/Intel-Xeon-Processor-E5-2620-15=
M-Cache-2_00-GHz-7_20-GTs-Intel-QPI
> >
> >- chassis :
> >http://www.supermicro.com/products/chassis/4u/847/sc847e16-rjbod1.cfm
> >- HBA sas connector :
> >http://www.lsi.com/products/storagecomponents/Pages/LSISAS9200-8e.aspx
> >- Cable between chassis and server :
> >http://www.provantage.com/supermicro-cbl-0166l~7SUPA01R.htm
> >
> >I use this command for test write speed :dd if=3D/dev/zero of=3Dtest.d=
d
> >bs=3D2M count=3D10000
> >I use this command for test read speed :dd if=3Dtest.dd of=3D/dev/null=
 bs=3D2M
> >count=3D10000
> >
> >Of course no compression on zfs dataset.
> >
> >Test on one of this disk format with UFS :
> >
> >Write :
> >some gstat raising : http://www.privatepaste.com/dd31fafaa6
> >speed around 140 mo/s and something like 1100 iops
> >dd result : 20971520000 bytes transferred in 146.722126 secs (14293358=
9
> >bytes/sec)
> >
> >Read :
> >I think I read on RAM (20971520000 bytes transferred in 8.813298 secs
> >(2379531480 bytes/sec)).
> >Then I make the test on all the drive (dd if=3D/dev/gpt/disk14.nop
> >of=3D/dev/null bs=3D2M count=3D10000)
> >some gstat raising : http://www.privatepaste.com/d022b7c480
> >speed around 140 mo/s again an near 1100+ iops
> >dd reslut : 20971520000 bytes transferred in 142.895212 secs (14676153=
0
> >bytes/sec)

Looks about right for a single WD Red 2TB drive.  Important: THIS IS A
SINGLE DRIVE.

> >ZFS - I make my zpool on this way : http://www.privatepaste.com/e74d9c=
c3b9

Looks good to me.  This is effectively RAID-10 as you said (a stripe of
mirrors).

> >zpool status : http://www.privatepaste.com/0276801ef6
> >zpool get all : http://www.privatepaste.com/74b37a2429
> >zfs get all : http://www.privatepaste.com/e56f4a33f8
> >zfs-stats -a : http://www.privatepaste.com/f017890aa1
> >zdb : http://www.privatepaste.com/7d723c5556
> >
> >With this setup I hope to have near 7x more speed for write and near 1=
4x
> >for
> >read than the UFS device alone. Then for be realistic, something like
> >850 mo/s for write and 1700 mo/s for read.

Your hopes may be shattered by the reality of how controllers behave and
operate (performance-wise) as well as many other things, including some
ZFS tunables.  We shall see.

> >ZFS =96 test :
> >
> >Write :
> >gstat raising : http://www.privatepaste.com/7cefb9393a
> >zpool iostat -v 1 of a fastest try : http://www.privatepaste.com/8ade4=
defbe
> >dd result : 20971520000 bytes transferred in 54.326509 secs (386027381
> >bytes/sec)
> >
> >386 mo/s more than twice less than I expect.

One thing to be aware of: while the dd took 54 seconds, the I/O to the
pool probably continued for long after that.  Your average speed to each
disk at that time was (just estimating it here) ~55MBytes/second.

I would assume what you're seeing above is probably the speed between
/dev/zero and the ZFS ARC, with (of course) the controller and driver in
the way.

We know that your disks can do about 110-140MBytes/second each, so the
performance hit has got to be in one of the following places:

1. ZFS itself,
2. Controller, controller driver (mps(4)), or controller firmware,
3. On-die MCH (memory controller)
4. PCIe bus speed limitations or other whatnots.

The place to start is with #1, ZFS.

See the bottom of my mail for advice.

> >Read :
> >I export and import the pool for limit the ARC effect. I don't know ho=
w
> >to do better, I hope that sufficient.

You could have checked using "top -b" (before and after export); look
for the "ARC" line.

I tend to just reboot the system, but export should result in a full
pending I/O flush (from ARC, etc.) to all the devices.  I would do this
and wait about 15 seconds + check with gstat before doing more
performance tests.

> >gstat raising : http://www.privatepaste.com/130ce43af1
> >zpool iostat -v 1 : http://privatepaste.com/eb5f9d3432
> >dd result : 20971520000 bytes transferred in 30.347214 secs (691052563
> >bytes/sec)
> >690 mo/s 2,5x less than I expect.
> >
> >
> >It's appear to not be an hardware issue, when I do a dd test of each
> >whole disk at the same time with the command dd if=3D/dev/gpt/diskX
> >of=3D/dev/null bs=3D1M count=3D10000, I have this gstat raising :
> >http://privatepaste.com/df9f63fd4d
> >
> >Near 130 mo/s for each device, something like I expect.

You're thinking of hardware in too simply a fashion -- if only it were
that simple.

> >In your opinion where the problem come from ?

Not enough information at this time to narrow down where the issue is.

Things to try:

1. Start with the initial loader.conf modifications I stated.  The
vm.kmem_size removal may help.

2. Possibly trying vfs.zfs.no_write_throttle=3D"1" in loader.conf +
rebooting + re-doing this test.  What that tunable does:

https://blogs.oracle.com/roch/entry/the_new_zfs_write_throttle

You can also Google "vfs.zfs.no_write_throttle" and see that it's been
discussed quite a bit, including some folks saying performance
tremendously increases when they set this to 1. =20

3. Given the massive size of your disk array and how much memory you
have, you may also want to consider adjusting some of these (possibly
increasing vfs.zfs.txg.timeout to make I/O flushing to your disks happen
*less* often; I haven't tinkered with the other two):

vfs.zfs.txg.timeout=3D"5"
vfs.zfs.vdev.max_pending=3D"10"
vfs.zfs.vdev.min_pending=3D"4"

These also come to mind (these are the defaults):

vfs.zfs.write_limit_max=3D"1069071872"
vfs.zfs.write_limit_min=3D"33554432"

sysctl -d will give you descriptions of these.  I have never had to tune
any of these, however, but that's also because the pools I've built have
consisted of much smaller numbers of disks (3 or 4 at most).  I am also
used to ahci(4) and have avoided all other controllers for a multitude
of reasons (not saying that's the cause of your problem here, just
saying that's the stance I've chosen to take).

You might also try limiting your ARC maximum (vfs.zfs.arc_max) to
something smaller -- say, 8GBytes.  See if that has an effect.

4. "sysctl -a | grep zfs" is a very useful piece of information that you
should do along with "gstat" and "zpool iostat -v".  The counters and
information shown there are very, very helpful a lot of the time.  There
are particular ones that indicate certain performance-hindering
scenarios.

5. Your "UFS tests" only tested a single disk, while your ZFS tests
tested 14 disks in a RAID-10-like fashion.  You could try reproducing
the RAID-10 setup using gvinum(8) and use UFS and see what sort of
performance you get there.

6. Try re-doing the tests but with less drives involved -- say, 6
instead of 14.  See if the throughput to each drive is increased
compared to with 14 drives.

In general, "profiling" ZFS like this is tricky and requires folks who
are very much in-the-know and know how to go about accomplishing this
task.  Others more familiar with how to do this may need to step up to
the plate, but no support/response is guaranteed (if you need that, try
Solaris).

--=20
| Jeremy Chadwick                                   jdc@koitsu.org |
| UNIX Systems Administrator                http://jdc.koitsu.org/ |
| Making life hard for others since 1977.             PGP 4BD6C0CB |


From owner-freebsd-fs@FreeBSD.ORG  Mon Jun 10 13:13:53 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@FreeBSD.org
Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115])
 by hub.freebsd.org (Postfix) with ESMTP id ABC907CE;
 Mon, 10 Jun 2013 13:13:53 +0000 (UTC) (envelope-from marck@rinet.ru)
Received: from woozle.rinet.ru (woozle.rinet.ru [195.54.192.68])
 by mx1.freebsd.org (Postfix) with ESMTP id 22B791758;
 Mon, 10 Jun 2013 13:13:52 +0000 (UTC)
Received: from localhost (localhost [127.0.0.1])
 by woozle.rinet.ru (8.14.5/8.14.5) with ESMTP id r5ADDi5a072378;
 Mon, 10 Jun 2013 17:13:44 +0400 (MSK) (envelope-from marck@rinet.ru)
Date: Mon, 10 Jun 2013 17:13:44 +0400 (MSK)
From: Dmitry Morozovsky <marck@rinet.ru>
To: freebsd-fs@FreeBSD.org
Subject: hast: can't restore after disk failure
Message-ID: <alpine.BSF.2.00.1306101700300.69113@woozle.rinet.ru>
User-Agent: Alpine 2.00 (BSF 1167 2008-08-23)
X-NCC-RegID: ru.rinet
X-OpenPGP-Key-ID: 6B691B03
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.4.3
 (woozle.rinet.ru [0.0.0.0]); Mon, 10 Jun 2013 17:13:45 +0400 (MSK)
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 10 Jun 2013 13:13:53 -0000

Dear colleagues,


stable/9 

FreeBSD cthulhu3 9.1-STABLE-NEWCARP FreeBSD 9.1-STABLE-NEWCARP #6 
r251443M: Thu Jun  6 02:54:36 MSK 2013

ada1 failed and has been replaced. gpart created. But I can't insert new disk 
into hast

root@cthulhu3:/# hastctl status
Name    Status   Role           Components
d0      complete secondary      /dev/ada0p1     cthulhu4
d1      -        init           /dev/ada1p1     cthulhu4
d2      complete secondary      /dev/ada2p1     cthulhu4
d3      complete secondary      /dev/ada3p1     cthulhu4
zil3    complete secondary      /dev/ada4p1     cthulhu4
zil4    complete secondary      /dev/ada4p2     cthulhu4

root@cthulhu3:/# hastctl role secondary d1
root@cthulhu3:/# hastctl list d1
d1:
  role: secondary
  provname: d1
  localpath: /dev/ada1p1
  extentsize: 0 (0B)
  keepdirty: 0
  remoteaddr: cthulhu4
  replication: memsync
  dirty: 0 (0B)
  statistics:
    reads: 0
    writes: 0
    deletes: 0
    flushes: 0
    activemap updates: 0
    local errors: read: 0, write: 0, delete: 0, flush: 0
root@cthulhu3:/# tail -2 /var/log/console.log
Jun 10 16:56:06 <console.info> cthulhu3 kernel: Jun 10 16:56:06 <daemon.err> 
cthulhu3 hastd[14379]: [d1] (secondary) Unable to read metadata from 
/dev/ada1p1: No such file or directory.
Jun 10 16:56:11 <console.info> cthulhu3 kernel: Jun 10 16:56:11 <daemon.err> 
cthulhu3 hastd[765]: [d1] (secondary) Worker process exited ungracefully 
(pid=14379, exitcode=66).
Jun 10 16:56:16 <console.info> cthulhu3 kernel: Jun 10 16:56:16 <daemon.err> 
cthulhu3 hastd[14380]: [d1] (secondary) Unable to read metadata from 
/dev/ada1p1: No such file or directory.
Jun 10 16:56:20 <console.info> cthulhu3 kernel: Jun 10 16:56:20 <daemon.err> 
cthulhu3 hastd[765]: [d1] (secondary) Worker process exited ungracefully 
(pid=14380, exitcode=66).

Any hints? Thanks!


-- 
Sincerely,
D.Marck                                     [DM5020, MCK-RIPE, DM3-RIPN]
[ FreeBSD committer:                                 marck@FreeBSD.org ]
------------------------------------------------------------------------
*** Dmitry Morozovsky --- D.Marck --- Wild Woozle --- marck@rinet.ru ***
------------------------------------------------------------------------

From owner-freebsd-fs@FreeBSD.ORG  Mon Jun 10 17:29:04 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: fs@FreeBSD.org
Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115])
 by hub.freebsd.org (Postfix) with ESMTP id 15103B89;
 Mon, 10 Jun 2013 17:29:04 +0000 (UTC)
 (envelope-from br@mail.bsdpad.com)
Received: from mail.bsdpad.com (mail.bsdpad.com [109.107.176.56])
 by mx1.freebsd.org (Postfix) with ESMTP id C34021762;
 Mon, 10 Jun 2013 17:29:03 +0000 (UTC)
Received: from mail.bsdpad.com ([109.107.176.56])
 by mail.bsdpad.com with smtp (Exim 4.80.1 (FreeBSD))
 (envelope-from <br@mail.bsdpad.com>)
 id 1Um5Gw-0002dy-Rv; Mon, 10 Jun 2013 16:48:46 +0000
Received: by mail.bsdpad.com (nbSMTP-1.00) for uid 1001
 br@mail.bsdpad.com; Mon, 10 Jun 2013 16:48:46 +0000 (UTC)
Date: Mon, 10 Jun 2013 16:48:46 +0000
From: Ruslan Bukin <br@bsdpad.com>
To: Steve Wills <swills@FreeBSD.org>
Subject: Re: dev entries for cloned zvol don't show up until after reboot
Message-ID: <20130610164846.GA10127@mail.bsdpad.com>
References: <8ea8b9c8074fd122f78c5eaa3b289805.squirrel@mouf.net>
 <51A2B533.8030504@FreeBSD.org>
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Disposition: inline
In-Reply-To: <51A2B533.8030504@FreeBSD.org>
User-Agent: Mutt/1.5.21 (2010-09-15)
Cc: fs@FreeBSD.org
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 10 Jun 2013 17:29:04 -0000

On Mon, May 27, 2013 at 01:21:55AM +0000, Steve Wills wrote:
> On 05/24/13 17:56, Steve Wills wrote:
> > Hi,
> > 
> > I've noticed that if I make zvol, create a snapshot of it, then clone
> > that, the /dev/zvol/* entries for it don't show up until after I reboot.
> > This is on r250925. Is this a known bug?
> 
> To add a bit more detail to this, the steps are:
> 
> zfs create -V 1G pool/somevol
> ls /dev/zvol/pool # witness somevol entries
> zfs create pool/somevol@somesnap
> ls /dev/zvol/pool # witness no new entries
> zfs clone pool/somvol@somesnap pool/anothervol
> ls /dev/zvol/pool # again witness no new entries
> reboot
> ls /dev/zvol/pool # witness missing entries appearing
> 
> I'll go ahead and submit a PR too in case that helps.


this patch for 9.1-stable works for me

--- zfs_ioctl.c	2013-06-09 23:54:22.386708932 +0400
+++ zfs_ioctl.c	2013-06-10 00:21:58.161708460 +0400
@@ -3299,6 +3299,12 @@
   	   if (error != 0)
 	      	     (void) dsl_destroy_head(fsname);
 		     }
+
+#ifdef __FreeBSD__
+        if (error == 0)
+                zvol_create_minors(fsname);
+#endif
+
	return (error);
 }

-Ruslan


From owner-freebsd-fs@FreeBSD.ORG  Mon Jun 10 20:16:57 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115])
 by hub.freebsd.org (Postfix) with ESMTP id 2C4AC655
 for <freebsd-fs@freebsd.org>; Mon, 10 Jun 2013 20:16:57 +0000 (UTC)
 (envelope-from to.my.trociny@gmail.com)
Received: from mail-bk0-x22f.google.com (mail-bk0-x22f.google.com
 [IPv6:2a00:1450:4008:c01::22f])
 by mx1.freebsd.org (Postfix) with ESMTP id B44701F17
 for <freebsd-fs@freebsd.org>; Mon, 10 Jun 2013 20:16:56 +0000 (UTC)
Received: by mail-bk0-f47.google.com with SMTP id jg1so3326684bkc.20
 for <freebsd-fs@freebsd.org>; Mon, 10 Jun 2013 13:16:55 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
 h=sender:date:from:to:cc:subject:message-id:references:mime-version
 :content-type:content-disposition:in-reply-to:user-agent;
 bh=tLSLZFEJhnbdtPtOrDcAzRLa7J4sWOPNEVVfvgA0L0U=;
 b=WkgLgEmsrV+AhrpNwba245wtGAgcZe6dxgIPnrz10HbX36JyhorvD0SaMDDdBb5B7J
 t7q0Hs+omFM6eIhtR5d0eq6fpfz2lJFLF++w4FfmAHziwWNDlUSe4L8w5XRhhVEopMoB
 nYpB3QggbyreKsoDsnMb8dr6qbW3a8GOsy9cba1Wsx/JowgPxGf1zolyvsuXMSBzm72c
 5fEWRaqDz9cuS7yqpWM71lyb+LV/ME1Qt71ji2Ds/v6HjuFrNiol3I5YMd+u+ORazwSx
 zSCE6DJD1eVpw4RtMFKoWw/c7EMKwkK2H9MslAmIQYG9aYa8tKdv6O5eXg0yc1DVkWhW
 l+Zw==
X-Received: by 10.204.71.77 with SMTP id g13mr1767464bkj.50.1370895415702;
 Mon, 10 Jun 2013 13:16:55 -0700 (PDT)
Received: from localhost ([178.150.115.244])
 by mx.google.com with ESMTPSA id og1sm4474296bkb.16.2013.06.10.13.16.54
 for <multiple recipients>
 (version=TLSv1.2 cipher=RC4-SHA bits=128/128);
 Mon, 10 Jun 2013 13:16:55 -0700 (PDT)
Sender: Mikolaj Golub <to.my.trociny@gmail.com>
Date: Mon, 10 Jun 2013 23:16:51 +0300
From: Mikolaj Golub <trociny@FreeBSD.org>
To: Dmitry Morozovsky <marck@rinet.ru>
Subject: Re: hast: can't restore after disk failure
Message-ID: <20130610201650.GA2823@gmail.com>
References: <alpine.BSF.2.00.1306101700300.69113@woozle.rinet.ru>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <alpine.BSF.2.00.1306101700300.69113@woozle.rinet.ru>
User-Agent: Mutt/1.5.21 (2010-09-15)
Cc: freebsd-fs@FreeBSD.org
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 10 Jun 2013 20:16:57 -0000

On Mon, Jun 10, 2013 at 05:13:44PM +0400, Dmitry Morozovsky wrote:
> Dear colleagues,
> 
> 
> stable/9 
> 
> FreeBSD cthulhu3 9.1-STABLE-NEWCARP FreeBSD 9.1-STABLE-NEWCARP #6 
> r251443M: Thu Jun  6 02:54:36 MSK 2013
> 
> ada1 failed and has been replaced. gpart created. But I can't insert new disk 
> into hast
> 
> root@cthulhu3:/# hastctl status
> Name    Status   Role           Components
> d0      complete secondary      /dev/ada0p1     cthulhu4
> d1      -        init           /dev/ada1p1     cthulhu4
> d2      complete secondary      /dev/ada2p1     cthulhu4
> d3      complete secondary      /dev/ada3p1     cthulhu4
> zil3    complete secondary      /dev/ada4p1     cthulhu4
> zil4    complete secondary      /dev/ada4p2     cthulhu4
> 
> root@cthulhu3:/# hastctl role secondary d1
> root@cthulhu3:/# hastctl list d1
> d1:
>   role: secondary
>   provname: d1
>   localpath: /dev/ada1p1
>   extentsize: 0 (0B)
>   keepdirty: 0
>   remoteaddr: cthulhu4
>   replication: memsync
>   dirty: 0 (0B)
>   statistics:
>     reads: 0
>     writes: 0
>     deletes: 0
>     flushes: 0
>     activemap updates: 0
>     local errors: read: 0, write: 0, delete: 0, flush: 0
> root@cthulhu3:/# tail -2 /var/log/console.log
> Jun 10 16:56:06 <console.info> cthulhu3 kernel: Jun 10 16:56:06 <daemon.err> 
> cthulhu3 hastd[14379]: [d1] (secondary) Unable to read metadata from 
> /dev/ada1p1: No such file or directory.
> Jun 10 16:56:11 <console.info> cthulhu3 kernel: Jun 10 16:56:11 <daemon.err> 
> cthulhu3 hastd[765]: [d1] (secondary) Worker process exited ungracefully 
> (pid=14379, exitcode=66).
> Jun 10 16:56:16 <console.info> cthulhu3 kernel: Jun 10 16:56:16 <daemon.err> 
> cthulhu3 hastd[14380]: [d1] (secondary) Unable to read metadata from 
> /dev/ada1p1: No such file or directory.
> Jun 10 16:56:20 <console.info> cthulhu3 kernel: Jun 10 16:56:20 <daemon.err> 
> cthulhu3 hastd[765]: [d1] (secondary) Worker process exited ungracefully 
> (pid=14380, exitcode=66).
> 
> Any hints? Thanks!

Have you run hastctl create to initialize metadata?

-- 
Mikolaj Golub

From owner-freebsd-fs@FreeBSD.ORG  Mon Jun 10 20:40:10 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by hub.freebsd.org (Postfix) with ESMTP id BE3E4E39;
 Mon, 10 Jun 2013 20:40:10 +0000 (UTC) (envelope-from marck@rinet.ru)
Received: from woozle.rinet.ru (woozle.rinet.ru [195.54.192.68])
 by mx1.freebsd.org (Postfix) with ESMTP id 51336106F;
 Mon, 10 Jun 2013 20:40:09 +0000 (UTC)
Received: from localhost (localhost [127.0.0.1])
 by woozle.rinet.ru (8.14.5/8.14.5) with ESMTP id r5AKe82o096707;
 Tue, 11 Jun 2013 00:40:08 +0400 (MSK) (envelope-from marck@rinet.ru)
Date: Tue, 11 Jun 2013 00:40:08 +0400 (MSK)
From: Dmitry Morozovsky <marck@rinet.ru>
To: Mikolaj Golub <trociny@freebsd.org>
Subject: Re: hast: can't restore after disk failure
In-Reply-To: <20130610201650.GA2823@gmail.com>
Message-ID: <alpine.BSF.2.00.1306110038010.96502@woozle.rinet.ru>
References: <alpine.BSF.2.00.1306101700300.69113@woozle.rinet.ru>
 <20130610201650.GA2823@gmail.com>
User-Agent: Alpine 2.00 (BSF 1167 2008-08-23)
X-NCC-RegID: ru.rinet
X-OpenPGP-Key-ID: 6B691B03
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.4.3
 (woozle.rinet.ru [0.0.0.0]); Tue, 11 Jun 2013 00:40:09 +0400 (MSK)
Cc: freebsd-fs@freebsd.org
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 10 Jun 2013 20:40:10 -0000

On Mon, 10 Jun 2013, Mikolaj Golub wrote:

[snipall]

> > Jun 10 16:56:20 <console.info> cthulhu3 kernel: Jun 10 16:56:20 <daemon.err> 
> > cthulhu3 hastd[765]: [d1] (secondary) Worker process exited ungracefully 
> > (pid=14380, exitcode=66).
> > 
> > Any hints? Thanks!
> 
> Have you run hastctl create to initialize metadata?

Yes, but did it naively:

hastctl create d1

and status still reported 0 as provider size...

Sould I provide more options to hastctl create?

-- 
Sincerely,
D.Marck                                     [DM5020, MCK-RIPE, DM3-RIPN]
[ FreeBSD committer:                                 marck@FreeBSD.org ]
------------------------------------------------------------------------
*** Dmitry Morozovsky --- D.Marck --- Wild Woozle --- marck@rinet.ru ***
------------------------------------------------------------------------

From owner-freebsd-fs@FreeBSD.ORG  Mon Jun 10 23:54:50 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115])
 by hub.freebsd.org (Postfix) with ESMTP id AFD799D2
 for <freebsd-fs@freebsd.org>; Mon, 10 Jun 2013 23:54:50 +0000 (UTC)
 (envelope-from editor@callfortesting.org)
Received: from mail-pa0-x22d.google.com (mail-pa0-x22d.google.com
 [IPv6:2607:f8b0:400e:c03::22d])
 by mx1.freebsd.org (Postfix) with ESMTP id 8AB0C1B1D
 for <freebsd-fs@freebsd.org>; Mon, 10 Jun 2013 23:54:50 +0000 (UTC)
Received: by mail-pa0-f45.google.com with SMTP id bi5so4899033pad.18
 for <freebsd-fs@freebsd.org>; Mon, 10 Jun 2013 16:54:50 -0700 (PDT)
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=google.com; s=20120113;
 h=message-id:date:from:user-agent:mime-version:to:cc:subject
 :references:in-reply-to:content-type:content-transfer-encoding
 :x-gm-message-state;
 bh=0qtVKFxCgNYZUWlehrJAUMJ77lh6rAr2C72VacTmOdU=;
 b=QSNW7Xv6eee1WR/uJFB91s/+vaG10+RaZqPWnREv8+tKWbxU4S+o54mWvLJhLbmBlQ
 ts1qZipJ/EhXP7RgCj5bovBrmtMR02vJu98tL8ZsVShWwRb9skyUoZqUxO0skZx+cDLs
 nBN9ha/qN1ipvMdg3kYCWz53UuPoIrd693QCFPQMhdoWcyGLpD0HB46TdtoiJR8Nz6+1
 8uSgYxSy5CT5ceTmnKKkXoPT4fA8vv0ILXLR3Vmmz7PmTwueSBYdCkiFV6tqr26u+uHs
 hhLvp2PMt68oV2lWOxQ+FWhLbpbyE6hEKnrM4nGzt4jc76Ov8hoSkJ/mcwFrf4r17AC/
 VVXQ==
X-Received: by 10.66.166.107 with SMTP id zf11mr16207396pab.166.1370908490151; 
 Mon, 10 Jun 2013 16:54:50 -0700 (PDT)
Received: from MacBook-4.local (c-98-246-202-204.hsd1.or.comcast.net.
 [98.246.202.204])
 by mx.google.com with ESMTPSA id pl9sm12038436pbc.5.2013.06.10.16.54.48
 for <multiple recipients>
 (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128);
 Mon, 10 Jun 2013 16:54:49 -0700 (PDT)
Message-ID: <51B66748.4000708@callfortesting.org>
Date: Mon, 10 Jun 2013 16:54:48 -0700
From: Michael Dexter <editor@callfortesting.org>
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.7;
 rv:17.0) Gecko/20130509 Thunderbird/17.0.6
MIME-Version: 1.0
To: Jeremy Chadwick <jdc@koitsu.org>
Subject: Re: ZFS panic on import under VMware
References: <51B0FADB.10302@callfortesting.org>
 <20130606215224.GA44910@icarus.home.lan>
In-Reply-To: <20130606215224.GA44910@icarus.home.lan>
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
X-Gm-Message-State: ALoCoQklBDOJAUPhPZwOVFAS0oau/spzpG9wYKl2h4DScrSRU3DAUq4vWQrRlOapZykpJf7el5ey
Cc: freebsd-fs@freebsd.org
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 10 Jun 2013 23:54:50 -0000


Requested details included inline:

>> I have encountered a FreeNAS under VMware system that gives "ONLINE"
>> status for a pool with 'zfs import' but panics when a -f import is
>> done under FreeNAS 8.3 x64, FreeBSD 10 and Solaris 11 live DVD. The
>> host filesystem passes all checks.
>> Stopped at traverse_prefetch_metadata+0x44: movq 0x50(%rax),%rcx
>> I have posted screen shots of the import, panic and backtrace output:
>> http://cft.lv/zfs/2013-07-06/
>
> 1. On what OS (version, etc.) was the ZFS pool originally created?

FreeNAS 8.2

> 2. Was the pool originally created with compression or dedup enabled?
> (Answers to both of these questions is extremely important)

Dedup: never enabled
Compression: Yes (Believe FreeNAS Default, guessing lzjb)

> 3. How much memory are you allocating to the VMware instance?  (This is
> in partial relation to question #2)

6GB, up to 10GB when attempting to re-import

> 4. On what OS (version, etc.) are the panic/backtrace screenshots from?
> It looks to me like FreeBSD 10.x.

Correct. amd64. Panics with Solaris 11.1 & FreeNAS 8.3. No saved traces

> 5. Is there a reason you didn't try FreeBSD 9.1-RELEASE?  The state of
> FreeBSD 10.x (head/CURRENT) is usually in fluctuation, you should try
> something other than head.

Will try 9.1R

> 6. You're using VMware Workstation; where did the source ZFS pool come
> from?  Do you have physical disks attached to the machine and are using
> the "Use a physical disk" feature?  If you're using "disk images" made
> by something, what did you use?  Please provide all the details, how you
> did it, etc...

VMware ESXi 5.1 with no storage. All from HUS110 iSCSI, mounted as VMFS5 
Datastores. The failing disks are vmdk files in virtual disk mode which 
were created when installing FreeNAS.

> 7. Is there some reason you cannot try this on bare metal?

VMware does not appear to fully support ZFS pass-through. Not easy to 
convert two, 2TB images. Suggestions?

> 8. On FreeBSD 9.x (see above) or 10.x, during boot, drop to the loader
> prompt and issue "set vfs.zfs.prefetch_disable=1" followed by "boot".
> See if that has any impact during the "zpool import" phase.

Results in Fatal trap 12 on FreeBSD 10 and will try 9.X ASAP.

Thanks!

Michael

From owner-freebsd-fs@FreeBSD.ORG  Tue Jun 11 00:17:19 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@FreeBSD.org
Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115])
 by hub.freebsd.org (Postfix) with ESMTP id EA261FB0;
 Tue, 11 Jun 2013 00:17:19 +0000 (UTC)
 (envelope-from mckusick@mckusick.com)
Received: from chez.mckusick.com (chez.mckusick.com
 [IPv6:2001:5a8:4:7e72:4a5b:39ff:fe12:452])
 by mx1.freebsd.org (Postfix) with ESMTP id C80D91C30;
 Tue, 11 Jun 2013 00:17:19 +0000 (UTC)
Received: from chez.mckusick.com (localhost [127.0.0.1])
 by chez.mckusick.com (8.14.3/8.14.3) with ESMTP id r5B0HFct074482;
 Mon, 10 Jun 2013 17:17:15 -0700 (PDT)
 (envelope-from mckusick@chez.mckusick.com)
Message-Id: <201306110017.r5B0HFct074482@chez.mckusick.com>
To: Palle Girgensohn <girgen@FreeBSD.org>
Subject: Re: leaking lots of unreferenced inodes (pg_xlog files?) 
In-reply-to: <51B5A277.2060904@FreeBSD.org> 
Date: Mon, 10 Jun 2013 17:17:15 -0700
From: Kirk McKusick <mckusick@mckusick.com>
X-Spam-Status: No, score=0.0 required=5.0 tests=MISSING_MID, UNPARSEABLE_RELAY
 autolearn=failed version=3.2.5
X-Spam-Checker-Version: SpamAssassin 3.2.5 (2008-06-10) on chez.mckusick.com
Cc: freebsd-fs@FreeBSD.org, Dan Thomas <godders@gmail.com>,
 Jeff Roberson <jroberson@jroberson.net>, Julian Akehurst <julian@pingpong.se>
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 11 Jun 2013 00:17:20 -0000

> Date: Mon, 10 Jun 2013 11:55:03 +0200
> From: Palle Girgensohn <girgen@FreeBSD.org>
> To: Kirk McKusick <mckusick@mckusick.com>
> CC: freebsd-fs@FreeBSD.org, Dan Thomas <godders@gmail.com>,
>         Jeff Roberson <jroberson@jroberson.net>,
>         Julian Akehurst <julian@pingpong.se>
> Subject: Re: leaking lots of unreferenced inodes (pg_xlog files?)
> 
> Kirk McKusick skrev:
> 
> > This looks good. Keep me posted.
> 
> After running for a number of days without soft updates, it seems to me
> that the culprit is indeed in the soft updates code.
> 
> # df -k /pgsql; du -sk /pgsql
> Filesystem  1024-blocks     Used    Avail Capacity  Mounted on
> /dev/da2s1d   134763348 86339044 37643238    70%    /pgsql
> 86303252	/pgsql
> 
> Palle

OK, good to have it narrowed down. I will look to devise some
additional diagnostics that hopefully will help tease out the
bug. I'll hopefully get back to you soon.

	Kirk McKusick

From owner-freebsd-fs@FreeBSD.ORG  Tue Jun 11 06:07:49 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115])
 by hub.freebsd.org (Postfix) with ESMTP id 062E37E9
 for <freebsd-fs@freebsd.org>; Tue, 11 Jun 2013 06:07:49 +0000 (UTC)
 (envelope-from to.my.trociny@gmail.com)
Received: from mail-lb0-f179.google.com (mail-lb0-f179.google.com
 [209.85.217.179])
 by mx1.freebsd.org (Postfix) with ESMTP id 8413E1D27
 for <freebsd-fs@freebsd.org>; Tue, 11 Jun 2013 06:07:47 +0000 (UTC)
Received: by mail-lb0-f179.google.com with SMTP id w20so7081620lbh.38
 for <freebsd-fs@freebsd.org>; Mon, 10 Jun 2013 23:07:46 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
 h=date:from:to:cc:subject:message-id:references:mime-version
 :content-type:content-disposition:in-reply-to:user-agent;
 bh=9U8eaVVJ0OEtDWh5BaSk5gro1MqxiqZ1uz2le/qMtj0=;
 b=mJoAIhwtFp88CayPgaE8dbW5yQfYBv9UdAucpJvJdQ1Kppq+YPQYKCoVnyHV2G9jOG
 wJ8w98/+oxgAoUBzLJfBubV9X979TEHuJEH2EAZrsaQPVGQMvcT3WMzPMkd5qFM9o1mW
 NPsxJj1qs4Rn1d6KOvLAdMxlYvZ23kjAIeA89SUWPL7tpn7RqSPKD2+i29An8XYFGQou
 b3vcLRtJRP/OjlxvtGTjZ5ZM25Rs17XEUcMiwcAz2ETzLajshEwsQqH8dkhCfzzFzqHx
 sxIg2s1JkGtjc66b4yELtiYQS8cFRVh+8HuTpA09gi8F6DsRHvMCXF+b/09FjauOiefh
 AVcw==
X-Received: by 10.152.8.37 with SMTP id o5mr6452312laa.87.1370930866709;
 Mon, 10 Jun 2013 23:07:46 -0700 (PDT)
Received: from localhost ([188.230.122.226])
 by mx.google.com with ESMTPSA id z9sm5601702lae.7.2013.06.10.23.07.44
 for <multiple recipients>
 (version=TLSv1 cipher=RC4-SHA bits=128/128);
 Mon, 10 Jun 2013 23:07:45 -0700 (PDT)
Date: Tue, 11 Jun 2013 09:07:42 +0300
From: Mikolaj Golub <to.my.trociny@gmail.com>
To: Dmitry Morozovsky <marck@rinet.ru>
Subject: Re: hast: can't restore after disk failure
Message-ID: <20130611060741.GA42231@gmail.com>
References: <alpine.BSF.2.00.1306101700300.69113@woozle.rinet.ru>
 <20130610201650.GA2823@gmail.com>
 <alpine.BSF.2.00.1306110038010.96502@woozle.rinet.ru>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <alpine.BSF.2.00.1306110038010.96502@woozle.rinet.ru>
User-Agent: Mutt/1.5.21 (2010-09-15)
Cc: freebsd-fs@freebsd.org
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 11 Jun 2013 06:07:49 -0000

On Tue, Jun 11, 2013 at 12:40:08AM +0400, Dmitry Morozovsky wrote:
> On Mon, 10 Jun 2013, Mikolaj Golub wrote:
> 
> [snipall]
> 
> > > Jun 10 16:56:20 <console.info> cthulhu3 kernel: Jun 10 16:56:20 <daemon.err> 
> > > cthulhu3 hastd[765]: [d1] (secondary) Worker process exited ungracefully 
> > > (pid=14380, exitcode=66).
> > > 
> > > Any hints? Thanks!
> > 
> > Have you run hastctl create to initialize metadata?
> 
> Yes, but did it naively:
> 
> hastctl create d1

No errors?

> 
> and status still reported 0 as provider size...

I assume /dev/ada1p1 is present and readable/writable?

Symptoms are like if it did not exist.

> Sould I provide more options to hastctl create?

Usually no, until the disk is of larger size than the replaced one,
and you need manually specify the old mediasize.

-- 
Mikolaj Golub

From owner-freebsd-fs@FreeBSD.ORG  Tue Jun 11 06:34:51 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115])
 by hub.freebsd.org (Postfix) with ESMTP id 4443FD81
 for <fs@freebsd.org>; Tue, 11 Jun 2013 06:34:51 +0000 (UTC)
 (envelope-from kostikbel@gmail.com)
Received: from kib.kiev.ua (kib.kiev.ua [IPv6:2001:470:d5e7:1::1])
 by mx1.freebsd.org (Postfix) with ESMTP id DBDA61E0E
 for <fs@freebsd.org>; Tue, 11 Jun 2013 06:34:50 +0000 (UTC)
Received: from tom.home (kostik@localhost [127.0.0.1])
 by kib.kiev.ua (8.14.7/8.14.7) with ESMTP id r5B6Ykds004347;
 Tue, 11 Jun 2013 09:34:46 +0300 (EEST)
 (envelope-from kostikbel@gmail.com)
DKIM-Filter: OpenDKIM Filter v2.8.3 kib.kiev.ua r5B6Ykds004347
Received: (from kostik@localhost)
 by tom.home (8.14.7/8.14.7/Submit) id r5B6Yktc004346;
 Tue, 11 Jun 2013 09:34:46 +0300 (EEST)
 (envelope-from kostikbel@gmail.com)
X-Authentication-Warning: tom.home: kostik set sender to kostikbel@gmail.com
 using -f
Date: Tue, 11 Jun 2013 09:34:46 +0300
From: Konstantin Belousov <kostikbel@gmail.com>
To: Bruce Evans <brde@optusnet.com.au>
Subject: Re: missed clustering for small block sizes in cluster_wbuild()
Message-ID: <20130611063446.GJ3047@kib.kiev.ua>
References: <20130607044845.O24441@besplex.bde.org>
MIME-Version: 1.0
Content-Type: multipart/signed; micalg=pgp-sha1;
 protocol="application/pgp-signature"; boundary="hCS/BWoPfTdmYtZi"
Content-Disposition: inline
In-Reply-To: <20130607044845.O24441@besplex.bde.org>
User-Agent: Mutt/1.5.21 (2010-09-15)
X-Spam-Status: No, score=-2.0 required=5.0 tests=ALL_TRUSTED,BAYES_00,
 DKIM_ADSP_CUSTOM_MED,FREEMAIL_FROM,NML_ADSP_CUSTOM_MED autolearn=no
 version=3.3.2
X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on tom.home
Cc: fs@freebsd.org
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 11 Jun 2013 06:34:51 -0000


--hCS/BWoPfTdmYtZi
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

On Fri, Jun 07, 2013 at 05:28:11AM +1000, Bruce Evans wrote:
> I think this is best fixed be fixed by removing the check above and
> checking here.  Then back out of the changes.  I don't know this code
> well enough to write the backing out easily.

Could you test this, please ?

diff --git a/sys/kern/vfs_cluster.c b/sys/kern/vfs_cluster.c
index b280317..9e1528e 100644
--- a/sys/kern/vfs_cluster.c
+++ b/sys/kern/vfs_cluster.c
@@ -766,7 +766,7 @@ cluster_wbuild(struct vnode *vp, long size, daddr_t sta=
rt_lbn, int len,
 {
 	struct buf *bp, *tbp;
 	struct bufobj *bo;
-	int i, j;
+	int i, j, jj;
 	int totalwritten =3D 0;
 	int dbsize =3D btodb(size);
=20
@@ -904,14 +904,10 @@ cluster_wbuild(struct vnode *vp, long size, daddr_t s=
tart_lbn, int len,
=20
 				/*
 				 * Check that the combined cluster
-				 * would make sense with regard to pages
-				 * and would not be too large
+				 * would make sense with regard to pages.
 				 */
-				if ((tbp->b_bcount !=3D size) ||
-				  ((bp->b_blkno + (dbsize * i)) !=3D
-				    tbp->b_blkno) ||
-				  ((tbp->b_npages + bp->b_npages) >
-				    (vp->v_mount->mnt_iosize_max / PAGE_SIZE))) {
+				if (tbp->b_bcount !=3D size ||
+				    bp->b_blkno + dbsize * i !=3D tbp->b_blkno) {
 					BUF_UNLOCK(tbp);
 					break;
 				}
@@ -964,6 +960,22 @@ cluster_wbuild(struct vnode *vp, long size, daddr_t st=
art_lbn, int len,
 						bp->b_pages[bp->b_npages] =3D m;
 						bp->b_npages++;
 					}
+					if (bp->b_npages > vp->v_mount->
+					    mnt_iosize_max / PAGE_SIZE) {
+						KASSERT(i !=3D 0, ("XXX"));
+						j++;
+						for (jj =3D 0; jj < j; jj++) {
+							vm_page_io_finish(tbp->
+							    b_pages[jj]);
+						}
+						vm_object_pip_subtract(
+						    m->object, j);
+						bqrelse(tbp);
+						bp->b_npages -=3D j;
+						VM_OBJECT_WUNLOCK(tbp->
+						    b_bufobj->bo_object);
+						goto finishcluster;
+					}
 				}
 				VM_OBJECT_WUNLOCK(tbp->b_bufobj->bo_object);
 			}

--hCS/BWoPfTdmYtZi
Content-Type: application/pgp-signature

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.20 (FreeBSD)

iQIcBAEBAgAGBQJRtsUGAAoJEJDCuSvBvK1B8FUP/3J+JkxgJ7jdQvkOyp21fibx
b/iiN19fnd3Ih1sDtLvKXFKDguf17vOxoECqnlhhRjrI8mJMsghqMjKJ11CUsZGq
9LrkXWCLiwtecuP7Rupu8hczAj+Msf1HGwZMtNVwDRAuhL9fE9WX/EiWXFV5D+Z+
04SniVO6Fu+v9ZlPVjCaGhJMDsuMrtsphdiDpRjivgqWN85dvrGur2I8hYm6PTD4
1qBwxv3j5IR3dqRBFkZc+jrYjpRjA5UIAtmJc+3iJlZvL4963od1m48x0L+Wdtit
+mfBDEaXN8gCAbtbN3QW1s+9WUKcqFcucYsECcb9wEjNi5aKjb0FLDPqEJWRoBVS
dqHwXc8bI3KM3+fVoTIhTJDgmkTaCZEvTSCkmGgS1e5B7f4H5My+X7jzaonf7pl3
Jeoab2J6dZ6mwu1xh9Kk+nN80NwGiujx5hW9NgBC7MD7xxvQL5JGmwz/HU1LGZfO
RL4Yi1g0dxobOWAfusK+rZDTEhzKts0vvrRgxk0O+LmEibx0WPcg/nkv8t60wE6U
4bBb7pxXudqTmmGuswKkGrNmP0F7HJFjn7kjkkbacnJRLPA9sOHh+MYW7V9dDIw5
T2Uk12Mm4aKX/hjHEWl6x87PIx384LyW9GiCkWcyFHqwH41SqYz7CgVUuNMqlv/P
f6YwIcq9W9OocqF26iZ0
=DAlx
-----END PGP SIGNATURE-----

--hCS/BWoPfTdmYtZi--

From owner-freebsd-fs@FreeBSD.ORG  Tue Jun 11 08:46:28 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115])
 by hub.freebsd.org (Postfix) with ESMTP id 0DF31F70
 for <freebsd-fs@freebsd.org>; Tue, 11 Jun 2013 08:46:28 +0000 (UTC)
 (envelope-from mxb@alumni.chalmers.se)
Received: from mail-la0-x22b.google.com (mail-la0-x22b.google.com
 [IPv6:2a00:1450:4010:c03::22b])
 by mx1.freebsd.org (Postfix) with ESMTP id 87554160C
 for <freebsd-fs@freebsd.org>; Tue, 11 Jun 2013 08:46:27 +0000 (UTC)
Received: by mail-la0-f43.google.com with SMTP id gw10so6666799lab.30
 for <freebsd-fs@freebsd.org>; Tue, 11 Jun 2013 01:46:25 -0700 (PDT)
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=google.com; s=20120113;
 h=content-type:mime-version:subject:from:in-reply-to:date:cc
 :content-transfer-encoding:message-id:references:to:x-mailer
 :x-gm-message-state;
 bh=qMS8s+keSx5TMEvVX4qXiamKGDpfRAyy9glszaQGKeA=;
 b=W2kljDhSycGwDHYoqXimTSXyJwu8agT1s3ygChxFeoQ7yAj4410umMSe92nBxuaLOY
 E8qvdWHqKSZF+xqDTmkt+YH719tdLFm6tOlE9BOy6v+VJzR8Sf3zWhYMUQjxG2n96iWu
 IgRhNAYrp2uJtDUae2f+APU796qKNqXmA1+xjwvbuYa4jmpm9Av1xaTWZmtdIueVkBCW
 FzL65vQ1N+acVKdyl9IunesvkPepnYhHfeuGuU3dkarbxVcszKirdDYcKjOg90uBaMyw
 03D+oj46SoVdzg9myy8mJagxNGTSAOhKEb0XSQkGoUDbktVVs6xxlR4+orjUHUaZHjL/
 prPw==
X-Received: by 10.152.121.106 with SMTP id lj10mr6861724lab.27.1370940385481; 
 Tue, 11 Jun 2013 01:46:25 -0700 (PDT)
Received: from grey.office.se.prisjakt.nu ([212.16.170.194])
 by mx.google.com with ESMTPSA id w4sm855521law.5.2013.06.11.01.46.23
 for <multiple recipients>
 (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128);
 Tue, 11 Jun 2013 01:46:24 -0700 (PDT)
Content-Type: text/plain; charset=us-ascii
Mime-Version: 1.0 (Mac OS X Mail 6.5 \(1508\))
Subject: Re: zpool export/import on failover - The pool metadata is corrupted
From: mxb <mxb@alumni.chalmers.se>
In-Reply-To: <20130606233417.GA46506@icarus.home.lan>
Date: Tue, 11 Jun 2013 10:46:22 +0200
Content-Transfer-Encoding: quoted-printable
Message-Id: <61E414CF-FCD3-42BB-9533-A40EA934DB99@alumni.chalmers.se>
References: <D7F099CB-855F-43F8-ACB5-094B93201B4B@alumni.chalmers.se>
 <CAKYr3zyPLpLau8xsv3fCkYrpJVzS0tXkyMn4E2aLz29EMBF9cA@mail.gmail.com>
 <016B635E-4EDC-4CDF-AC58-82AC39CBFF56@alumni.chalmers.se>
 <20130606223911.GA45807@icarus.home.lan>
 <C3FC39B3-D09F-4E73-9476-3BFC8B817278@alumni.chalmers.se>
 <20130606233417.GA46506@icarus.home.lan>
To: Jeremy Chadwick <jdc@koitsu.org>
X-Mailer: Apple Mail (2.1508)
X-Gm-Message-State: ALoCoQkc8OJr4ravkNcLpOU7h/rIr026SuVs1m8vZtwDDxWpOfIwmf24Vr9kv+7aC7fnkKy/szrK
Cc: "freebsd-fs@freebsd.org" <freebsd-fs@freebsd.org>
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 11 Jun 2013 08:46:28 -0000


Thanks everyone whom replied.
Removing local L2ARC cache disks (da1,da2) indeed showed to be a cure to =
my problem.

Next is to test with add/remove after import/export as Jeremy suggested.

//mxb

On 7 jun 2013, at 01:34, Jeremy Chadwick <jdc@koitsu.org> wrote:

> On Fri, Jun 07, 2013 at 12:51:14AM +0200, mxb wrote:
>>=20
>> Sure, script is not perfects yet and does not handle many of stuff, =
but moving highlight from zpool import/export to the script itself not =
that
>> clever,as this works most of the time.
>>=20
>> Question is WHY ZFS corrupts metadata then it should not. Sometimes.
>> I'v seen stale of zpool then manually importing/exporting pool.
>>=20
>>=20
>> On 7 jun 2013, at 00:39, Jeremy Chadwick <jdc@koitsu.org> wrote:
>>=20
>>> On Fri, Jun 07, 2013 at 12:12:39AM +0200, mxb wrote:
>>>>=20
>>>> Then MASTER goes down, CARP on the second node goes MASTER =
(devd.conf, and script for lifting):
>>>>=20
>>>> root@nfs2:/root # cat /etc/devd.conf
>>>>=20
>>>>=20
>>>> notify 30 {
>>>> match "system"		"IFNET";
>>>> match "subsystem"	"carp0";
>>>> match "type"		"LINK_UP";
>>>> action "/etc/zfs_switch.sh active";
>>>> };
>>>>=20
>>>> notify 30 {
>>>> match "system"          "IFNET";
>>>> match "subsystem"       "carp0";
>>>> match "type"            "LINK_DOWN";
>>>> action "/etc/zfs_switch.sh backup";
>>>> };
>>>>=20
>>>> root@nfs2:/root # cat /etc/zfs_switch.sh
>>>> #!/bin/sh
>>>>=20
>>>> DATE=3D`date +%Y%m%d`
>>>> HOSTNAME=3D`hostname`
>>>>=20
>>>> ZFS_POOL=3D"jbod"
>>>>=20
>>>>=20
>>>> case $1 in
>>>> 	active)
>>>> 		echo "Switching to ACTIVE and importing ZFS" | mail -s =
''$DATE': '$HOSTNAME' switching to ACTIVE' root
>>>> 		sleep 10
>>>> 		/sbin/zpool import -f jbod
>>>> 		/etc/rc.d/mountd restart
>>>> 		/etc/rc.d/nfsd restart
>>>> 		;;
>>>> 	backup)
>>>> 		echo "Switching to BACKUP and exporting ZFS" | mail -s =
''$DATE': '$HOSTNAME' switching to BACKUP' root
>>>> 		/sbin/zpool export jbod
>>>> 		/etc/rc.d/mountd restart
>>>>               /etc/rc.d/nfsd restart
>>>> 		;;
>>>> 	*)
>>>> 		exit 0
>>>> 		;;
>>>> esac
>>>>=20
>>>> This works, most of the time, but sometimes I'm forced to re-create =
pool. Those machines suppose to go into prod.
>>>> Loosing pool(and data inside it) stops me from deploy this setup.
>>>=20
>>> This script looks highly error-prone.  Hasty hasty...  :-)
>>>=20
>>> This script assumes that the "zpool" commands (import and export) =
always
>>> work/succeed; there is no exit code ($?) checking being used.
>>>=20
>>> Since this is run from within devd(8): where does stdout/stderr go =
to
>>> when running a program/script under devd(8)?  Does it effectively go
>>> to the bit bucket (/dev/null)?  If so, you'd never know if the =
import or
>>> export actually succeeded or not (the export sounds more likely to =
be
>>> the problem point).
>>>=20
>>> I imagine there would be some situations where the export would fail
>>> (some files on filesystems under pool "jbod" still in use), yet CARP =
is
>>> already blindly assuming everything will be fantastic.  Surprise.
>>>=20
>>> I also do not know if devd.conf(5) "action" commands spawn a =
sub-shell
>>> (/bin/sh) or not.  If they don't, you won't be able to use things =
like"
>>> 'action "/etc/zfs_switch.sh active >> /var/log/failover.log";'.  You
>>> would then need to implement the equivalent of logging within your
>>> zfs_switch.sh script.
>>>=20
>>> You may want to consider the -f flag to zpool import/export
>>> (particularly export).  However there are risks involved -- userland
>>> applications which have an fd/fh open on a file which is stored on a
>>> filesystem that has now completely disappeared can sometimes crash
>>> (segfault) or behave very oddly (100% CPU usage, etc.) depending on =
how
>>> they're designed.
>>>=20
>>> Basically what I'm trying to say is that devd(8) being used as a =
form of
>>> HA (high availability) and load balancing is not always possible.
>>> Real/true HA (especially with SANs) is often done very differently =
(now
>>> you know why it's often proprietary.  :-) )
>=20
> Add error checking to your script.  That's my first and foremost
> recommendation.  It's not hard to do, really.  :-)
>=20
> After you do that and still experience the issue (e.g. you see no =
actual
> errors/issues during the export/import phases), I recommend removing
> the "cache" devices which are "independent" on each system from the =
pool
> entirely.  Quoting you (for readers, since I snipped it from my =
previous
> reply):
>=20
>>>> Note, that ZIL(mirrored) resides on external enclosure. Only L2ARC
>>>> is both local and external - da1,da2, da13s2, da14s2
>=20
> I interpret this to mean the primary and backup nodes (physical =
systems)
> have actual disks which are not part of the "external enclosure".  If
> that's the case -- those disks are always going to vary in their
> contents and metadata.  Those are never going to be 100% identical all
> the time (is this not obvious?).  I'm surprised your stuff has worked =
at
> all using that model, honestly.
>=20
> ZFS is going to bitch/cry if it cannot verify the integrity of certain
> things, all the way down to the L2ARC.  That's my understanding of it =
at
> least, meaning there must always be "some" kind of metadata that has =
to
> be kept/maintained there.
>=20
> Alternately you could try doing this:
>=20
> zpool remove jbod cache daX daY ...
> zpool export jbod
>=20
> Then on the other system:
>=20
> zpool import jbod
> zpool add jbod cache daX daY ...
>=20
> Where daX and daY are the disks which are independent to each system
> (not on the "external enclosure").
>=20
> Finally, it would also be useful/worthwhile if you would provide=20
> "dmesg" from both systems and for you to explain the physical wiring
> along with what device (e.g. daX) correlates with what exact thing on
> each system.  (We right now have no knowledge of that, and your terse
> explanations imply we do -- we need to know more)
>=20
> --=20
> | Jeremy Chadwick                                   jdc@koitsu.org |
> | UNIX Systems Administrator                http://jdc.koitsu.org/ |
> | Making life hard for others since 1977.             PGP 4BD6C0CB |
>=20


From owner-freebsd-fs@FreeBSD.ORG  Tue Jun 11 14:21:12 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@smarthost.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115])
 by hub.freebsd.org (Postfix) with ESMTP id 8F998261;
 Tue, 11 Jun 2013 14:21:12 +0000 (UTC) (envelope-from smh@FreeBSD.org)
Received: from freefall.freebsd.org (freefall.freebsd.org
 [IPv6:2001:1900:2254:206c::16:87])
 by mx1.freebsd.org (Postfix) with ESMTP id 68A41171F;
 Tue, 11 Jun 2013 14:21:12 +0000 (UTC)
Received: from freefall.freebsd.org (localhost [127.0.0.1])
 by freefall.freebsd.org (8.14.7/8.14.7) with ESMTP id r5BELCQj031401;
 Tue, 11 Jun 2013 14:21:12 GMT
 (envelope-from smh@freefall.freebsd.org)
Received: (from smh@localhost)
 by freefall.freebsd.org (8.14.7/8.14.7/Submit) id r5BELC9m031400;
 Tue, 11 Jun 2013 14:21:12 GMT (envelope-from smh)
Date: Tue, 11 Jun 2013 14:21:12 GMT
Message-Id: <201306111421.r5BELC9m031400@freefall.freebsd.org>
To: smh@FreeBSD.org, freebsd-fs@FreeBSD.org, smh@FreeBSD.org
From: smh@FreeBSD.org
Subject: Re: kern/178999: [zfs] dev entries for cloned zvol don't show up
 until after reboot
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 11 Jun 2013 14:21:12 -0000

Synopsis: [zfs] dev entries for cloned zvol don't show up until after reboot

Responsible-Changed-From-To: freebsd-fs->smh
Responsible-Changed-By: smh
Responsible-Changed-When: Tue Jun 11 14:21:02 UTC 2013
Responsible-Changed-Why: 
I'll take it

http://www.freebsd.org/cgi/query-pr.cgi?pr=178999

From owner-freebsd-fs@FreeBSD.ORG  Tue Jun 11 20:28:41 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115])
 by hub.freebsd.org (Postfix) with ESMTP id 270D11DC
 for <freebsd-fs@freebsd.org>; Tue, 11 Jun 2013 20:28:41 +0000 (UTC)
 (envelope-from marck@rinet.ru)
Received: from woozle.rinet.ru (woozle.rinet.ru [195.54.192.68])
 by mx1.freebsd.org (Postfix) with ESMTP id A4AE71B14
 for <freebsd-fs@freebsd.org>; Tue, 11 Jun 2013 20:28:40 +0000 (UTC)
Received: from localhost (localhost [127.0.0.1])
 by woozle.rinet.ru (8.14.5/8.14.5) with ESMTP id r5BKNqlJ087328;
 Wed, 12 Jun 2013 00:23:52 +0400 (MSK) (envelope-from marck@rinet.ru)
Date: Wed, 12 Jun 2013 00:23:52 +0400 (MSK)
From: Dmitry Morozovsky <marck@rinet.ru>
To: Mikolaj Golub <to.my.trociny@gmail.com>
Subject: Re: hast: can't restore after disk failure
In-Reply-To: <20130611060741.GA42231@gmail.com>
Message-ID: <alpine.BSF.2.00.1306120022580.96502@woozle.rinet.ru>
References: <alpine.BSF.2.00.1306101700300.69113@woozle.rinet.ru>
 <20130610201650.GA2823@gmail.com>
 <alpine.BSF.2.00.1306110038010.96502@woozle.rinet.ru>
 <20130611060741.GA42231@gmail.com>
User-Agent: Alpine 2.00 (BSF 1167 2008-08-23)
X-NCC-RegID: ru.rinet
X-OpenPGP-Key-ID: 6B691B03
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.4.3
 (woozle.rinet.ru [0.0.0.0]); Wed, 12 Jun 2013 00:23:52 +0400 (MSK)
Cc: freebsd-fs@freebsd.org
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 11 Jun 2013 20:28:41 -0000

On Tue, 11 Jun 2013, Mikolaj Golub wrote:

> On Tue, Jun 11, 2013 at 12:40:08AM +0400, Dmitry Morozovsky wrote:
> > On Mon, 10 Jun 2013, Mikolaj Golub wrote:
> > 
> > [snipall]
> > 
> > > > Jun 10 16:56:20 <console.info> cthulhu3 kernel: Jun 10 16:56:20 <daemon.err> 
> > > > cthulhu3 hastd[765]: [d1] (secondary) Worker process exited ungracefully 
> > > > (pid=14380, exitcode=66).
> > > > 
> > > > Any hints? Thanks!
> > > 
> > > Have you run hastctl create to initialize metadata?
> > 
> > Yes, but did it naively:
> > 
> > hastctl create d1
> 
> No errors?

no visible, but hast instance ungracefully exits

> > and status still reported 0 as provider size...
> 
> I assume /dev/ada1p1 is present and readable/writable?
> 
> Symptoms are like if it did not exist.

nope, it does:

root@cthulhu3:/# diskinfo /dev/ada1p1
/dev/ada1p1     512     999654686720    1952450560      0       1048576 1936954 16      63
root@cthulhu3:/# diskinfo /dev/ada0p1
/dev/ada0p1     512     999653638144    1952448512      0       1048576 1936952 16      63


-- 
Sincerely,
D.Marck                                     [DM5020, MCK-RIPE, DM3-RIPN]
[ FreeBSD committer:                                 marck@FreeBSD.org ]
------------------------------------------------------------------------
*** Dmitry Morozovsky --- D.Marck --- Wild Woozle --- marck@rinet.ru ***
------------------------------------------------------------------------

From owner-freebsd-fs@FreeBSD.ORG  Tue Jun 11 21:08:35 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@FreeBSD.org
Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115])
 by hub.freebsd.org (Postfix) with ESMTP id AC819EBB
 for <freebsd-fs@FreeBSD.org>; Tue, 11 Jun 2013 21:08:35 +0000 (UTC)
 (envelope-from bra@fsn.hu)
Received: from people.fsn.hu (people.fsn.hu [195.228.252.137])
 by mx1.freebsd.org (Postfix) with ESMTP id 22A3B1CF8
 for <freebsd-fs@FreeBSD.org>; Tue, 11 Jun 2013 21:08:34 +0000 (UTC)
Received: by people.fsn.hu (Postfix, from userid 1001)
 id EDD4910CC062; Tue, 11 Jun 2013 23:01:27 +0200 (CEST)
X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.3
X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MF-ACE0E1EA [pR:
 12.4903]
X-CRM114-CacheID: sfid-20130611_23012_80876E8A 
X-CRM114-Status: Good  ( pR: 12.4903 )
X-DSPAM-Result: Whitelisted
X-DSPAM-Processed: Tue Jun 11 23:01:27 2013
X-DSPAM-Confidence: 0.9938
X-DSPAM-Probability: 0.0000
X-DSPAM-Signature: 51b79027796977830491068
X-DSPAM-Factors: 27, From*Attila Nagy <bra@fsn.hu>, 0.00010, STABLE, 0.00371,
 disks, 0.00397, disks, 0.00397, 231, 0.00428, filter, 0.00505,
 filter, 0.00505, ZFS, 0.00555, ZFS, 0.00555, 1+14, 0.00555,
 2+21, 0.00555, Subject*ZFS, 0.00617, sysctl, 0.00617,
 From*Attila, 0.00617, To*FreeBSD.org, 0.00656, 158, 0.00693,
 474, 0.00693, 215, 0.00693, machines, 0.00739,
 machines, 0.00739, 1+19, 0.00792, 1+19, 0.00792,
 controller, 0.00874, load, 0.00893, load, 0.00893,
 files, 0.00965,
X-Spambayes-Classification: ham; 0.00
Received: from [192.168.3.2] (japan.t-online.co.hu [195.228.243.99])
 by people.fsn.hu (Postfix) with ESMTPSA id B8AAC10CC057
 for <freebsd-fs@FreeBSD.org>; Tue, 11 Jun 2013 23:01:26 +0200 (CEST)
Message-ID: <51B79023.5020109@fsn.hu>
Date: Tue, 11 Jun 2013 23:01:23 +0200
From: Attila Nagy <bra@fsn.hu>
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US;
 rv:1.8.1.23) Gecko/20090817 Thunderbird/2.0.0.23 Mnenhy/0.7.6.0
MIME-Version: 1.0
To: freebsd-fs@FreeBSD.org
Subject: An order of magnitude higher IOPS needed with ZFS than UFS
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
X-Content-Filtered-By: Mailman/MimeDel 2.1.14
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 11 Jun 2013 21:08:35 -0000

Hi,

I have two identical machines. They have 14 disks hooked up to a HP 
smartarray (SA from now on) controller.
Both machines have the same SA configuration and layout: the disks are 
organized into mirror pairs (HW RAID1).

On the first machine, these mirrors are formatted with UFS2+SU (default 
settings), on the second machine they are used as separate zpools 
(please don't tell me that ZFS can do the same, I know). Atime is turned 
off, otherwise, no other modifications (zpool/zfs or sysctl parameters).
The file systems are loaded more or less evenly with serving of some kB 
to few megs files.

The machines act as NFS servers, so there is one, maybe important 
difference here: the UFS machine runs 8.3-RELEASE, while the ZFS one 
runs 9.1-STABLE@r248885.
They get the same type of load, and according to nfsstat and netstat, 
the loads don't explain the big difference which can be seen in disk 
IOs. In fact, the UFS host seems to be more loaded...

According to gstat on the UFS machine:
dT: 60.001s  w: 60.000s  filter: da
  L(q)  ops/s    r/s   kBps   ms/r    w/s   kBps   ms/w %busy Name
     0     42     35    404    6.4      8    150  214.2 21.5| da0
     0     30     21    215    6.1      9    168  225.2 15.9| da1
     0     41     33    474    4.5      8    158  211.3 18.0| da2
     0     39     30    425    4.6      9    163  235.0 17.1| da3
     1     31     24    266    5.1      7     93  174.1 14.9| da4
     0     29     22    273    5.9      7     84  200.7 15.9| da5
     0     37     30    692    7.1      7    115  206.6 19.4| da6

and on the ZFS one:
dT: 60.001s  w: 60.000s  filter: da
  L(q)  ops/s    r/s   kBps   ms/r    w/s   kBps   ms/w %busy Name
     0    228    201   1045   23.7     27    344   53.5 88.7| da0
     5    185    167    855   21.1     19    238   44.9 73.8| da1
    10    263    236   1298   34.9     27    454   53.3 99.9| da2
    10    255    235   1341   28.3     20    239   64.8 92.9| da3
    10    219    195    994   22.3     23    257   46.3 81.3| da4
    10    248    221   1213   22.4     27    264   55.8 90.2| da5
     9    231    213   1169   25.1     19    229   54.6 88.6| da6

I've seen a lot of cases where ZFS required more memory and CPU (and 
even IO) to handle the same load, but they were nowhere this bad (often 
a 10x increase).

Any ideas?

BTW, the file systems are 77-78% full according to df (so ZFS holds 
more, because UFS is -m 8).

Thanks,

From owner-freebsd-fs@FreeBSD.ORG  Tue Jun 11 21:21:18 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by hub.freebsd.org (Postfix) with ESMTP id 4F7614C0
 for <freebsd-fs@freebsd.org>; Tue, 11 Jun 2013 21:21:18 +0000 (UTC)
 (envelope-from rmacklem@uoguelph.ca)
Received: from esa-annu.net.uoguelph.ca (esa-annu.mail.uoguelph.ca
 [131.104.91.36]) by mx1.freebsd.org (Postfix) with ESMTP id 158431DA7
 for <freebsd-fs@freebsd.org>; Tue, 11 Jun 2013 21:21:17 +0000 (UTC)
X-Cloudmark-SP-Filtered: true
X-Cloudmark-SP-Result: v=1.1 cv=ME3lrcP4jFDzpPiCSQywCMKJiHtpRWeRXBDIYmR1BZg=
 c=1 sm=2
 a=ctSXsGKhotwA:10 a=FKkrIqjQGGEA:10 a=uNq0K1xFbOwA:10 a=IkcTkHD0fZMA:10
 a=6I5d2MoRAAAA:8 a=IIDPTtw_6pPhr65o58kA:9 a=QEXdDO2ut3YA:10
 a=IO5DDJVRER8A:10 a=jpxF4j0qNWYA:10 a=0X1wm-MWLxgA:10 a=izcmP9whcIMA:10
 a=UgQyK67jzVMA:10 a=KK3dN39wtEsA:10 a=zr1izwO6SH0A:10 a=SV7veod9ZcQA:10
X-IronPort-Anti-Spam-Filtered: true
X-IronPort-Anti-Spam-Result: AqMEANeTt1GDaFvO/2dsb2JhbABWA4M5SYJ0u1qBF3SCIwEBAQMBAQEBICsgCwUWDgoCAg0ZAikBCSYGCAcEARwEh2YGDKhbkUKBJoxKEH4kEAcRgjuBFAOTboENgkWBKYkDhxaDKyAygQM2
X-IronPort-AV: E=Sophos;i="4.87,847,1363147200"; d="scan'208";a="32625428"
Received: from erie.cs.uoguelph.ca (HELO zcs3.mail.uoguelph.ca)
 ([131.104.91.206])
 by esa-annu.net.uoguelph.ca with ESMTP; 11 Jun 2013 17:20:09 -0400
Received: from zcs3.mail.uoguelph.ca (localhost.localdomain [127.0.0.1])
 by zcs3.mail.uoguelph.ca (Postfix) with ESMTP id BAA39B3F0D;
 Tue, 11 Jun 2013 17:20:09 -0400 (EDT)
Date: Tue, 11 Jun 2013 17:20:09 -0400 (EDT)
From: Rick Macklem <rmacklem@uoguelph.ca>
To: Attila Nagy <bra@fsn.hu>
Message-ID: <253074981.119060.1370985609747.JavaMail.root@erie.cs.uoguelph.ca>
In-Reply-To: <51B79023.5020109@fsn.hu>
Subject: Re: An order of magnitude higher IOPS needed with ZFS than UFS
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit
X-Originating-IP: [172.17.91.202]
X-Mailer: Zimbra 6.0.10_GA_2692 (ZimbraWebClient - FF3.0 (Win)/6.0.10_GA_2692)
Cc: freebsd-fs@FreeBSD.org
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 11 Jun 2013 21:21:18 -0000

Attila Nagy wrote:
> Hi,
> 
> I have two identical machines. They have 14 disks hooked up to a HP
> smartarray (SA from now on) controller.
> Both machines have the same SA configuration and layout: the disks are
> organized into mirror pairs (HW RAID1).
> 
> On the first machine, these mirrors are formatted with UFS2+SU
> (default
> settings), on the second machine they are used as separate zpools
> (please don't tell me that ZFS can do the same, I know). Atime is
> turned
> off, otherwise, no other modifications (zpool/zfs or sysctl
> parameters).
> The file systems are loaded more or less evenly with serving of some
> kB
> to few megs files.
> 
> The machines act as NFS servers, so there is one, maybe important
> difference here: the UFS machine runs 8.3-RELEASE, while the ZFS one
> runs 9.1-STABLE@r248885.
> They get the same type of load, and according to nfsstat and netstat,
> the loads don't explain the big difference which can be seen in disk
> IOs. In fact, the UFS host seems to be more loaded...
> 
> According to gstat on the UFS machine:
> dT: 60.001s w: 60.000s filter: da
> L(q) ops/s r/s kBps ms/r w/s kBps ms/w %busy Name
> 0 42 35 404 6.4 8 150 214.2 21.5| da0
> 0 30 21 215 6.1 9 168 225.2 15.9| da1
> 0 41 33 474 4.5 8 158 211.3 18.0| da2
> 0 39 30 425 4.6 9 163 235.0 17.1| da3
> 1 31 24 266 5.1 7 93 174.1 14.9| da4
> 0 29 22 273 5.9 7 84 200.7 15.9| da5
> 0 37 30 692 7.1 7 115 206.6 19.4| da6
> 
> and on the ZFS one:
> dT: 60.001s w: 60.000s filter: da
> L(q) ops/s r/s kBps ms/r w/s kBps ms/w %busy Name
> 0 228 201 1045 23.7 27 344 53.5 88.7| da0
> 5 185 167 855 21.1 19 238 44.9 73.8| da1
> 10 263 236 1298 34.9 27 454 53.3 99.9| da2
> 10 255 235 1341 28.3 20 239 64.8 92.9| da3
> 10 219 195 994 22.3 23 257 46.3 81.3| da4
> 10 248 221 1213 22.4 27 264 55.8 90.2| da5
> 9 231 213 1169 25.1 19 229 54.6 88.6| da6
> 
> I've seen a lot of cases where ZFS required more memory and CPU (and
> even IO) to handle the same load, but they were nowhere this bad
> (often
> a 10x increase).
> 
> Any ideas?
> 
ken@ recently committed a change to the new NFS server to add file
handle affinity support to it. He reported that he had found that,
without file handle affinity, that ZFS's sequential reading heuristic
broke badly (or something like that, you can probably find the email
thread or maybe he will chime in).

Anyhow, you could try switching the FreeBSD 9 system to use the old
NFS server (assuming your clients are doing NFSv3 mounts) and see if
that has a significant effect. (For FreeBSD9, the old server has file
handle affinity, but the new server does not.)

rick

> BTW, the file systems are 77-78% full according to df (so ZFS holds
> more, because UFS is -m 8).
> 
> Thanks,
> _______________________________________________
> freebsd-fs@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-fs
> To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org"

From owner-freebsd-fs@FreeBSD.ORG  Tue Jun 11 21:25:34 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: fs@FreeBSD.org
Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115])
 by hub.freebsd.org (Postfix) with ESMTP id 2DA91837
 for <fs@FreeBSD.org>; Tue, 11 Jun 2013 21:25:34 +0000 (UTC)
 (envelope-from brde@optusnet.com.au)
Received: from mail106.syd.optusnet.com.au (mail106.syd.optusnet.com.au
 [211.29.132.42]) by mx1.freebsd.org (Postfix) with ESMTP id D1D5D1DE9
 for <fs@FreeBSD.org>; Tue, 11 Jun 2013 21:25:33 +0000 (UTC)
Received: from c122-106-156-23.carlnfd1.nsw.optusnet.com.au
 (c122-106-156-23.carlnfd1.nsw.optusnet.com.au [122.106.156.23])
 by mail106.syd.optusnet.com.au (Postfix) with ESMTPS id D91CD3C23B0;
 Wed, 12 Jun 2013 07:25:26 +1000 (EST)
Date: Wed, 12 Jun 2013 07:25:24 +1000 (EST)
From: Bruce Evans <brde@optusnet.com.au>
X-X-Sender: bde@besplex.bde.org
To: Konstantin Belousov <kostikbel@gmail.com>
Subject: Re: missed clustering for small block sizes in cluster_wbuild()
In-Reply-To: <20130611063446.GJ3047@kib.kiev.ua>
Message-ID: <20130612053543.X900@besplex.bde.org>
References: <20130607044845.O24441@besplex.bde.org>
 <20130611063446.GJ3047@kib.kiev.ua>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed
X-Optus-CM-Score: 0
X-Optus-CM-Analysis: v=2.0 cv=Q6eKePKa c=1 sm=1 a=r8sOWHbHUnAA:10
 a=kj9zAlcOel0A:10 a=PO7r1zJSAAAA:8 a=JzwRw_2MAAAA:8 a=zUlCpqlVHewA:10
 a=3cL_b2E_Z7FY-DCS104A:9 a=CjuIK1q_8ugA:10 a=ebeQFi2P/qHVC0Yw9JDJ4g==:117
Cc: fs@FreeBSD.org
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 11 Jun 2013 21:25:34 -0000

On Tue, 11 Jun 2013, Konstantin Belousov wrote:

> On Fri, Jun 07, 2013 at 05:28:11AM +1000, Bruce Evans wrote:
>> I think this is best fixed be fixed by removing the check above and
>> checking here.  Then back out of the changes.  I don't know this code
>> well enough to write the backing out easily.
>
> Could you test this, please ?

It works in limited testing.

I got a panic from not adapting for changed locking when merging it to
my version, and debugging this showed a problem.

> diff --git a/sys/kern/vfs_cluster.c b/sys/kern/vfs_cluster.c
> index b280317..9e1528e 100644
> --- a/sys/kern/vfs_cluster.c
> +++ b/sys/kern/vfs_cluster.c
> ...
> @@ -904,14 +904,10 @@ cluster_wbuild(struct vnode *vp, long size, daddr_t start_lbn, int len,
>
> 				/*
> 				 * Check that the combined cluster
> -				 * would make sense with regard to pages
> -				 * and would not be too large
> +				 * would make sense with regard to pages.
> 				 */

The comment needs more changes.  There is no check "with regard to
pages" now.  The old comment was poorly worded.  The code never made
an extra check that the cluster "would make sense with regard to pages"
here (that check was always later).  What it did was use the page count
in the largeness check.

> -				if ((tbp->b_bcount != size) ||
> -				  ((bp->b_blkno + (dbsize * i)) !=
> -				    tbp->b_blkno) ||
> -				  ((tbp->b_npages + bp->b_npages) >
> -				    (vp->v_mount->mnt_iosize_max / PAGE_SIZE))) {
> +				if (tbp->b_bcount != size ||
> +				    bp->b_blkno + dbsize * i != tbp->b_blkno) {
> 					BUF_UNLOCK(tbp);
> 					break;
> 				}

Contrary to what I said before, the caller doesn't always limit the
cluster size.  Only the cluster_write() caller does that.  The
vfs_bio_awrite() doesn't.  Now it is fairly common to allocate 1 too
many page and have to back out.  This happens even when everything is
aligned.  I observed the following:
- there were a lot of contiguous dirty buffers, and this loop happily built
   up a cluster with 17 pages, though mnt_iosize_max was only 17 pages.
   Perhaps the extra page is necessary if the part of the buffer to be
   written starts at a nonzero offset, but there was no offset in the case
   that I observed (can there be one, and if so, is it limited to an offset
   within the first page?  The general case needs 16 4K extra pages to write
   a 64K-block (when the offset of the area to be written is 64K-512).
- ...

> @@ -964,6 +960,22 @@ cluster_wbuild(struct vnode *vp, long size, daddr_t start_lbn, int len,
> 						bp->b_pages[bp->b_npages] = m;
> 						bp->b_npages++;
> 					}
> +					if (bp->b_npages > vp->v_mount->
> +					    mnt_iosize_max / PAGE_SIZE) {

- ...Then this detects that the 17th page is 1 too many and cleans up.

> +						KASSERT(i != 0, ("XXX"));
> +						j++;
> +						for (jj = 0; jj < j; jj++) {
> +							vm_page_io_finish(tbp->
> +							    b_pages[jj]);
> +						}
> +						vm_object_pip_subtract(
> +						    m->object, j);
> +						bqrelse(tbp);
> +						bp->b_npages -= j;
> +						VM_OBJECT_WUNLOCK(tbp->
> +						    b_bufobj->bo_object);
> +						goto finishcluster;
> +					}
> 				}
> 				VM_OBJECT_WUNLOCK(tbp->b_bufobj->bo_object);
> 			}

I think it would work and fix other bugs to check (tbp->b_bcount +
bp->b_bcount <= vp->v_mount->mnt_iosize_max) up front.  Drivers should
be able to handle an i/o size of b_bcount however many pages that
takes.  There must be a limit on b_pages, but it seems to be
non-critical and the limit on b_bcount gives one of
(mnt_iosize_max / PAGE_SIZE) rounded in some way and possibly increased
by 1 or doubled to account for offsets.  If mnt_iosize_max is not a
multiple of PAGE_SIZE, then the limit using pages doesn't even allow
covering mnt_iosize_max using pages, since the rounding down is
non-null.

I found this bug while debugging a recent PR about bad performance and
hangs under write pressure.  I only have 1 other clearly correct fix
for the bad performance.  msdosfs is missing read clustering for
read-before-write.  I didn't notice that this was necessary when I
implemented clustering for msdosfs a few years ago.  I thought that the
following patch was a complete fix, but have found more performance
problems in clustering:

@ diff -u2 msdosfs_vnops.c~ msdosfs_vnops.c
@ --- msdosfs_vnops.c~	Thu Feb  5 19:11:37 2004
@ +++ msdosfs_vnops.c	Wed Jun 12 04:01:19 2013
@ @@ -740,5 +756,19 @@
@  			 * The block we need to write into exists, so read it in.
@  			 */
@ -			error = bread(thisvp, bn, pmp->pm_bpcluster, cred, &bp);
@ +			if ((ioflag >> IO_SEQSHIFT) != 0 &&

This was cloned from the ffs version.  All ffs should call a common function
instead of duplicating the cluster_read/bread decision.  Similarly for
write clustering except there are more decisions.  But ffs and ext2fs do
this in UFS_BALLOC() and ext2fs_balloc() (?) where not all the info that
might be need is available.

I repeated the (ioflag >> IO_SEQSHIFT) calculation instead of copying to
a variable like ffs does, to localise this patch and to avoid copying
ffs's mounds of style bugs in the declaration of the variable.

@ +			    (vp->v_mount->mnt_flag & MNT_NOCLUSTERR) == 0) {
@ +				error = cluster_read(vp, dep->de_FileSize, bn,
@ +				    pmp->pm_bpcluster, NOCRED,
@ +#if 0
@ +				    (uio->uio_offset & pmp->pm_crbomask) +
@ +				    uio->uio_resid,

This part was copied from msdosfs_read().  msdosfs_read() uses the uio of
some reader.  Here the uio for read-before-write is for the writer.  It
isn't clear that either should be used here.  UFS_BALLOC() is not passed
the full uio info needed to make this caclulation, and it uses the fixed
size MAXBSIZE.  That is wrong in a different way.

This parameter is used to reduce latency.  It asks for a small cluster
of the specified size followed by read ahead of full clusters, with the
amount of read ahead controlled by vfs.read_max.  This gives clusters
of non-uniform sizes and offsets, especially when the reader uses small
blocks.  scottl recently added vfs.read_min which can be tuned to prevent
this.

I don't like many things in this area.  vfs.read_min works OK, but is
another hack.  The default should probably be to optimize for throughput
instead of latency (the reverse of the current one, but the curent one
is historical so it shouldn't be changed).  The units of vfs.read_max
and vfs.read_min are fs block sizes.  This is quite broken when the
cluster size is varied and sometimes small.  E.g., the old default
read_max of 8, with a block size of 512 then the read-ahead is limited
to a whole 4K.  The default is 64, which is still too small with small
block sizes.  But if you increase vfs.read_max a lot, then the
read-ahead becomes almost infinity when the block size is large.  In
my version, the units for vfs.read_max are 512-blocks (default 256 for
the old limit of 128K read-ahead with ffs's old default 16K-blocks.
The current limit of 64 seems excessive with ffs's current default of
32K-blocks (2048K read-ahead).  My units are mostly better, but I
just noticed that they have a different too-delicate interaction with
application and kernel block sizes...

@ +#else
@ +				    MAXPHYS,

The above gave sub-maximal clustering.  So does ffs's MAXBSIZE when
it is smaller than mnt_iosize_max.  In ~5.2, mnt_iosize_max is physical
and is usually DFLTPHYS == MAXBSIZE, so ffs's choice usually gives
maximal clusters.  However, in -current, mnt_iosize_max is virtual
and is usually MAXPHYS == 2 * MAXBSIZE.  So MAXPHYS is probably correct
here.

...  Then I noticed another problem.  MAXPHYS twice mnt_iosize_max,
so the cluster size is only mnt_iosize_max = DFLTPHYS = 64K.  This
apparently acts badly with vfs.read_max = 256 512-blocks.  I think
it breaks read-ahead.  Throughput drops by a factor of 4 for read-before
write relative to direct writes (not counting the factor of 2 for the
doubled i/o from the reads), although all the i/o sizes are 64K.
Increasing vfs.read_max by just 16 fixes this.  The throughput drop
is then only 10-20% (there must be some drop for the extra seeks).
I'm not sure if extra read-ahead is good or bad here.  More read-ahead
in read-before-write reduces seeks, but it may also break drives'
caching and sequential heuristics.  My drives are old and have small
caches and are very sensitive to the i/o pattern for read-before-write.

@ +#endif
@ +				    ioflag >> IO_SEQSHIFT, &bp);
@ +			} else {
@ +				error = bread(vp, bn, pmp->pm_bpcluster,
@ +				    NOCRED, &bp);
@ +			}
@  			if (error) {
@  				brelse(bp);

To complete getting mostly-full clusters for writing large files to msdosfs,
I hacked the block size heuristic some more to give larger blocks:

@ diff -u2 msdosfs_vfsops.c~ msdosfs_vfsops.c
@ --- msdosfs_vfsops.c~	Sun Jun 20 14:20:03 2004
@ +++ msdosfs_vfsops.c	Wed Jun 12 04:32:52 2013
@ @@ -519,7 +547,11 @@
@ 
@  	if (FAT12(pmp))
@ -		pmp->pm_fatblocksize = 3 * pmp->pm_BytesPerSec;
@ +		pmp->pm_fatblocksize = 3 * DEV_BSIZE;
@ +	else if (FAT16(pmp))
@ +		pmp->pm_fatblocksize = PAGE_SIZE;
@  	else
@ -		pmp->pm_fatblocksize = MSDOSFS_DFLTBSIZE;
@ +		pmp->pm_fatblocksize = DFLTPHYS;
@ +	pmp->pm_fatblocksize = roundup(pmp->pm_fatblocksize,
@ +	    pmp->pm_BytesPerSec);
@ 
@  	pmp->pm_fatblocksec = pmp->pm_fatblocksize / DEV_BSIZE;

I changed my version this long ago to use 3*DEV_BSIZE for FAT12 and 
PAGE_SIZE in all other cases.  3*pmp->pm_BytesPerSec is bogus since
IIRC a small file system doesn't need even 3 DEV_BSIZE sectors and
when the sector size is larger than DEV_BSIZE then it won't need 3
sectors.  MSDOSFS_DFLTBSIZE is 4096.  This and PAGE_SIZE are really
too small for huge FATs.  The FAT i/o size should really depend on
the size of the FAT, not on its type, but small sizes are more
robust and more efficient for sparse writes.  The larger size also
requires fewer buffers.  4K is not too bad, but 512 would be really
bad.

I just remembered why I like small blocks.  They are more robust, and
clustering makes them efficient.  But clustering of the FAT isn't done.
Clusters are normally written with bdwrite() but not B_CLUSTEROK.  I
think some clustering still occurs since !B_CLUSTEROK is not honored.
Clusters are read using bread().  I think this is followed breadn(),
giving the old limited read-ahead which isn't nearly enough with
4K-blocks.

DFLTPHYS or MAXPHYS wasn't usable as a default until geom made the max
i/o size virtual, since it didn't work for devices with a lower
limit.  Neither did MAXBSIZE.  Even 4K might be larger than the
device limit.

Bruce

From owner-freebsd-fs@FreeBSD.ORG  Tue Jun 11 23:21:30 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115])
 by hub.freebsd.org (Postfix) with ESMTP id D5CC5ADB
 for <freebsd-fs@freebsd.org>; Tue, 11 Jun 2013 23:21:30 +0000 (UTC)
 (envelope-from ken@kdm.org)
Received: from nargothrond.kdm.org (nargothrond.kdm.org [70.56.43.81])
 by mx1.freebsd.org (Postfix) with ESMTP id A7501138C
 for <freebsd-fs@freebsd.org>; Tue, 11 Jun 2013 23:21:30 +0000 (UTC)
Received: from nargothrond.kdm.org (localhost [127.0.0.1])
 by nargothrond.kdm.org (8.14.2/8.14.2) with ESMTP id r5BNLOCn047835;
 Tue, 11 Jun 2013 17:21:24 -0600 (MDT)
 (envelope-from ken@nargothrond.kdm.org)
Received: (from ken@localhost)
 by nargothrond.kdm.org (8.14.2/8.14.2/Submit) id r5BNLOgG047834;
 Tue, 11 Jun 2013 17:21:24 -0600 (MDT) (envelope-from ken)
Date: Tue, 11 Jun 2013 17:21:24 -0600
From: "Kenneth D. Merry" <ken@freebsd.org>
To: Rick Macklem <rmacklem@uoguelph.ca>
Subject: Re: An order of magnitude higher IOPS needed with ZFS than UFS
Message-ID: <20130611232124.GA42577@nargothrond.kdm.org>
References: <51B79023.5020109@fsn.hu>
 <253074981.119060.1370985609747.JavaMail.root@erie.cs.uoguelph.ca>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <253074981.119060.1370985609747.JavaMail.root@erie.cs.uoguelph.ca>
User-Agent: Mutt/1.4.2i
Cc: freebsd-fs@freebsd.org
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 11 Jun 2013 23:21:30 -0000

On Tue, Jun 11, 2013 at 17:20:09 -0400, Rick Macklem wrote:
> Attila Nagy wrote:
> > Hi,
> > 
> > I have two identical machines. They have 14 disks hooked up to a HP
> > smartarray (SA from now on) controller.
> > Both machines have the same SA configuration and layout: the disks are
> > organized into mirror pairs (HW RAID1).
> > 
> > On the first machine, these mirrors are formatted with UFS2+SU
> > (default
> > settings), on the second machine they are used as separate zpools
> > (please don't tell me that ZFS can do the same, I know). Atime is
> > turned
> > off, otherwise, no other modifications (zpool/zfs or sysctl
> > parameters).
> > The file systems are loaded more or less evenly with serving of some
> > kB
> > to few megs files.
> > 
> > The machines act as NFS servers, so there is one, maybe important
> > difference here: the UFS machine runs 8.3-RELEASE, while the ZFS one
> > runs 9.1-STABLE@r248885.
> > They get the same type of load, and according to nfsstat and netstat,
> > the loads don't explain the big difference which can be seen in disk
> > IOs. In fact, the UFS host seems to be more loaded...
> > 
> > According to gstat on the UFS machine:
> > dT: 60.001s w: 60.000s filter: da
> > L(q) ops/s r/s kBps ms/r w/s kBps ms/w %busy Name
> > 0 42 35 404 6.4 8 150 214.2 21.5| da0
> > 0 30 21 215 6.1 9 168 225.2 15.9| da1
> > 0 41 33 474 4.5 8 158 211.3 18.0| da2
> > 0 39 30 425 4.6 9 163 235.0 17.1| da3
> > 1 31 24 266 5.1 7 93 174.1 14.9| da4
> > 0 29 22 273 5.9 7 84 200.7 15.9| da5
> > 0 37 30 692 7.1 7 115 206.6 19.4| da6
> > 
> > and on the ZFS one:
> > dT: 60.001s w: 60.000s filter: da
> > L(q) ops/s r/s kBps ms/r w/s kBps ms/w %busy Name
> > 0 228 201 1045 23.7 27 344 53.5 88.7| da0
> > 5 185 167 855 21.1 19 238 44.9 73.8| da1
> > 10 263 236 1298 34.9 27 454 53.3 99.9| da2
> > 10 255 235 1341 28.3 20 239 64.8 92.9| da3
> > 10 219 195 994 22.3 23 257 46.3 81.3| da4
> > 10 248 221 1213 22.4 27 264 55.8 90.2| da5
> > 9 231 213 1169 25.1 19 229 54.6 88.6| da6
> > 
> > I've seen a lot of cases where ZFS required more memory and CPU (and
> > even IO) to handle the same load, but they were nowhere this bad
> > (often
> > a 10x increase).
> > 
> > Any ideas?
> > 
> ken@ recently committed a change to the new NFS server to add file
> handle affinity support to it. He reported that he had found that,
> without file handle affinity, that ZFS's sequential reading heuristic
> broke badly (or something like that, you can probably find the email
> thread or maybe he will chime in).

That is correct.  The problem, if the I/O is sequential, is that simultaneous
requests for adjacent blocks in a file will get farmed out to different
threads in the NFS server.  These can easily go down into ZFS out of order,
and make the ZFS prefetch code think that the file is not being read
sequentially.  It blows away the zfetch stream, and you wind up with a lot
of I/O bandwidth getting used (with a lot of prefetching done and then
re-done), but not much performance.

The FHA code puts adjacent requests in a single file into the same thread,
so ZFS sees the requests in the right order.

Another change I made was to allow parallel writes to a file if the
underlying filesystem allows it.  (ZFS is the only filesystem that
allows that currently.)  That can help random writes.

Linux clients are more likely than FreeBSD and MacOS clients to queue a lot
of reads to the server.

> Anyhow, you could try switching the FreeBSD 9 system to use the old
> NFS server (assuming your clients are doing NFSv3 mounts) and see if
> that has a significant effect. (For FreeBSD9, the old server has file
> handle affinity, but the new server does not.)

If using the old NFS server helps, then the FHA code for the new server
will help as well.  Perhaps more, because the default FHA tuning parameters
have changed somewhat and parallel writes are now possible.

If you want to try out the FHA changes in stable/9, I just MFCed them,
change 251641.

Ken
-- 
Kenneth Merry
ken@FreeBSD.ORG

From owner-freebsd-fs@FreeBSD.ORG  Tue Jun 11 23:39:10 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: fs@FreeBSD.org
Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115])
 by hub.freebsd.org (Postfix) with ESMTP id C7863C9B
 for <fs@FreeBSD.org>; Tue, 11 Jun 2013 23:39:10 +0000 (UTC)
 (envelope-from brde@optusnet.com.au)
Received: from mail105.syd.optusnet.com.au (mail105.syd.optusnet.com.au
 [211.29.132.249])
 by mx1.freebsd.org (Postfix) with ESMTP id 77328148D
 for <fs@FreeBSD.org>; Tue, 11 Jun 2013 23:39:09 +0000 (UTC)
Received: from c122-106-156-23.carlnfd1.nsw.optusnet.com.au
 (c122-106-156-23.carlnfd1.nsw.optusnet.com.au [122.106.156.23])
 by mail105.syd.optusnet.com.au (Postfix) with ESMTPS id 2EB0F104196B;
 Wed, 12 Jun 2013 09:39:08 +1000 (EST)
Date: Wed, 12 Jun 2013 09:39:07 +1000 (EST)
From: Bruce Evans <brde@optusnet.com.au>
X-X-Sender: bde@besplex.bde.org
To: Bruce Evans <brde@optusnet.com.au>
Subject: Re: missed clustering for small block sizes in cluster_wbuild()
In-Reply-To: <20130612053543.X900@besplex.bde.org>
Message-ID: <20130612085648.L836@besplex.bde.org>
References: <20130607044845.O24441@besplex.bde.org>
 <20130611063446.GJ3047@kib.kiev.ua>
 <20130612053543.X900@besplex.bde.org>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed
X-Optus-CM-Score: 0
X-Optus-CM-Analysis: v=2.0 cv=K8x6hFqI c=1 sm=1 a=r8sOWHbHUnAA:10
 a=kj9zAlcOel0A:10 a=PO7r1zJSAAAA:8 a=JzwRw_2MAAAA:8 a=zUlCpqlVHewA:10
 a=hiBHK-Nd3Hw8SXYrQcMA:9 a=CjuIK1q_8ugA:10 a=ebeQFi2P/qHVC0Yw9JDJ4g==:117
Cc: fs@FreeBSD.org
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 11 Jun 2013 23:39:10 -0000

On Wed, 12 Jun 2013, Bruce Evans wrote:

> On Tue, 11 Jun 2013, Konstantin Belousov wrote:
>
>> On Fri, Jun 07, 2013 at 05:28:11AM +1000, Bruce Evans wrote:
>>> I think this is best fixed be fixed by removing the check above and
>>> checking here.  Then back out of the changes.  I don't know this code
>>> well enough to write the backing out easily.
>> 
>> Could you test this, please ?
>
> It works in limited testing.

> ...
> - there were a lot of contiguous dirty buffers, and this loop happily built
>  up a cluster with 17 pages, though mnt_iosize_max was only 17 pages.
>  Perhaps the extra page is necessary if the part of the buffer to be
>  written starts at a nonzero offset, but there was no offset in the case
>  that I observed (can there be one, and if so, is it limited to an offset
>  within the first page?  The general case needs 16 4K extra pages to write
>  a 64K-block (when the offset of the area to be written is 64K-512).

I now remember a bit more about how this works.  There is only a limited
amount of offseting.  The buffer might not be page-aligned relative to
the start of the disk.  Then the first page in the buffer must not all
be accessed (via this buffer) for i/o.  The first page is mapped at
bp->b_kvabase, but disk drivers must only access data starting at
bp->b_data, which is offset from bp->b_kvabase in the misaligned case.

I think this is the only relevant complication.  When misaligned buffers
are merged into a cluster buffer, they must all have the same misalignment
and size for the merge to work.  1 "extra" page, but no more, is always
required in the misaligned case to reach the full mnt_iosize_max.

msdosfs buffers may even be misaligned if they have size 64K!  All
msdosfs clusters may be misaligned if they have size >= PAGE_SIZE!
This is not good for performance, but should work.  Misalignment used
to be the usual case, since msdsofs metadata before the data clusters
tends to have an odd size in sectors and when the cluster size is >=
PAGE_SIZE the misalignment is preserved.  FreeBSD newfs_msdos shouldn't
produce misaligned buffers, but other systems' utilities might.  This
may also cause problems with the MAXBSIZE limit of 64K.  If it is a
hard limit on b_kvasize, then misaligned buffers of this size won't
be allowed.  If it only a limit on b_bcount, then there may be
fragmentation problems.

> ...
> I think it would work and fix other bugs to check (tbp->b_bcount +
> bp->b_bcount <= vp->v_mount->mnt_iosize_max) up front.  Drivers should
> be able to handle an i/o size of b_bcount however many pages that
> takes.  There must be a limit on b_pages, but it seems to be
> non-critical and the limit on b_bcount gives one of
> (mnt_iosize_max / PAGE_SIZE) rounded in some way and possibly increased
> by 1 or doubled to account for offsets.  If mnt_iosize_max is not a
> multiple of PAGE_SIZE, then the limit using pages doesn't even allow
> covering mnt_iosize_max using pages, since the rounding down is
> non-null.

I'm now trying the b_bcount check and not doing any backout later (just
print debugging info when it is reached).  The backout case is reached
even with the b_bcount check.  This is in the misaligned case.  The
misaligned case shouldn't break clustering since it is quite common.
It happens whenever the blocksize is small and the start of the cluster
is misaligned relative to the start of the disk.  If the block size is
larger, then all blocks may be misaligned.

> [read-before-write fix for msdosfs and generic problems with read-b4-write]
> ...  Then I noticed another problem.  MAXPHYS twice mnt_iosize_max,
> so the cluster size is only mnt_iosize_max = DFLTPHYS = 64K.  This
> apparently acts badly with vfs.read_max = 256 512-blocks.  I think
> it breaks read-ahead.  Throughput drops by a factor of 4 for read-before
> write relative to direct writes (not counting the factor of 2 for the
> doubled i/o from the reads), although all the i/o sizes are 64K.
> Increasing vfs.read_max by just 16 fixes this.  The throughput drop
> is then only 10-20% (there must be some drop for the extra seeks).
> I'm not sure if extra read-ahead is good or bad here.  More read-ahead
> in read-before-write reduces seeks, but it may also break drives'
> caching and sequential heuristics.  My drives are old and have small
> caches and are very sensitive to the i/o pattern for read-before-write.

I confirmed that this has something to do with the drive.  After reaching
a quiescent pattern with "dd bs=1k count=1024k conv=notrunc" for almost-
contiguous files (and 1k < fs block size, and fs = msdosfs with MAXPHYS
read-before-write), reads and writes alternate with reads some constant
distance ahead of writes.  The difference depends on vfs.read_max.  It is
sometimes a multiple of 128 512-blocks, but often not.  My drives don't
like some fixed distances.  I don't understand their pattern.  They seem
to prefer non-power-of-2 differnces.  Turning off read-ahead by setting
vfs.read_max to 0 gives the worset performance (reduce by another power
of 2).  The levels of reduced performance are quantized: one level at
7 times slower, one level at 4 times slower and one level at 10-20% slower.

Bruce

From owner-freebsd-fs@FreeBSD.ORG  Wed Jun 12 00:20:01 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@smarthost.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115])
 by hub.freebsd.org (Postfix) with ESMTP id 7E37810F
 for <freebsd-fs@smarthost.ysv.freebsd.org>;
 Wed, 12 Jun 2013 00:20:01 +0000 (UTC)
 (envelope-from gnats@FreeBSD.org)
Received: from freefall.freebsd.org (freefall.freebsd.org
 [IPv6:2001:1900:2254:206c::16:87])
 by mx1.freebsd.org (Postfix) with ESMTP id 5483A16D3
 for <freebsd-fs@smarthost.ysv.freebsd.org>;
 Wed, 12 Jun 2013 00:20:01 +0000 (UTC)
Received: from freefall.freebsd.org (localhost [127.0.0.1])
 by freefall.freebsd.org (8.14.7/8.14.7) with ESMTP id r5C0K1I9057967
 for <freebsd-fs@freefall.freebsd.org>; Wed, 12 Jun 2013 00:20:01 GMT
 (envelope-from gnats@freefall.freebsd.org)
Received: (from gnats@localhost)
 by freefall.freebsd.org (8.14.7/8.14.7/Submit) id r5C0K0I0057965;
 Wed, 12 Jun 2013 00:20:00 GMT (envelope-from gnats)
Date: Wed, 12 Jun 2013 00:20:00 GMT
Message-Id: <201306120020.r5C0K0I0057965@freefall.freebsd.org>
To: freebsd-fs@FreeBSD.org
Cc: 
From: Garrett Cooper <yaneurabeya@gmail.com>
Subject: Re: kern/172334: [unionfs] unionfs permits recursive union mounts;
 causes panic quickly
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
Reply-To: Garrett Cooper <yaneurabeya@gmail.com>
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 12 Jun 2013 00:20:01 -0000

The following reply was made to PR kern/172334; it has been noted by GNATS.

From: Garrett Cooper <yaneurabeya@gmail.com>
To: bug-followup@FreeBSD.org, yaneurabeya@gmail.com
Cc: daichi@FreeBSD.org
Subject: Re: kern/172334: [unionfs] unionfs permits recursive union mounts;
 causes panic quickly
Date: Tue, 11 Jun 2013 17:10:20 -0700

 I finally got around to testing this. Yup -- the patch looks good to me.
 Thank you Daichi!
 -Garrett

From owner-freebsd-fs@FreeBSD.ORG  Wed Jun 12 02:17:04 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@FreeBSD.org
Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115])
 by hub.freebsd.org (Postfix) with ESMTP id C5AB6AEC;
 Wed, 12 Jun 2013 02:17:04 +0000 (UTC)
 (envelope-from brde@optusnet.com.au)
Received: from mail110.syd.optusnet.com.au (mail110.syd.optusnet.com.au
 [211.29.132.97])
 by mx1.freebsd.org (Postfix) with ESMTP id 750491980;
 Wed, 12 Jun 2013 02:17:03 +0000 (UTC)
Received: from c122-106-156-23.carlnfd1.nsw.optusnet.com.au
 (c122-106-156-23.carlnfd1.nsw.optusnet.com.au [122.106.156.23])
 by mail110.syd.optusnet.com.au (Postfix) with ESMTPS id A342D7804C2;
 Wed, 12 Jun 2013 11:48:12 +1000 (EST)
Date: Wed, 12 Jun 2013 11:48:11 +1000 (EST)
From: Bruce Evans <brde@optusnet.com.au>
X-X-Sender: bde@besplex.bde.org
To: "Kenneth D. Merry" <ken@FreeBSD.org>
Subject: Re: An order of magnitude higher IOPS needed with ZFS than UFS
In-Reply-To: <20130611232124.GA42577@nargothrond.kdm.org>
Message-ID: <20130612104903.A1146@besplex.bde.org>
References: <51B79023.5020109@fsn.hu>
 <253074981.119060.1370985609747.JavaMail.root@erie.cs.uoguelph.ca>
 <20130611232124.GA42577@nargothrond.kdm.org>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed
X-Optus-CM-Score: 0
X-Optus-CM-Analysis: v=2.0 cv=Q6eKePKa c=1 sm=1 a=uNq0K1xFbOwA:10
 a=kj9zAlcOel0A:10 a=PO7r1zJSAAAA:8 a=JzwRw_2MAAAA:8 a=_0xpXSU753EA:10
 a=fMB5tdty3pWOc5zq9kgA:9 a=CjuIK1q_8ugA:10 a=ebeQFi2P/qHVC0Yw9JDJ4g==:117
Cc: freebsd-fs@FreeBSD.org
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 12 Jun 2013 02:17:04 -0000

On Tue, 11 Jun 2013, Kenneth D. Merry wrote:

> On Tue, Jun 11, 2013 at 17:20:09 -0400, Rick Macklem wrote:
>> Attila Nagy wrote:
>>> ...
>>> I've seen a lot of cases where ZFS required more memory and CPU (and
>>> even IO) to handle the same load, but they were nowhere this bad
>>> (often
>>> a 10x increase).
>>>
>>> Any ideas?
>>>
>> ken@ recently committed a change to the new NFS server to add file
>> handle affinity support to it. He reported that he had found that,
>> without file handle affinity, that ZFS's sequential reading heuristic
>> broke badly (or something like that, you can probably find the email
>> thread or maybe he will chime in).
>
> That is correct.  The problem, if the I/O is sequential, is that simultaneous
> requests for adjacent blocks in a file will get farmed out to different
> threads in the NFS server.  These can easily go down into ZFS out of order,
> and make the ZFS prefetch code think that the file is not being read
> sequentially.  It blows away the zfetch stream, and you wind up with a lot
> of I/O bandwidth getting used (with a lot of prefetching done and then
> re-done), but not much performance.

I saw the nfsd's getting in each other's way when debugging nfs write
slowness some time ago.  I used the "fix" of using only 1 nfsd.  This
worked fine on a lightly-loaded nfs server and client doing nothing
nearly as heavy as the write benchmark for all other uses combined.
With this and some other changes that are supposed to be in -current
now, the write performance for large files was close to the drive's
maximum.  But reads were at best 75% of the maximum.  Maybe FHA fixes
the read case.

More recently, I noticed that vfs clustering works poorly partly because
it has too many, yet not enough sequential pointers.

There is a pointer (fp->f_nextoff and fp->f_seqcount) for the sequential
heuristic at the struct file level.  This is shared between reads and
writes, so mixed reads, writes and seeks break the heuristic for the
reads and writes in the case that the seeks are to get back to position
after the previous write (the rewrite benchmark in bonnie does this).
The seeks mean that the i/o is not really sequential although it is
sequential for the read part and the write part.  FreeBSD is only
trying to guess if these parts are sequential per-file.  Mixed reads
and writes on the same file shouldn't affect the guess any more than
non-mixed reads or writes on different files, or mixed reads and writes
on the same file when the kernel does the read to fill buffers before
partial writes.  However, at a lower level the only seeks that matter
re physical ones.  The per-file pointers should be combined somehow
to predict and minimize the physical seeks.  Nothing is done.  The
kernel read-before write does significant physical seeks but since
everything is below the file level the per-file pointer is not clobbered
so pure sequential writes are still guessed to be sequential although
they aren't really.

There is also a pointer (vp->v_lastw and vp->vp->v_lasta) for cluster
writes.  This is closer to the physical disk pointer that is needed,
but since it is per-vnode it share a fundamental design error with
the buffer cache (buffer cache code wants to access one vnode at a
time, vnode data and metadata may be very non-sequential).  vnodes
are below the file level, so this pointer gets clobbered by writes
(but not reads) on separate open files.  The clobbering keeps the
vnode pointer closer to the physical disk pointer if and only iff
all accesses are to the same vnode.  I think it mostly helps to
not try to track per-file sequentiality for writes, but the per-file
sequentiality guess is used for cluster writing too.  The 2 types
of sequentiality combine in a confusing way even if there is only
1 writer (but a reader on the same file).  Write accesses are then
sequential from the point of view of the vnode pointer, but random
from the point of view of the file pointer, since only the latter
is clobbered by intermediate reads.

As mentioned above, bonnie's atypical i/o pattern clobbers the file
pointer, but kernel's more typical i/o pattern for read-before-write
doesn't.

I first thought that clobbering the pointer was a bug, but now I think
it is a feature.  The i/o really is non-sequential.  Basing most i/o
sequentiality guesses on a single per-disk pointer (shared across
different partitions on the same disk) might work better than all
the separate pointers.  Accesses that are sequential at the file level
would only be considered sequential if no other physical accesses
intervene.  After getting that right, use sequentiality guesses again
to delay some physical accesses if they would intervene.

Bruce

From owner-freebsd-fs@FreeBSD.ORG  Wed Jun 12 08:44:59 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115])
 by hub.freebsd.org (Postfix) with ESMTP id CA585B42
 for <freebsd-fs@freebsd.org>; Wed, 12 Jun 2013 08:44:59 +0000 (UTC)
 (envelope-from to.my.trociny@gmail.com)
Received: from mail-la0-x233.google.com (mail-la0-x233.google.com
 [IPv6:2a00:1450:4010:c03::233])
 by mx1.freebsd.org (Postfix) with ESMTP id 5371219CF
 for <freebsd-fs@freebsd.org>; Wed, 12 Jun 2013 08:44:59 +0000 (UTC)
Received: by mail-la0-f51.google.com with SMTP id fq12so7489521lab.10
 for <freebsd-fs@freebsd.org>; Wed, 12 Jun 2013 01:44:58 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
 h=date:from:to:cc:subject:message-id:references:mime-version
 :content-type:content-disposition:in-reply-to:user-agent;
 bh=vNvD5TNxquZYbGiKpfDWpMnowuHfmxgp/1BvQYFDZcc=;
 b=t4WXZ5o0CxZv9/kwfNrSqH5h5F0utLJsEh/pcNSbT1P3yo/Qh1uox58y1RVM8xua/8
 6w8o2NQOW9bb1hrSNLWdbLhCWkcEI39DBUnIGMLOlIlvRt42w6cc3iL0DIIzJS3s2j+N
 5lbgCqGwd61IY+1SKWO6OftgKDX3cnf3nvuQ/NXFgEJtZIFAHlJJ0s9YFBPZwdpnAuZE
 SfWcbcTgnrluj7Sp6tBqdY2DUYv/SCPR4hohQAnNCMojLCBH/AY+Y0q2KGPEl7LuJ87h
 NhDs3LVnLtbAbXqY/AQVucIRpyMEuxFa4lIMaLRgMLK4TZBxt8bH3aaziVaUVwzyOeeL
 rmoA==
X-Received: by 10.112.150.170 with SMTP id uj10mr5406999lbb.93.1371026697991; 
 Wed, 12 Jun 2013 01:44:57 -0700 (PDT)
Received: from localhost ([188.230.122.226])
 by mx.google.com with ESMTPSA id a3sm8737907lbg.2.2013.06.12.01.44.55
 for <multiple recipients>
 (version=TLSv1 cipher=RC4-SHA bits=128/128);
 Wed, 12 Jun 2013 01:44:56 -0700 (PDT)
Date: Wed, 12 Jun 2013 11:44:54 +0300
From: Mikolaj Golub <to.my.trociny@gmail.com>
To: Dmitry Morozovsky <marck@rinet.ru>
Subject: Re: hast: can't restore after disk failure
Message-ID: <20130612084453.GA55502@gmail.com>
References: <alpine.BSF.2.00.1306101700300.69113@woozle.rinet.ru>
 <20130610201650.GA2823@gmail.com>
 <alpine.BSF.2.00.1306110038010.96502@woozle.rinet.ru>
 <20130611060741.GA42231@gmail.com>
 <alpine.BSF.2.00.1306120022580.96502@woozle.rinet.ru>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <alpine.BSF.2.00.1306120022580.96502@woozle.rinet.ru>
User-Agent: Mutt/1.5.21 (2010-09-15)
Cc: freebsd-fs@freebsd.org
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 12 Jun 2013 08:44:59 -0000

On Wed, Jun 12, 2013 at 12:23:52AM +0400, Dmitry Morozovsky wrote:
> On Tue, 11 Jun 2013, Mikolaj Golub wrote:
> 
> > On Tue, Jun 11, 2013 at 12:40:08AM +0400, Dmitry Morozovsky wrote:
> > > On Mon, 10 Jun 2013, Mikolaj Golub wrote:
> > > 
> > > [snipall]
> > > 
> > > > > Jun 10 16:56:20 <console.info> cthulhu3 kernel: Jun 10 16:56:20 <daemon.err> 
> > > > > cthulhu3 hastd[765]: [d1] (secondary) Worker process exited ungracefully 
> > > > > (pid=14380, exitcode=66).
> > > > > 
> > > > > Any hints? Thanks!
> > > > 
> > > > Have you run hastctl create to initialize metadata?
> > > 
> > > Yes, but did it naively:
> > > 
> > > hastctl create d1
> > 
> > No errors?
> 
> no visible, but hast instance ungracefully exits
> 
> > > and status still reported 0 as provider size...
> > 
> > I assume /dev/ada1p1 is present and readable/writable?
> > 
> > Symptoms are like if it did not exist.
> 
> nope, it does:
> 
> root@cthulhu3:/# diskinfo /dev/ada1p1
> /dev/ada1p1     512     999654686720    1952450560      0       1048576 1936954 16      63
> root@cthulhu3:/# diskinfo /dev/ada0p1
> /dev/ada0p1     512     999653638144    1952448512      0       1048576 1936952 16      63
> 

Hm, looking in the source where this error is generated:

  cthulhu3 hastd[14379]: [d1] (secondary) Unable to read metadata from /dev/ada1p1: No such file or directory.

it looks like hastd successfully read metadata from disk but failed to
parse it (did not found an entry). This usually happens when metadata
is not initialized by `hastctl create`.

Does `hastctl dump d1' not work too?

-- 
Mikolaj Golub

From owner-freebsd-fs@FreeBSD.ORG  Wed Jun 12 08:49:54 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by hub.freebsd.org (Postfix) with ESMTP id 4B3E3C32
 for <freebsd-fs@freebsd.org>; Wed, 12 Jun 2013 08:49:54 +0000 (UTC)
 (envelope-from se@freebsd.org)
Received: from nm8-vm1.bullet.mail.ird.yahoo.com
 (nm8-vm1.bullet.mail.ird.yahoo.com [77.238.189.198])
 by mx1.freebsd.org (Postfix) with SMTP id 392C61A06
 for <freebsd-fs@freebsd.org>; Wed, 12 Jun 2013 08:49:53 +0000 (UTC)
Received: from [77.238.189.238] by nm8.bullet.mail.ird.yahoo.com with NNFMP;
 12 Jun 2013 08:49:51 -0000
Received: from [46.228.39.69] by tm19.bullet.mail.ird.yahoo.com with NNFMP;
 12 Jun 2013 08:49:51 -0000
Received: from [127.0.0.1] by smtp106.mail.ir2.yahoo.com with NNFMP;
 12 Jun 2013 08:49:51 -0000
X-Yahoo-Newman-Id: 458754.61133.bm@smtp106.mail.ir2.yahoo.com
X-Yahoo-Newman-Property: ymail-3
X-YMail-OSG: Bjnd4aYVM1k42RPalIBMugNHwfh2mQbDST.vBo1t969Rt0O
 g4RVTol70_AYmtxcePMxzP2nN7LWH6_OfJP6njytei1cnHwwnfeW3Ih8nL8X
 cDFiIs_ix3LL4CIKinlqNZgjPPj5T.623XaApa3sIhmm5pgZAoXttIiqZ1Rw
 ZCNX51Qgu1fb_1haxNOG_FtV4ViC5A.mIrPwxweNGAf8btyhojbxaD2HT2fF
 HhzMRcr2LDxZwwDVZbqLZur13SzDpCrg349r764O1HIA7Eja3xvOjKYqjAEq
 a6SHeqM4ReRmpM9JhqwbzQgWQ4YgR6u.fyBhPSWEJonfvTCr8W6kEyCIaJn4
 28P17JfXWwCrFLyvNACE1mQqnLGNX8lhf0khX35YQRUT.XlpDS.6LdlmNgaz
 mrrRy8qQ.aNMhbcpA_PYSZ_gTzPJ_aU0il9kc7nDm5a4liJBNxDi8dmbXJBV
 bqzaHEwYsmxNCjr3GrnK13.vgW9Ohwfij
X-Yahoo-SMTP: iDf2N9.swBDAhYEh7VHfpgq0lnq.
X-Rocket-Received: from [192.168.119.11] (se@87.158.30.195 with )
 by smtp106.mail.ir2.yahoo.com with SMTP; 12 Jun 2013 08:49:51 +0000 UTC
Message-ID: <51B8362A.4080406@freebsd.org>
Date: Wed, 12 Jun 2013 10:49:46 +0200
From: Stefan Esser <se@freebsd.org>
User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64;
 rv:17.0) Gecko/20130509 Thunderbird/17.0.6
MIME-Version: 1.0
To: freebsd-fs@freebsd.org
Subject: Re: An order of magnitude higher IOPS needed with ZFS than UFS
References: <51B79023.5020109@fsn.hu>
 <253074981.119060.1370985609747.JavaMail.root@erie.cs.uoguelph.ca>
 <20130611232124.GA42577@nargothrond.kdm.org>
 <20130612104903.A1146@besplex.bde.org>
In-Reply-To: <20130612104903.A1146@besplex.bde.org>
X-Enigmail-Version: 1.5.1
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 7bit
Cc: bde@freebsd.org
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 12 Jun 2013 08:49:54 -0000

Am 12.06.2013 03:48, schrieb Bruce Evans:> I first thought that
clobbering the pointer was a bug, but now I think
> it is a feature.  The i/o really is non-sequential.  Basing most i/o
> sequentiality guesses on a single per-disk pointer (shared across
> different partitions on the same disk) might work better than all
> the separate pointers.  Accesses that are sequential at the file level
> would only be considered sequential if no other physical accesses
> intervene.  After getting that right, use sequentiality guesses again
> to delay some physical accesses if they would intervene.

Hi Bruce,

I tend to disagree ... ;-)

Recognizing sequential reads on a per file basis hints at whether
read-ahead (delaying the next following access and the buffer
needed to keep the data) might be useful. This "knowledge" can
lead to drastically higher total throughput in situations, where
multiple processes (or network clients) read files sequentially
(e.g. a media server for many parallel streams).

If you try to recognize sequential accesses on the device level,
then you may identify cases were one reader is likely to perform
back-to-back reads. But in all other cases (and especially under
high load), you will not be able to identify the processes that
might be helper by reading larger chunks than requested (lowering
the number of seeks required and taking pressure from the storage).

So, I think you need the per file read-ahead heuristics to identify
candidates for read-ahead. And I doubt you can get the same effect
by tracking disk accesses.

Hmmm, you could keep a list of read-ahead pointers per disk, which
could be recycled in a LRU scheme. Any new read that continues a
prior read is detected and updates the corresponding pointer, which
is in a struct with a read-ahead flag or the amount to read-ahead.
Access to this list of pointers could be sped up by having a hash
table that points to them (hash key is some number of LSBs, e.g.
for 256 or 1024 buckets). That way the temporal distribution of
the accesses could be included in the heuristic: If sequential
reads are spread out over a long time, then their corresponding
pointer is lost (after e.g. 256 or 1024 non-sequential accesses
to the volume).

This could be implemented as a scheduler class in GEOM, I think
(to make it easily loadable and selectable per volume, but might
also be appropriate for productive use). That way different
strategies (with regard to read-ahead and the potential for
clustering of writes) could be tested.

Might be interesting to compare such a scheduler with the per file
heuristics as implemented in the kernel now ...

Best regards, STefan

From owner-freebsd-fs@FreeBSD.ORG  Wed Jun 12 09:07:46 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115])
 by hub.freebsd.org (Postfix) with ESMTP id 5C750224
 for <freebsd-fs@freebsd.org>; Wed, 12 Jun 2013 09:07:46 +0000 (UTC)
 (envelope-from marck@rinet.ru)
Received: from woozle.rinet.ru (woozle.rinet.ru [195.54.192.68])
 by mx1.freebsd.org (Postfix) with ESMTP id DE70D1B0D
 for <freebsd-fs@freebsd.org>; Wed, 12 Jun 2013 09:07:45 +0000 (UTC)
Received: from localhost (localhost [127.0.0.1])
 by woozle.rinet.ru (8.14.5/8.14.5) with ESMTP id r5C97hdC024879;
 Wed, 12 Jun 2013 13:07:43 +0400 (MSK) (envelope-from marck@rinet.ru)
Date: Wed, 12 Jun 2013 13:07:43 +0400 (MSK)
From: Dmitry Morozovsky <marck@rinet.ru>
To: Mikolaj Golub <to.my.trociny@gmail.com>
Subject: Re: hast: can't restore after disk failure
In-Reply-To: <20130612084453.GA55502@gmail.com>
Message-ID: <alpine.BSF.2.00.1306121306030.96502@woozle.rinet.ru>
References: <alpine.BSF.2.00.1306101700300.69113@woozle.rinet.ru>
 <20130610201650.GA2823@gmail.com>
 <alpine.BSF.2.00.1306110038010.96502@woozle.rinet.ru>
 <20130611060741.GA42231@gmail.com>
 <alpine.BSF.2.00.1306120022580.96502@woozle.rinet.ru>
 <20130612084453.GA55502@gmail.com>
User-Agent: Alpine 2.00 (BSF 1167 2008-08-23)
X-NCC-RegID: ru.rinet
X-OpenPGP-Key-ID: 6B691B03
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.4.3
 (woozle.rinet.ru [0.0.0.0]); Wed, 12 Jun 2013 13:07:43 +0400 (MSK)
Cc: freebsd-fs@freebsd.org
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 12 Jun 2013 09:07:46 -0000

On Wed, 12 Jun 2013, Mikolaj Golub wrote:

[snip a bit]

> > nope, it does:
> > 
> > root@cthulhu3:/# diskinfo /dev/ada1p1
> > /dev/ada1p1     512     999654686720    1952450560      0       1048576 1936954 16      63
> > root@cthulhu3:/# diskinfo /dev/ada0p1
> > /dev/ada0p1     512     999653638144    1952448512      0       1048576 1936952 16      63

Argh! Somehow ada1p1 got created in slightly different size (though bigger than 
necessary, and it was the source of the problem.

recreating it with gpart fixes the problem.

> Hm, looking in the source where this error is generated:
> 
>   cthulhu3 hastd[14379]: [d1] (secondary) Unable to read metadata from /dev/ada1p1: No such file or directory.
> 
> it looks like hastd successfully read metadata from disk but failed to
> parse it (did not found an entry). This usually happens when metadata
> is not initialized by `hastctl create`.

Well, error messages definitely could be improved :)

-- 
Sincerely,
D.Marck                                     [DM5020, MCK-RIPE, DM3-RIPN]
[ FreeBSD committer:                                 marck@FreeBSD.org ]
------------------------------------------------------------------------
*** Dmitry Morozovsky --- D.Marck --- Wild Woozle --- marck@rinet.ru ***
------------------------------------------------------------------------

From owner-freebsd-fs@FreeBSD.ORG  Wed Jun 12 09:26:20 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by hub.freebsd.org (Postfix) with ESMTP id 287DFA16;
 Wed, 12 Jun 2013 09:26:20 +0000 (UTC) (envelope-from bra@fsn.hu)
Received: from people.fsn.hu (people.fsn.hu [195.228.252.137])
 by mx1.freebsd.org (Postfix) with ESMTP id 7BB051C6A;
 Wed, 12 Jun 2013 09:26:18 +0000 (UTC)
Received: by people.fsn.hu (Postfix, from userid 1001)
 id B21A5109600B; Wed, 12 Jun 2013 11:26:17 +0200 (CEST)
X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.3
X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MF-ACE0E1EA [pR:
 24.4369]
X-CRM114-CacheID: sfid-20130612_11260_D61B9634 
X-CRM114-Status: Good  ( pR: 24.4369 )
X-DSPAM-Result: Whitelisted
X-DSPAM-Processed: Wed Jun 12 11:26:17 2013
X-DSPAM-Confidence: 0.9965
X-DSPAM-Probability: 0.0000
X-DSPAM-Signature: 51b83eb9254715214434796
X-DSPAM-Factors: 27, From*Attila Nagy <bra@fsn.hu>, 0.00010, >+On, 0.00059,
 FreeBSD, 0.00074, FreeBSD, 0.00074, )+>, 0.00139,
 wrote+>>, 0.00147, wrote+>, 0.00218, On+Tue, 0.00242,
 >+of, 0.00279, >+of, 0.00279, >+>>, 0.00321, 2013+at, 0.00348,
 >+>, 0.00371, >+>, 0.00371, References*fsn.hu>, 0.00371,
 the+server, 0.00397, >>+>>, 0.00428, something+like, 0.00463,
 parameters, 0.00463, >+have, 0.00463, queue, 0.00529,
 wrote, 0.00535, wrote, 0.00535, ZFS, 0.00555, ZFS, 0.00555,
 >+If, 0.00555,
X-Spambayes-Classification: ham; 0.00
Received: from japan.t-online.private (japan.t-online.co.hu [195.228.243.99])
 by people.fsn.hu (Postfix) with ESMTPSA id C30B31095FF4;
 Wed, 12 Jun 2013 11:26:06 +0200 (CEST)
Message-ID: <51B83EAE.7060603@fsn.hu>
Date: Wed, 12 Jun 2013 11:26:06 +0200
From: Attila Nagy <bra@fsn.hu>
MIME-Version: 1.0
To: "Kenneth D. Merry" <ken@freebsd.org>
Subject: Re: An order of magnitude higher IOPS needed with ZFS than UFS
References: <51B79023.5020109@fsn.hu>
 <253074981.119060.1370985609747.JavaMail.root@erie.cs.uoguelph.ca>
 <20130611232124.GA42577@nargothrond.kdm.org>
In-Reply-To: <20130611232124.GA42577@nargothrond.kdm.org>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Cc: freebsd-fs@freebsd.org
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 12 Jun 2013 09:26:20 -0000

Hi,

On 06/12/13 01:21, Kenneth D. Merry wrote:
> On Tue, Jun 11, 2013 at 17:20:09 -0400, Rick Macklem wrote:
>>
>> ken@ recently committed a change to the new NFS server to add file
>> handle affinity support to it. He reported that he had found that,
>> without file handle affinity, that ZFS's sequential reading heuristic
>> broke badly (or something like that, you can probably find the email
>> thread or maybe he will chime in).
> That is correct.  The problem, if the I/O is sequential, is that simultaneous
> requests for adjacent blocks in a file will get farmed out to different
The IO is pretty much random, and the files aren't so big either (mean 
size around 400k).

> threads in the NFS server.  These can easily go down into ZFS out of order,
> and make the ZFS prefetch code think that the file is not being read
> sequentially.  It blows away the zfetch stream, and you wind up with a lot
> of I/O bandwidth getting used (with a lot of prefetching done and then
> re-done), but not much performance.
I've tried disabling prefetch, without any noticeable effects.
>
> Linux clients are more likely than FreeBSD and MacOS clients to queue a lot
> of reads to the server.
The clients are also FreeBSD (8.3 and 7.2 mostly). Running NFSv3 of course.
>
>> Anyhow, you could try switching the FreeBSD 9 system to use the old
>> NFS server (assuming your clients are doing NFSv3 mounts) and see if
>> that has a significant effect. (For FreeBSD9, the old server has file
>> handle affinity, but the new server does not.)
> If using the old NFS server helps, then the FHA code for the new server
> will help as well.  Perhaps more, because the default FHA tuning parameters
> have changed somewhat and parallel writes are now possible.
>
> If you want to try out the FHA changes in stable/9, I just MFCed them,
> change 251641.
>
Sure, I will try both 251641 and the old nfsd.

Thanks,

From owner-freebsd-fs@FreeBSD.ORG  Wed Jun 12 09:37:01 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115])
 by hub.freebsd.org (Postfix) with ESMTP id C5DE7BDB
 for <freebsd-fs@freebsd.org>; Wed, 12 Jun 2013 09:37:01 +0000 (UTC)
 (envelope-from jdc@koitsu.org)
Received: from relay5-d.mail.gandi.net (relay5-d.mail.gandi.net
 [217.70.183.197])
 by mx1.freebsd.org (Postfix) with ESMTP id 4E4851D01
 for <freebsd-fs@freebsd.org>; Wed, 12 Jun 2013 09:37:00 +0000 (UTC)
Received: from mfilter10-d.gandi.net (mfilter10-d.gandi.net [217.70.178.139])
 by relay5-d.mail.gandi.net (Postfix) with ESMTP id 2350841C0A4;
 Wed, 12 Jun 2013 11:36:44 +0200 (CEST)
X-Virus-Scanned: Debian amavisd-new at mfilter10-d.gandi.net
Received: from relay5-d.mail.gandi.net ([217.70.183.197])
 by mfilter10-d.gandi.net (mfilter10-d.gandi.net [10.0.15.180]) (amavisd-new,
 port 10024)
 with ESMTP id AsLmWF+lNvDh; Wed, 12 Jun 2013 11:36:42 +0200 (CEST)
X-Originating-IP: 76.102.14.35
Received: from jdc.koitsu.org (c-76-102-14-35.hsd1.ca.comcast.net
 [76.102.14.35]) (Authenticated sender: jdc@koitsu.org)
 by relay5-d.mail.gandi.net (Postfix) with ESMTPSA id 8E7BB41C0A6;
 Wed, 12 Jun 2013 11:36:41 +0200 (CEST)
Received: by icarus.home.lan (Postfix, from userid 1000)
 id A53A973A1C; Wed, 12 Jun 2013 02:36:39 -0700 (PDT)
Date: Wed, 12 Jun 2013 02:36:39 -0700
From: Jeremy Chadwick <jdc@koitsu.org>
To: Mikolaj Golub <to.my.trociny@gmail.com>
Subject: Re: hast: can't restore after disk failure
Message-ID: <20130612093639.GA9219@icarus.home.lan>
References: <alpine.BSF.2.00.1306101700300.69113@woozle.rinet.ru>
 <20130610201650.GA2823@gmail.com>
 <alpine.BSF.2.00.1306110038010.96502@woozle.rinet.ru>
 <20130611060741.GA42231@gmail.com>
 <alpine.BSF.2.00.1306120022580.96502@woozle.rinet.ru>
 <20130612084453.GA55502@gmail.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <20130612084453.GA55502@gmail.com>
User-Agent: Mutt/1.5.21 (2010-09-15)
Cc: freebsd-fs@freebsd.org, Dmitry Morozovsky <marck@rinet.ru>
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 12 Jun 2013 09:37:01 -0000

On Wed, Jun 12, 2013 at 11:44:54AM +0300, Mikolaj Golub wrote:
> On Wed, Jun 12, 2013 at 12:23:52AM +0400, Dmitry Morozovsky wrote:
> > On Tue, 11 Jun 2013, Mikolaj Golub wrote:
> > 
> > > On Tue, Jun 11, 2013 at 12:40:08AM +0400, Dmitry Morozovsky wrote:
> > > > On Mon, 10 Jun 2013, Mikolaj Golub wrote:
> > > > 
> > > > [snipall]
> > > > 
> > > > > > Jun 10 16:56:20 <console.info> cthulhu3 kernel: Jun 10 16:56:20 <daemon.err> 
> > > > > > cthulhu3 hastd[765]: [d1] (secondary) Worker process exited ungracefully 
> > > > > > (pid=14380, exitcode=66).
> > > > > > 
> > > > > > Any hints? Thanks!
> > > > > 
> > > > > Have you run hastctl create to initialize metadata?
> > > > 
> > > > Yes, but did it naively:
> > > > 
> > > > hastctl create d1
> > > 
> > > No errors?
> > 
> > no visible, but hast instance ungracefully exits
> > 
> > > > and status still reported 0 as provider size...
> > > 
> > > I assume /dev/ada1p1 is present and readable/writable?
> > > 
> > > Symptoms are like if it did not exist.
> > 
> > nope, it does:
> > 
> > root@cthulhu3:/# diskinfo /dev/ada1p1
> > /dev/ada1p1     512     999654686720    1952450560      0       1048576 1936954 16      63
> > root@cthulhu3:/# diskinfo /dev/ada0p1
> > /dev/ada0p1     512     999653638144    1952448512      0       1048576 1936952 16      63
> > 
> 
> Hm, looking in the source where this error is generated:
> 
>   cthulhu3 hastd[14379]: [d1] (secondary) Unable to read metadata from /dev/ada1p1: No such file or directory.
>
> it looks like hastd successfully read metadata from disk but failed to
> parse it (did not found an entry). This usually happens when metadata
> is not initialized by `hastctl create`.
> 
> Does `hastctl dump d1' not work too?

Note up front: I have zero familiarity with hast stuff.  I'm just
looking at source code, because your comment seems to indicate that
ENOENT (errno 2; No such file or directory) is actually false/incorrect.

I did spend almost 30 minutes digging through the hastd code.  This is
hard to follow -- very specifically, the error/errno situational code.
It's a very deep rabbit hole.  Variable names are common or re-used
(legitimately due to local scope), and the actual error that gets
printed comes directly from the global errno variable.

I honestly cannot see how nv->nv_error (which is what nv_error()
returns) gets set to ENOENT within the function call stack:

- metadata_read() is what prints the error (line 152 in nv.c)
- Error printing done by pjdlog_errno(), which uses the global errno
  to print its errors
- nv = nv_ntoh(eb)
- nv_ntoh() sets nv->nv_error to 0 initially, but then calls
  nv_validate() later on which can modify nv->error
- nv_validate() explicitly sets error (which later can get assigned
  to nv->nv_error) to EINVAL in many cases, but not ENOENT.

Therefore, I am honestly not sure how ENOENT gets returned to the user
in this case.  It looks like it's a misleading errno and is probably
meant to be something else.  If it's correct, I would absolutely love
for someone to show me how/where.

The code is here:

http://svnweb.freebsd.org/base/stable/9/sbin/hastd/

-- 
| Jeremy Chadwick                                   jdc@koitsu.org |
| UNIX Systems Administrator                http://jdc.koitsu.org/ |
| Making life hard for others since 1977.             PGP 4BD6C0CB |


From owner-freebsd-fs@FreeBSD.ORG  Wed Jun 12 10:03:44 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by hub.freebsd.org (Postfix) with ESMTP id 9D1311F7;
 Wed, 12 Jun 2013 10:03:44 +0000 (UTC)
 (envelope-from to.my.trociny@gmail.com)
Received: from mail-lb0-f170.google.com (mail-lb0-f170.google.com
 [209.85.217.170])
 by mx1.freebsd.org (Postfix) with ESMTP id E8EFA1E76;
 Wed, 12 Jun 2013 10:03:43 +0000 (UTC)
Received: by mail-lb0-f170.google.com with SMTP id t13so3407885lbd.29
 for <multiple recipients>; Wed, 12 Jun 2013 03:03:36 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
 h=date:from:to:cc:subject:message-id:references:mime-version
 :content-type:content-disposition:in-reply-to:user-agent;
 bh=PhSENIOZW89kxYWmWCiHOoqiJQlNeu7Eb253VXpuTYM=;
 b=NsdZGVfYlJ8EKsvNM6ThSOsVNXcfkVt1pyk8NdHNPwKws9GS/K1aoVdYh41OsOWtpn
 00Kss3TaKG9QH7mHTIw6WAYJeVYGf9TqoU5J+gjBPonkt0Wb+Mhx3SsDCE8F6QN5h0FH
 2hp+ezWq7dY3ze4C60/vyEdJTLMJPC/i4GuZMTAyoRtByPOxstYn+lNSIyp5B6redOqT
 a/9sLUE4zJMwLRI9DYYWzUKVp0yYz5nQED3MzalSlaZ1bgtGG10vUrnfO31/s0P02z8v
 BiRb/ZxvaPe7CGJDtldtbiVTAXegDAhMRAmoaiHdmXlDUNhpwdH2yvORvoG2W4Ei7A9U
 +7DQ==
X-Received: by 10.112.181.71 with SMTP id du7mr10509975lbc.24.1371031416709;
 Wed, 12 Jun 2013 03:03:36 -0700 (PDT)
Received: from localhost ([188.230.122.226])
 by mx.google.com with ESMTPSA id n3sm2301111lag.9.2013.06.12.03.03.34
 for <multiple recipients>
 (version=TLSv1 cipher=RC4-SHA bits=128/128);
 Wed, 12 Jun 2013 03:03:35 -0700 (PDT)
Date: Wed, 12 Jun 2013 13:03:33 +0300
From: Mikolaj Golub <to.my.trociny@gmail.com>
To: Jeremy Chadwick <jdc@koitsu.org>
Subject: Re: hast: can't restore after disk failure
Message-ID: <20130612100332.GB55502@gmail.com>
References: <alpine.BSF.2.00.1306101700300.69113@woozle.rinet.ru>
 <20130610201650.GA2823@gmail.com>
 <alpine.BSF.2.00.1306110038010.96502@woozle.rinet.ru>
 <20130611060741.GA42231@gmail.com>
 <alpine.BSF.2.00.1306120022580.96502@woozle.rinet.ru>
 <20130612084453.GA55502@gmail.com>
 <20130612093639.GA9219@icarus.home.lan>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <20130612093639.GA9219@icarus.home.lan>
User-Agent: Mutt/1.5.21 (2010-09-15)
Cc: freebsd-fs@freebsd.org, Dmitry Morozovsky <marck@rinet.ru>
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 12 Jun 2013 10:03:44 -0000

On Wed, Jun 12, 2013 at 02:36:39AM -0700, Jeremy Chadwick wrote:

> I honestly cannot see how nv->nv_error (which is what nv_error()
> returns) gets set to ENOENT within the function call stack:
> 
> - metadata_read() is what prints the error (line 152 in nv.c)
> - Error printing done by pjdlog_errno(), which uses the global errno
>   to print its errors
> - nv = nv_ntoh(eb)
> - nv_ntoh() sets nv->nv_error to 0 initially, but then calls
>   nv_validate() later on which can modify nv->error
> - nv_validate() explicitly sets error (which later can get assigned
>   to nv->nv_error) to EINVAL in many cases, but not ENOENT.
> 
> Therefore, I am honestly not sure how ENOENT gets returned to the user
> in this case.  It looks like it's a misleading errno and is probably
> meant to be something else.  If it's correct, I would absolutely love
> for someone to show me how/where.

nv_find() (which is used by nv_get_* functions) sets ENOENT when it
fails.

"No such file or directory" really looks confusing in this case. I am
not sure what a code from errno.h would be better here though. ENOATTR?

-- 
Mikolaj Golub

From owner-freebsd-fs@FreeBSD.ORG  Wed Jun 12 10:41:51 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by hub.freebsd.org (Postfix) with ESMTP id 55D64A50;
 Wed, 12 Jun 2013 10:41:51 +0000 (UTC) (envelope-from jdc@koitsu.org)
Received: from relay5-d.mail.gandi.net (relay5-d.mail.gandi.net
 [217.70.183.197])
 by mx1.freebsd.org (Postfix) with ESMTP id D380010D6;
 Wed, 12 Jun 2013 10:41:50 +0000 (UTC)
Received: from mfilter1-d.gandi.net (mfilter1-d.gandi.net [217.70.178.130])
 by relay5-d.mail.gandi.net (Postfix) with ESMTP id CE33341C07E;
 Wed, 12 Jun 2013 12:41:39 +0200 (CEST)
X-Virus-Scanned: Debian amavisd-new at mfilter1-d.gandi.net
Received: from relay5-d.mail.gandi.net ([217.70.183.197])
 by mfilter1-d.gandi.net (mfilter1-d.gandi.net [10.0.15.180]) (amavisd-new,
 port 10024)
 with ESMTP id IQBXtwfEe7Xz; Wed, 12 Jun 2013 12:41:38 +0200 (CEST)
X-Originating-IP: 76.102.14.35
Received: from jdc.koitsu.org (c-76-102-14-35.hsd1.ca.comcast.net
 [76.102.14.35]) (Authenticated sender: jdc@koitsu.org)
 by relay5-d.mail.gandi.net (Postfix) with ESMTPSA id BA6AD41C090;
 Wed, 12 Jun 2013 12:41:37 +0200 (CEST)
Received: by icarus.home.lan (Postfix, from userid 1000)
 id EA25573A1C; Wed, 12 Jun 2013 03:41:35 -0700 (PDT)
Date: Wed, 12 Jun 2013 03:41:35 -0700
From: Jeremy Chadwick <jdc@koitsu.org>
To: Mikolaj Golub <to.my.trociny@gmail.com>
Subject: Re: hast: can't restore after disk failure
Message-ID: <20130612104135.GA11495@icarus.home.lan>
References: <alpine.BSF.2.00.1306101700300.69113@woozle.rinet.ru>
 <20130610201650.GA2823@gmail.com>
 <alpine.BSF.2.00.1306110038010.96502@woozle.rinet.ru>
 <20130611060741.GA42231@gmail.com>
 <alpine.BSF.2.00.1306120022580.96502@woozle.rinet.ru>
 <20130612084453.GA55502@gmail.com>
 <20130612093639.GA9219@icarus.home.lan>
 <20130612100332.GB55502@gmail.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <20130612100332.GB55502@gmail.com>
User-Agent: Mutt/1.5.21 (2010-09-15)
Cc: freebsd-fs@freebsd.org, Dmitry Morozovsky <marck@rinet.ru>
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 12 Jun 2013 10:41:51 -0000

On Wed, Jun 12, 2013 at 01:03:33PM +0300, Mikolaj Golub wrote:
> On Wed, Jun 12, 2013 at 02:36:39AM -0700, Jeremy Chadwick wrote:
> 
> > I honestly cannot see how nv->nv_error (which is what nv_error()
> > returns) gets set to ENOENT within the function call stack:
> > 
> > - metadata_read() is what prints the error (line 152 in nv.c)
> > - Error printing done by pjdlog_errno(), which uses the global errno
> >   to print its errors
> > - nv = nv_ntoh(eb)
> > - nv_ntoh() sets nv->nv_error to 0 initially, but then calls
> >   nv_validate() later on which can modify nv->error
> > - nv_validate() explicitly sets error (which later can get assigned
> >   to nv->nv_error) to EINVAL in many cases, but not ENOENT.
> > 
> > Therefore, I am honestly not sure how ENOENT gets returned to the user
> > in this case.  It looks like it's a misleading errno and is probably
> > meant to be something else.  If it's correct, I would absolutely love
> > for someone to show me how/where.
> 
> nv_find() (which is used by nv_get_* functions) sets ENOENT when it
> fails.

How wonderful -- when I reviewed the code, I thought "Oh surely those
can't be responsible...".  I did see nv_find(), but I did not think
nv_get_*() would call that.  My fault/failure.

> "No such file or directory" really looks confusing in this case. I am
> not sure what a code from errno.h would be better here though. ENOATTR?

Sorry to make this longer than it needs to be, but I'm brain dumping:

What exactly is the error condition that is happening in the above case?
All I read was that the partition size differed between nodes and that
this caused the issue?

IMO, that condition should be checked and handled elegantly, and that
the error message should not use an errno at all but instead just tell
the user about the device size mismatch between nodes (for that specific
device) -- the device sizes must match between both nodes, correct?

There must be some kind of communication protocol between the nodes that
can indicate something along those lines.

If an errno is really needed, ENOATTR isn't relevant (that's referring
to extended filesystem attributes).  See intro(2) for the official
explanation of all of them.

I would choose EIO, ENXIO, ENOSPC, EOPNOTSUPP, or EPROTO.

I have not looked at what OpenBSD and NetBSD have for errno.h.  That
might be good to do first.

Else, Linux has some errno.h entries in it which look like they might be
more relevant, such as EBADFD, EREMOTEIO, or EMEDIUMTYPE (this one might
be a bit misleading).

http://www.virtsync.com/c-error-codes-include-errno

Some of these are even part of our recent BSM audit(2) stuff; check out
include/bsm/audit_errno.h (some are Solaris specific but look like they
might help, and I see some duplicates between those and what Linux has
too).

Important: I do not know the implications of adding/enhancing errno.
POSIX is involved, thus it would be wise to ask Bruce Evans.

-- 
| Jeremy Chadwick                                   jdc@koitsu.org |
| UNIX Systems Administrator                http://jdc.koitsu.org/ |
| Making life hard for others since 1977.             PGP 4BD6C0CB |


From owner-freebsd-fs@FreeBSD.ORG  Wed Jun 12 11:40:34 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by hub.freebsd.org (Postfix) with ESMTP id 39B6C4AC
 for <freebsd-fs@freebsd.org>; Wed, 12 Jun 2013 11:40:34 +0000 (UTC)
 (envelope-from feld@feld.me)
Received: from out3-smtp.messagingengine.com (out3-smtp.messagingengine.com
 [66.111.4.27]) by mx1.freebsd.org (Postfix) with ESMTP id 0FD5A1383
 for <freebsd-fs@freebsd.org>; Wed, 12 Jun 2013 11:40:33 +0000 (UTC)
Received: from compute2.internal (compute2.nyi.mail.srv.osa [10.202.2.42])
 by gateway1.nyi.mail.srv.osa (Postfix) with ESMTP id 1D7D6202B4
 for <freebsd-fs@freebsd.org>; Wed, 12 Jun 2013 07:40:33 -0400 (EDT)
Received: from frontend1.nyi.mail.srv.osa ([10.202.2.160])
 by compute2.internal (MEProxy); Wed, 12 Jun 2013 07:40:33 -0400
DKIM-Signature: v=1; a=rsa-sha1; c=relaxed/relaxed; d=feld.me; h=
 content-type:to:subject:references:date:mime-version
 :content-transfer-encoding:from:message-id:in-reply-to; s=
 mesmtp; bh=MqodeHbV4GO4rRE2iQzLLhCKKRo=; b=fJGym8QTZVVQzeTF88IeB
 efygSQzPvPGWgxKUyirhfujGYSTxgDXK3MKGJT60XAgbSIaqFCT5HpsURPuWo219
 WW05ch1viVxTuTSHA6/3B3qMXazCiRlA72rdxrkUmahT3s53HaFVlrw2B3nKRVEe
 M+/MVn+wlol/3/xQzgiTtA=
DKIM-Signature: v=1; a=rsa-sha1; c=relaxed/relaxed; d=
 messagingengine.com; h=content-type:to:subject:references:date
 :mime-version:content-transfer-encoding:from:message-id
 :in-reply-to; s=smtpout; bh=MqodeHbV4GO4rRE2iQzLLhCKKRo=; b=LSz2
 iGPrzqJRGBBN/hPjlirxb9zTAkpZ/C66BY0a0+ICxBhnuwb3mEZKKIoELRC0cMtR
 rrtXIqCfN0MDUYTR1qdk/YkhLSxTlxyYnG6OvxBffEmYRkFCQtV4UY839buicS1q
 71G3YVcyKDOx+3tKJUBbfCZRSUgPfphNUmehiCI=
X-Sasl-enc: LAyeIfZX08QsWlq/ru4x6uK3D9LLy5QwEfPn9Ufsd6Bb 1371037232
Received: from markf.office.supranet.net (unknown [66.170.8.18])
 by mail.messagingengine.com (Postfix) with ESMTPA id DB7B4C00E81
 for <freebsd-fs@freebsd.org>; Wed, 12 Jun 2013 07:40:32 -0400 (EDT)
Content-Type: text/plain; charset=us-ascii; format=flowed; delsp=yes
To: freebsd-fs@freebsd.org
Subject: Re: An order of magnitude higher IOPS needed with ZFS than UFS
References: <51B79023.5020109@fsn.hu>
Date: Wed, 12 Jun 2013 06:40:32 -0500
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
From: "Mark Felder" <feld@feld.me>
Message-ID: <op.wykdduw834t2sn@markf.office.supranet.net>
In-Reply-To: <51B79023.5020109@fsn.hu>
User-Agent: Opera Mail/12.15 (FreeBSD)
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 12 Jun 2013 11:40:34 -0000

On Tue, 11 Jun 2013 16:01:23 -0500, Attila Nagy <bra@fsn.hu> wrote:

> BTW, the file systems are 77-78% full according to df (so ZFS holds  
> more, because UFS is -m 8).

ZFS write performance can begin to drop pretty badly when you get around  
80% full. I've not seen any benchmarks showing an improvement with a very  
fast and large ZIL or tons of memory, but I'd expect that would help  
significantly. Just note that you're right at the edge where performance  
gets impacted.

From owner-freebsd-fs@FreeBSD.ORG  Wed Jun 12 11:47:34 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115])
 by hub.freebsd.org (Postfix) with ESMTP id C80D96AD
 for <freebsd-fs@freebsd.org>; Wed, 12 Jun 2013 11:47:34 +0000 (UTC)
 (envelope-from ira@wakeful.net)
Received: from mail-ob0-x234.google.com (mail-ob0-x234.google.com
 [IPv6:2607:f8b0:4003:c01::234])
 by mx1.freebsd.org (Postfix) with ESMTP id 960FA145E
 for <freebsd-fs@freebsd.org>; Wed, 12 Jun 2013 11:47:34 +0000 (UTC)
Received: by mail-ob0-f180.google.com with SMTP id eh20so13318750obb.11
 for <freebsd-fs@freebsd.org>; Wed, 12 Jun 2013 04:47:34 -0700 (PDT)
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=google.com; s=20120113;
 h=mime-version:in-reply-to:references:from:date:message-id:subject:to
 :cc:content-type:x-gm-message-state;
 bh=NEvlZ61Rl9T6Zvl9YEVZ72bKeAoSOc4bQ8uf+x6JMbE=;
 b=EvCHKBMmO5+b5fyL/xLn9mpYOLKn/liniIFHKWFUlWs8M+0saDKDUe5e8FJEZYRrpE
 EeiWqLgwcxnOkuF6Y9gjTpXSxxk85+imGDRfkyXIw+Dxk/jWqNnpK1CJ7GIlICokWq77
 v1rxBpvzAvsCE30ZYq8V+agjWvLrhyewEIlswfd+vZSxg9NxsPV1W/aCF5/UEYTwCMBo
 N5rzEnG2U2Dd4oVr8gMdLSsEzkZ+FbaY3Bc6cPXedFRuM5slTAVMnNF2+48LmPbMpCG+
 QHIqs3PL3BIWmouWm0dL4ZGiuQIq3JAeoczxrCa4+c117MZGXCzIS8HpW/jAyqegmW3e
 KrEw==
X-Received: by 10.182.237.50 with SMTP id uz18mr15126535obc.51.1371037653970; 
 Wed, 12 Jun 2013 04:47:33 -0700 (PDT)
MIME-Version: 1.0
Received: by 10.76.154.202 with HTTP; Wed, 12 Jun 2013 04:47:13 -0700 (PDT)
In-Reply-To: <op.wykdduw834t2sn@markf.office.supranet.net>
References: <51B79023.5020109@fsn.hu>
 <op.wykdduw834t2sn@markf.office.supranet.net>
From: Ira Cooper <ira@wakeful.net>
Date: Wed, 12 Jun 2013 07:47:13 -0400
Message-ID: <CAAPGDw+KLGRTvFEaanxeyrB1om21m4TbRfFd0nUQOGp9Nd97gw@mail.gmail.com>
Subject: Re: An order of magnitude higher IOPS needed with ZFS than UFS
To: Mark Felder <feld@feld.me>
X-Gm-Message-State: ALoCoQkHJ6W87MNF1GWkWnEqkwDkartlbG3olS7l3bPYQvhpU1BzmDG1Q1K5+4e1PZUo8URAg/T0
Content-Type: text/plain; charset=ISO-8859-1
X-Content-Filtered-By: Mailman/MimeDel 2.1.14
Cc: freebsd-fs@freebsd.org
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 12 Jun 2013 11:47:34 -0000

On Wed, Jun 12, 2013 at 7:40 AM, Mark Felder <feld@feld.me> wrote:

> On Tue, 11 Jun 2013 16:01:23 -0500, Attila Nagy <bra@fsn.hu> wrote:
>
>  BTW, the file systems are 77-78% full according to df (so ZFS holds more,
>> because UFS is -m 8).
>>
>
> ZFS write performance can begin to drop pretty badly when you get around
> 80% full. I've not seen any benchmarks showing an improvement with a very
> fast and large ZIL or tons of memory, but I'd expect that would help
> significantly. Just note that you're right at the edge where performance
> gets impacted.
>
>
If it matches what illumos does.  You jump off the same cliff.

-Ira

From owner-freebsd-fs@FreeBSD.ORG  Wed Jun 12 11:49:59 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by hub.freebsd.org (Postfix) with ESMTP id 7759D7E2
 for <freebsd-fs@freebsd.org>; Wed, 12 Jun 2013 11:49:59 +0000 (UTC)
 (envelope-from jdc@koitsu.org)
Received: from relay3-d.mail.gandi.net (relay3-d.mail.gandi.net
 [217.70.183.195])
 by mx1.freebsd.org (Postfix) with ESMTP id 36A69148F
 for <freebsd-fs@freebsd.org>; Wed, 12 Jun 2013 11:49:59 +0000 (UTC)
Received: from mfilter21-d.gandi.net (mfilter21-d.gandi.net [217.70.178.149])
 by relay3-d.mail.gandi.net (Postfix) with ESMTP id 7AC14A80FB;
 Wed, 12 Jun 2013 13:49:41 +0200 (CEST)
X-Virus-Scanned: Debian amavisd-new at mfilter21-d.gandi.net
Received: from relay3-d.mail.gandi.net ([217.70.183.195])
 by mfilter21-d.gandi.net (mfilter21-d.gandi.net [10.0.15.180]) (amavisd-new,
 port 10024)
 with ESMTP id TugAw4oebJsM; Wed, 12 Jun 2013 13:49:39 +0200 (CEST)
X-Originating-IP: 76.102.14.35
Received: from jdc.koitsu.org (c-76-102-14-35.hsd1.ca.comcast.net
 [76.102.14.35]) (Authenticated sender: jdc@koitsu.org)
 by relay3-d.mail.gandi.net (Postfix) with ESMTPSA id A8520A80C4;
 Wed, 12 Jun 2013 13:49:39 +0200 (CEST)
Received: by icarus.home.lan (Postfix, from userid 1000)
 id D0B3C73A1C; Wed, 12 Jun 2013 04:49:37 -0700 (PDT)
Date: Wed, 12 Jun 2013 04:49:37 -0700
From: Jeremy Chadwick <jdc@koitsu.org>
To: Mark Felder <feld@feld.me>
Subject: Re: An order of magnitude higher IOPS needed with ZFS than UFS
Message-ID: <20130612114937.GA13688@icarus.home.lan>
References: <51B79023.5020109@fsn.hu>
 <op.wykdduw834t2sn@markf.office.supranet.net>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <op.wykdduw834t2sn@markf.office.supranet.net>
User-Agent: Mutt/1.5.21 (2010-09-15)
Cc: freebsd-fs@freebsd.org
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 12 Jun 2013 11:49:59 -0000

On Wed, Jun 12, 2013 at 06:40:32AM -0500, Mark Felder wrote:
> On Tue, 11 Jun 2013 16:01:23 -0500, Attila Nagy <bra@fsn.hu> wrote:
> 
> >BTW, the file systems are 77-78% full according to df (so ZFS
> >holds more, because UFS is -m 8).
> 
> ZFS write performance can begin to drop pretty badly when you get
> around 80% full. I've not seen any benchmarks showing an improvement
> with a very fast and large ZIL or tons of memory, but I'd expect
> that would help significantly. Just note that you're right at the
> edge where performance gets impacted.

Mark, do you have any references for this?  I'd love to learn/read more
about this engineering/design aspect (I won't say flaw, I'll just say
aspect) to ZFS, as it's the first I've heard of it.

The reason I ask: (respectfully, not judgementally) I'm worried you
might be referring to something that has to do with SSDs and not ZFS,
specifically SSD wear-levelling performing better with lots of free
space (i.e. a small FTL map; TRIM helps with this immensely) -- where
the performance hit tends to begin around the 70-80% mark.  (I can talk
more about that if asked, but want to make sure the two things aren't
being mistaken for one another)

-- 
| Jeremy Chadwick                                   jdc@koitsu.org |
| UNIX Systems Administrator                http://jdc.koitsu.org/ |
| Making life hard for others since 1977.             PGP 4BD6C0CB |


From owner-freebsd-fs@FreeBSD.ORG  Wed Jun 12 11:55:47 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115])
 by hub.freebsd.org (Postfix) with ESMTP id DA9B0A99;
 Wed, 12 Jun 2013 11:55:46 +0000 (UTC)
 (envelope-from to.my.trociny@gmail.com)
Received: from mail-la0-x232.google.com (mail-la0-x232.google.com
 [IPv6:2a00:1450:4010:c03::232])
 by mx1.freebsd.org (Postfix) with ESMTP id 33FB71612;
 Wed, 12 Jun 2013 11:55:46 +0000 (UTC)
Received: by mail-la0-f50.google.com with SMTP id dy20so5576074lab.37
 for <multiple recipients>; Wed, 12 Jun 2013 04:55:45 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
 h=date:from:to:cc:subject:message-id:references:mime-version
 :content-type:content-disposition:in-reply-to:user-agent;
 bh=RfCbMrcVnFlgHYM+ob+fOEVu5PC47ppzLGMo04pTGjI=;
 b=hizyH0WYpQmELz40hHpJoGgEjGDONOc71aspR6La4jHyBLqWM2V52ybyw7kBpkZcDN
 ixaBEUhgFqlwzK83WPntKITIADXgDTKY5NnV8qJ+TPdlx/b1O3+Ux5CEXVCPMwkH3N6Q
 27BlVffSdl+z+hmD3bLYZbi+W51PDZBAO75UfolW7pP4rhlRN/cCwo+CQ9h+tF+kD7Sa
 G/rEoCh8TvEmWIkOwWLEuufJP0ZD5f4Imts7Fd02L1V2hRGFlqdnnvb2EJqdzy3DocH4
 a/OGggVCodkNmS/z6XQgpxSvDYCSf8KEIk6YAS2Nu0j94uX6Cl3pvV3hsFD282Q67lBw
 qkjA==
X-Received: by 10.152.28.66 with SMTP id z2mr3528753lag.5.1371038145172;
 Wed, 12 Jun 2013 04:55:45 -0700 (PDT)
Received: from localhost ([188.230.122.226])
 by mx.google.com with ESMTPSA id n1sm7913405lae.0.2013.06.12.04.55.43
 for <multiple recipients>
 (version=TLSv1 cipher=RC4-SHA bits=128/128);
 Wed, 12 Jun 2013 04:55:44 -0700 (PDT)
Date: Wed, 12 Jun 2013 14:55:41 +0300
From: Mikolaj Golub <to.my.trociny@gmail.com>
To: Jeremy Chadwick <jdc@koitsu.org>
Subject: Re: hast: can't restore after disk failure
Message-ID: <20130612115540.GC55502@gmail.com>
References: <alpine.BSF.2.00.1306101700300.69113@woozle.rinet.ru>
 <20130610201650.GA2823@gmail.com>
 <alpine.BSF.2.00.1306110038010.96502@woozle.rinet.ru>
 <20130611060741.GA42231@gmail.com>
 <alpine.BSF.2.00.1306120022580.96502@woozle.rinet.ru>
 <20130612084453.GA55502@gmail.com>
 <20130612093639.GA9219@icarus.home.lan>
 <20130612100332.GB55502@gmail.com>
 <20130612104135.GA11495@icarus.home.lan>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <20130612104135.GA11495@icarus.home.lan>
User-Agent: Mutt/1.5.21 (2010-09-15)
Cc: freebsd-fs@freebsd.org, Dmitry Morozovsky <marck@rinet.ru>
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 12 Jun 2013 11:55:47 -0000

On Wed, Jun 12, 2013 at 03:41:35AM -0700, Jeremy Chadwick wrote:
> On Wed, Jun 12, 2013 at 01:03:33PM +0300, Mikolaj Golub wrote:
> > On Wed, Jun 12, 2013 at 02:36:39AM -0700, Jeremy Chadwick wrote:
> > 
> > > I honestly cannot see how nv->nv_error (which is what nv_error()
> > > returns) gets set to ENOENT within the function call stack:
> > > 
> > > - metadata_read() is what prints the error (line 152 in nv.c)
> > > - Error printing done by pjdlog_errno(), which uses the global errno
> > >   to print its errors
> > > - nv = nv_ntoh(eb)
> > > - nv_ntoh() sets nv->nv_error to 0 initially, but then calls
> > >   nv_validate() later on which can modify nv->error
> > > - nv_validate() explicitly sets error (which later can get assigned
> > >   to nv->nv_error) to EINVAL in many cases, but not ENOENT.
> > > 
> > > Therefore, I am honestly not sure how ENOENT gets returned to the user
> > > in this case.  It looks like it's a misleading errno and is probably
> > > meant to be something else.  If it's correct, I would absolutely love
> > > for someone to show me how/where.
> > 
> > nv_find() (which is used by nv_get_* functions) sets ENOENT when it
> > fails.
> 
> How wonderful -- when I reviewed the code, I thought "Oh surely those
> can't be responsible...".  I did see nv_find(), but I did not think
> nv_get_*() would call that.  My fault/failure.
> 
> > "No such file or directory" really looks confusing in this case. I am
> > not sure what a code from errno.h would be better here though. ENOATTR?
> 
> Sorry to make this longer than it needs to be, but I'm brain dumping:
> 
> What exactly is the error condition that is happening in the above case?
> All I read was that the partition size differed between nodes and that
> this caused the issue?

As I wrote it before the error was that hastd failed to parse metadata
it had read from the local disk (failed to find some entry in metadata
structure). Usually this happens when metadata is not properly
initialized for a new disk or corrupted.

Different data sizes should trigger the error "Data size differs
between nodes ..." on primary.

Unfortunately I have not seen full logs from primary and secondary, so
it is difficult to me to guess what was going on there.

-- 
Mikolaj Golub

From owner-freebsd-fs@FreeBSD.ORG  Wed Jun 12 12:04:01 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115])
 by hub.freebsd.org (Postfix) with ESMTP id DAD43E71
 for <freebsd-fs@freebsd.org>; Wed, 12 Jun 2013 12:04:01 +0000 (UTC)
 (envelope-from feld@feld.me)
Received: from out3-smtp.messagingengine.com (out3-smtp.messagingengine.com
 [66.111.4.27]) by mx1.freebsd.org (Postfix) with ESMTP id AFDB4169D
 for <freebsd-fs@freebsd.org>; Wed, 12 Jun 2013 12:04:01 +0000 (UTC)
Received: from compute4.internal (compute4.nyi.mail.srv.osa [10.202.2.44])
 by gateway1.nyi.mail.srv.osa (Postfix) with ESMTP id 7061C20E1D;
 Wed, 12 Jun 2013 08:03:59 -0400 (EDT)
Received: from frontend2.nyi.mail.srv.osa ([10.202.2.161])
 by compute4.internal (MEProxy); Wed, 12 Jun 2013 08:04:00 -0400
DKIM-Signature: v=1; a=rsa-sha1; c=relaxed/relaxed; d=feld.me; h=
 content-type:to:cc:subject:references:date:mime-version
 :content-transfer-encoding:from:message-id:in-reply-to; s=
 mesmtp; bh=H/qxRRnrMV/D1SIGS6r2tLHnGlw=; b=Cirh4FoyepQSMn+zAtWSs
 c0SvqAaMh4QVL07s3XnKEmKbzpreUOe3UFPnVgfwXBOlufqfyKSgIkybjDhfjZlt
 Gdhnoj6hvzajR33Hvo+/bbJ+bseUPuMRC++6Q8xcsgSahe6XN0JAZZPfE+oZr/Nk
 YFzzt9lqqEQAHDVo91y/Js=
DKIM-Signature: v=1; a=rsa-sha1; c=relaxed/relaxed; d=
 messagingengine.com; h=content-type:to:cc:subject:references
 :date:mime-version:content-transfer-encoding:from:message-id
 :in-reply-to; s=smtpout; bh=H/qxRRnrMV/D1SIGS6r2tLHnGlw=; b=JyaT
 jcphGy4OyitIwiq6ndaGeRrgwY+uCePpxjKtR3KqfRZRTu5Xqvn37rBpC/K+oMOV
 nbKpAddQ8zuh3uqM+78QOp700UgBpunySYbCBH5j9YbN/39SJiGtqDo29EOnXB1x
 0PIsHxjrUsnE5St7AilZ0KFmRZwnzx21BEVBmik=
X-Sasl-enc: 2QnJhI0Rp23FCEfkf4xV6ybxSRXJBZxjnOgnLX6JljYw 1371038639
Received: from markf.office.supranet.net (unknown [66.170.8.18])
 by mail.messagingengine.com (Postfix) with ESMTPA id 5A6236801F3;
 Wed, 12 Jun 2013 08:03:59 -0400 (EDT)
Content-Type: text/plain; charset=us-ascii; format=flowed; delsp=yes
To: "Jeremy Chadwick" <jdc@koitsu.org>
Subject: Re: An order of magnitude higher IOPS needed with ZFS than UFS
References: <51B79023.5020109@fsn.hu>
 <op.wykdduw834t2sn@markf.office.supranet.net>
 <20130612114937.GA13688@icarus.home.lan>
Date: Wed, 12 Jun 2013 07:03:58 -0500
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
From: "Mark Felder" <feld@feld.me>
Message-ID: <op.wykegwri34t2sn@markf.office.supranet.net>
In-Reply-To: <20130612114937.GA13688@icarus.home.lan>
User-Agent: Opera Mail/12.15 (FreeBSD)
Cc: freebsd-fs@freebsd.org
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 12 Jun 2013 12:04:01 -0000

On Wed, 12 Jun 2013 06:49:37 -0500, Jeremy Chadwick <jdc@koitsu.org> wrote:

>
> Mark, do you have any references for this?  I'd love to learn/read more
> about this engineering/design aspect (I won't say flaw, I'll just say
> aspect) to ZFS, as it's the first I've heard of it.

Firsthand experience on a couple servers, and some old Sun docs that I  
can't find anymore since Oracle broke the links. If you start googling for  
"ZFS performance 80%" you should come across similar reports. The  
recommendation was always that when you hit about 80% you need to add a  
new vdev or you'll be in serious trouble. I'd always believed that it has  
to do with the way the ZFS COW algorithm works. If my suspicion is correct  
I'd guess it probably stalls trying to find an ideal place to write --  
maybe some cost calculation? I'm reaching for straws now because I don't  
know anything about the code itself.

I'd love to hear from people who have actually touched the code and can  
give a more definitive answer because this does border on "urban legend"  
territory, but I've read it and experienced it a few times so I'm just  
passing it on.

From owner-freebsd-fs@FreeBSD.ORG  Wed Jun 12 14:52:44 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by hub.freebsd.org (Postfix) with ESMTP id D4E0E8DE
 for <freebsd-fs@freebsd.org>; Wed, 12 Jun 2013 14:52:44 +0000 (UTC)
 (envelope-from amvandemore@gmail.com)
Received: from mail-pa0-x235.google.com (mail-pa0-x235.google.com
 [IPv6:2607:f8b0:400e:c03::235])
 by mx1.freebsd.org (Postfix) with ESMTP id B397F117F
 for <freebsd-fs@freebsd.org>; Wed, 12 Jun 2013 14:52:44 +0000 (UTC)
Received: by mail-pa0-f53.google.com with SMTP id tj12so4953867pac.40
 for <freebsd-fs@freebsd.org>; Wed, 12 Jun 2013 07:52:43 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
 h=mime-version:in-reply-to:references:date:message-id:subject:from:to
 :cc:content-type;
 bh=M2sQqeX5UwmfBtYtkhEwPwU8b2dVxUJm0BCuAvYFtz0=;
 b=OaswgK0E744eT/uWgAQIGtBy9EzaHoz8LBS0j067xfOAviYZy9JKc9IINhyhrBNK1v
 Lh3Ggl+r4tk7/t9ohtAAzKfkxfVFLy2ma5Onj+VBsBTmWwCkUCirx4WmjYbdLJ59VOvJ
 mEtgKGep89z2Sgioc1zokwHGLk+rXxDUGtttLgOlYsC/yxSW1/ZpmBZBMZumV8PriS4/
 zZ5fhWbz9PNJ9n6zj9sQWCdEv5WvBB624MRAR4e5tk/12IJxLnNjxEzzX5XvkeNFgnXb
 FdkMWN3+Yf+bSYdCRgoZmUBehUtW9hkl49iS7T8jQEMIKG2BznIiO3W03nF6/vJyiE6j
 bRaw==
MIME-Version: 1.0
X-Received: by 10.68.203.161 with SMTP id kr1mr19688590pbc.192.1371048763895; 
 Wed, 12 Jun 2013 07:52:43 -0700 (PDT)
Received: by 10.70.31.195 with HTTP; Wed, 12 Jun 2013 07:52:43 -0700 (PDT)
In-Reply-To: <20130612114937.GA13688@icarus.home.lan>
References: <51B79023.5020109@fsn.hu>
 <op.wykdduw834t2sn@markf.office.supranet.net>
 <20130612114937.GA13688@icarus.home.lan>
Date: Wed, 12 Jun 2013 09:52:43 -0500
Message-ID: <CA+tpaK1kWeh8rWhtsYXJ0buRE0jbJYJ3QwBNRdWa2pTdVY+SKA@mail.gmail.com>
Subject: Re: An order of magnitude higher IOPS needed with ZFS than UFS
From: Adam Vande More <amvandemore@gmail.com>
To: Jeremy Chadwick <jdc@koitsu.org>
Content-Type: text/plain; charset=ISO-8859-1
X-Content-Filtered-By: Mailman/MimeDel 2.1.14
Cc: freebsd-fs <freebsd-fs@freebsd.org>
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 12 Jun 2013 14:52:44 -0000

On Wed, Jun 12, 2013 at 6:49 AM, Jeremy Chadwick <jdc@koitsu.org> wrote:

> Mark, do you have any references for this?  I'd love to learn/read more
> about this engineering/design aspect (I won't say flaw, I'll just say
> aspect) to ZFS, as it's the first I've heard of it.


Recently, I dd'ed out the free space on a ZFS volume.  The last few MB's
took like an hour and io seemed to drop exponentially once past 80% or so.
 Nothing to do with SSD's


-- 
Adam Vande More

From owner-freebsd-fs@FreeBSD.ORG  Wed Jun 12 16:03:40 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by hub.freebsd.org (Postfix) with ESMTP id 628DC28D
 for <freebsd-fs@freebsd.org>; Wed, 12 Jun 2013 16:03:40 +0000 (UTC)
 (envelope-from borjam@sarenet.es)
Received: from proxypop03b.sare.net (proxypop03b.sare.net [194.30.0.251])
 by mx1.freebsd.org (Postfix) with ESMTP id 285761855
 for <freebsd-fs@freebsd.org>; Wed, 12 Jun 2013 16:03:39 +0000 (UTC)
Received: from [172.16.2.2] (izaro.sarenet.es [192.148.167.11])
 by proxypop03.sare.net (Postfix) with ESMTPSA id 8B3959DD057;
 Wed, 12 Jun 2013 17:57:13 +0200 (CEST)
Subject: Re: An order of magnitude higher IOPS needed with ZFS than UFS
Mime-Version: 1.0 (Apple Message framework v1085)
Content-Type: text/plain; charset=us-ascii
From: Borja Marcos <borjam@sarenet.es>
In-Reply-To: <20130612114937.GA13688@icarus.home.lan>
Date: Wed, 12 Jun 2013 17:57:13 +0200
Content-Transfer-Encoding: quoted-printable
Message-Id: <8FF3DAC5-ED3D-4678-B040-74829A208A86@sarenet.es>
References: <51B79023.5020109@fsn.hu>
 <op.wykdduw834t2sn@markf.office.supranet.net>
 <20130612114937.GA13688@icarus.home.lan>
To: Jeremy Chadwick <jdc@koitsu.org>
X-Mailer: Apple Mail (2.1085)
Cc: freebsd-fs@freebsd.org
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 12 Jun 2013 16:03:40 -0000


On Jun 12, 2013, at 1:49 PM, Jeremy Chadwick wrote:

> Mark, do you have any references for this?  I'd love to learn/read =
more
> about this engineering/design aspect (I won't say flaw, I'll just say
> aspect) to ZFS, as it's the first I've heard of it.

I have seen that behavior with standard hard disks. Once the busy space =
reached 80 % performance
dropped significantly. Just deleting some old data (it is a log storage =
system) performance went
back to normal.

Sorry I don't have graphs or anything like that. What I noticed is that =
the disks were
"busier" per the %busy column in gstat(8).


Borja.


From owner-freebsd-fs@FreeBSD.ORG  Wed Jun 12 17:59:42 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115])
 by hub.freebsd.org (Postfix) with ESMTP id D10D55FC
 for <freebsd-fs@freebsd.org>; Wed, 12 Jun 2013 17:59:42 +0000 (UTC)
 (envelope-from dieterbsd@gmail.com)
Received: from mail-ie0-x242.google.com (mail-ie0-x242.google.com
 [IPv6:2607:f8b0:4001:c03::242])
 by mx1.freebsd.org (Postfix) with ESMTP id AC70216FC
 for <freebsd-fs@freebsd.org>; Wed, 12 Jun 2013 17:59:42 +0000 (UTC)
Received: by mail-ie0-f194.google.com with SMTP id 9so1834772iec.5
 for <freebsd-fs@freebsd.org>; Wed, 12 Jun 2013 10:59:42 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
 h=mime-version:date:message-id:subject:from:to:content-type;
 bh=5m7cEcPBUjlr+wRt2UuWBIGRBr3N16Jvwkxs5XAVqdI=;
 b=d7Ce8tGAZejQZ0XNCNXEAs6lLQxCAbDTmgkrM/HYZyLdposbmeWlVeWIQCtW9lyuD9
 YZrIslEk0TT8ObGqKGNjhiF0fiwZCBVMX6ueo9ASxCxr8uR52q0Ut8+1EEIO/ccRat13
 t/ZSMFmL0b5+ot/xTpl4+bUbkS5DOFPFYnoFpsp83C9cGTU/SpUQ+BoOKo7pQcx/mJs8
 pbexftTIwj/lx9hPM93bT/HdRACZajRFc+jHxqpuHGfD+fNVsDEj7rItIfiFWqHYuUse
 AjpbSzbf+6OX11PL6x5SbnSGfoVXc0zeTn3DpZFZt8Dz6RY+TS3uHarN+8VxRl4fgz8w
 7s9g==
MIME-Version: 1.0
X-Received: by 10.50.83.37 with SMTP id n5mr4022003igy.44.1371059982336; Wed,
 12 Jun 2013 10:59:42 -0700 (PDT)
Received: by 10.64.139.34 with HTTP; Wed, 12 Jun 2013 10:59:42 -0700 (PDT)
Date: Wed, 12 Jun 2013 10:59:42 -0700
Message-ID: <CAA3ZYrD5sj=wr3biX0+kDZABeuZVoZcuuyQ3YvHu01G75E_8=w@mail.gmail.com>
Subject: FFS: fsck doesn't match doc
From: Dieter BSD <dieterbsd@gmail.com>
To: freebsd-fs@freebsd.org
Content-Type: text/plain; charset=ISO-8859-1
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 12 Jun 2013 17:59:42 -0000

Anyone have thoughts on bin/166499: fsck(8) behaviour does not match doc
(PARTIALLY TRUNCATED INODE)?  This PR has been sitting around for over a
year with no comments or action.  Seems to me it should appear on the
list of open fs bugs, but it doesn't?

A process is running, appending data to a file (*NOT* truncating the
file as the doc claims!).  Machine panics or otherwise goes down badly.
Fsck whines about PARTIALLY TRUNCATED INODE and that fs doesn't get
mounted until I run fsck manually. This happens nearly every time the
machine goes down. More details are in the PR.

Would it be safe to have fsck automagically fix this problem, as the
doc (incorrectly) says it does?

From owner-freebsd-fs@FreeBSD.ORG  Wed Jun 12 18:01:27 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by hub.freebsd.org (Postfix) with ESMTP id 324A57C7
 for <freebsd-fs@freebsd.org>; Wed, 12 Jun 2013 18:01:27 +0000 (UTC)
 (envelope-from nowakpl@platinum.linux.pl)
Received: from platinum.linux.pl (platinum.edu.pl [81.161.192.4])
 by mx1.freebsd.org (Postfix) with ESMTP id E86481728
 for <freebsd-fs@freebsd.org>; Wed, 12 Jun 2013 18:01:26 +0000 (UTC)
Received: by platinum.linux.pl (Postfix, from userid 87)
 id 696175FD06; Wed, 12 Jun 2013 19:55:13 +0200 (CEST)
X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on platinum.linux.pl
X-Spam-Level: 
X-Spam-Status: No, score=-1.3 required=3.0 tests=ALL_TRUSTED,AWL
 autolearn=disabled version=3.3.2
Received: from [10.255.0.2] (c38-073.client.duna.pl [83.151.38.73])
 by platinum.linux.pl (Postfix) with ESMTPA id DD1995FD05
 for <freebsd-fs@freebsd.org>; Wed, 12 Jun 2013 19:55:12 +0200 (CEST)
Message-ID: <51B8B5DC.2010703@platinum.linux.pl>
Date: Wed, 12 Jun 2013 19:54:36 +0200
From: Adam Nowacki <nowakpl@platinum.linux.pl>
User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64;
 rv:17.0) Gecko/20130509 Thunderbird/17.0.6
MIME-Version: 1.0
To: freebsd-fs@freebsd.org
Subject: Re: An order of magnitude higher IOPS needed with ZFS than UFS
References: <51B79023.5020109@fsn.hu>
 <op.wykdduw834t2sn@markf.office.supranet.net>
 <20130612114937.GA13688@icarus.home.lan>
In-Reply-To: <20130612114937.GA13688@icarus.home.lan>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 12 Jun 2013 18:01:27 -0000

On 2013-06-12 13:49, Jeremy Chadwick wrote:
> On Wed, Jun 12, 2013 at 06:40:32AM -0500, Mark Felder wrote:
>> On Tue, 11 Jun 2013 16:01:23 -0500, Attila Nagy <bra@fsn.hu> wrote:
>>
>>> BTW, the file systems are 77-78% full according to df (so ZFS
>>> holds more, because UFS is -m 8).
>>
>> ZFS write performance can begin to drop pretty badly when you get
>> around 80% full. I've not seen any benchmarks showing an improvement
>> with a very fast and large ZIL or tons of memory, but I'd expect
>> that would help significantly. Just note that you're right at the
>> edge where performance gets impacted.
>
> Mark, do you have any references for this?  I'd love to learn/read more
> about this engineering/design aspect (I won't say flaw, I'll just say
> aspect) to ZFS, as it's the first I've heard of it.
>
> The reason I ask: (respectfully, not judgementally) I'm worried you
> might be referring to something that has to do with SSDs and not ZFS,
> specifically SSD wear-levelling performing better with lots of free
> space (i.e. a small FTL map; TRIM helps with this immensely) -- where
> the performance hit tends to begin around the 70-80% mark.  (I can talk
> more about that if asked, but want to make sure the two things aren't
> being mistaken for one another)
>

So I went hunting for some evidence and created this:
http://tepeserwery.pl/nowak/fillingzfs.png

Columns are groups of sectors, new row is created every time a FLUSH 
command is sent to a disk. Percentage is the amount of filled space in 
the pool. Red means a write happened there, Pool is 1GB with writes of 
50MB between black lines.

It looks like past 80% there simply isn't enough continuous disk space 
and writes are becoming more and more random. For some unknown to me 
reason there is also a lot more flushing which certainly doesn't help 
for performance. There is also this odd hole left untouched by any 
write, reserved space of some sort?


From owner-freebsd-fs@FreeBSD.ORG  Wed Jun 12 18:26:02 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115])
 by hub.freebsd.org (Postfix) with ESMTP id 2E371AE3
 for <freebsd-fs@freebsd.org>; Wed, 12 Jun 2013 18:26:02 +0000 (UTC)
 (envelope-from feld@feld.me)
Received: from out3-smtp.messagingengine.com (out3-smtp.messagingengine.com
 [66.111.4.27]) by mx1.freebsd.org (Postfix) with ESMTP id 03C9E1935
 for <freebsd-fs@freebsd.org>; Wed, 12 Jun 2013 18:26:01 +0000 (UTC)
Received: from compute3.internal (compute3.nyi.mail.srv.osa [10.202.2.43])
 by gateway1.nyi.mail.srv.osa (Postfix) with ESMTP id 05FC120D9D;
 Wed, 12 Jun 2013 14:26:01 -0400 (EDT)
Received: from frontend1.nyi.mail.srv.osa ([10.202.2.160])
 by compute3.internal (MEProxy); Wed, 12 Jun 2013 14:26:01 -0400
DKIM-Signature: v=1; a=rsa-sha1; c=relaxed/relaxed; d=feld.me; h=
 content-type:to:subject:references:date:mime-version
 :content-transfer-encoding:from:message-id:in-reply-to; s=
 mesmtp; bh=u/0kpl7t3ihQj1wittijxKv4Gbw=; b=oJjaWMOIxAJ1MbBj+aWpg
 lfNl3PmiM1sJSOEq7YLrDZWmrPvftNLdVXcUC7sOBgHfs6kp2NfMqsB3g9220Hq7
 RnUddj7Y9PFhAs2e9iBI9DTzbldZ0PLBxSiWT3dl0Tv7BjDtbCZangLMrktalZ0A
 cWk8M92TGAbz51k1lenhjg=
DKIM-Signature: v=1; a=rsa-sha1; c=relaxed/relaxed; d=
 messagingengine.com; h=content-type:to:subject:references:date
 :mime-version:content-transfer-encoding:from:message-id
 :in-reply-to; s=smtpout; bh=u/0kpl7t3ihQj1wittijxKv4Gbw=; b=f7Vi
 HCTCj/+5+zuQDirlpHrvVWD8UchLYUR3UzcPoH4r5fNPSXGMb0Nr9A7cfcZi/fyI
 f4GtgGN7wzROiChqvtP+8OdnlxY0eDK94q08XhpEw/cry3YOMn5p9JTXDmQnaXE9
 xrQUeahIfZyBjJraFuXHTae4Gpq4s66ASfDSFvA=
X-Sasl-enc: yDd1pRBUgMhlL5fFnSaS1iMdVyzjpNH4XWMWYKt9Uzwx 1371061560
Received: from markf.office.supranet.net (unknown [66.170.8.18])
 by mail.messagingengine.com (Postfix) with ESMTPA id BDE01C00E89;
 Wed, 12 Jun 2013 14:26:00 -0400 (EDT)
Content-Type: text/plain; charset=us-ascii; format=flowed; delsp=yes
To: freebsd-fs@freebsd.org, "Dieter BSD" <dieterbsd@gmail.com>
Subject: Re: FFS: fsck doesn't match doc
References: <CAA3ZYrD5sj=wr3biX0+kDZABeuZVoZcuuyQ3YvHu01G75E_8=w@mail.gmail.com>
Date: Wed, 12 Jun 2013 13:26:00 -0500
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
From: "Mark Felder" <feld@feld.me>
Message-ID: <op.wykv5mkk34t2sn@markf.office.supranet.net>
In-Reply-To: <CAA3ZYrD5sj=wr3biX0+kDZABeuZVoZcuuyQ3YvHu01G75E_8=w@mail.gmail.com>
User-Agent: Opera Mail/12.15 (FreeBSD)
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 12 Jun 2013 18:26:02 -0000

On Wed, 12 Jun 2013 12:59:42 -0500, Dieter BSD <dieterbsd@gmail.com> wrote:

>
> Would it be safe to have fsck automagically fix this problem, as the
> doc (incorrectly) says it does?

What happens if you add to /etc/rc.conf:

fsck_y_enable="YES"
background_fsck="NO"

From owner-freebsd-fs@FreeBSD.ORG  Wed Jun 12 19:33:05 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115])
 by hub.freebsd.org (Postfix) with ESMTP id B91FD5E7
 for <freebsd-fs@freebsd.org>; Wed, 12 Jun 2013 19:33:05 +0000 (UTC)
 (envelope-from dieterbsd@gmail.com)
Received: from mail-ie0-x241.google.com (mail-ie0-x241.google.com
 [IPv6:2607:f8b0:4001:c03::241])
 by mx1.freebsd.org (Postfix) with ESMTP id 948451D06
 for <freebsd-fs@freebsd.org>; Wed, 12 Jun 2013 19:33:05 +0000 (UTC)
Received: by mail-ie0-f193.google.com with SMTP id s9so2965825iec.0
 for <freebsd-fs@freebsd.org>; Wed, 12 Jun 2013 12:33:05 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
 h=mime-version:date:message-id:subject:from:to:content-type;
 bh=UpoVG02InOy9kZbG+fYfCilx6D5auDSVDyaKh2AUPkI=;
 b=kPuycVQX4CwURidkKjEmiHq3FRZb3kAlY03QXbbG898Uto4MBzPaT0EMPsNDlf9R1U
 AK2aT+mZtjxHp/Di6jUtski2OjAF/BvFNAgmSsDdNQrDU8CV+lxdNdYztRK8bVT/snT1
 IPtq5612UP5+oZzEsSn4glkD2VSElh5/8bXmWjm6CAYeWJnoi19quV73yZlBkFX9lRTP
 13hDSmOESPUc2wXnSrP7Hx+/NLcS0MwscvXINHpHLeoIcrXZhOg1d/1TQMrdSKyAR+lv
 rwWKWXVEgDI2/RaY/m2QgV6Tn46xqGnxjqls4Kr+OLQEc0WgxvbxikhmZZ7QuFhsVBCz
 ivnw==
MIME-Version: 1.0
X-Received: by 10.50.23.108 with SMTP id l12mr4063818igf.45.1371065585296;
 Wed, 12 Jun 2013 12:33:05 -0700 (PDT)
Received: by 10.64.139.34 with HTTP; Wed, 12 Jun 2013 12:33:04 -0700 (PDT)
Date: Wed, 12 Jun 2013 12:33:04 -0700
Message-ID: <CAA3ZYrAWR=uuw668q4ZEcZqGUG_9jzRqyQVM-2i3COy4dQmipA@mail.gmail.com>
Subject: Re: FFS: fsck doesn't match doc
From: Dieter BSD <dieterbsd@gmail.com>
To: freebsd-fs@freebsd.org
Content-Type: text/plain; charset=ISO-8859-1
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 12 Jun 2013 19:33:05 -0000

>> Would it be safe to have fsck automagically fix this problem, as the
>> doc (incorrectly) says it does?
>
> What happens if you add to /etc/rc.conf:
>
> fsck_y_enable="YES"
> background_fsck="NO"

Fsck -y is not safe.  :-(

Would it be *safe* to have "fsck -p" automagically fix this problem, as the
doc (incorrectly) says it does?

From owner-freebsd-fs@FreeBSD.ORG  Wed Jun 12 23:15:05 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115])
 by hub.freebsd.org (Postfix) with ESMTP id 1B9E644D
 for <freebsd-fs@freebsd.org>; Wed, 12 Jun 2013 23:15:05 +0000 (UTC)
 (envelope-from wmn@siberianet.ru)
Received: from mail.siberianet.ru (mail.siberianet.ru [89.105.136.7])
 by mx1.freebsd.org (Postfix) with ESMTP id BFAD619D0
 for <freebsd-fs@freebsd.org>; Wed, 12 Jun 2013 23:15:04 +0000 (UTC)
Received: from book.localnet (wmn.siberianet.ru [89.105.137.12])
 by mail.siberianet.ru (Postfix) with ESMTP id D4FD612FB34;
 Thu, 13 Jun 2013 07:05:27 +0800 (KRAT)
From: Sergey Lobanov <wmn@siberianet.ru>
Organization: ISP "SiberiaNet"
To: freebsd-fs@freebsd.org
Subject: Re: An order of magnitude higher IOPS needed with ZFS than UFS
Date: Thu, 13 Jun 2013 07:05:26 +0800
User-Agent: KMail/1.13.7 (FreeBSD/9.0-RELEASE-p3; KDE/4.7.3; amd64; ; )
References: <51B79023.5020109@fsn.hu>
 <op.wykdduw834t2sn@markf.office.supranet.net>
 <20130612114937.GA13688@icarus.home.lan>
In-Reply-To: <20130612114937.GA13688@icarus.home.lan>
MIME-Version: 1.0
Content-Type: Text/Plain;
  charset="us-ascii"
Content-Transfer-Encoding: 7bit
Message-Id: <201306130705.26895.wmn@siberianet.ru>
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 12 Jun 2013 23:15:05 -0000

On Wednesday 12 June 2013, Jeremy Chadwick wrote:
> On Wed, Jun 12, 2013 at 06:40:32AM -0500, Mark Felder wrote:
> > On Tue, 11 Jun 2013 16:01:23 -0500, Attila Nagy <bra@fsn.hu> wrote:
> > >BTW, the file systems are 77-78% full according to df (so ZFS
> > >holds more, because UFS is -m 8).
> > 
> > ZFS write performance can begin to drop pretty badly when you get
> > around 80% full. I've not seen any benchmarks showing an improvement
> > with a very fast and large ZIL or tons of memory, but I'd expect
> > that would help significantly. Just note that you're right at the
> > edge where performance gets impacted.
> 
> Mark, do you have any references for this?  I'd love to learn/read more
> about this engineering/design aspect (I won't say flaw, I'll just say
> aspect) to ZFS, as it's the first I've heard of it.
> 
> The reason I ask: (respectfully, not judgementally) I'm worried you
> might be referring to something that has to do with SSDs and not ZFS,
> specifically SSD wear-levelling performing better with lots of free
> space (i.e. a small FTL map; TRIM helps with this immensely) -- where
> the performance hit tends to begin around the 70-80% mark.  (I can talk
> more about that if asked, but want to make sure the two things aren't
> being mistaken for one another)

http://lists.freebsd.org/pipermail/freebsd-fs/2013-March/016834.html

CC'd mm@.

-- 
ISP "SiberiaNet"
System and Network Administrator

From owner-freebsd-fs@FreeBSD.ORG  Wed Jun 12 23:40:51 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115])
 by hub.freebsd.org (Postfix) with ESMTP id 094878B6
 for <freebsd-fs@freebsd.org>; Wed, 12 Jun 2013 23:40:51 +0000 (UTC)
 (envelope-from jonaschuman@gmail.com)
Received: from mail-bk0-x22b.google.com (mail-bk0-x22b.google.com
 [IPv6:2a00:1450:4008:c01::22b])
 by mx1.freebsd.org (Postfix) with ESMTP id 960C31AFC
 for <freebsd-fs@freebsd.org>; Wed, 12 Jun 2013 23:40:50 +0000 (UTC)
Received: by mail-bk0-f43.google.com with SMTP id jm2so3640819bkc.16
 for <freebsd-fs@freebsd.org>; Wed, 12 Jun 2013 16:40:49 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
 h=mime-version:date:message-id:subject:from:to:content-type;
 bh=sjzNeJDHJ2jFZdqY2ERfpAd5B3KOog3MDMLHCiDgPGQ=;
 b=DKRPF+8BYHb+LzDVsRuH6WbHg419K6S0tiibMUzcB53x1JQxZXtfzxPQENBTDlpyrK
 mxzEUEPXv6SdcziNOsok1/TCwZ2Dl23dzq4m7u1oPxxnVmKlTuTNd/gW1Y9D0NXICgW6
 HbrRH+iw1l1st7KPmpuBmisVSn5MjTwo+e+6nmcF8UTZYHnUy1vbC7mKKsfqPXgmizNO
 zQJThxU0HrgWDgQpuJxtilM0goUdV7/H4+vFx34CGU9AId1f4AMxlQ3xKs3R/toPHWo2
 DBOtN7Prh+jxM2wcK4z8QHEounL49RiFAqzYZgBEU/W3DM7otIu1ImqlZE8wFl9RiUoi
 6s8g==
MIME-Version: 1.0
X-Received: by 10.204.65.69 with SMTP id h5mr3506797bki.59.1371080449628; Wed,
 12 Jun 2013 16:40:49 -0700 (PDT)
Received: by 10.205.125.145 with HTTP; Wed, 12 Jun 2013 16:40:49 -0700 (PDT)
Date: Wed, 12 Jun 2013 19:40:49 -0400
Message-ID: <CAC-LZTYLzFPTvA6S4CN0xTd-E_x9c3kxYwQoFed5LkVBrwVk0Q@mail.gmail.com>
Subject: zfs send/recv dies when transferring large-ish dataset
From: Jona Schuman <jonaschuman@gmail.com>
To: freebsd-fs@freebsd.org
Content-Type: text/plain; charset=ISO-8859-1
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 12 Jun 2013 23:40:51 -0000

Hi,

I'm getting some strange behavior from zfs send/recv and I'm hoping
someone may be able to provide some insight. I have two identical
machines running 9.0-RELEASE-p3, each having a ZFS pool (zfs 5, zpool
28) for storage. I want to use zfs send/recv for replication between
the two machines. For the most part, this has worked as expected.
However, send/recv fails when transferring the largest dataset (both
in actual size and in terms of number of files) on either machine.
With these datasets, issuing:

machine2# nc -d -l 9999 | zfs recv -d storagepool
machine1# zfs send dataset@snap | nc machine2 9999

terminates early on the sending side without any error messages. The
receiving end continues on as expected, cleaning up the partial data
received so far and reverting to its initial state. (I've tried using
mbuffer instead of nc, or just using ssh, both with similar results.)
Oddly, zfs send dies slightly differently depending on how the two
machines are connected. When connected through the racktop switch, zfs
send dies quietly without any indication that the transfer has failed.
When connected directly using a crossover cable, zfs send dies quietly
and machine1 becomes unresponsive (no network, no keyboard, hard reset
required). In both cases, no messages are printed to screen or to
anything in /var/log/.


I can transfer the same datasets successfully if I send/recv to/from file:

machine1# zfs send dataset@snap > /tmp/dump
machine1# scp /tmp/dump machine2:/tmp/dump
machine2# zfs recv -d storagepool < /tmp/dump

so I don't think the datasets themselves are the issue. I've also
successfully tried send/recv over the network using different network
interfaces (10GbE ixgbe cards instead of the 1GbE igb links), which
would suggest the issue is with the 1GbE links.

Might there be some buffering parameter that I'm neglecting to tune,
which is essential on the 1GbE links but may be less important on the
faster links? Are there any known issues with the igb driver that
might be the culprit here? Any other suggestions?

Thanks,
Jona

From owner-freebsd-fs@FreeBSD.ORG  Thu Jun 13 07:57:47 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by hub.freebsd.org (Postfix) with ESMTP id 3D2C2F43
 for <freebsd-fs@freebsd.org>; Thu, 13 Jun 2013 07:57:47 +0000 (UTC)
 (envelope-from Ivailo.Tanusheff@skrill.com)
Received: from ch1outboundpool.messaging.microsoft.com
 (ch1ehsobe002.messaging.microsoft.com [216.32.181.182])
 by mx1.freebsd.org (Postfix) with ESMTP id E43CD1E47
 for <freebsd-fs@freebsd.org>; Thu, 13 Jun 2013 07:57:46 +0000 (UTC)
Received: from mail161-ch1-R.bigfish.com (10.43.68.231) by
 CH1EHSOBE018.bigfish.com (10.43.70.68) with Microsoft SMTP Server id
 14.1.225.23; Thu, 13 Jun 2013 07:42:30 +0000
Received: from mail161-ch1 (localhost [127.0.0.1])	by
 mail161-ch1-R.bigfish.com (Postfix) with ESMTP id 929231C01BD;	Thu, 13 Jun
 2013 07:42:30 +0000 (UTC)
X-Forefront-Antispam-Report: CIP:157.56.249.213; KIP:(null); UIP:(null);
 IPV:NLI; H:AM2PRD0710HT004.eurprd07.prod.outlook.com; RD:none; EFVD:NLI
X-SpamScore: -1
X-BigFish: PS-1(z54eehz9371I542I4015Izz1f42h1ee6h1de0h1fdah1202h1e76h1d1ah1d2ah1fc6hzz17326ah8275dhz2fh2a8h668h839h944hd24hf0ah1220h1288h12a5h12a9h12bdh137ah13b6h1441h1504h1537h153bh162dh1631h1758h18e1h1946h19b5h19ceh1ad9h1b0ah1d07h1d0ch1d2eh1d3fh1de9h1dfeh1dffh1e1dh9a9j1155h)
Received-SPF: pass (mail161-ch1: domain of skrill.com designates
 157.56.249.213 as permitted sender) client-ip=157.56.249.213;
 envelope-from=Ivailo.Tanusheff@skrill.com;
 helo=AM2PRD0710HT004.eurprd07.prod.outlook.com ; .outlook.com ; 
X-Forefront-Antispam-Report-Untrusted: SFV:SKI; SFS:; DIR:OUT; SFP:; SCL:-1;
 SRVR:DB3PR07MB057; H:DB3PR07MB059.eurprd07.prod.outlook.com; LANG:en; 
Received: from mail161-ch1 (localhost.localdomain [127.0.0.1]) by mail161-ch1
 (MessageSwitch) id 1371109347953425_25617;
 Thu, 13 Jun 2013 07:42:27 +0000 (UTC)
Received: from CH1EHSMHS035.bigfish.com (snatpool1.int.messaging.microsoft.com
 [10.43.68.242])	by mail161-ch1.bigfish.com (Postfix) with ESMTP id
 DAD6420004D;	Thu, 13 Jun 2013 07:42:27 +0000 (UTC)
Received: from AM2PRD0710HT004.eurprd07.prod.outlook.com (157.56.249.213) by
 CH1EHSMHS035.bigfish.com (10.43.70.35) with Microsoft SMTP Server (TLS) id
 14.1.225.23; Thu, 13 Jun 2013 07:42:27 +0000
Received: from DB3PR07MB057.eurprd07.prod.outlook.com (10.242.137.144) by
 AM2PRD0710HT004.eurprd07.prod.outlook.com (10.255.165.39) with Microsoft SMTP
 Server (TLS) id 14.16.324.0; Thu, 13 Jun 2013 07:42:12 +0000
Received: from DB3PR07MB059.eurprd07.prod.outlook.com (10.242.137.149) by
 DB3PR07MB057.eurprd07.prod.outlook.com (10.242.137.144) with Microsoft SMTP
 Server (TLS) id 15.0.702.21; Thu, 13 Jun 2013 07:42:11 +0000
Received: from DB3PR07MB059.eurprd07.prod.outlook.com ([169.254.2.14]) by
 DB3PR07MB059.eurprd07.prod.outlook.com ([169.254.2.14]) with mapi id
 15.00.0702.005; Thu, 13 Jun 2013 07:42:11 +0000
From: Ivailo Tanusheff <Ivailo.Tanusheff@skrill.com>
To: Jona Schuman <jonaschuman@gmail.com>, "freebsd-fs@freebsd.org"
 <freebsd-fs@freebsd.org>
Subject: RE: zfs send/recv dies when transferring large-ish dataset
Thread-Topic: zfs send/recv dies when transferring large-ish dataset
Thread-Index: AQHOZ8ZVUN+hFJBhLk6aHw9omdejRZkzQxIg
Date: Thu, 13 Jun 2013 07:42:11 +0000
Message-ID: <57e0551229684b69bc27476b8a08fb91@DB3PR07MB059.eurprd07.prod.outlook.com>
References: <CAC-LZTYLzFPTvA6S4CN0xTd-E_x9c3kxYwQoFed5LkVBrwVk0Q@mail.gmail.com>
In-Reply-To: <CAC-LZTYLzFPTvA6S4CN0xTd-E_x9c3kxYwQoFed5LkVBrwVk0Q@mail.gmail.com>
Accept-Language: en-US
Content-Language: en-US
X-MS-Has-Attach: 
X-MS-TNEF-Correlator: 
x-originating-ip: [217.18.249.148]
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: quoted-printable
MIME-Version: 1.0
X-OriginatorOrg: skrill.com
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 13 Jun 2013 07:57:47 -0000

Hi,

Can you try send/recv with the -v or with -vP swiches, so you can see more =
verbose information?

Regards,
Ivailo Tanusheff

-----Original Message-----
From: owner-freebsd-fs@freebsd.org [mailto:owner-freebsd-fs@freebsd.org] On=
 Behalf Of Jona Schuman
Sent: Thursday, June 13, 2013 2:41 AM
To: freebsd-fs@freebsd.org
Subject: zfs send/recv dies when transferring large-ish dataset

Hi,

I'm getting some strange behavior from zfs send/recv and I'm hoping someone=
 may be able to provide some insight. I have two identical machines running=
 9.0-RELEASE-p3, each having a ZFS pool (zfs 5, zpool
28) for storage. I want to use zfs send/recv for replication between the tw=
o machines. For the most part, this has worked as expected.
However, send/recv fails when transferring the largest dataset (both in act=
ual size and in terms of number of files) on either machine.
With these datasets, issuing:

machine2# nc -d -l 9999 | zfs recv -d storagepool machine1# zfs send datase=
t@snap | nc machine2 9999

terminates early on the sending side without any error messages. The receiv=
ing end continues on as expected, cleaning up the partial data received so =
far and reverting to its initial state. (I've tried using mbuffer instead o=
f nc, or just using ssh, both with similar results.) Oddly, zfs send dies s=
lightly differently depending on how the two machines are connected. When c=
onnected through the racktop switch, zfs send dies quietly without any indi=
cation that the transfer has failed.
When connected directly using a crossover cable, zfs send dies quietly and =
machine1 becomes unresponsive (no network, no keyboard, hard reset required=
). In both cases, no messages are printed to screen or to anything in /var/=
log/.


I can transfer the same datasets successfully if I send/recv to/from file:

machine1# zfs send dataset@snap > /tmp/dump machine1# scp /tmp/dump machine=
2:/tmp/dump machine2# zfs recv -d storagepool < /tmp/dump

so I don't think the datasets themselves are the issue. I've also successfu=
lly tried send/recv over the network using different network interfaces (10=
GbE ixgbe cards instead of the 1GbE igb links), which would suggest the iss=
ue is with the 1GbE links.

Might there be some buffering parameter that I'm neglecting to tune, which =
is essential on the 1GbE links but may be less important on the faster link=
s? Are there any known issues with the igb driver that might be the culprit=
 here? Any other suggestions?

Thanks,
Jona
_______________________________________________
freebsd-fs@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-fs
To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org"


From owner-freebsd-fs@FreeBSD.ORG  Thu Jun 13 08:28:33 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by hub.freebsd.org (Postfix) with ESMTP id 4EAE5BF1
 for <freebsd-fs@freebsd.org>; Thu, 13 Jun 2013 08:28:33 +0000 (UTC)
 (envelope-from rs@bytecamp.net)
Received: from mail.bytecamp.net (mail.bytecamp.net [212.204.60.9])
 by mx1.freebsd.org (Postfix) with ESMTP id DC9671F60
 for <freebsd-fs@freebsd.org>; Thu, 13 Jun 2013 08:28:32 +0000 (UTC)
Received: (qmail 41990 invoked by uid 89); 13 Jun 2013 10:28:24 +0200
Received: from stella.bytecamp.net (HELO ?212.204.60.37?)
 (rs%bytecamp.net@212.204.60.37)
 by mail.bytecamp.net with CAMELLIA256-SHA encrypted SMTP;
 13 Jun 2013 10:28:24 +0200
Message-ID: <51B982A8.10605@bytecamp.net>
Date: Thu, 13 Jun 2013 10:28:24 +0200
From: Robert Schulze <rs@bytecamp.net>
Organization: bytecamp GmbH
User-Agent: Mozilla/5.0 (X11; Linux i686;
 rv:17.0) Gecko/20130330 Thunderbird/17.0.5
MIME-Version: 1.0
To: freebsd-fs@freebsd.org
Subject: Re: An order of magnitude higher IOPS needed with ZFS than UFS
References: <51B79023.5020109@fsn.hu>
 <op.wykdduw834t2sn@markf.office.supranet.net>
 <20130612114937.GA13688@icarus.home.lan>
In-Reply-To: <20130612114937.GA13688@icarus.home.lan>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 13 Jun 2013 08:28:33 -0000

Hi,

Am 12.06.2013 13:49, schrieb Jeremy Chadwick:
> On Wed, Jun 12, 2013 at 06:40:32AM -0500, Mark Felder wrote:
>> On Tue, 11 Jun 2013 16:01:23 -0500, Attila Nagy <bra@fsn.hu> wrote:
>>
>> ZFS write performance can begin to drop pretty badly when you get
>> around 80% full. I've not seen any benchmarks showing an improvement
>> with a very fast and large ZIL or tons of memory, but I'd expect
>> that would help significantly. Just note that you're right at the
>> edge where performance gets impacted.
>
> Mark, do you have any references for this?  I'd love to learn/read more
> about this engineering/design aspect (I won't say flaw, I'll just say
> aspect) to ZFS, as it's the first I've heard of it.

this is even true when getting near a quota limit on a zfs, although 
there are e.g. 10/16 TB free in the pool.

Just create a filesystem and set quota=1G, then do sequential 
invocations of dd to fill the fs with 100M files. You will see a sharp 
slowdown when the last twenty files are beeing created.

Here are the results from the following short test:

for i in `jot - 0 99`
     do
     dd if=/dev/zero of=/pool/quota-test/10M.$i bs=1M count=10
     done

0..80:	< 0.4 s
80	0.27 s
81	0.77 s
82	0.50 s
83	0.51 s
84	0.22 s
85	0.87 s
86	0.52 s
87	1.13 s
88	0.91 s
90	0.39 s
91	1.04 s
92	0.80 s
93	1.94 s
94	1.27 s
95	1.36 s
96	1.76 s
97	2.13 s
98	3.28 s
99	4.07 s

of course, there are some small values beyond 80% utilisation, but I 
think the trend is clearly visible.

In my opinion, hitting a quota limit should not give these results 
unless enough free physical disk space is available in the pool. This is 
a bug or a design flaw and creating serious problems when exporting 
quota'ed zfs over nfs.

with kind regards,
Robert Schulze

-- 
/7\ bytecamp GmbH
Geschwister-Scholl-Str. 10, 14776 Brandenburg a.d. Havel
HRB15752, Amtsgericht Potsdam, Geschaeftsfuehrer:
Bjoern Barnekow, Frank Rosenbaum, Sirko Zidlewitz
tel +49 3381 79637-0 werktags 10-12,13-17 Uhr, fax +49 3381 79637-20
mail rs@bytecamp.net, web http://bytecamp.net/

From owner-freebsd-fs@FreeBSD.ORG  Thu Jun 13 11:41:02 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by hub.freebsd.org (Postfix) with ESMTP id DE90C3D0
 for <freebsd-fs@freebsd.org>; Thu, 13 Jun 2013 11:41:02 +0000 (UTC)
 (envelope-from sdenic@intech.co.rs)
Received: from sam.nabble.com (sam.nabble.com [216.139.236.26])
 by mx1.freebsd.org (Postfix) with ESMTP id C402317D5
 for <freebsd-fs@freebsd.org>; Thu, 13 Jun 2013 11:41:02 +0000 (UTC)
Received: from [192.168.236.26] (helo=sam.nabble.com)
 by sam.nabble.com with esmtp (Exim 4.72)
 (envelope-from <sdenic@intech.co.rs>) id 1Un5sT-0001wn-53
 for freebsd-fs@freebsd.org; Thu, 13 Jun 2013 04:39:41 -0700
Date: Thu, 13 Jun 2013 04:39:41 -0700 (PDT)
From: intech <sdenic@intech.co.rs>
To: freebsd-fs@freebsd.org
Message-ID: <1371123581091-5819759.post@n5.nabble.com>
In-Reply-To: <4C6BDBB9.3020007@gibfest.dk>
References: <4C61CF4D.4060009@gibfest.dk> <4C651B7E.5000805@gibfest.dk>
 <4C6B08BD.9080206@gibfest.dk> <20100818110655.GA2177@garage.freebsd.pl>
 <4C6BC0BA.9030303@gibfest.dk> <4C6BC35B.9040000@gibfest.dk>
 <20100818121133.GC2177@garage.freebsd.pl> <4C6BD521.1060807@gibfest.dk>
 <20100818125856.GE2177@garage.freebsd.pl> <4C6BDBB9.3020007@gibfest.dk>
Subject: Re: HAST initial sync speed
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 13 Jun 2013 11:41:02 -0000

Thought this threat is almost 3 years old, I want to ask if this
MAX_SEND_SIZE adopted in freebsd 8.3 and even fbsd9.1?
Indeed I have the same issue on 1Gb network - nodes performing sync at only
10MBytes/sec ?! and I can't figure out what is happening as network itself
is not the problem, I tested it.
And just one question fullsync is only option for HAST replication at time
of writing, so could HAST perform at 100MB/sec in this mode, and  when we
expect memsync and async to be released?


--
View this message in context: http://freebsd.1045724.n5.nabble.com/HAST-initial-sync-speed-tp4027033p5819759.html
Sent from the freebsd-fs mailing list archive at Nabble.com.

From owner-freebsd-fs@FreeBSD.ORG  Thu Jun 13 15:56:28 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by hub.freebsd.org (Postfix) with ESMTP id C19F6322
 for <freebsd-fs@freebsd.org>; Thu, 13 Jun 2013 15:56:28 +0000 (UTC)
 (envelope-from jonaschuman@gmail.com)
Received: from mail-vb0-x22b.google.com (mail-vb0-x22b.google.com
 [IPv6:2607:f8b0:400c:c02::22b])
 by mx1.freebsd.org (Postfix) with ESMTP id 86CDA139E
 for <freebsd-fs@freebsd.org>; Thu, 13 Jun 2013 15:56:28 +0000 (UTC)
Received: by mail-vb0-f43.google.com with SMTP id e12so4591892vbg.16
 for <freebsd-fs@freebsd.org>; Thu, 13 Jun 2013 08:56:28 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
 h=mime-version:in-reply-to:references:date:message-id:subject:from:to
 :cc:content-type:content-transfer-encoding;
 bh=AAtGmjKDJY8sF4Bjd8gnnrxn31GiYQuwEwmLIDY/D90=;
 b=LpzmG2065qoDwx+HjDRJTxHJm95aDjdofjFLQshxX205ltLgKpQaGZX362sJuDtAfu
 CTFOecO0w5WsTMNyknxj8PBqAIt+OuQUtpIDPA3UP4BA6cfl+FOm2oaXnuuC7O1twj18
 76pSWfj/F2iIBD/svCTI1A4s8IyMpyiTrjbvOU+MVqzSq743uaJ/fTlD9S0WcX220IvR
 g3uHBt1G55KZbEeySmN8URCOu7vtbCZDo3mIEso9yrymVJexn6MP1metlBYUfrbAljGN
 hWPWJSteoJhMhQZo46Bc3MJxQX/E27nCPgeIrvFKNWRuQAQAaSDkdvTnFhWkZ6IQW9lO
 luTQ==
MIME-Version: 1.0
X-Received: by 10.52.22.78 with SMTP id b14mr545924vdf.27.1371138988025; Thu,
 13 Jun 2013 08:56:28 -0700 (PDT)
Received: by 10.220.167.73 with HTTP; Thu, 13 Jun 2013 08:56:27 -0700 (PDT)
In-Reply-To: <57e0551229684b69bc27476b8a08fb91@DB3PR07MB059.eurprd07.prod.outlook.com>
References: <CAC-LZTYLzFPTvA6S4CN0xTd-E_x9c3kxYwQoFed5LkVBrwVk0Q@mail.gmail.com>
 <57e0551229684b69bc27476b8a08fb91@DB3PR07MB059.eurprd07.prod.outlook.com>
Date: Thu, 13 Jun 2013 11:56:27 -0400
Message-ID: <CAC-LZTajW0SO_dH9ZtUH80zX628vcosL_vOzwkwB1JF1PZy0qA@mail.gmail.com>
Subject: Re: zfs send/recv dies when transferring large-ish dataset
From: Jona Schuman <jonaschuman@gmail.com>
To: Ivailo Tanusheff <Ivailo.Tanusheff@skrill.com>
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable
Cc: "freebsd-fs@freebsd.org" <freebsd-fs@freebsd.org>
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 13 Jun 2013 15:56:28 -0000

machine2# nc -d -l 9999 | zfs receive -v -F -d storagepool
machine1# zfs send -v -R dataset@snap | nc machine2 9999

machine1-output: sending from @ to dataset@snap
machine2-output: receiving full stream of dataset@snap into
storagepool/dataset@snap
machine1-output: warning: cannot send 'dataset@snap': Broken pipe
machine1-output: Broken pipe


On Thu, Jun 13, 2013 at 3:42 AM, Ivailo Tanusheff
<Ivailo.Tanusheff@skrill.com> wrote:
> Hi,
>
> Can you try send/recv with the -v or with -vP swiches, so you can see mor=
e verbose information?
>
> Regards,
> Ivailo Tanusheff
>
> -----Original Message-----
> From: owner-freebsd-fs@freebsd.org [mailto:owner-freebsd-fs@freebsd.org] =
On Behalf Of Jona Schuman
> Sent: Thursday, June 13, 2013 2:41 AM
> To: freebsd-fs@freebsd.org
> Subject: zfs send/recv dies when transferring large-ish dataset
>
> Hi,
>
> I'm getting some strange behavior from zfs send/recv and I'm hoping someo=
ne may be able to provide some insight. I have two identical machines runni=
ng 9.0-RELEASE-p3, each having a ZFS pool (zfs 5, zpool
> 28) for storage. I want to use zfs send/recv for replication between the =
two machines. For the most part, this has worked as expected.
> However, send/recv fails when transferring the largest dataset (both in a=
ctual size and in terms of number of files) on either machine.
> With these datasets, issuing:
>
> machine2# nc -d -l 9999 | zfs recv -d storagepool machine1# zfs send data=
set@snap | nc machine2 9999
>
> terminates early on the sending side without any error messages. The rece=
iving end continues on as expected, cleaning up the partial data received s=
o far and reverting to its initial state. (I've tried using mbuffer instead=
 of nc, or just using ssh, both with similar results.) Oddly, zfs send dies=
 slightly differently depending on how the two machines are connected. When=
 connected through the racktop switch, zfs send dies quietly without any in=
dication that the transfer has failed.
> When connected directly using a crossover cable, zfs send dies quietly an=
d machine1 becomes unresponsive (no network, no keyboard, hard reset requir=
ed). In both cases, no messages are printed to screen or to anything in /va=
r/log/.
>
>
> I can transfer the same datasets successfully if I send/recv to/from file=
:
>
> machine1# zfs send dataset@snap > /tmp/dump machine1# scp /tmp/dump machi=
ne2:/tmp/dump machine2# zfs recv -d storagepool < /tmp/dump
>
> so I don't think the datasets themselves are the issue. I've also success=
fully tried send/recv over the network using different network interfaces (=
10GbE ixgbe cards instead of the 1GbE igb links), which would suggest the i=
ssue is with the 1GbE links.
>
> Might there be some buffering parameter that I'm neglecting to tune, whic=
h is essential on the 1GbE links but may be less important on the faster li=
nks? Are there any known issues with the igb driver that might be the culpr=
it here? Any other suggestions?
>
> Thanks,
> Jona
> _______________________________________________
> freebsd-fs@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-fs
> To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org"
>
>

From owner-freebsd-fs@FreeBSD.ORG  Thu Jun 13 16:06:59 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115])
 by hub.freebsd.org (Postfix) with ESMTP id A829363F
 for <freebsd-fs@freebsd.org>; Thu, 13 Jun 2013 16:06:59 +0000 (UTC)
 (envelope-from ler@lerctr.org)
Received: from thebighonker.lerctr.org
 (lrosenman-1-pt.tunnel.tserv8.dal1.ipv6.he.net [IPv6:2001:470:1f0e:3ad::2])
 by mx1.freebsd.org (Postfix) with ESMTP id 65AC115C9
 for <freebsd-fs@freebsd.org>; Thu, 13 Jun 2013 16:06:59 +0000 (UTC)
DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lerctr.org;
 s=lerami; 
 h=Message-ID:References:In-Reply-To:Subject:To:From:Date:Content-Transfer-Encoding:Content-Type:MIME-Version;
 bh=2EbPCYEag64lSc/JF9mqcvzjg6mzkpzwh5L54Q7twlE=; 
 b=KcINUEWUVyo2/TTIyDOUkauzakrfzx+Q8YxwZdEeByIbrORQFn4Dqnle7jabo7quf6c3kptLoABGaoaO5ixrEEc6sGrdIopR+8dRZG5+Yt9R9w7YWFOiibWuPtdyhtW0at0AIlp/3Z8uMfgCVqpHekqLFoCBYvhACzJBtudXHAA=;
Received: from localhost.lerctr.org ([127.0.0.1]:36432 helo=webmail.lerctr.org)
 by thebighonker.lerctr.org with esmtpa (Exim 4.80.1 (FreeBSD))
 (envelope-from <ler@lerctr.org>) id 1UnA37-0000ep-Th
 for freebsd-fs@freebsd.org; Thu, 13 Jun 2013 11:06:59 -0500
Received: from [32.97.110.58] by webmail.lerctr.org
 with HTTP (HTTP/1.1 POST); Thu, 13 Jun 2013 11:06:57 -0500
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8;
 format=flowed
Content-Transfer-Encoding: 7bit
Date: Thu, 13 Jun 2013 11:06:57 -0500
From: Larry Rosenman <ler@lerctr.org>
To: freebsd-fs@freebsd.org
Subject: Re: zfs send/recv dies when transferring large-ish dataset
In-Reply-To: <CAC-LZTajW0SO_dH9ZtUH80zX628vcosL_vOzwkwB1JF1PZy0qA@mail.gmail.com>
References: <CAC-LZTYLzFPTvA6S4CN0xTd-E_x9c3kxYwQoFed5LkVBrwVk0Q@mail.gmail.com>
 <57e0551229684b69bc27476b8a08fb91@DB3PR07MB059.eurprd07.prod.outlook.com>
 <CAC-LZTajW0SO_dH9ZtUH80zX628vcosL_vOzwkwB1JF1PZy0qA@mail.gmail.com>
Message-ID: <7dbb4a3d84381d923e22ec5ed77ea15e@webmail.lerctr.org>
X-Sender: ler@lerctr.org
User-Agent: Roundcube Webmail/0.9.1
X-Spam-Score: -3.3 (---)
X-LERCTR-Spam-Score: -3.3 (---)
X-Spam-Report: SpamScore (-3.3/5.0) ALL_TRUSTED=-1, BAYES_00=-1.9,
 RP_MATCHES_RCVD=-0.392
X-LERCTR-Spam-Report: SpamScore (-3.3/5.0) ALL_TRUSTED=-1, BAYES_00=-1.9,
 RP_MATCHES_RCVD=-0.392
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 13 Jun 2013 16:06:59 -0000

This may be related to the grief I'm seeing in general with send/recv.

I try to send streams and it breaks with invalid datastream or similar 
to yours.

I've posted many times, and offered up my machine(s) to debug but no 
takers.


On 2013-06-13 10:56, Jona Schuman wrote:
> machine2# nc -d -l 9999 | zfs receive -v -F -d storagepool
> machine1# zfs send -v -R dataset@snap | nc machine2 9999
> 
> machine1-output: sending from @ to dataset@snap
> machine2-output: receiving full stream of dataset@snap into
> storagepool/dataset@snap
> machine1-output: warning: cannot send 'dataset@snap': Broken pipe
> machine1-output: Broken pipe
> 
> 
> On Thu, Jun 13, 2013 at 3:42 AM, Ivailo Tanusheff
> <Ivailo.Tanusheff@skrill.com> wrote:
>> Hi,
>> 
>> Can you try send/recv with the -v or with -vP swiches, so you can see 
>> more verbose information?
>> 
>> Regards,
>> Ivailo Tanusheff
>> 
>> -----Original Message-----
>> From: owner-freebsd-fs@freebsd.org 
>> [mailto:owner-freebsd-fs@freebsd.org] On Behalf Of Jona Schuman
>> Sent: Thursday, June 13, 2013 2:41 AM
>> To: freebsd-fs@freebsd.org
>> Subject: zfs send/recv dies when transferring large-ish dataset
>> 
>> Hi,
>> 
>> I'm getting some strange behavior from zfs send/recv and I'm hoping 
>> someone may be able to provide some insight. I have two identical 
>> machines running 9.0-RELEASE-p3, each having a ZFS pool (zfs 5, zpool
>> 28) for storage. I want to use zfs send/recv for replication between 
>> the two machines. For the most part, this has worked as expected.
>> However, send/recv fails when transferring the largest dataset (both 
>> in actual size and in terms of number of files) on either machine.
>> With these datasets, issuing:
>> 
>> machine2# nc -d -l 9999 | zfs recv -d storagepool machine1# zfs send 
>> dataset@snap | nc machine2 9999
>> 
>> terminates early on the sending side without any error messages. The 
>> receiving end continues on as expected, cleaning up the partial data 
>> received so far and reverting to its initial state. (I've tried using 
>> mbuffer instead of nc, or just using ssh, both with similar results.) 
>> Oddly, zfs send dies slightly differently depending on how the two 
>> machines are connected. When connected through the racktop switch, zfs 
>> send dies quietly without any indication that the transfer has failed.
>> When connected directly using a crossover cable, zfs send dies quietly 
>> and machine1 becomes unresponsive (no network, no keyboard, hard reset 
>> required). In both cases, no messages are printed to screen or to 
>> anything in /var/log/.
>> 
>> 
>> I can transfer the same datasets successfully if I send/recv to/from 
>> file:
>> 
>> machine1# zfs send dataset@snap > /tmp/dump machine1# scp /tmp/dump 
>> machine2:/tmp/dump machine2# zfs recv -d storagepool < /tmp/dump
>> 
>> so I don't think the datasets themselves are the issue. I've also 
>> successfully tried send/recv over the network using different network 
>> interfaces (10GbE ixgbe cards instead of the 1GbE igb links), which 
>> would suggest the issue is with the 1GbE links.
>> 
>> Might there be some buffering parameter that I'm neglecting to tune, 
>> which is essential on the 1GbE links but may be less important on the 
>> faster links? Are there any known issues with the igb driver that 
>> might be the culprit here? Any other suggestions?
>> 
>> Thanks,
>> Jona
>> _______________________________________________
>> freebsd-fs@freebsd.org mailing list
>> http://lists.freebsd.org/mailman/listinfo/freebsd-fs
>> To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org"
>> 
>> 
> _______________________________________________
> freebsd-fs@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-fs
> To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org"

-- 
Larry Rosenman                     http://www.lerctr.org/~ler
Phone: +1 214-642-9640 (c)     E-Mail: ler@lerctr.org
US Mail: 430 Valona Loop, Round Rock, TX 78681-3893

From owner-freebsd-fs@FreeBSD.ORG  Thu Jun 13 20:53:50 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by hub.freebsd.org (Postfix) with ESMTP id 6881DCCD
 for <freebsd-fs@freebsd.org>; Thu, 13 Jun 2013 20:53:50 +0000 (UTC)
 (envelope-from to.my.trociny@gmail.com)
Received: from mail-ea0-x231.google.com (mail-ea0-x231.google.com
 [IPv6:2a00:1450:4013:c01::231])
 by mx1.freebsd.org (Postfix) with ESMTP id F2F48122E
 for <freebsd-fs@freebsd.org>; Thu, 13 Jun 2013 20:53:49 +0000 (UTC)
Received: by mail-ea0-f177.google.com with SMTP id j14so7982628eak.22
 for <freebsd-fs@freebsd.org>; Thu, 13 Jun 2013 13:53:48 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
 h=sender:date:from:to:cc:subject:message-id:references:mime-version
 :content-type:content-disposition:in-reply-to:user-agent;
 bh=4c+0qwysDJICPu0sbZVEk/2hJJQIIqsV6lcAkVV2hjU=;
 b=HDBDrQUGhhmF84/wn9oHjQlBFQ3MR1H7LPUc5ejnRnr684zbchTFAZ2UvdLKUhn1Ny
 XGcb5c/WrNGdu+YjVbjnwxeUcGwrj1IUpRPvC7uJgLENgithDODaOt81Y+JUF2SeiKTQ
 pEL7s0L248ohOK66NngXGkk2R1cmXpipBHCMU1hPtyqhsQdz5Z9+TZp5B2ChqXoIzBdt
 3SYWB0mdYipxJ+Zawyg/bglX6/RW3RIklLf1s2139TEDwzH5S42t+b3K7YbsDkVHJMwx
 c9z5gYuBBWE2FmO4sdr74EBp7l7I0B7XqCqfqiEaz6rJ8s/cANhqMYQPDDjmYmcCnSFd
 KyYg==
X-Received: by 10.15.99.2 with SMTP id bk2mr3204097eeb.76.1371156828366;
 Thu, 13 Jun 2013 13:53:48 -0700 (PDT)
Received: from localhost ([178.150.115.244])
 by mx.google.com with ESMTPSA id y10sm46209983eev.3.2013.06.13.13.53.46
 for <multiple recipients>
 (version=TLSv1.2 cipher=RC4-SHA bits=128/128);
 Thu, 13 Jun 2013 13:53:47 -0700 (PDT)
Sender: Mikolaj Golub <to.my.trociny@gmail.com>
Date: Thu, 13 Jun 2013 23:53:45 +0300
From: Mikolaj Golub <trociny@FreeBSD.org>
To: intech <sdenic@intech.co.rs>
Subject: Re: HAST initial sync speed
Message-ID: <20130613205344.GB8732@gmail.com>
References: <4C651B7E.5000805@gibfest.dk> <4C6B08BD.9080206@gibfest.dk>
 <20100818110655.GA2177@garage.freebsd.pl>
 <4C6BC0BA.9030303@gibfest.dk> <4C6BC35B.9040000@gibfest.dk>
 <20100818121133.GC2177@garage.freebsd.pl>
 <4C6BD521.1060807@gibfest.dk>
 <20100818125856.GE2177@garage.freebsd.pl>
 <4C6BDBB9.3020007@gibfest.dk>
 <1371123581091-5819759.post@n5.nabble.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <1371123581091-5819759.post@n5.nabble.com>
User-Agent: Mutt/1.5.21 (2010-09-15)
Cc: freebsd-fs@freebsd.org
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 13 Jun 2013 20:53:50 -0000

On Thu, Jun 13, 2013 at 04:39:41AM -0700, intech wrote:
> Thought this threat is almost 3 years old, I want to ask if this
> MAX_SEND_SIZE adopted in freebsd 8.3 and even fbsd9.1?
> Indeed I have the same issue on 1Gb network - nodes performing sync at only
> 10MBytes/sec ?! and I can't figure out what is happening as network itself
> is not the problem, I tested it.

What version are you running? There have been several changes since
that thread was started related to the synchronization speed issue
(MAX_SEND_SIZE among them). It is recommended to use recent versions.

> And just one question fullsync is only option for HAST replication at time
> of writing, so could HAST perform at 100MB/sec in this mode, and  when we
> expect memsync and async to be released?

Synchronization is run in background by synchronization thread and
hardly depends on replication mode. Anyway, HAST from CURRENT,
STABLE/9 and STABLE/8 supports all three modes. The async mode was
merged to stable branches in Jan 2012, and reached 8.4 and 9.1 (I am
not sure about the later, one needs to check). The memsync mode was
merged in Apr 2013, after 8.4 freeze, so there is no release that
would contain it yet.

-- 
Mikolaj Golub

From owner-freebsd-fs@FreeBSD.ORG  Thu Jun 13 21:29:19 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115])
 by hub.freebsd.org (Postfix) with ESMTP id 77CB9EA;
 Thu, 13 Jun 2013 21:29:19 +0000 (UTC)
 (envelope-from sinisa.denic@intech.co.rs)
Received: from exchange.peacebellservers.com (exchange.peacebellservers.com
 [46.22.146.98]) by mx1.freebsd.org (Postfix) with ESMTP id 2C1421447;
 Thu, 13 Jun 2013 21:29:18 +0000 (UTC)
Received: from localhost (localhost [127.0.0.1])
 by exchange.peacebellservers.com (Postfix) with ESMTP id 2CFF248CF3;
 Thu, 13 Jun 2013 23:20:41 +0200 (CEST)
Received: from exchange.peacebellservers.com ([127.0.0.1])
 by localhost (exchange.peacebellservers.com [127.0.0.1]) (amavisd-new,
 port 10032)
 with ESMTP id pOSVSjR_z_uY; Thu, 13 Jun 2013 23:20:40 +0200 (CEST)
Received: from localhost (localhost [127.0.0.1])
 by exchange.peacebellservers.com (Postfix) with ESMTP id 1BC7F48CF8;
 Thu, 13 Jun 2013 23:20:40 +0200 (CEST)
X-Virus-Scanned: amavisd-new at exchange.peacebellservers.com
Received: from exchange.peacebellservers.com ([127.0.0.1])
 by localhost (exchange.peacebellservers.com [127.0.0.1]) (amavisd-new,
 port 10026)
 with ESMTP id jEGXt3Jqhlga; Thu, 13 Jun 2013 23:20:39 +0200 (CEST)
Received: from exchange.peacebellservers.com (exchange.peacebellservers.com
 [46.22.146.98])
 by exchange.peacebellservers.com (Postfix) with ESMTP id C4EDA48CF3;
 Thu, 13 Jun 2013 23:20:39 +0200 (CEST)
Date: Thu, 13 Jun 2013 21:20:39 +0000 (UTC)
From: =?utf-8?Q?Sini=C5=A1a_Deni=C4=87?= <sinisa.denic@intech.co.rs>
To: Mikolaj Golub <trociny@FreeBSD.org>
Message-ID: <511373126.6399.1371158439628.JavaMail.root@intech.co.rs>
In-Reply-To: <20130613205344.GB8732@gmail.com>
References: <4C651B7E.5000805@gibfest.dk> <4C6BC35B.9040000@gibfest.dk>
 <20100818121133.GC2177@garage.freebsd.pl> <4C6BD521.1060807@gibfest.dk>
 <20100818125856.GE2177@garage.freebsd.pl> <4C6BDBB9.3020007@gibfest.dk>
 <1371123581091-5819759.post@n5.nabble.com> <20130613205344.GB8732@gmail.com>
Subject: Re: HAST initial sync speed
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: quoted-printable
X-Mailer: Zimbra 8.0.3_GA_5664 (ZimbraWebClient - GC27 (Win)/8.0.3_GA_5664)
Thread-Topic: HAST initial sync speed
Thread-Index: gqlsJVGtxB1in2A16T57Iz2DTryePQ==
Cc: freebsd-fs@freebsd.org
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 13 Jun 2013 21:29:19 -0000

>There have been several changes since
>that thread was started related to the synchronization speed issue
That's what happening to me, syncing 37GB takes more then 30min over 1Gb ne=
twork, I have two Fujitsu Siemens Esprimo nodes with two seagate sata2 150M=
B/s running freebsd-9.1-amd64 version.


Sini=C5=A1a Deni=C4=87=20
INTECH DOO=20
www.intech.co.rs=20


----- Original Message -----
From: "Mikolaj Golub" <trociny@FreeBSD.org>
To: "intech" <sdenic@intech.co.rs>
Cc: freebsd-fs@freebsd.org
Sent: Thursday, June 13, 2013 10:53:45 PM
Subject: Re: HAST initial sync speed

On Thu, Jun 13, 2013 at 04:39:41AM -0700, intech wrote:
> Thought this threat is almost 3 years old, I want to ask if this
> MAX_SEND_SIZE adopted in freebsd 8.3 and even fbsd9.1?
> Indeed I have the same issue on 1Gb network - nodes performing sync at on=
ly
> 10MBytes/sec ?! and I can't figure out what is happening as network itsel=
f
> is not the problem, I tested it.

What version are you running? There have been several changes since
that thread was started related to the synchronization speed issue
(MAX_SEND_SIZE among them). It is recommended to use recent versions.

> And just one question fullsync is only option for HAST replication at tim=
e
> of writing, so could HAST perform at 100MB/sec in this mode, and  when we
> expect memsync and async to be released?

Synchronization is run in background by synchronization thread and
hardly depends on replication mode. Anyway, HAST from CURRENT,
STABLE/9 and STABLE/8 supports all three modes. The async mode was
merged to stable branches in Jan 2012, and reached 8.4 and 9.1 (I am
not sure about the later, one needs to check). The memsync mode was
merged in Apr 2013, after 8.4 freeze, so there is no release that
would contain it yet.

--=20
Mikolaj Golub

From owner-freebsd-fs@FreeBSD.ORG  Thu Jun 13 21:52:28 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115])
 by hub.freebsd.org (Postfix) with ESMTP id 2D4B776C
 for <freebsd-fs@freebsd.org>; Thu, 13 Jun 2013 21:52:28 +0000 (UTC)
 (envelope-from jonaschuman@gmail.com)
Received: from mail-vb0-x22f.google.com (mail-vb0-x22f.google.com
 [IPv6:2607:f8b0:400c:c02::22f])
 by mx1.freebsd.org (Postfix) with ESMTP id E1586164F
 for <freebsd-fs@freebsd.org>; Thu, 13 Jun 2013 21:52:27 +0000 (UTC)
Received: by mail-vb0-f47.google.com with SMTP id x14so7412231vbb.34
 for <freebsd-fs@freebsd.org>; Thu, 13 Jun 2013 14:52:27 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
 h=mime-version:in-reply-to:references:date:message-id:subject:from:to
 :content-type; bh=EvleN+1NPw0UgGpcc5woKnSZrp1buutcCAtN3I34PRg=;
 b=z+VKGNX7RnP/naWbUfq+RgHonUSYp/JsD28SH8dXfazUo+HCHubMmP5/EhlP09X2b6
 3nHwXcVU0We0+Qm/4/5zIejKvVVQ1fv6P0JQWxi7bk17EjpquCEx/OEtTxfg1YfKcnGO
 SVgky1hRzvq6DWyGtkvgn1Z/l5yNawfoA6S403k5Sr466MC9c2NNQhvomv9VAh17Wedt
 6Ia8/kM8kV3ZG78QYbg85YaJ8S/sfJe5lCpYFeJ4nyk1mqz3czU6m+rJY78djthmF0B5
 vJKTUZW+IbyQ7zdfTCTE76DMLsvbvAM7eIoduXXXezbnVXocVZOBuBgysEpsZxjSWpLI
 YAlw==
MIME-Version: 1.0
X-Received: by 10.220.11.143 with SMTP id t15mr1181436vct.68.1371160347436;
 Thu, 13 Jun 2013 14:52:27 -0700 (PDT)
Received: by 10.220.167.73 with HTTP; Thu, 13 Jun 2013 14:52:27 -0700 (PDT)
In-Reply-To: <CAL2e+nek5GqWUJTJDe0pbWia+sY2QgSJWe_6XieGPmdX8KGjZg@mail.gmail.com>
References: <CAC-LZTYLzFPTvA6S4CN0xTd-E_x9c3kxYwQoFed5LkVBrwVk0Q@mail.gmail.com>
 <57e0551229684b69bc27476b8a08fb91@DB3PR07MB059.eurprd07.prod.outlook.com>
 <CAC-LZTajW0SO_dH9ZtUH80zX628vcosL_vOzwkwB1JF1PZy0qA@mail.gmail.com>
 <CAL2e+nek5GqWUJTJDe0pbWia+sY2QgSJWe_6XieGPmdX8KGjZg@mail.gmail.com>
Date: Thu, 13 Jun 2013 17:52:27 -0400
Message-ID: <CAC-LZTb4w2S2wrb3tXJsvC73h6ky_w_iPPhoeOWAsoZ-sd0Chw@mail.gmail.com>
Subject: Re: zfs send/recv dies when transferring large-ish dataset
From: Jona Schuman <jonaschuman@gmail.com>
To: abhay trivedi <abhaytrivedi3@gmail.com>, 
 "freebsd-fs@freebsd.org" <freebsd-fs@freebsd.org>
Content-Type: text/plain; charset=ISO-8859-1
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 13 Jun 2013 21:52:28 -0000

atime is off on both origin and destination datasets


On Thu, Jun 13, 2013 at 5:30 PM, abhay trivedi <abhaytrivedi3@gmail.com> wrote:
> Can you set atime off on Destination file system and try again?
>
>
>
> On Thu, Jun 13, 2013 at 9:26 PM, Jona Schuman <jonaschuman@gmail.com> wrote:
>>
>> machine2# nc -d -l 9999 | zfs receive -v -F -d storagepool
>> machine1# zfs send -v -R dataset@snap | nc machine2 9999
>>
>> machine1-output: sending from @ to dataset@snap
>> machine2-output: receiving full stream of dataset@snap into
>> storagepool/dataset@snap
>> machine1-output: warning: cannot send 'dataset@snap': Broken pipe
>> machine1-output: Broken pipe
>>
>>
>> On Thu, Jun 13, 2013 at 3:42 AM, Ivailo Tanusheff
>> <Ivailo.Tanusheff@skrill.com> wrote:
>> > Hi,
>> >
>> > Can you try send/recv with the -v or with -vP swiches, so you can see
>> > more verbose information?
>> >
>> > Regards,
>> > Ivailo Tanusheff
>> >
>> > -----Original Message-----
>> > From: owner-freebsd-fs@freebsd.org [mailto:owner-freebsd-fs@freebsd.org]
>> > On Behalf Of Jona Schuman
>> > Sent: Thursday, June 13, 2013 2:41 AM
>> > To: freebsd-fs@freebsd.org
>> > Subject: zfs send/recv dies when transferring large-ish dataset
>> >
>> > Hi,
>> >
>> > I'm getting some strange behavior from zfs send/recv and I'm hoping
>> > someone may be able to provide some insight. I have two identical machines
>> > running 9.0-RELEASE-p3, each having a ZFS pool (zfs 5, zpool
>> > 28) for storage. I want to use zfs send/recv for replication between the
>> > two machines. For the most part, this has worked as expected.
>> > However, send/recv fails when transferring the largest dataset (both in
>> > actual size and in terms of number of files) on either machine.
>> > With these datasets, issuing:
>> >
>> > machine2# nc -d -l 9999 | zfs recv -d storagepool machine1# zfs send
>> > dataset@snap | nc machine2 9999
>> >
>> > terminates early on the sending side without any error messages. The
>> > receiving end continues on as expected, cleaning up the partial data
>> > received so far and reverting to its initial state. (I've tried using
>> > mbuffer instead of nc, or just using ssh, both with similar results.) Oddly,
>> > zfs send dies slightly differently depending on how the two machines are
>> > connected. When connected through the racktop switch, zfs send dies quietly
>> > without any indication that the transfer has failed.
>> > When connected directly using a crossover cable, zfs send dies quietly
>> > and machine1 becomes unresponsive (no network, no keyboard, hard reset
>> > required). In both cases, no messages are printed to screen or to anything
>> > in /var/log/.
>> >
>> >
>> > I can transfer the same datasets successfully if I send/recv to/from
>> > file:
>> >
>> > machine1# zfs send dataset@snap > /tmp/dump machine1# scp /tmp/dump
>> > machine2:/tmp/dump machine2# zfs recv -d storagepool < /tmp/dump
>> >
>> > so I don't think the datasets themselves are the issue. I've also
>> > successfully tried send/recv over the network using different network
>> > interfaces (10GbE ixgbe cards instead of the 1GbE igb links), which would
>> > suggest the issue is with the 1GbE links.
>> >
>> > Might there be some buffering parameter that I'm neglecting to tune,
>> > which is essential on the 1GbE links but may be less important on the faster
>> > links? Are there any known issues with the igb driver that might be the
>> > culprit here? Any other suggestions?
>> >
>> > Thanks,
>> > Jona
>> > _______________________________________________
>> > freebsd-fs@freebsd.org mailing list
>> > http://lists.freebsd.org/mailman/listinfo/freebsd-fs
>> > To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org"
>> >
>> >
>> _______________________________________________
>> freebsd-fs@freebsd.org mailing list
>> http://lists.freebsd.org/mailman/listinfo/freebsd-fs
>> To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org"
>
>
>
>
> --
> T@J

From owner-freebsd-fs@FreeBSD.ORG  Fri Jun 14 00:21:15 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@smarthost.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by hub.freebsd.org (Postfix) with ESMTP id E08A778A;
 Fri, 14 Jun 2013 00:21:15 +0000 (UTC)
 (envelope-from rmacklem@FreeBSD.org)
Received: from freefall.freebsd.org (freefall.freebsd.org
 [IPv6:2001:1900:2254:206c::16:87])
 by mx1.freebsd.org (Postfix) with ESMTP id B9D561371;
 Fri, 14 Jun 2013 00:21:15 +0000 (UTC)
Received: from freefall.freebsd.org (localhost [127.0.0.1])
 by freefall.freebsd.org (8.14.7/8.14.7) with ESMTP id r5E0LFYn034797;
 Fri, 14 Jun 2013 00:21:15 GMT
 (envelope-from rmacklem@freefall.freebsd.org)
Received: (from rmacklem@localhost)
 by freefall.freebsd.org (8.14.7/8.14.7/Submit) id r5E0LFgE034796;
 Fri, 14 Jun 2013 00:21:15 GMT (envelope-from rmacklem)
Date: Fri, 14 Jun 2013 00:21:15 GMT
Message-Id: <201306140021.r5E0LFgE034796@freefall.freebsd.org>
To: izrodix@gmail.com, rmacklem@FreeBSD.org, freebsd-fs@FreeBSD.org
From: rmacklem@FreeBSD.org
Subject: Re: kern/177335: [nfs] [panic] Sleeping on "vmopar" with the
 following non-sleepable locks held: exclusive sleep mutex NFSnode lock
 (NFSnode lock)
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 14 Jun 2013 00:21:16 -0000

Synopsis: [nfs] [panic] Sleeping on "vmopar" with the following non-sleepable locks held: exclusive sleep mutex NFSnode lock (NFSnode lock)

State-Changed-From-To: feedback->closed
State-Changed-By: rmacklem
State-Changed-When: Fri Jun 14 00:19:54 UTC 2013
State-Changed-Why: 

The patch that stops this crash (int head as r251089) has
been MFC'd to stable/8 as r251719.

http://www.freebsd.org/cgi/query-pr.cgi?pr=177335

From owner-freebsd-fs@FreeBSD.ORG  Fri Jun 14 00:42:25 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by hub.freebsd.org (Postfix) with ESMTP id 8B74ADB4
 for <freebsd-fs@freebsd.org>; Fri, 14 Jun 2013 00:42:25 +0000 (UTC)
 (envelope-from amvandemore@gmail.com)
Received: from mail-pa0-x22d.google.com (mail-pa0-x22d.google.com
 [IPv6:2607:f8b0:400e:c03::22d])
 by mx1.freebsd.org (Postfix) with ESMTP id 6876E15CC
 for <freebsd-fs@freebsd.org>; Fri, 14 Jun 2013 00:42:25 +0000 (UTC)
Received: by mail-pa0-f45.google.com with SMTP id bi5so87413pad.18
 for <freebsd-fs@freebsd.org>; Thu, 13 Jun 2013 17:42:25 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
 h=mime-version:in-reply-to:references:date:message-id:subject:from:to
 :cc:content-type;
 bh=B90ze03cnmichfN4uIIxOEM0cqrOGsjdCaUr6J02k+g=;
 b=cG1hv6J1grieALBkXtupcw+5xAcXT5E7i4wmbjawhLEgsDXrF8hGiiht8sJVEXbNl+
 h4GcAT1M7OHPSKYiL+uNExaXDnk4v+PcwsG/eGnJy29Y6MFcyf1EwXwuEb88UCJLP7yL
 xWoiwiTIF5uHUJrbdcAzDoYS3DJe5EtNXmg3PqUf2JY+pmG8/cot/FNpt9fOqtOnR+Sg
 GFxwYzXhmd9hbE1LHigOCTE40By+6M+s7snAy7h99q19ITaOwVMOby8Pq0BFUNpc9U5W
 UmyXMthsLMtGfjI+fA0ZFxeiblvJLguvWIwdj+WErZjiYYpgn6Xoy3X5a1P1n9Pr1Vc+
 PtOg==
MIME-Version: 1.0
X-Received: by 10.68.203.161 with SMTP id kr1mr426271pbc.192.1371170545166;
 Thu, 13 Jun 2013 17:42:25 -0700 (PDT)
Received: by 10.70.31.195 with HTTP; Thu, 13 Jun 2013 17:42:25 -0700 (PDT)
In-Reply-To: <CAC-LZTYLzFPTvA6S4CN0xTd-E_x9c3kxYwQoFed5LkVBrwVk0Q@mail.gmail.com>
References: <CAC-LZTYLzFPTvA6S4CN0xTd-E_x9c3kxYwQoFed5LkVBrwVk0Q@mail.gmail.com>
Date: Thu, 13 Jun 2013 19:42:25 -0500
Message-ID: <CA+tpaK16ziiBAAOGJvvmxv1BKRHHqvLf6-8oyBQMCgU5WUYimg@mail.gmail.com>
Subject: Re: zfs send/recv dies when transferring large-ish dataset
From: Adam Vande More <amvandemore@gmail.com>
To: Jona Schuman <jonaschuman@gmail.com>
Content-Type: text/plain; charset=ISO-8859-1
X-Content-Filtered-By: Mailman/MimeDel 2.1.14
Cc: freebsd-fs <freebsd-fs@freebsd.org>
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 14 Jun 2013 00:42:25 -0000

On Wed, Jun 12, 2013 at 6:40 PM, Jona Schuman <jonaschuman@gmail.com> wrote:

> Might there be some buffering parameter that I'm neglecting to tune,
> which is essential on the 1GbE links but may be less important on the
> faster links? Are there any known issues with the igb driver that
> might be the culprit here? Any other suggestions?
>

ZFS borks on low memory/high io situations.  Thinks have improved a lot
since your version.  The first thing I would try to do is upgrade and 9.0
isn't supported anymore regardless of the ZFS issue.  Migrating to STABLE
is probably your best chance of success, but 9.1 probably would work too.
 You also didn't indicate amd64/i386 or any other system specs.  IIRC,
vm.kmem_size
had to set higher even on AMD64 for that era.

-- 
Adam Vande More

From owner-freebsd-fs@FreeBSD.ORG  Fri Jun 14 00:52:26 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115])
 by hub.freebsd.org (Postfix) with ESMTP id 0CBE3F5D
 for <freebsd-fs@freebsd.org>; Fri, 14 Jun 2013 00:52:26 +0000 (UTC)
 (envelope-from beastie@tardisi.com)
Received: from mho-02-ewr.mailhop.org (mho-02-ewr.mailhop.org [204.13.248.72])
 by mx1.freebsd.org (Postfix) with ESMTP id C629015FF
 for <freebsd-fs@freebsd.org>; Fri, 14 Jun 2013 00:52:25 +0000 (UTC)
Received: from ip70-179-144-108.fv.ks.cox.net ([70.179.144.108]
 helo=zen.lhaven.homeip.net)
 by mho-02-ewr.mailhop.org with esmtpsa (TLSv1:CAMELLIA256-SHA:256)
 (Exim 4.72) (envelope-from <beastie@tardisi.com>) id 1UnIFX-00005g-9C
 for freebsd-fs@freebsd.org; Fri, 14 Jun 2013 00:52:19 +0000
X-Mail-Handler: Dyn Standard SMTP by Dyn
X-Originating-IP: 70.179.144.108
X-Report-Abuse-To: abuse@dyndns.com (see
 http://www.dyndns.com/services/sendlabs/outbound_abuse.html for abuse
 reporting information)
X-MHO-User: U2FsdGVkX1/4JoORhAPmozf83kVKfb7aoFG/Mby/QUU=
Message-ID: <51BA6941.7040909@tardisi.com>
Date: Thu, 13 Jun 2013 19:52:17 -0500
From: The BSD Dreamer <beastie@tardisi.com>
User-Agent: Mozilla/5.0 (X11; FreeBSD amd64;
 rv:17.0) Gecko/20130516 Thunderbird/17.0.6
MIME-Version: 1.0
To: freebsd-fs@freebsd.org
Subject: Re: ZFS triggered 9-STABLE r246646 panic "vdrop: holdcnt 0"
References: <513E8E95.6010802@freebsd.org>
In-Reply-To: <513E8E95.6010802@freebsd.org>
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 7bit
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 14 Jun 2013 00:52:26 -0000


On 03/11/2013 21:10, Lawrence Stewart wrote:
> Hi all,
> 
> I got this panic yesterday. I haven't seen it before (or since), but I
> have the crashdump and kernel here if there's additional information I
> can provide that would be useful in finding the cause.
> 
> The machine runs ZFS exclusively and was under quite heavy CPU and IO
> load at the time of the crash as I was compiling in a VirtualBox VM and
> on the host itself, as well as running a full KDE desktop environment.
> I'm fairly certain the machine was not swapping at the time of the crash.
> 
> lstewart@lstewart> uname -a
> FreeBSD lstewart 9.1-STABLE FreeBSD 9.1-STABLE #8 r246646M: Mon Feb 11
> 14:57:13 EST 2013
> root@lstewart:/usr/obj/usr/src/sys/LSTEWART-DESKTOP  amd64
> 
> lstewart@lstewart> sudo kgdb /boot/kernel/kernel /var/crash/vmcore.0
> 
> [...]
> 
> (kgdb) bt
> #0  doadump (textdump=<value optimized out>) at pcpu.h:229
> #1  0xffffffff808e5824 in kern_reboot (howto=260) at
> /usr/src/sys/kern/kern_shutdown.c:448
> #2  0xffffffff808e5d27 in panic (fmt=0x1 <Address 0x1 out of bounds>) at
> /usr/src/sys/kern/kern_shutdown.c:636
> #3  0xffffffff8097a71e in vdropl (vp=<value optimized out>) at
> /usr/src/sys/kern/vfs_subr.c:2465
> #4  0xffffffff80b4da2b in vm_page_alloc (object=0xffffffff8132c000,
> pindex=143696, req=32) at /usr/src/sys/vm/vm_page.c:1569
> #5  0xffffffff80b3f312 in kmem_back (map=0xfffffe00020000e8,
> addr=18446743524542296064, size=131072, flags=705200752)
>     at /usr/src/sys/vm/vm_kern.c:361

I just came home to find that my system had panic'd (around
11:30am)....and this was the only FreeBSD 9 'panic: vdrop: holdcnt: 0'
that I found.

The machine runs ZFS exclusively as well....CPU would be busy, since I
run BOINC and distributed.net (go Team FreeBSD :)  And, IO load would be
high from BackupPC_nightly running...out of the box this job starts at
1am, but I had moved it to run at 11am so that it doesn't run into all
things that get scheduled in cron around this time, along with all the
backups that I'm running...  as well as out of the way when I'm checking
email and such first thing in the morning over coffee before heading
into work.  And, it takes a few hours to grind through the 7.2TB zpool...

Its possible that this was happening when it was set to 1am, but I never
had a crash dump when it had happened and no indication that a panic was
why.  Though I did later find out that recollindex cleans itself up when
something goes wrong by sending TERM to its pgid....and running
recollindex as root from cron during this time....means its sending TERM
to init.  And, not running it anymore seems to have solved that.... and
there didn't seem to be any reason to move BackupPC_nightly back.

Plus the other problem would have me wake up to find the machine with
console screen in single user mode.  With this, I came home to gnome
login screen....

So, my system is:

lchen@zen:~ 102> uname -a
FreeBSD zen.lhaven.homeip.net 9.1-RELEASE-p3 FreeBSD 9.1-RELEASE-p3 #0:
Mon Apr 29 18:27:25 UTC 2013
root@amd64-builder.daemonology.net:/usr/obj/usr/src/sys/GENERIC  amd64

but, when I try to look at the dump:


lchen@zen:~ 103> sudo kgdb /boot/kernel/kernel /var/crash/vmcore.0
Password:
GNU gdb 6.1.1 [FreeBSD]
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain
conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "amd64-marcel-freebsd"...(no debugging
symbols found)...
Attempt to extract a component of a value that is not a structure pointer.
Attempt to extract a component of a value that is not a structure pointer.
#0  0xffffffff808e9ecb in doadump ()
(kgdb)

There's no kernel.symbols either.  The only one that is, is the backup
of my 9.0 kernel.  Is that because I've been using freebsd-update to update?

Here's the info.0 file....

lchen@zen:~ 104> sudo cat /var/crash/info.0
Dump header from device /dev/gpt/swap0
  Architecture: amd64
  Architecture Version: 2
  Dump Length: 9172926464B (8747 MB)
  Blocksize: 512
  Dumptime: Thu Jun 13 11:31:10 2013
  Hostname: zen.lhaven.homeip.net
  Magic: FreeBSD Kernel Dump
  Version String: FreeBSD 9.1-RELEASE-p3 #0: Mon Apr 29 18:27:25 UTC 2013
    root@amd64-builder.daemonology.net:/usr/obj/usr/src/sys/GENERIC
  Panic String: vdrop: holdcnt 0
  Dump Parity: 4285100545
  Bounds: 0
  Dump Status: good

So, just to see if anything meaningful might result....I move my
/etc/make.conf aside and do a "make buildkernel", and tried a

kgdb /usr/obj/usr/src/sys/generic/kernel.debug /var/crash/vmcore.0

which get's me this...

GNU gdb 6.1.1 [FreeBSD]
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain
conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "amd64-marcel-freebsd"...

Unread portion of the kernel message buffer:
panic: vdrop: holdcnt 0
cpuid = 1
KDB: stack backtrace:
#0 0xffffffff809208d6 at kdb_backtrace+0x66
#1 0xffffffff808ea8ee at panic+0x1ce
#2 0xffffffff8097fa86 at vdropl+0x366
#3 0xffffffff80b522ab at vm_page_alloc+0x28b
#4 0xffffffff80bd9096 at uma_small_alloc+0x66
#5 0xffffffff80b3b5fa at keg_alloc_slab+0x9a
#6 0xffffffff80b3bb72 at keg_fetch_slab+0xb2
#7 0xffffffff80b3bede at zone_fetch_slab+0x3e
#8 0xffffffff80b3b229 at zone_alloc_item+0x59
#9 0xffffffff80b3b431 at uma_large_malloc+0x31
#10 0xffffffff808d5a99 at malloc+0xd9
#11 0xffffffff815b28ee at zio_write_bp_init+0x1fe
#12 0xffffffff815b2063 at zio_execute+0xc3
#13 0xffffffff815b3fad at zio_ready+0x17d
#14 0xffffffff815b2063 at zio_execute+0xc3
#15 0xffffffff8092cf85 at taskqueue_run_locked+0x85
#16 0xffffffff8092df06 at taskqueue_thread_loop+0x46
#17 0xffffffff808bba1f at fork_exit+0x11f
Uptime: 15d13h35m36s
Dumping 8747 out of 16308
MB:..1%..11%..21%..31%..41%..51%..61%..71%..81%..91%

Reading symbols from /boot/kernel/nullfs.ko...done.
Loaded symbols for /boot/kernel/nullfs.ko
Reading symbols from /boot/kernel/zfs.ko...done.
Loaded symbols for /boot/kernel/zfs.ko
Reading symbols from /boot/kernel/opensolaris.ko...done.
Loaded symbols for /boot/kernel/opensolaris.ko
Reading symbols from /boot/kernel/if_tap.ko...done.
Loaded symbols for /boot/kernel/if_tap.ko
Reading symbols from /boot/kernel/aio.ko...done.
Loaded symbols for /boot/kernel/aio.ko
Reading symbols from /boot/kernel/accf_data.ko...done.
Loaded symbols for /boot/kernel/accf_data.ko
Reading symbols from /boot/kernel/accf_http.ko...done.
Loaded symbols for /boot/kernel/accf_http.ko
Reading symbols from /boot/kernel/coretemp.ko...done.
Loaded symbols for /boot/kernel/coretemp.ko
Reading symbols from /boot/kernel/cpuctl.ko...done.
Loaded symbols for /boot/kernel/cpuctl.ko
Reading symbols from /boot/kernel/sem.ko...done.
Loaded symbols for /boot/kernel/sem.ko
Reading symbols from /boot/modules/cuse4bsd.ko...done.
Loaded symbols for /boot/modules/cuse4bsd.ko
Reading symbols from /boot/modules/vboxdrv.ko...done.
Loaded symbols for /boot/modules/vboxdrv.ko
Reading symbols from /boot/modules/nvidia.ko...done.
Loaded symbols for /boot/modules/nvidia.ko
Reading symbols from /boot/kernel/linux.ko...done.
Loaded symbols for /boot/kernel/linux.ko
Reading symbols from /boot/kernel/libiconv.ko...done.
Loaded symbols for /boot/kernel/libiconv.ko
Reading symbols from /boot/kernel/libmchain.ko...done.
Loaded symbols for /boot/kernel/libmchain.ko
Reading symbols from /boot/kernel/cd9660_iconv.ko...done.
Loaded symbols for /boot/kernel/cd9660_iconv.ko
Reading symbols from /boot/kernel/msdosfs_iconv.ko...done.
Loaded symbols for /boot/kernel/msdosfs_iconv.ko
Reading symbols from /boot/kernel/ichwd.ko...done.
Loaded symbols for /boot/kernel/ichwd.ko
Reading symbols from /boot/kernel/fdescfs.ko...done.
Loaded symbols for /boot/kernel/fdescfs.ko
Reading symbols from /boot/kernel/ipl.ko...done.
Loaded symbols for /boot/kernel/ipl.ko
Reading symbols from /boot/modules/vboxnetflt.ko...done.
Loaded symbols for /boot/modules/vboxnetflt.ko
Reading symbols from /boot/kernel/netgraph.ko...done.
Loaded symbols for /boot/kernel/netgraph.ko
Reading symbols from /boot/kernel/ng_ether.ko...done.
Loaded symbols for /boot/kernel/ng_ether.ko
Reading symbols from /boot/modules/vboxnetadp.ko...done.
Loaded symbols for /boot/modules/vboxnetadp.ko
Reading symbols from /usr/local/modules/fuse.ko...done.
Loaded symbols for /usr/local/modules/fuse.ko
Reading symbols from /boot/kernel/linprocfs.ko...done.
Loaded symbols for /boot/kernel/linprocfs.ko
Reading symbols from /boot/kernel/linsysfs.ko...done.
Loaded symbols for /boot/kernel/linsysfs.ko
Reading symbols from /usr/local/libexec/linux_adobe/linux_adobe.ko...done.
Loaded symbols for /usr/local/libexec/linux_adobe/linux_adobe.ko
Reading symbols from /usr/local/modules/rtc.ko...done.
Loaded symbols for /usr/local/modules/rtc.ko
#0  doadump (textdump=Variable "textdump" is not available.
) at pcpu.h:224
224		__asm("movq %%gs:0,%0" : "=r" (td));
(kgdb) bt
#0  doadump (textdump=Variable "textdump" is not available.
) at pcpu.h:224
#1  0xffffffff808ea3d1 in kern_reboot (howto=260)
    at /usr/src/sys/kern/kern_shutdown.c:448
#2  0xffffffff808ea8c7 in panic (fmt=0x1 <Address 0x1 out of bounds>)
    at /usr/src/sys/kern/kern_shutdown.c:636
#3  0xffffffff8097fa86 in vdropl (vp=Variable "vp" is not available.
) at /usr/src/sys/kern/vfs_subr.c:2400
#4  0xffffffff80b522ab in vm_page_alloc (object=0x0, pindex=0, req=32)
    at /usr/src/sys/vm/vm_page.c:1537
#5  0xffffffff80bd9096 in uma_small_alloc (zone=Variable "zone" is not
available.
)
    at /usr/src/sys/amd64/amd64/uma_machdep.c:58
#6  0xffffffff80b3b5fa in keg_alloc_slab (keg=0xfffffe043ffef0e0,
    zone=0xfffffe043ffee000, wait=258) at /usr/src/sys/vm/uma_core.c:844
#7  0xffffffff80b3bb72 in keg_fetch_slab (keg=0xfffffe043ffef0e0,
    zone=0xfffffe043ffee000, flags=2) at /usr/src/sys/vm/uma_core.c:2173
#8  0xffffffff80b3bede in zone_fetch_slab (zone=0xfffffe043ffee000,
    keg=0xfffffe043ffef0e0, flags=2) at /usr/src/sys/vm/uma_core.c:2233
#9  0xffffffff80b3b229 in zone_alloc_item (zone=0xfffffe043ffee000,
udata=0x0,
    flags=2) at /usr/src/sys/vm/uma_core.c:2490
#10 0xffffffff80b3b431 in uma_large_malloc (size=16384, wait=2)
    at /usr/src/sys/vm/uma_core.c:3064
#11 0xffffffff808d5a99 in malloc (size=16384, mtp=0xffffffff81734c20,
flags=2)
    at /usr/src/sys/kern/kern_malloc.c:492
#12 0xffffffff815b28ee in zio_write_bp_init () from /boot/kernel/zfs.ko
---Type <return> to continue, or q <return> to quit---
#13 0x0000000000000010 in ?? ()
#14 0xfffffe022b9726e0 in ?? ()
#15 0xfffffe03c81a2a50 in ?? ()
#16 0xffffff801b78e880 in ?? ()
#17 0xfffffe000e99e000 in ?? ()
#18 0xffffff8471d93ae0 in ?? ()
#19 0xffffffff815b2063 in zio_execute () from /boot/kernel/zfs.ko
#20 0x0000000000000000 in ?? ()
#21 0x0000000000000000 in ?? ()
#22 0xfffffe03c81a2a50 in ?? ()
#23 0xffffff801b78e880 in ?? ()
#24 0xfffffe000e99e000 in ?? ()
#25 0xffffff8471d93b10 in ?? ()
#26 0xffffffff815b3fad in zio_ready () from /boot/kernel/zfs.ko
#27 0xfffffe03c81a2a50 in ?? ()
#28 0x0000000000000006 in ?? ()
#29 0x0000000000000006 in ?? ()
#30 0xffffff8471d93b50 in ?? ()
#31 0xffffffff815b2063 in zio_execute () from /boot/kernel/zfs.ko
#32 0xfffffe0013c79800 in ?? ()
#33 0xfffffe03c81a2d90 in ?? ()
#34 0xfffffe0013c70000 in ?? ()
#35 0x0000000000000001 in ?? ()
---Type <return> to continue, or q <return> to quit---
#36 0xfffffe0013c70000 in ?? ()
#37 0xffffff8471d93bc0 in ?? ()
#38 0xffffffff8092cf85 in taskqueue_run_locked (queue=0xffffff800904e380)
    at /usr/src/sys/kern/subr_taskqueue.c:308
Previous frame inner to this frame (corrupt stack?)
(kgdb) l *0xffffffff8097fa86
0xffffffff8097fa86 is at /usr/src/sys/kern/vfs_subr.c:2400.
2395		int active;
2396	
2397		ASSERT_VI_LOCKED(vp, "vdropl");
2398		CTR2(KTR_VFS, "%s: vp %p", __func__, vp);
2399		if (vp->v_holdcnt <= 0)
2400			panic("vdrop: holdcnt %d", vp->v_holdcnt);
2401		vp->v_holdcnt--;
2402		if (vp->v_holdcnt > 0) {
2403			VI_UNLOCK(vp);
2404			return;

so, it seems to work, but beyond the fact that it says to panic if
vp->v_holdcnt is <= 0...don't know how to look to see why this variable
had come to be 0, when it thinks it shouldn't have.

I have periodic (about twice a year) scrubs enabled on my system, and
the zpool for backuppc was last scrubbed on May 24th (it took 47h57m -
repaired 0 with 0 errors.)

-- 
  Name: Lawrence "The Dreamer" Chen      Email: beastie@tardisi.com
 Snail: 1530 College Ave, A5              Blog: http://lawrencechen.net
        Manhattan, KS 66502-2768         Phone: 785-789-4132

From owner-freebsd-fs@FreeBSD.ORG  Fri Jun 14 03:39:07 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@smarthost.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115])
 by hub.freebsd.org (Postfix) with ESMTP id 355ECA02;
 Fri, 14 Jun 2013 03:39:07 +0000 (UTC) (envelope-from pfg@FreeBSD.org)
Received: from freefall.freebsd.org (freefall.freebsd.org
 [IPv6:2001:1900:2254:206c::16:87])
 by mx1.freebsd.org (Postfix) with ESMTP id 100EC1CB9;
 Fri, 14 Jun 2013 03:39:07 +0000 (UTC)
Received: from freefall.freebsd.org (localhost [127.0.0.1])
 by freefall.freebsd.org (8.14.7/8.14.7) with ESMTP id r5E3d6dI073333;
 Fri, 14 Jun 2013 03:39:06 GMT
 (envelope-from pfg@freefall.freebsd.org)
Received: (from pfg@localhost)
 by freefall.freebsd.org (8.14.7/8.14.7/Submit) id r5E3d5We073332;
 Fri, 14 Jun 2013 03:39:06 GMT (envelope-from pfg)
Date: Fri, 14 Jun 2013 03:39:06 GMT
Message-Id: <201306140339.r5E3d5We073332@freefall.freebsd.org>
To: cederom@tlen.pl, pfg@FreeBSD.org, freebsd-fs@FreeBSD.org
From: pfg@FreeBSD.org
Subject: Re: kern/174060: [ext2fs] Ext2FS system crashes (buffer overflow?)
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 14 Jun 2013 03:39:07 -0000

Synopsis: [ext2fs] Ext2FS system crashes (buffer overflow?)

State-Changed-From-To: open->closed
State-Changed-By: pfg
State-Changed-When: Fri Jun 14 03:34:15 UTC 2013
State-Changed-Why: 
Testing with fsx revealed issues that have been worked around by
disabling reallocation in r245817 (MFC'd).

Without reallocation the filesystem appears to be stable.

http://www.freebsd.org/cgi/query-pr.cgi?pr=174060

From owner-freebsd-fs@FreeBSD.ORG  Fri Jun 14 07:04:28 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115])
 by hub.freebsd.org (Postfix) with ESMTP id 88D4F8E6
 for <freebsd-fs@freebsd.org>; Fri, 14 Jun 2013 07:04:28 +0000 (UTC)
 (envelope-from Ivailo.Tanusheff@skrill.com)
Received: from ch1outboundpool.messaging.microsoft.com
 (ch1ehsobe003.messaging.microsoft.com [216.32.181.183])
 by mx1.freebsd.org (Postfix) with ESMTP id 3872016D4
 for <freebsd-fs@freebsd.org>; Fri, 14 Jun 2013 07:04:27 +0000 (UTC)
Received: from mail97-ch1-R.bigfish.com (10.43.68.225) by
 CH1EHSOBE018.bigfish.com (10.43.70.68) with Microsoft SMTP Server id
 14.1.225.23; Fri, 14 Jun 2013 07:04:26 +0000
Received: from mail97-ch1 (localhost [127.0.0.1])	by mail97-ch1-R.bigfish.com
 (Postfix) with ESMTP id 168152E033F;
 Fri, 14 Jun 2013 07:04:26 +0000 (UTC)
X-Forefront-Antispam-Report: CIP:157.56.249.213; KIP:(null); UIP:(null);
 IPV:NLI; H:AM2PRD0710HT001.eurprd07.prod.outlook.com; RD:none; EFVD:NLI
X-SpamScore: -3
X-BigFish: PS-3(z54eehz98dI9371I542I1432I4015Izz1f42h1ee6h1de0h1fdah1202h1e76h1d1ah1d2ah1fc6hzz17326ah8275bh8275dhz2fh2a8h668h839h944hd24hf0ah1220h1288h12a5h12a9h12bdh137ah13b6h1441h1504h1537h153bh162dh1631h1758h18e1h1946h19b5h19ceh1ad9h1b0ah1d07h1d0ch1d2eh1d3fh1de9h1dfeh1dffh1e1dh9a9j1155h)
Received-SPF: pass (mail97-ch1: domain of skrill.com designates 157.56.249.213
 as permitted sender) client-ip=157.56.249.213;
 envelope-from=Ivailo.Tanusheff@skrill.com;
 helo=AM2PRD0710HT001.eurprd07.prod.outlook.com ; .outlook.com ; 
X-Forefront-Antispam-Report-Untrusted: SFV:SKI; SFS:; DIR:OUT; SFP:; SCL:-1;
 SRVR:DBXPR07MB062; H:DBXPR07MB064.eurprd07.prod.outlook.com; LANG:en; 
Received: from mail97-ch1 (localhost.localdomain [127.0.0.1]) by mail97-ch1
 (MessageSwitch) id 1371193463719231_26787; Fri, 14 Jun 2013 07:04:23 +0000
 (UTC)
Received: from CH1EHSMHS026.bigfish.com (snatpool2.int.messaging.microsoft.com
 [10.43.68.231])	by mail97-ch1.bigfish.com (Postfix) with ESMTP id
 A3C94460244;	Fri, 14 Jun 2013 07:04:23 +0000 (UTC)
Received: from AM2PRD0710HT001.eurprd07.prod.outlook.com (157.56.249.213) by
 CH1EHSMHS026.bigfish.com (10.43.70.26) with Microsoft SMTP Server (TLS) id
 14.1.225.23; Fri, 14 Jun 2013 07:04:22 +0000
Received: from DBXPR07MB062.eurprd07.prod.outlook.com (10.242.147.20) by
 AM2PRD0710HT001.eurprd07.prod.outlook.com (10.255.165.36) with Microsoft SMTP
 Server (TLS) id 14.16.324.0; Fri, 14 Jun 2013 07:04:12 +0000
Received: from DBXPR07MB064.eurprd07.prod.outlook.com (10.242.147.24) by
 DBXPR07MB062.eurprd07.prod.outlook.com (10.242.147.20) with Microsoft SMTP
 Server (TLS) id 15.0.702.21; Fri, 14 Jun 2013 07:04:11 +0000
Received: from DBXPR07MB064.eurprd07.prod.outlook.com ([169.254.7.13]) by
 DBXPR07MB064.eurprd07.prod.outlook.com ([169.254.7.13]) with mapi id
 15.00.0702.005; Fri, 14 Jun 2013 07:04:11 +0000
From: Ivailo Tanusheff <Ivailo.Tanusheff@skrill.com>
To: Jona Schuman <jonaschuman@gmail.com>, abhay trivedi
 <abhaytrivedi3@gmail.com>, "freebsd-fs@freebsd.org" <freebsd-fs@freebsd.org>
Subject: RE: zfs send/recv dies when transferring large-ish dataset
Thread-Topic: zfs send/recv dies when transferring large-ish dataset
Thread-Index: AQHOZ8ZVUN+hFJBhLk6aHw9omdejRZkzQxIggACKgICAAGOU04AAmc2g
Date: Fri, 14 Jun 2013 07:04:10 +0000
Message-ID: <97ca7eedc13f4b2b809945d067a732b6@DBXPR07MB064.eurprd07.prod.outlook.com>
References: <CAC-LZTYLzFPTvA6S4CN0xTd-E_x9c3kxYwQoFed5LkVBrwVk0Q@mail.gmail.com>
 <57e0551229684b69bc27476b8a08fb91@DB3PR07MB059.eurprd07.prod.outlook.com>
 <CAC-LZTajW0SO_dH9ZtUH80zX628vcosL_vOzwkwB1JF1PZy0qA@mail.gmail.com>
 <CAL2e+nek5GqWUJTJDe0pbWia+sY2QgSJWe_6XieGPmdX8KGjZg@mail.gmail.com>
 <CAC-LZTb4w2S2wrb3tXJsvC73h6ky_w_iPPhoeOWAsoZ-sd0Chw@mail.gmail.com>
In-Reply-To: <CAC-LZTb4w2S2wrb3tXJsvC73h6ky_w_iPPhoeOWAsoZ-sd0Chw@mail.gmail.com>
Accept-Language: en-US
Content-Language: en-US
X-MS-Has-Attach: 
X-MS-TNEF-Correlator: 
x-originating-ip: [217.18.249.148]
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: quoted-printable
MIME-Version: 1.0
X-OriginatorOrg: skrill.com
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 14 Jun 2013 07:04:28 -0000

This sounds to me as a communicational problem or system getting out of mem=
ory ...

-----Original Message-----
From: owner-freebsd-fs@freebsd.org [mailto:owner-freebsd-fs@freebsd.org] On=
 Behalf Of Jona Schuman
Sent: Friday, June 14, 2013 12:52 AM
To: abhay trivedi; freebsd-fs@freebsd.org
Subject: Re: zfs send/recv dies when transferring large-ish dataset

atime is off on both origin and destination datasets


On Thu, Jun 13, 2013 at 5:30 PM, abhay trivedi <abhaytrivedi3@gmail.com> wr=
ote:
> Can you set atime off on Destination file system and try again?
>
>
>
> On Thu, Jun 13, 2013 at 9:26 PM, Jona Schuman <jonaschuman@gmail.com> wro=
te:
>>
>> machine2# nc -d -l 9999 | zfs receive -v -F -d storagepool machine1#=20
>> zfs send -v -R dataset@snap | nc machine2 9999
>>
>> machine1-output: sending from @ to dataset@snap
>> machine2-output: receiving full stream of dataset@snap into=20
>> storagepool/dataset@snap
>> machine1-output: warning: cannot send 'dataset@snap': Broken pipe
>> machine1-output: Broken pipe
>>
>>
>> On Thu, Jun 13, 2013 at 3:42 AM, Ivailo Tanusheff=20
>> <Ivailo.Tanusheff@skrill.com> wrote:
>> > Hi,
>> >
>> > Can you try send/recv with the -v or with -vP swiches, so you can=20
>> > see more verbose information?
>> >
>> > Regards,
>> > Ivailo Tanusheff
>> >
>> > -----Original Message-----
>> > From: owner-freebsd-fs@freebsd.org=20
>> > [mailto:owner-freebsd-fs@freebsd.org]
>> > On Behalf Of Jona Schuman
>> > Sent: Thursday, June 13, 2013 2:41 AM
>> > To: freebsd-fs@freebsd.org
>> > Subject: zfs send/recv dies when transferring large-ish dataset
>> >
>> > Hi,
>> >
>> > I'm getting some strange behavior from zfs send/recv and I'm hoping=20
>> > someone may be able to provide some insight. I have two identical=20
>> > machines running 9.0-RELEASE-p3, each having a ZFS pool (zfs 5,=20
>> > zpool
>> > 28) for storage. I want to use zfs send/recv for replication=20
>> > between the two machines. For the most part, this has worked as expect=
ed.
>> > However, send/recv fails when transferring the largest dataset=20
>> > (both in actual size and in terms of number of files) on either machin=
e.
>> > With these datasets, issuing:
>> >
>> > machine2# nc -d -l 9999 | zfs recv -d storagepool machine1# zfs=20
>> > send dataset@snap | nc machine2 9999
>> >
>> > terminates early on the sending side without any error messages.=20
>> > The receiving end continues on as expected, cleaning up the partial=20
>> > data received so far and reverting to its initial state. (I've=20
>> > tried using mbuffer instead of nc, or just using ssh, both with=20
>> > similar results.) Oddly, zfs send dies slightly differently=20
>> > depending on how the two machines are connected. When connected=20
>> > through the racktop switch, zfs send dies quietly without any indicati=
on that the transfer has failed.
>> > When connected directly using a crossover cable, zfs send dies=20
>> > quietly and machine1 becomes unresponsive (no network, no keyboard,=20
>> > hard reset required). In both cases, no messages are printed to=20
>> > screen or to anything in /var/log/.
>> >
>> >
>> > I can transfer the same datasets successfully if I send/recv=20
>> > to/from
>> > file:
>> >
>> > machine1# zfs send dataset@snap > /tmp/dump machine1# scp /tmp/dump=20
>> > machine2:/tmp/dump machine2# zfs recv -d storagepool < /tmp/dump
>> >
>> > so I don't think the datasets themselves are the issue. I've also=20
>> > successfully tried send/recv over the network using different=20
>> > network interfaces (10GbE ixgbe cards instead of the 1GbE igb=20
>> > links), which would suggest the issue is with the 1GbE links.
>> >
>> > Might there be some buffering parameter that I'm neglecting to=20
>> > tune, which is essential on the 1GbE links but may be less=20
>> > important on the faster links? Are there any known issues with the=20
>> > igb driver that might be the culprit here? Any other suggestions?
>> >
>> > Thanks,
>> > Jona
>> > _______________________________________________
>> > freebsd-fs@freebsd.org mailing list=20
>> > http://lists.freebsd.org/mailman/listinfo/freebsd-fs
>> > To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org"
>> >
>> >
>> _______________________________________________
>> freebsd-fs@freebsd.org mailing list
>> http://lists.freebsd.org/mailman/listinfo/freebsd-fs
>> To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org"
>
>
>
>
> --
> T@J
_______________________________________________
freebsd-fs@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-fs
To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org"


From owner-freebsd-fs@FreeBSD.ORG  Fri Jun 14 09:14:07 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115])
 by hub.freebsd.org (Postfix) with ESMTP id A5E553FA;
 Fri, 14 Jun 2013 09:14:07 +0000 (UTC)
 (envelope-from tomek.cedro@gmail.com)
Received: from mail-pa0-x22f.google.com (mail-pa0-x22f.google.com
 [IPv6:2607:f8b0:400e:c03::22f])
 by mx1.freebsd.org (Postfix) with ESMTP id 821091F32;
 Fri, 14 Jun 2013 09:14:07 +0000 (UTC)
Received: by mail-pa0-f47.google.com with SMTP id kl14so469141pab.6
 for <multiple recipients>; Fri, 14 Jun 2013 02:14:07 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
 h=mime-version:sender:in-reply-to:references:date
 :x-google-sender-auth:message-id:subject:from:to:cc:content-type;
 bh=nm45a9YZG9Z/fzxGO9cvGaUAVHMcAHc+lYbaM1Fem5A=;
 b=cq6kStlnY5tCEBhazCDUqATZQfJv6e1pwcu8aPHv4lwqNffGN+M9LEO7wBLRNfL+vq
 ZlLdTXIZTrSFc3KIF1dyJlyUFYhUpMzbYbeueHMttNSScWMGGbC8lnfmFyQxL4XrOqx6
 SlGe+AcbVcZT/inu5CdMY6hrznRJ3bEBiq7msexl123NjcfLYXKtrRlzUgNNeQtV/I9/
 CUXW4UV9VaJ+44acIGpFjXaI+N/qNrCodfjegpfX0JkVdepSGGUe9+wMWVtYzp/oYX1F
 H0vqa5QkMCuj4dgm1bAu7+O7yaWpTmwjhqEMgn7wWkZt2kFVP/aE0zOPkykvylt8jI0x
 DZRg==
MIME-Version: 1.0
X-Received: by 10.66.150.40 with SMTP id uf8mr1649975pab.66.1371201247308;
 Fri, 14 Jun 2013 02:14:07 -0700 (PDT)
Sender: tomek.cedro@gmail.com
Received: by 10.68.112.4 with HTTP; Fri, 14 Jun 2013 02:14:07 -0700 (PDT)
In-Reply-To: <201306140339.r5E3d5We073332@freefall.freebsd.org>
References: <201306140339.r5E3d5We073332@freefall.freebsd.org>
Date: Fri, 14 Jun 2013 11:14:07 +0200
X-Google-Sender-Auth: sta4BRjpf0KXaNq7C8G3S40LkX4
Message-ID: <CAFYkXjkJ+GR5TA+DQjq5mQz5+v6Oem=VAzYMONM9--=Z7YSYNQ@mail.gmail.com>
Subject: Re: kern/174060: [ext2fs] Ext2FS system crashes (buffer overflow?)
From: CeDeROM <cederom@tlen.pl>
To: pfg@freebsd.org
Content-Type: text/plain; charset=UTF-8
Cc: freebsd-fs@freebsd.org
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 14 Jun 2013 09:14:07 -0000

Thank you!! :-)

--
CeDeROM, SQ7MHZ, http://www.tomek.cedro.info

From owner-freebsd-fs@FreeBSD.ORG  Fri Jun 14 09:55:31 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by hub.freebsd.org (Postfix) with ESMTP id 0DCBA51B;
 Fri, 14 Jun 2013 09:55:31 +0000 (UTC)
 (envelope-from josefkarthauser@gmail.com)
Received: from mail-we0-x22a.google.com (mail-we0-x22a.google.com
 [IPv6:2a00:1450:400c:c03::22a])
 by mx1.freebsd.org (Postfix) with ESMTP id 7514D1320;
 Fri, 14 Jun 2013 09:55:30 +0000 (UTC)
Received: by mail-we0-f170.google.com with SMTP id w57so309923wes.1
 for <multiple recipients>; Fri, 14 Jun 2013 02:55:29 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
 h=from:content-type:message-id:mime-version:date:subject:cc:to
 :x-mailer; bh=nN0SSje1VyLm4dtAbnSdrhpxlSPscJ35WGIH2IUbT58=;
 b=UFEWAIK2WQ9Tf9t4RtU9eO7zOOHqj40xe3sAkN3dY2FqMpeUG6/mHD9LDz91pIf/iv
 +Y44XkFL3q8IqpdoXOmd6x0l9H99OHrBrIuE0ZXdc2oIaL9L3BJALjhCG8mdOyfLto4/
 cjayiZCtAOi6Nk19j0ZbZ+fUxJ4pUG+e/0pZYlKpCTbiXeq9wDap86C7F+/5QNenEp/r
 wYjax2otkCwKIbPThrIDTwPWC++7cSJrBb0wZ/bxy2YuGDoSb+oYgZrt5t9X7ANVqtP0
 rOpdV5KNDUxlDGhyUs0ka+qnnz3tMZ3ogFIcPwaQcmAz6EXy8xVkN5LGxsiLdA+sVC0C
 lCXA==
X-Received: by 10.180.39.136 with SMTP id p8mr877971wik.11.1371203729585;
 Fri, 14 Jun 2013 02:55:29 -0700 (PDT)
Received: from phoenix.fritz.box ([81.187.183.70])
 by mx.google.com with ESMTPSA id o14sm1977777wiv.3.2013.06.14.02.55.28
 for <multiple recipients>
 (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128);
 Fri, 14 Jun 2013 02:55:28 -0700 (PDT)
From: Dr Josef Karthauser <josefkarthauser@gmail.com>
Message-Id: <301B4131-F677-4B8D-ABF6-A6D269FE604E@gmail.com>
Mime-Version: 1.0 (Mac OS X Mail 6.3 \(1503\))
Date: Fri, 14 Jun 2013 10:55:29 +0100
Subject: Help! :( ZFS panic on boot, importing pool after server crash.
To: fs@freebsd.org
X-Mailer: Apple Mail (2.1503)
Content-Type: text/plain;
	charset=us-ascii
Content-Transfer-Encoding: quoted-printable
X-Content-Filtered-By: Mailman/MimeDel 2.1.14
Cc: "freebsd-stable@freebsd.org" <freebsd-stable@freebsd.org>
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 14 Jun 2013 09:55:31 -0000

Hi, I'm a bit at the end of my tether.

We had a ZFS panic last night on a machine that hosts all my mail and =
web; it was rebooted and it now panics mounting the ZFS root filesystem.

The call stack info is:

	solaris assert: ss =3D=3D NULL, file: =
/usr/src/sys/modules/zfs/../../cddl/contrib/opensource/uts/common/fs/zfs/s=
pace_map.c, line: 109

	kdb_backtrace
	panic
	space_map_add
	space_map_load
	metaslab_activate
	metaslab_allocate
	zio_dva_allocate=09
	zio_execute
	taskqueue_run_locked
	taskqueue_thread_loop
	fork_exit
	fork_trampoline

I can boot from the live DVD filesystem, but I can only mount the pool =
read-only without getting the same kernel panic.  This is with FreeBSD =
9.0.

The machine is remote, and I don't have access other than through a DRAC =
console port (so I can't cut and paste; sorry for the poor stack trace).

Is anyone here in the position to advice me how I might process to get =
this machine mounting and running again in multi-user mode?

Thanks so much.
Joe

p.s. the config, btw, is a ZFS mirror on two ad devices. It's got a ZFS =
root file system.=

From owner-freebsd-fs@FreeBSD.ORG  Fri Jun 14 11:00:57 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115])
 by hub.freebsd.org (Postfix) with ESMTP id 726CE80B;
 Fri, 14 Jun 2013 11:00:57 +0000 (UTC)
 (envelope-from c.kworr@gmail.com)
Received: from mail-lb0-f176.google.com (mail-lb0-f176.google.com
 [209.85.217.176])
 by mx1.freebsd.org (Postfix) with ESMTP id C181F10D3;
 Fri, 14 Jun 2013 11:00:56 +0000 (UTC)
Received: by mail-lb0-f176.google.com with SMTP id z5so453531lbh.35
 for <multiple recipients>; Fri, 14 Jun 2013 04:00:55 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
 h=message-id:date:from:user-agent:mime-version:to:cc:subject
 :references:in-reply-to:content-type:content-transfer-encoding;
 bh=jCKtUJQG1oIn/J1GPhD2GxSSIaMiTzdi43gmUk5mmOY=;
 b=DWPhNWoUPjDVfYX0tXEHzWpA5lOaKcJI+c1KxE13qJjq8tjV9NNYvHn7iiITkv06ut
 u/aNFyxss0EZ0BRdzLivS7YDdZx+pzHKVoBuJ+ecgZBVTgR0dYW1kuwhhz7pAEqnLC/Z
 pHYIHJ4VARUVXQPgZAyKIeu/N+YxHwYBxOp1gndK5/CXy3lrlusPZqRmmUeD3pWmLCMn
 siWATlGRlNu6lm/zQ5Wvmo8TPiwRAXPHT4Gbv18funeRvwiRT1D4kl/zXMug/rNDb7iE
 o0DdJuiXJNekNFljpKEHbXCjEkYoab+u9iZFsoYBRe/a2hxZVLa+R6q9dL9UAyeT5KhR
 wrRQ==
X-Received: by 10.152.20.66 with SMTP id l2mr920749lae.30.1371207655326;
 Fri, 14 Jun 2013 04:00:55 -0700 (PDT)
Received: from [192.168.1.125] (mau.donbass.com. [92.242.127.250])
 by mx.google.com with ESMTPSA id p20sm663301lbb.17.2013.06.14.04.00.52
 for <multiple recipients>
 (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128);
 Fri, 14 Jun 2013 04:00:54 -0700 (PDT)
Message-ID: <51BAF7E3.4020401@gmail.com>
Date: Fri, 14 Jun 2013 14:00:51 +0300
From: Volodymyr Kostyrko <c.kworr@gmail.com>
User-Agent: Mozilla/5.0 (X11; FreeBSD amd64;
 rv:20.0) Gecko/20100101 Firefox/20.0 SeaMonkey/2.17.1
MIME-Version: 1.0
To: Dr Josef Karthauser <josefkarthauser@gmail.com>, fs@freebsd.org
Subject: Re: Help! :( ZFS panic on boot, importing pool after server crash.
References: <301B4131-F677-4B8D-ABF6-A6D269FE604E@gmail.com>
In-Reply-To: <301B4131-F677-4B8D-ABF6-A6D269FE604E@gmail.com>
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Cc: "freebsd-stable@freebsd.org" <freebsd-stable@freebsd.org>
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 14 Jun 2013 11:00:57 -0000

14.06.2013 12:55, Dr Josef Karthauser:
> Hi, I'm a bit at the end of my tether.
>
> We had a ZFS panic last night on a machine that hosts all my mail and web; it was rebooted and it now panics mounting the ZFS root filesystem.
>
> The call stack info is:
>
> 	solaris assert: ss == NULL, file: /usr/src/sys/modules/zfs/../../cddl/contrib/opensource/uts/common/fs/zfs/space_map.c, line: 109
>
> 	kdb_backtrace
> 	panic
> 	space_map_add
> 	space_map_load
> 	metaslab_activate
> 	metaslab_allocate
> 	zio_dva_allocate	
> 	zio_execute
> 	taskqueue_run_locked
> 	taskqueue_thread_loop
> 	fork_exit
> 	fork_trampoline
>
> I can boot from the live DVD filesystem, but I can only mount the pool read-only without getting the same kernel panic.  This is with FreeBSD 9.0.
>
> The machine is remote, and I don't have access other than through a DRAC console port (so I can't cut and paste; sorry for the poor stack trace).
>
> Is anyone here in the position to advice me how I might process to get this machine mounting and running again in multi-user mode?

There's no official way.

> p.s. the config, btw, is a ZFS mirror on two ad devices. It's got a ZFS root file system.

If you are fairly sure about your devices you can:

1. Remove second disk from pool or create another pool on top of it.

2. Recreate all FS structure on the second disk. You can dump al your FS 
with something like:

zfs list -Ho name | xargs -n1 zfs get -H all | awk 
'BEGIN{shard="";output=""}{if(shard!=$1 && shard!=""){output="zfs 
create";for(param in params)output=output" -o "param"="params[param];print
output" "shard;delete 
params;shard=""}}$4~/local/{params[$2]=$3;shard=$1;next}$2~/type/{shard=$1}END{output="zfs 
create";for(param in params)output=output" -o 
"param"="params[param];print output" "shard;}'

Be sure to rename the pool and change the first line.

3. Rsync all data to the second disk.

4. Try to boot from the second disk.

If everything worked you are free to attach first disk to second one to 
create a mirror again.

-- 
Sphinx of black quartz, judge my vow.

From owner-freebsd-fs@FreeBSD.ORG  Fri Jun 14 11:04:31 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by hub.freebsd.org (Postfix) with ESMTP id 8116ABD8;
 Fri, 14 Jun 2013 11:04:31 +0000 (UTC)
 (envelope-from tomek.cedro@gmail.com)
Received: from mail-pa0-x22e.google.com (mail-pa0-x22e.google.com
 [IPv6:2607:f8b0:400e:c03::22e])
 by mx1.freebsd.org (Postfix) with ESMTP id 5BF721174;
 Fri, 14 Jun 2013 11:04:31 +0000 (UTC)
Received: by mail-pa0-f46.google.com with SMTP id fa11so544830pad.5
 for <multiple recipients>; Fri, 14 Jun 2013 04:04:31 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
 h=mime-version:sender:in-reply-to:references:date
 :x-google-sender-auth:message-id:subject:from:to:cc:content-type;
 bh=Vq7tzr7emA/C3NoeNL7IVbRQmqubNAWOXN3lItraZHI=;
 b=XzrnTAT+rWRXRh0T4rb9nOqnjI98DatQaG/haddByGRv1NOdPrMrFNERxelApMIB0g
 OcGcxBLxne4PxxkATShw+UPuySWzSN8vW6JEp4JxjCdfiA0XFcGGaMWuM2dH2zxckwn9
 2GvRLRw6BMHINpx9N2k21iFzd08V4DxbMHYZO3v6HpF4OhKFWompVU+ImhPFc/i0dLW0
 WDeobGeOwW1aIR8vWOfrNYg3jbzJkQgwmSSkENniKCyt0PODHoQLhBArE//5EppWLQ3s
 sdZTuJdjX7rxIV3dR1E4qqHxxzz9lO0v+dB+fSjPC0vKL1+sjdASEezgHKdssgF/EQoA
 KYSg==
MIME-Version: 1.0
X-Received: by 10.68.247.69 with SMTP id yc5mr2036094pbc.66.1371207871048;
 Fri, 14 Jun 2013 04:04:31 -0700 (PDT)
Sender: tomek.cedro@gmail.com
Received: by 10.68.112.4 with HTTP; Fri, 14 Jun 2013 04:04:30 -0700 (PDT)
In-Reply-To: <51BAF609.9040201@FreeBSD.org>
References: <201306140339.r5E3d5We073332@freefall.freebsd.org>
 <CAFYkXjkJ+GR5TA+DQjq5mQz5+v6Oem=VAzYMONM9--=Z7YSYNQ@mail.gmail.com>
 <51BAF609.9040201@FreeBSD.org>
Date: Fri, 14 Jun 2013 13:04:30 +0200
X-Google-Sender-Auth: lW-obJ5CSj0Kr0sTqi-DV0T8J2s
Message-ID: <CAFYkXj=S7HY4Mt=t7BTsqHGd0Pp3GMFchACgEzX4ZhHnT=wNhA@mail.gmail.com>
Subject: Re: kern/174060: [ext2fs] Ext2FS system crashes (buffer overflow?)
From: CeDeROM <cederom@tlen.pl>
To: Pedro Giffuni <pfg@freebsd.org>
Content-Type: text/plain; charset=UTF-8
Cc: freebsd-fs@freebsd.org
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 14 Jun 2013 11:04:31 -0000

On Fri, Jun 14, 2013 at 12:52 PM, Pedro Giffuni <pfg@freebsd.org> wrote:
> On 14.06.2013 04:14, CeDeROM wrote:
>> Thank you!! :-)
>> --
>> CeDeROM, SQ7MHZ, http://www.tomek.cedro.info
>
> Thank you for the report...
> I am still working on a fix for the reallocblk issue, which
> happens to be a can of worms.
> Pedro.

Wow, I can imagine, I keep my fingers crossed, good luck!! :-)

--
CeDeROM, SQ7MHZ, http://www.tomek.cedro.info

From owner-freebsd-fs@FreeBSD.ORG  Fri Jun 14 11:53:52 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115])
 by hub.freebsd.org (Postfix) with ESMTP id 1BD131E8
 for <freebsd-fs@freebsd.org>; Fri, 14 Jun 2013 11:53:52 +0000 (UTC)
 (envelope-from pfg@FreeBSD.org)
Received: from nm8-vm8.bullet.mail.gq1.yahoo.com
 (nm8-vm8.bullet.mail.gq1.yahoo.com [98.136.218.231])
 by mx1.freebsd.org (Postfix) with ESMTP id DAECE1950
 for <freebsd-fs@freebsd.org>; Fri, 14 Jun 2013 11:53:51 +0000 (UTC)
Received: from [98.137.12.191] by nm8.bullet.mail.gq1.yahoo.com with NNFMP;
 14 Jun 2013 10:53:00 -0000
Received: from [208.71.42.191] by tm12.bullet.mail.gq1.yahoo.com with NNFMP;
 14 Jun 2013 10:53:00 -0000
Received: from [127.0.0.1] by smtp202.mail.gq1.yahoo.com with NNFMP;
 14 Jun 2013 10:53:00 -0000
X-Yahoo-Newman-Id: 78476.65236.bm@smtp202.mail.gq1.yahoo.com
X-Yahoo-Newman-Property: ymail-3
X-YMail-OSG: OMT0QVMVM1nfVLyOrfQdEgwqXhc5LQZxozAGEiHr7aYnbOR
 83QeuAUPCAwx77JCQenFY9_c9YdMPP6aMMpw8V6_z2cszQi.Zw.LdBrJTLBR
 R6BFZf5fxGv62hszvqwfg2LWVGMEWYMJZQ_1mt1ewP5jQ80LBITt7lN.dyGj
 uPqU8zP7LJ_NgMhYD4i9FUm3_raJP8CiWp2Fru3evdmodZcSFH0xLuwWS4sH
 AK.fNr1fEGvOWWF21.KFhaZxL35CDjD4KTP1csm5gWJ.02ELJvFzm_b2pXBP
 SFa9tt_56E3dfw8SRLw27odDNusgpIFk8X7SBjNtcKY0dScYpIlBvl9CWIyr
 7wiMW.jpGHgire2esR1OUgSJzQZqqWeFn39odhqpjXj6ljDkkP3C9ZwYT53d
 itpCDPx_12VHwFqt5QylvY_BKFIFfLSg7jQjmkd6ZgSP28CH1ae2t2ure46B
 neQfb4BTISl9_CXky.Zpe05Tis2SvUkygd8aPzVLIuG7E3.e6Td.tPXLihfZ
 PmBaan4SDCamSMs8J_8_OWun2QO5fJ4Dls0XkO.GZzJtz3y7UAjuRnvMdD4C
 EHsHrFNVV7dRNpiOz8q0-
X-Yahoo-SMTP: xcjD0guswBAZaPPIbxpWwLcp9Unf
X-Rocket-Received: from [192.168.0.102] (pfg@190.157.126.109 with )
 by smtp202.mail.gq1.yahoo.com with SMTP; 14 Jun 2013 03:53:00 -0700 PDT
Message-ID: <51BAF609.9040201@FreeBSD.org>
Date: Fri, 14 Jun 2013 05:52:57 -0500
From: Pedro Giffuni <pfg@FreeBSD.org>
User-Agent: Mozilla/5.0 (X11; FreeBSD amd64;
 rv:17.0) Gecko/20130407 Thunderbird/17.0.5
MIME-Version: 1.0
To: CeDeROM <cederom@tlen.pl>
Subject: Re: kern/174060: [ext2fs] Ext2FS system crashes (buffer overflow?)
References: <201306140339.r5E3d5We073332@freefall.freebsd.org>
 <CAFYkXjkJ+GR5TA+DQjq5mQz5+v6Oem=VAzYMONM9--=Z7YSYNQ@mail.gmail.com>
In-Reply-To: <CAFYkXjkJ+GR5TA+DQjq5mQz5+v6Oem=VAzYMONM9--=Z7YSYNQ@mail.gmail.com>
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Cc: freebsd-fs@freebsd.org
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 14 Jun 2013 11:53:52 -0000

On 14.06.2013 04:14, CeDeROM wrote:
> Thank you!! :-)
>
> --
> CeDeROM, SQ7MHZ, http://www.tomek.cedro.info

Thank you for the report...

I am still working on a fix for the reallocblk issue, which
happens to be a can of worms.

Pedro.

From owner-freebsd-fs@FreeBSD.ORG  Fri Jun 14 12:51:44 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115])
 by hub.freebsd.org (Postfix) with ESMTP id 7E8ACD3F;
 Fri, 14 Jun 2013 12:51:44 +0000 (UTC)
 (envelope-from josefkarthauser@gmail.com)
Received: from mail-wg0-x231.google.com (mail-wg0-x231.google.com
 [IPv6:2a00:1450:400c:c00::231])
 by mx1.freebsd.org (Postfix) with ESMTP id E52521D45;
 Fri, 14 Jun 2013 12:51:43 +0000 (UTC)
Received: by mail-wg0-f49.google.com with SMTP id a12so461576wgh.28
 for <multiple recipients>; Fri, 14 Jun 2013 05:51:43 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
 h=subject:mime-version:content-type:from:in-reply-to:date:cc
 :content-transfer-encoding:message-id:references:to:x-mailer;
 bh=WXoR6eZeOIXmv9TKhiys0rQMPHQV+SdWKECKdmhBQ/M=;
 b=RIsPWINv+dXnITu0ky8QQj+VjColWoba5hlA+bUB1kApkL4UmN1Xhbhm/r2jtivvtV
 rn0IfYFAHFwFCJIUMId9Bv1biS3XM6P5R9HQ2kj+vej6a5Zi2eujgj1MgacFzikfMy8A
 5wcw7mZ/pr/Gl9K8fsyUBl3m0sK0NbHYHPFuv5/mVtbJ0915yrri8X9dYujp1Ry4ExSh
 YMyeU7eNq8MecuZJglFaX8yVStNLWwQMaqroF5WlbJwBKoUylNmDC48FeFMS7E9Nmv7T
 q39BynSUmaX0tACHJEFpSjgLKIRG39Q8QgJt9hfxbLnQbcSNRY05SLrI16SqA/dp3bda
 Ldgw==
X-Received: by 10.194.242.136 with SMTP id wq8mr1367543wjc.60.1371214303095;
 Fri, 14 Jun 2013 05:51:43 -0700 (PDT)
Received: from ?IPv6:2001:8b0:3a3::ad3c:495b:5c7b:32f1?
 ([2001:8b0:3a3:0:ad3c:495b:5c7b:32f1])
 by mx.google.com with ESMTPSA id fo10sm3008607wib.8.2013.06.14.05.51.40
 for <multiple recipients>
 (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128);
 Fri, 14 Jun 2013 05:51:42 -0700 (PDT)
Subject: Re: Help! :( ZFS panic on boot, importing pool after server crash.
Mime-Version: 1.0 (Mac OS X Mail 6.3 \(1503\))
Content-Type: text/plain; charset=us-ascii
From: Dr Josef Karthauser <josefkarthauser@gmail.com>
In-Reply-To: <51BAF7E3.4020401@gmail.com>
Date: Fri, 14 Jun 2013 13:51:40 +0100
Content-Transfer-Encoding: quoted-printable
Message-Id: <9AF22029-B753-4D74-A798-11C0A1C55D88@gmail.com>
References: <301B4131-F677-4B8D-ABF6-A6D269FE604E@gmail.com>
 <51BAF7E3.4020401@gmail.com>
To: Volodymyr Kostyrko <c.kworr@gmail.com>
X-Mailer: Apple Mail (2.1503)
Cc: "freebsd-stable@freebsd.org" <freebsd-stable@freebsd.org>, fs@freebsd.org
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 14 Jun 2013 12:51:44 -0000

On 14 Jun 2013, at 12:00, Volodymyr Kostyrko <c.kworr@gmail.com> wrote:

> 14.06.2013 12:55, Dr Josef Karthauser:
>> Hi, I'm a bit at the end of my tether.

>> p.s. the config, btw, is a ZFS mirror on two ad devices. It's got a =
ZFS root file system.
>=20
> If you are fairly sure about your devices you can:
>=20
> 1. Remove second disk from pool or create another pool on top of it.
>=20
> 2. Recreate all FS structure on the second disk. You can dump al your =
FS with something like:
>=20

Great. Thanks for that.

Have you got a hint as to how I can get access to the root file system? =
It's currently set to have a legacy mount point.  Which means that when =
I import the pool:

	# zfs import -o readonly=3Don -o altroot=3D/tmp/zfs -f poolname

the root filesystem is missing.  Then if I try and set the mount point:

	#zfs set mountpoint=3D/tmp/zfs2 poolname

it just sits there; probably because the command is blocking on the R/O =
pool, or something.

How do I temporarily remount the root filesystem so that I can get =
access to the files?

Thanks,
Joe


From owner-freebsd-fs@FreeBSD.ORG  Fri Jun 14 13:06:42 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115])
 by hub.freebsd.org (Postfix) with ESMTP id EFC17928;
 Fri, 14 Jun 2013 13:06:42 +0000 (UTC)
 (envelope-from c.kworr@gmail.com)
Received: from mail-lb0-f174.google.com (mail-lb0-f174.google.com
 [209.85.217.174])
 by mx1.freebsd.org (Postfix) with ESMTP id 48E001E20;
 Fri, 14 Jun 2013 13:06:42 +0000 (UTC)
Received: by mail-lb0-f174.google.com with SMTP id x10so577997lbi.33
 for <multiple recipients>; Fri, 14 Jun 2013 06:06:35 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
 h=message-id:date:from:user-agent:mime-version:to:cc:subject
 :references:in-reply-to:content-type:content-transfer-encoding;
 bh=/61i0b7zhMma+Wf2YiJn0SDe+BaC8XnCfd2m+7eHpJU=;
 b=CoGk7IcI0PcqPOmd6thaf6JWHGMG7oNO6x9lSjlDgdpjvGFprMgnXWmcdxMZVANF1v
 v4b6q97GIpDCfgAQgyEywwEfKb4xbnNBpf7WBanzIAig0ltnvSaRBVMdRGha7oUvx9FK
 bgmqRlJjozhEz8Rbf8jt1a1XM1/UaWeg6HlUV+nGz0TOXz/cK2IINOzCgTIwfcylcAf4
 YCHEh/e7uJhcYppq7JGBG+6rCW+R1cn3WcRkzwUQSstl5WVp553aDmLAJgH/6kjVbTlf
 KO2YcDFdWRZYb9dOc4iFNBslpXK5ZQEbC9mAq+CrsQfz37GnMAX4aFAW7tiQGZMnIILd
 FCXQ==
X-Received: by 10.112.144.6 with SMTP id si6mr1090609lbb.61.1371215195200;
 Fri, 14 Jun 2013 06:06:35 -0700 (PDT)
Received: from [192.168.1.125] (mau.donbass.com. [92.242.127.250])
 by mx.google.com with ESMTPSA id p20sm842159lbb.17.2013.06.14.06.06.34
 for <multiple recipients>
 (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128);
 Fri, 14 Jun 2013 06:06:34 -0700 (PDT)
Message-ID: <51BB1559.7050803@gmail.com>
Date: Fri, 14 Jun 2013 16:06:33 +0300
From: Volodymyr Kostyrko <c.kworr@gmail.com>
User-Agent: Mozilla/5.0 (X11; FreeBSD amd64;
 rv:20.0) Gecko/20100101 Firefox/20.0 SeaMonkey/2.17.1
MIME-Version: 1.0
To: Dr Josef Karthauser <josefkarthauser@gmail.com>
Subject: Re: Help! :( ZFS panic on boot, importing pool after server crash.
References: <301B4131-F677-4B8D-ABF6-A6D269FE604E@gmail.com>
 <51BAF7E3.4020401@gmail.com> <9AF22029-B753-4D74-A798-11C0A1C55D88@gmail.com>
In-Reply-To: <9AF22029-B753-4D74-A798-11C0A1C55D88@gmail.com>
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Cc: "freebsd-stable@freebsd.org" <freebsd-stable@freebsd.org>, fs@freebsd.org
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 14 Jun 2013 13:06:43 -0000

14.06.2013 15:51, Dr Josef Karthauser:
> On 14 Jun 2013, at 12:00, Volodymyr Kostyrko <c.kworr@gmail.com> wrote:
>
>> 14.06.2013 12:55, Dr Josef Karthauser:
>>> Hi, I'm a bit at the end of my tether.
>
>>> p.s. the config, btw, is a ZFS mirror on two ad devices. It's got a ZFS root file system.
>>
>> If you are fairly sure about your devices you can:
>>
>> 1. Remove second disk from pool or create another pool on top of it.
>>
>> 2. Recreate all FS structure on the second disk. You can dump al your FS with something like:
>>
>
> Great. Thanks for that.
>
> Have you got a hint as to how I can get access to the root file system? It's currently set to have a legacy mount point.  Which means that when I import the pool:
>
> 	# zfs import -o readonly=on -o altroot=/tmp/zfs -f poolname
>
> the root filesystem is missing.  Then if I try and set the mount point:
>
> 	#zfs set mountpoint=/tmp/zfs2 poolname
>
> it just sits there; probably because the command is blocking on the R/O pool, or something.
>
> How do I temporarily remount the root filesystem so that I can get access to the files?

mount -t zfs <pool-name> <mountpoint>

Personally when I need to work with such pools I first import the pool 
with -N (nomount) option, then I mount root fs by hand and after that 
goes `zfs mount -a` which handles everything else.

-- 
Sphinx of black quartz, judge my vow.

From owner-freebsd-fs@FreeBSD.ORG  Fri Jun 14 17:36:33 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115])
 by hub.freebsd.org (Postfix) with ESMTP id BBD30904;
 Fri, 14 Jun 2013 17:36:33 +0000 (UTC)
 (envelope-from Devin.Teske@fisglobal.com)
Received: from mx1.fisglobal.com (mx1.fisglobal.com [199.200.24.190])
 by mx1.freebsd.org (Postfix) with ESMTP id 8C39B1F14;
 Fri, 14 Jun 2013 17:36:33 +0000 (UTC)
Received: from smtp.fisglobal.com ([10.132.206.31])
 by ltcfislmsgpa06.fnfis.com (8.14.5/8.14.5) with ESMTP id r5EHaWlF029633
 (version=TLSv1/SSLv3 cipher=AES128-SHA bits=128 verify=NOT);
 Fri, 14 Jun 2013 12:36:32 -0500
Received: from LTCFISWMSGMB21.FNFIS.com ([10.132.99.23]) by
 LTCFISWMSGHT03.FNFIS.com ([10.132.206.31]) with mapi id 14.02.0309.002; Fri,
 14 Jun 2013 12:36:06 -0500
From: "Teske, Devin" <Devin.Teske@fisglobal.com>
To: "freebsd-fs@freebsd.org" <freebsd-fs@freebsd.org>
Subject: ZFS Union
Thread-Topic: ZFS Union
Thread-Index: AQHOaSWmuit0fMTVvU+5jica75CJmg==
Date: Fri, 14 Jun 2013 17:36:05 +0000
Message-ID: <13CA24D6AB415D428143D44749F57D7201F81804@ltcfiswmsgmb21>
Accept-Language: en-US
Content-Language: en-US
X-MS-Has-Attach: 
X-MS-TNEF-Correlator: 
x-originating-ip: [10.132.253.126]
Content-Type: text/plain; charset="Windows-1252"
Content-ID: <875536F73867504C99A7BE0F16643FDE@fisglobal.com>
Content-Transfer-Encoding: quoted-printable
MIME-Version: 1.0
X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:5.10.8794, 1.0.431,
 0.0.0000
 definitions=2013-06-14_06:2013-06-14,2013-06-14,1970-01-01 signatures=0
Cc: Devin Teske <dteske@freebsd.org>
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
Reply-To: Devin Teske <dteske@freebsd.org>
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 14 Jun 2013 17:36:33 -0000

Hi List,

I had an idea recently that I thought might be worth chasing.

I've thought for a long time that it would be really great if I could have =
(for the purpose of jails):

+ ZFS filesystem /vm/master
+ ZFS filesystem /vm/unit1
+ ZFS filesystem /vm/unit2
+ ZFS filesystem /vm/unit3

Unlike a ZFS snapshot/clone system, where changes in /vm/master made beyond=
 the snapshot for which /vm/unit{1,2,3} were cloned-from do not effect /vm/=
unit{1,2,3}=85

What if there was a way to layer /vm/unit{1,2,3} in a union manner to be th=
e top-layer above /vm/master.

I believe that UnionFS isn't of help here specifically because I believe it=
 to not support the following use-case example:

Step 1. touch /vm/master/foo

ASIDE: /vm/unit1/foo does not exist prior to Step 1

Step 2. See that now /vm/unit{1,2,3}/foo exists

(that was the litmus test for any layering filesystem, now comes the part t=
hat I believe UnionFS fails)

Step 3. Now, rm /vm/unit1/foo

Step 4. See that /vm/master/foo is still there, but /vm/unit1/foo remains g=
one

Step 5. Counter to Step 4 above, See that /vm/unit2/foo and /vm/unit3/foo s=
till exist

In other words=85 the filesystem should be able to keep track of unlinked f=
iles as a black-list.

Enhancing ZFS to support union is quite sexy, because when you go down the =
rabbit hole of Steps 3-5 above, you realize you may want to (as an administ=
rator) "reclaim" files from a lower layer. This could perhaps be tacked ont=
o the "zfs" utility (whereas if enhancing UnionFS, a new utility would need=
 to be born).

I would imagine a "zfs reclaim" (hypothetical fictional command) to allow t=
he path from lower layers to become visible again, so-long-as a lower-layer=
 hasn't black-listed it from an even lower-layer.

The end-run production use of this would be to allow jails to "inherit" fil=
es from a lower layer but unlike a snapshot system, continue to inherit per=
petually in realtime. Going into a lower layer and making a change would im=
mediately percolate that change to all the jails layered on top.

It would also mean nice lean deltas. Layering "/foo" on top of "/" would be=
 a quick and dirty way of cloning your base system into a new jail where al=
l the writes go off to that directory (something existing UnionFS technolog=
ies already do) and -- something not done by existing UnionFS technologies =
-- unlinked files will not appear (giving the idea that, while chroot'd or =
jail'd into that directory, you have more control over your universe becaus=
e you "rm" a file and it goes away; but of course [hypothetically] an admin=
istrator in parent host to the jail can "reclaim" it for you from a lower l=
ayer if you want him/her to do-so).

Thoughts?
--=20
Devin

_____________
The information contained in this message is proprietary and/or confidentia=
l. If you are not the intended recipient, please: (i) delete the message an=
d all copies; (ii) do not disclose, distribute or use the message in any ma=
nner; and (iii) notify the sender immediately. In addition, please be aware=
 that any message addressed to our domain is subject to archiving and revie=
w by persons other than the intended recipient. Thank you.

From owner-freebsd-fs@FreeBSD.ORG  Fri Jun 14 18:00:34 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115])
 by hub.freebsd.org (Postfix) with ESMTP id 39262BC8
 for <freebsd-fs@freebsd.org>; Fri, 14 Jun 2013 18:00:34 +0000 (UTC)
 (envelope-from feld@feld.me)
Received: from out3-smtp.messagingengine.com (out3-smtp.messagingengine.com
 [66.111.4.27]) by mx1.freebsd.org (Postfix) with ESMTP id 0DAF01FF9
 for <freebsd-fs@freebsd.org>; Fri, 14 Jun 2013 18:00:33 +0000 (UTC)
Received: from compute4.internal (compute4.nyi.mail.srv.osa [10.202.2.44])
 by gateway1.nyi.mail.srv.osa (Postfix) with ESMTP id 07AB9207D3
 for <freebsd-fs@freebsd.org>; Fri, 14 Jun 2013 14:00:33 -0400 (EDT)
Received: from frontend1.nyi.mail.srv.osa ([10.202.2.160])
 by compute4.internal (MEProxy); Fri, 14 Jun 2013 14:00:33 -0400
DKIM-Signature: v=1; a=rsa-sha1; c=relaxed/relaxed; d=feld.me; h=
 content-type:to:subject:references:date:mime-version
 :content-transfer-encoding:from:message-id:in-reply-to; s=
 mesmtp; bh=XrNUxlD/NZOwpUt1+J/0OMoNnik=; b=i8w9k6KU2/mOpGk+LK9Cp
 8o5nUlQ07kLo2OwDRTAbIkk6FhYj0Od4RrmZHsQ2CNYNgXrtTp+Y7RbvMT2R35l9
 ttY0lcSp/zxG0y6U+cNszygBzSMwtg7xgeqvJdFtyi2wHLDjtyiZld+AGXN9hea5
 XtNCT6vgMSm85GmD+CDo/g=
DKIM-Signature: v=1; a=rsa-sha1; c=relaxed/relaxed; d=
 messagingengine.com; h=content-type:to:subject:references:date
 :mime-version:content-transfer-encoding:from:message-id
 :in-reply-to; s=smtpout; bh=XrNUxlD/NZOwpUt1+J/0OMoNnik=; b=iZPq
 5L9Y7OBbx8u7URG8SlGR/p2TFsJfHE86SnaQgGSHU1U3XE42LYKP0N0Z3YZUDO4S
 Q+Q3NmR7txfFI4nXkNHwkB6GRUkb6h1+BIia0naOI6lDimcTGhEz84ROTWCXsQMe
 4XzxQUKJ1D7VDi4orxOx0iktSBSt+/lvTrdqmFk=
X-Sasl-enc: pTBzqYqoLI/ys6ynXX6XQuCtVNu9uCadcZBQjxd+uSxn 1371232832
Received: from tech304.office.supranet.net (unknown [66.170.8.18])
 by mail.messagingengine.com (Postfix) with ESMTPA id C3477C00E84
 for <freebsd-fs@freebsd.org>; Fri, 14 Jun 2013 14:00:32 -0400 (EDT)
Content-Type: text/plain; charset=us-ascii; format=flowed; delsp=yes
To: "freebsd-fs@freebsd.org" <freebsd-fs@freebsd.org>
Subject: Re: ZFS Union
References: <13CA24D6AB415D428143D44749F57D7201F81804@ltcfiswmsgmb21>
Date: Fri, 14 Jun 2013 13:00:32 -0500
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
From: "Mark Felder" <feld@feld.me>
Message-ID: <op.wyoka6l534t2sn@tech304.office.supranet.net>
In-Reply-To: <13CA24D6AB415D428143D44749F57D7201F81804@ltcfiswmsgmb21>
User-Agent: Opera Mail/12.15 (FreeBSD)
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 14 Jun 2013 18:00:34 -0000

On Fri, 14 Jun 2013 12:36:05 -0500, Teske, Devin  
<Devin.Teske@fisglobal.com> wrote:

> Thoughts?

Yes. A unanimous yes.

From owner-freebsd-fs@FreeBSD.ORG  Fri Jun 14 18:51:51 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115])
 by hub.freebsd.org (Postfix) with ESMTP id 4E5B1B09;
 Fri, 14 Jun 2013 18:51:51 +0000 (UTC)
 (envelope-from mckusick@mckusick.com)
Received: from chez.mckusick.com (chez.mckusick.com
 [IPv6:2001:5a8:4:7e72:4a5b:39ff:fe12:452])
 by mx1.freebsd.org (Postfix) with ESMTP id 2D809123E;
 Fri, 14 Jun 2013 18:51:51 +0000 (UTC)
Received: from chez.mckusick.com (localhost [127.0.0.1])
 by chez.mckusick.com (8.14.3/8.14.3) with ESMTP id r5EIpkl2054401;
 Fri, 14 Jun 2013 11:51:46 -0700 (PDT)
 (envelope-from mckusick@chez.mckusick.com)
Message-Id: <201306141851.r5EIpkl2054401@chez.mckusick.com>
To: Devin Teske <dteske@freebsd.org>,
 "Teske, Devin" <Devin.Teske@fisglobal.com>
Subject: Re: ZFS Union 
In-reply-to: <13CA24D6AB415D428143D44749F57D7201F81804@ltcfiswmsgmb21> 
Date: Fri, 14 Jun 2013 11:51:46 -0700
From: Kirk McKusick <mckusick@mckusick.com>
X-Spam-Status: No, score=0.0 required=5.0 tests=MISSING_MID, UNPARSEABLE_RELAY
 autolearn=failed version=3.2.5
X-Spam-Checker-Version: SpamAssassin 3.2.5 (2008-06-10) on chez.mckusick.com
Cc: "freebsd-fs@freebsd.org" <freebsd-fs@freebsd.org>
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 14 Jun 2013 18:51:51 -0000

The union filesystem uses "whiteout" to remove files that appear
in a lower layer. In your example, when you `rm /vm/unit1/foo'
what happens is that a whiteout entry gets created for /vm/unit1/foo.
(Whiteout is implemented by creating a name with inode number 1;
Inode 1 is the "anti-inode" which when combined with any other inode
disappears in a cloud of greasy smoke.) Thus /vm/master/foo continues
to exist and is visible as /vm/unit2/foo and /vm/unit3/foo. You can
"recover" /vm/unit1/foo using `rm -W /vm/unit1/foo' which will remove
the whiteout entry causing /vm/master/foo to once again be visible
as /vm/unit1/foo. In short, I believe that the existing union filesystem
will do what you want to do.

	Kirk McKusick

From owner-freebsd-fs@FreeBSD.ORG  Fri Jun 14 19:01:24 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115])
 by hub.freebsd.org (Postfix) with ESMTP id 56110E25;
 Fri, 14 Jun 2013 19:01:24 +0000 (UTC)
 (envelope-from Devin.Teske@fisglobal.com)
Received: from mx1.fisglobal.com (mx1.fisglobal.com [199.200.24.190])
 by mx1.freebsd.org (Postfix) with ESMTP id 24AD812B2;
 Fri, 14 Jun 2013 19:01:23 +0000 (UTC)
Received: from smtp.fisglobal.com ([10.132.206.17])
 by ltcfislmsgpa03.fnfis.com (8.14.5/8.14.5) with ESMTP id r5EJ1Ej6015037
 (version=TLSv1/SSLv3 cipher=AES128-SHA bits=128 verify=NOT);
 Fri, 14 Jun 2013 14:01:16 -0500
Received: from LTCFISWMSGMB21.FNFIS.com ([10.132.99.23]) by
 LTCFISWMSGHT06.FNFIS.com ([10.132.206.17]) with mapi id 14.02.0309.002; Fri,
 14 Jun 2013 14:01:11 -0500
From: "Teske, Devin" <Devin.Teske@fisglobal.com>
To: Kirk McKusick <mckusick@mckusick.com>
Subject: Re: ZFS Union 
Thread-Topic: ZFS Union 
Thread-Index: AQHOaTA7lOdSGgtNJkChFq/RufGZx5k15IEA
Date: Fri, 14 Jun 2013 19:01:11 +0000
Message-ID: <13CA24D6AB415D428143D44749F57D7201F81A60@ltcfiswmsgmb21>
References: <201306141851.r5EIpkl2054401@chez.mckusick.com>
In-Reply-To: <201306141851.r5EIpkl2054401@chez.mckusick.com>
Accept-Language: en-US
Content-Language: en-US
X-MS-Has-Attach: 
X-MS-TNEF-Correlator: 
x-originating-ip: [10.132.253.126]
MIME-Version: 1.0
X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:5.10.8794, 1.0.431,
 0.0.0000
 definitions=2013-06-14_07:2013-06-14,2013-06-14,1970-01-01 signatures=0
Content-Type: text/plain; charset="Windows-1252"
Content-Transfer-Encoding: quoted-printable
X-Content-Filtered-By: Mailman/MimeDel 2.1.14
Cc: "freebsd-fs@freebsd.org" <freebsd-fs@freebsd.org>,
 Devin Teske <dteske@freebsd.org>
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
Reply-To: Devin Teske <dteske@freebsd.org>
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 14 Jun 2013 19:01:24 -0000


On Jun 14, 2013, at 11:51 AM, Kirk McKusick wrote:

The union filesystem uses "whiteout" to remove files that appear
in a lower layer. In your example, when you `rm /vm/unit1/foo'
what happens is that a whiteout entry gets created for /vm/unit1/foo.
(Whiteout is implemented by creating a name with inode number 1;
Inode 1 is the "anti-inode" which when combined with any other inode
disappears in a cloud of greasy smoke.)

WINO=85 yes=85 just as your response came in, I was finding the code=85

http://svnweb.freebsd.org/base/head/sys/ufs/ufs/ufs_lookup.c?r1=3D156418&r2=
=3D160269

  if (ep->d_ino =3D=3D 0 ||
-     (ep->d_ino =3D=3D WINO &&
+     (ep->d_ino =3D=3D WINO && namlen =3D=3D dirp->d_namlen &&
       bcmp(ep->d_name, dirp->d_name, dirp->d_namlen) =3D=3D 0)) {


Thus /vm/master/foo continues
to exist and is visible as /vm/unit2/foo and /vm/unit3/foo. You can
"recover" /vm/unit1/foo using `rm -W /vm/unit1/foo' which will remove
the whiteout entry causing /vm/master/foo to once again be visible
as /vm/unit1/foo.

Beautiful=85 that was my next consternation after seeing that it was in the=
 filesystem layer (how to reset the value from WINO to something that will =
allow the lower layer to bleed through).


In short, I believe that the existing union filesystem
will do what you want to do.

Kirk McKusick

Absolutely right=85 thank you much Sir!

I didn't know about "rm -W" until today.
--
Devin

_____________
The information contained in this message is proprietary and/or confidentia=
l. If you are not the intended recipient, please: (i) delete the message an=
d all copies; (ii) do not disclose, distribute or use the message in any ma=
nner; and (iii) notify the sender immediately. In addition, please be aware=
 that any message addressed to our domain is subject to archiving and revie=
w by persons other than the intended recipient. Thank you.

From owner-freebsd-fs@FreeBSD.ORG  Sat Jun 15 14:52:45 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115])
 by hub.freebsd.org (Postfix) with ESMTP id 23FF62AD
 for <freebsd-fs@freebsd.org>; Sat, 15 Jun 2013 14:52:45 +0000 (UTC)
 (envelope-from break19@gmail.com)
Received: from mail-yh0-x232.google.com (mail-yh0-x232.google.com
 [IPv6:2607:f8b0:4002:c01::232])
 by mx1.freebsd.org (Postfix) with ESMTP id DDEBB1A8E
 for <freebsd-fs@freebsd.org>; Sat, 15 Jun 2013 14:52:44 +0000 (UTC)
Received: by mail-yh0-f50.google.com with SMTP id i72so525901yha.37
 for <freebsd-fs@freebsd.org>; Sat, 15 Jun 2013 07:52:44 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
 h=message-id:date:from:user-agent:mime-version:to:subject:references
 :in-reply-to:content-type:content-transfer-encoding;
 bh=F+SWI9D/WnONJLM30NukGnOlFceyXrTv7xvmqREmrBU=;
 b=fkyIM4EuGsnZEiXJJhg3JxGHuFyrAIEk4GIv2OHMjKjO8qZvWZKQrlZpL9x3kpLUi5
 gcCu4mRrZN0zsy5WkYPCgzW0f/bXuZdY7fd5tfIlkEqTyN0tzmv0ghuZSdbD0CKS+PGJ
 ymzTYMZpApwxtweUyxioHSKEctC85ZjkU53whMf4UNT5A0C8TfKdbtbJ5RNZjctZF5YP
 ulS7h+9rJpWW6v3YXKdKNEe4Rg69yu+HIy25PSbvmZRJYdM4i34RkbBcfltFd44eKbi5
 TMzj53oW1kj6gq2ffLbhip5evdF1VTttxsLICoQUOTsiDeeGfYfY2AIHraHdSYx3j2Bv
 lz1g==
X-Received: by 10.236.17.165 with SMTP id j25mr4094624yhj.89.1371307964174;
 Sat, 15 Jun 2013 07:52:44 -0700 (PDT)
Received: from [192.168.2.16] (173-17-218-61.client.mchsi.com. [173.17.218.61])
 by mx.google.com with ESMTPSA id a62sm10842574yhk.4.2013.06.15.07.52.42
 for <freebsd-fs@freebsd.org>
 (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128);
 Sat, 15 Jun 2013 07:52:43 -0700 (PDT)
Message-ID: <51BC7FB1.8040000@gmail.com>
Date: Sat, 15 Jun 2013 09:52:33 -0500
From: Chuck Burns <break19@gmail.com>
User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64;
 rv:17.0) Gecko/20130509 Thunderbird/17.0.6
MIME-Version: 1.0
To: freebsd-fs@freebsd.org
Subject: Re: Changing the default for ZFS atime to off?
References: <16FEF774EE8E4100AD2CAEC65276A49D@multiplay.co.uk>
 <20130608213331.GB18201@icarus.home.lan>
 <01719722FD8A41B4A4366611972A703A@multiplay.co.uk>
 <20130609001532.GA21540@icarus.home.lan>
 <459E2FCADB4E40079066E4ABDBE47AFE@multiplay.co.uk>
In-Reply-To: <459E2FCADB4E40079066E4ABDBE47AFE@multiplay.co.uk>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sat, 15 Jun 2013 14:52:45 -0000

On 6/8/2013 7:48 PM, Steven Hartland wrote:
> ----- Original Message ----- From: "Jeremy Chadwick" <jdc@koitsu.org>
>
>>> To clarify when I say "by default" this only effect newly created
>>> pools / volumes, it would not effect any existing volumes and hence
>>> couldn't break existing installs.
>>>
>>> As I mentioned there are apps, mainly mail focused ones, which rely
>>> on on atime, but thats easy to keep working by ensuring these are
>>> stored on volumes which do have atime=on.
>>
>> The problem is that your proposed change (to set atime=off as the
>> default) means the administrator:
>>
>> 1. Has to be aware that the default is now atime=off going forward,
>> and thus,
>>
>> 2. Must manually set atime=on on filesystems where it matters, which may
>> also mean creating a separate filesystem just for certain
>> purposes/tasks (which may not be possible with UFS after-the-fact).
>>
>> The reality of #1, I'm sorry to say, is that barring some kind of mass
>> announcement on every single FreeBSD mailing list (I don't mean just
>> -announce, I mean EVERY LIST) to inform people of this change, as well
>> as some gigantic 72pt font text on www.freebsd.org telling people, most
>> people are not going to know about it.  I know that reality doesn't work
>> in your favour, but it's how things are.  A single line in the Release
>> Notes is going to be overlooked.
>>
>> I cannot even begin to cover all the situations/cases of #2, so I'll
>> just do a brain dump as I think:
>>
>> i) ZFS: You might think this is as easy as creating a separate
>> filesystem that's for /var/mail -- it is not that simple.  Many people
>> have their mail delivered to mboxes within $HOME, i.e. ~user/Mail, and
>> /var/mail never gets used.  It worsens when you consider people are
>> being insane with ZFS filesystems, such as creating a separate
>> filesystem for every single user on the system.
>>
>> ii) With UFS, you might think it's as easy as removing noatime from
>> /etc/fstab for /var, but it isn't -- same situation as (i).
>>
>> iii) There is the situation with UFS and bsdinstall where you can choose
>> the "quick and easy" partitioning/filesystem setu results in one big /
>> and that's all.  Now the admin has to remove noatime from /etc/fstab and
>> basically loses any benefit noatime provided per your proposal.
>
> The initial question was for ZFS, with UFS being secondary, but yes
> UFS isn't as easy as UFS.
>
>> iv) It is very common for setups to have two separate places for mail
>> storage, i.e. the default is /var/mail/username, but users with a
>> .forward and/or .procmailrc may be siphoning mail to $HOME/Mail/folder
>> instead.  So now you have two filesystems where atime needs to be
>> enabled.
>
> Could that not be covered by: /var /home for the common case at least?
>
>> v) Non-mail-related stuff, meaning there may actually be users and
>> administrators who rely upon access times to indicate something.
>>
>> None of these touche base on what Bruce Evans stated too: that atime=on
>> by default is a requirement to be POSIX-compliant.  That's also
>> confirmed here at Wikipedia WRT stat(2) (which also mentions some other
>> software that relies on atime too):
>>
>> http://en.wikipedia.org/wiki/Stat_%28system_call%29#Criticism_of_atime
>
> So yes others think its a less than stellar idea ;-)
>
>>> The messaging and changes to installers which support ZFS root
>>> installs, such as mfsbsd, would need to be included in this but
>>> I don't see that as a blocker.
>>
>> See above -- I think you are assuming mail always gets stored on one
>> filesystem, which quite often not the case.
>
> Its still seems simple to fix, see above.
>
>>> I suggesting this now as it seems like its time to consider that
>>> the vast majority of systems don't need this option for all volumes
>>> and the performance and reliability of systems are in question if
>>> we don't consider it.
>>
>> My personal feeling is that this is extremely hasty -- do we have any
>> idea how much software relies on atime?  Because I certainly don't.
>
> Hasty no, just opening the idea up for discussion ;-)
>
>> Sorry for sounding rude (I don't mean to be, I just can't be bothered to
>> phrase it differently), but: were you yourself even aware that atime was
>> relied upon/used for classic UNIX mailboxes?  I get the impression you
>> weren't, which just strengthens my point.
>
> Yes I am aware, which is why I mentioned mail in my original post.
>
>> For example, I use atime everywhere, simply because I do not know what
>> might break/stop working reliably if atime was disabled on some
>> filesystems.  I do not know the internals of every single daemon and
>> program on a system (does anyone?), so I must take the stance of
>> choosing stability/reliability.
>
> I did already mention, we set atime=off on everything and have never had
> an issue, there's been similar mentions on the illumos list too.
>
> Now that doesn't mean its suitable for everthing, mail has already been
> mentioned, but thats still seems like a small set of use cases where its
> required.
>
> I guess where I'm coming from is making better for the vast majority.
>
> I believe there's no point in configuring for a rare case by default
> when it will make the much more common case worse.
>
>> All said and done: I do appreciate having this discussion, particularly
>> publicly on a list.  Too many "key changes" in FreeBSD in the past few
>> years have been results of closed-door meetings of sorts (private mail
>> or in-person *con meetings), so the fact this is public is good.
>
> Everyone has their different uses of any OS, different experience etc,
> so things like this need open discussion IMO.
>
>    Regards
>    Steve
>
> ================================================
> This e.mail is private and confidential between Multiplay (UK) Ltd. 
> and the person or entity to whom it is addressed. In the event of 
> misdirection, the recipient is prohibited from using, copying, 
> printing or otherwise disseminating it or any information contained in 
> it.
> In the event of misdirection, illegible or incomplete transmission 
> please telephone +44 845 868 1337
> or return the E.mail to postmaster@multiplay.co.uk.
>
> _______________________________________________
> freebsd-fs@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-fs
> To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org"

Plainly put, FreeBSD almost -never- changes the defaults.  They're 
steady, reliable, and pretty much set in stone.  It's the entire point.

If you want something where the defaults can change from day to day, 
then perhaps FreeBSD is not for you.  Personally, I don't mind having 
these defaults, as long as I can change them.

I mean, seriously, it really isn't all that hard to type "zfs set 
atime=off /some/pool/some/where"

-- 
Chuck Burns <break19@gmail.com>


From owner-freebsd-fs@FreeBSD.ORG  Sat Jun 15 14:54:54 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115])
 by hub.freebsd.org (Postfix) with ESMTP id 3B2C233C
 for <freebsd-fs@freebsd.org>; Sat, 15 Jun 2013 14:54:54 +0000 (UTC)
 (envelope-from break19@gmail.com)
Received: from mail-yh0-x233.google.com (mail-yh0-x233.google.com
 [IPv6:2607:f8b0:4002:c01::233])
 by mx1.freebsd.org (Postfix) with ESMTP id 03F751AA7
 for <freebsd-fs@freebsd.org>; Sat, 15 Jun 2013 14:54:53 +0000 (UTC)
Received: by mail-yh0-f51.google.com with SMTP id l109so517498yhq.10
 for <freebsd-fs@freebsd.org>; Sat, 15 Jun 2013 07:54:53 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
 h=message-id:date:from:user-agent:mime-version:to:subject:references
 :in-reply-to:content-type:content-transfer-encoding;
 bh=9NcFW0mxfLLv+IAQJAEamNkxZ2UTEBK1ZaTwOa81GH0=;
 b=ON601YwhjctUNQXRU1fIBVqH/dF01aFzWVVjjgfyPLwMgndtJnYacM+JRrU292gBtQ
 YXkcnTXjaWxENZORu1/MUIbxkkg5zQk/92m6DLIOLWd7AbZ4rD/Pnjevzac0Nokkx3uy
 xEKNN2872MycEFC1kAfxFTp3Bch9u1/F2M+DQC2Rw3Qz/fa/ntHFYKaYY0c7YD1Fmn54
 u42+U2cVSwrPELoxXRfB3V5lE7LflcsVrQr+NLYtX7atKT9D2S7ozcZF+yEjad0LgtfB
 A0ECGqeIWVIHGBlJoRRH8NMKJJAHYO6AZyQp5Y3kkRoX7W8Vb0oaKDTX24geQB/LIhzp
 Ey7Q==
X-Received: by 10.236.139.75 with SMTP id b51mr4202234yhj.6.1371308093592;
 Sat, 15 Jun 2013 07:54:53 -0700 (PDT)
Received: from [192.168.2.16] (173-17-218-61.client.mchsi.com. [173.17.218.61])
 by mx.google.com with ESMTPSA id y24sm10759026yhn.20.2013.06.15.07.54.52
 for <freebsd-fs@freebsd.org>
 (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128);
 Sat, 15 Jun 2013 07:54:53 -0700 (PDT)
Message-ID: <51BC8033.6020605@gmail.com>
Date: Sat, 15 Jun 2013 09:54:43 -0500
From: Chuck Burns <break19@gmail.com>
User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64;
 rv:17.0) Gecko/20130509 Thunderbird/17.0.6
MIME-Version: 1.0
To: freebsd-fs@freebsd.org
Subject: Re: Changing the default for ZFS atime to off?
References: <16FEF774EE8E4100AD2CAEC65276A49D@multiplay.co.uk>
 <2AC5E8F4-3AF1-4EA5-975D-741506AC70A5@my.gd>
In-Reply-To: <2AC5E8F4-3AF1-4EA5-975D-741506AC70A5@my.gd>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sat, 15 Jun 2013 14:54:54 -0000

On 6/9/2013 5:39 AM, Damien Fleuriot wrote:
> On 8 Jun 2013, at 20:54, "Steven Hartland" <smh@freebsd.org> wrote:
>
>> One of the first changes we make here when installing machines
>> here to changing atime=off on all ZFS pool roots.
>>
>> I know there are a few apps which can rely on atime updates
>> such as qmail and possibly postfix, but those seem like special
>> cases for which admins should enable atime instead of the other
>> way round.
>>
>> This is going to of particular interest for flash based storage
>> which should avoid unnessacary writes to reduce wear, but it will
>> also help improve performance in general.
>>
>> So what do people think is it worth considering changing the
>> default from atime=on to atime=off moving forward?
>>
>> If so what about UFS, same change?
>>
>
> I strongly oppose the change for reasons already raised by many people regarding the mbox file.
>
> Besides, if atime should default to off on 2 filesystems and on on all others, that would definitely create confusion.
>
> Last, I believe it should be the admin's decision to turn atime off, just like it is his decision to turn compression on.
>
> Don't mistake me, we turn atime=off on every box, every filesystem, even on Mac's HFS.
> Yet I believe defaulting it to off is a mistake.
> _______________________________________________
> freebsd-fs@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-fs
> To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org"

+1 here.  I, too, usually turn it off, and doing so isn't especially 
difficult.  Changing DEFAULTS is only good when the defaults actually 
break stuff.

-- 
Chuck Burns <break19@gmail.com>