From owner-freebsd-fs@FreeBSD.ORG  Sun Feb 24 05:12:41 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115])
 by hub.freebsd.org (Postfix) with ESMTP id 94AD7D62
 for <freebsd-fs@freebsd.org>; Sun, 24 Feb 2013 05:12:41 +0000 (UTC)
 (envelope-from freebsd@deman.com)
Received: from plato.corp.nas.com (plato.corp.nas.com [66.114.32.138])
 by mx1.freebsd.org (Postfix) with ESMTP id 365761543
 for <freebsd-fs@freebsd.org>; Sun, 24 Feb 2013 05:12:40 +0000 (UTC)
Received: from localhost (localhost [127.0.0.1])
 by plato.corp.nas.com (Postfix) with ESMTP id 46CBA1314141D
 for <freebsd-fs@freebsd.org>; Sat, 23 Feb 2013 21:02:37 -0800 (PST)
X-Virus-Scanned: amavisd-new at corp.nas.com
Received: from plato.corp.nas.com ([127.0.0.1])
 by localhost (plato.corp.nas.com [127.0.0.1]) (amavisd-new, port 10024)
 with ESMTP id NHhXPEVRRwKh for <freebsd-fs@freebsd.org>;
 Sat, 23 Feb 2013 21:02:37 -0800 (PST)
Received: from [192.168.2.160] (mono-sis1.s.bli.openaccess.org [66.114.32.149])
 by plato.corp.nas.com (Postfix) with ESMTPSA id DBF4A13141412
 for <freebsd-fs@freebsd.org>; Sat, 23 Feb 2013 21:02:36 -0800 (PST)
Content-Type: text/plain; charset=iso-8859-1
Mime-Version: 1.0 (Mac OS X Mail 6.2 \(1499\))
Subject: Re: RFC: Suggesting ZFS "best practices" in FreeBSD
From: Michael DeMan <freebsd@deman.com>
In-Reply-To: <AAE9CC17-B5C4-43DC-B86B-2F498FCA5AD4@deman.com>
Date: Sat, 23 Feb 2013 21:02:36 -0800
Content-Transfer-Encoding: quoted-printable
Message-Id: <CBCA98B0-7568-4826-BE1A-62C9AC1BE3A1@deman.com>
References: <314B600D-E8E6-4300-B60F-33D5FA5A39CF@sarenet.es>
 <AAE9CC17-B5C4-43DC-B86B-2F498FCA5AD4@deman.com>
To: FreeBSD Filesystems <freebsd-fs@freebsd.org>
X-Mailer: Apple Mail (2.1499)
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sun, 24 Feb 2013 05:12:41 -0000

I have not a heard a word on this topic in a while and still think it is =
a good idea.
How can we move forward?  How can I help?  Would it be useful to have a =
sharable space somewhere to discuss things so when a best practices =
document that is available for all instead of the secret few - is =
reliable?
I am willing to put some effort in.

Thanks,
- Mike

On Jan 22, 2013, at 5:27 PM, Michael DeMan <freebsd@deman.com> wrote:

> I think this would be awesome.  Googling around it is extremely =
difficult to know what to do and which practices are current or =
obsolete, etc.
> <SNIP>

<SNIP>=

From owner-freebsd-fs@FreeBSD.ORG  Sun Feb 24 10:55:04 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by hub.freebsd.org (Postfix) with ESMTP id 517FC1EC
 for <freebsd-fs@freebsd.org>; Sun, 24 Feb 2013 10:55:04 +0000 (UTC)
 (envelope-from radiomlodychbandytow@o2.pl)
Received: from tur.go2.pl (tur.go2.pl [193.17.41.50])
 by mx1.freebsd.org (Postfix) with ESMTP id D7DBF1C3A
 for <freebsd-fs@freebsd.org>; Sun, 24 Feb 2013 10:55:03 +0000 (UTC)
Received: from moh1-ve1.go2.pl (moh1-ve1.go2.pl [193.17.41.131])
 by tur.go2.pl (Postfix) with ESMTP id B6D7615A06AC
 for <freebsd-fs@freebsd.org>; Sun, 24 Feb 2013 11:54:55 +0100 (CET)
Received: from moh1-ve1.go2.pl (unknown [10.0.0.131])
 by moh1-ve1.go2.pl (Postfix) with ESMTP id 9979891D58F
 for <freebsd-fs@freebsd.org>; Sun, 24 Feb 2013 11:54:35 +0100 (CET)
Received: from unknown (unknown [10.0.0.108])
 by moh1-ve1.go2.pl (Postfix) with SMTP
 for <freebsd-fs@freebsd.org>; Sun, 24 Feb 2013 11:54:35 +0100 (CET)
Received: from unknown [93.175.66.185] by poczta.o2.pl with ESMTP id GGKQxE;
 Sun, 24 Feb 2013 11:54:35 +0100
Message-ID: <5129F16A.6020505@o2.pl>
Date: Sun, 24 Feb 2013 11:54:34 +0100
From: =?UTF-8?B?UmFkaW8gbcWCb2R5Y2ggYmFuZHl0w7N3?= <radiomlodychbandytow@o2.pl>
User-Agent: Mozilla/5.0 (X11; FreeBSD amd64;
 rv:17.0) Gecko/20130201 Thunderbird/17.0.2
MIME-Version: 1.0
To: freebsd-fs@freebsd.org, "Ronald Klop" <ronald-freebsd8@klop.yi.org>
Subject: Re: Some filesystem thoughts
References: <mailman.15.1361534401.3001.freebsd-fs@freebsd.org>
In-Reply-To: <mailman.15.1361534401.3001.freebsd-fs@freebsd.org>
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
X-O2-Trust: 1, 37
X-O2-SPF: neutral
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sun, 24 Feb 2013 10:55:04 -0000

"Ronald Klop" <ronald-freebsd8@klop.yi.org> wrote:
> Creative ideas.
> Part of what you want is in fusefs (mounting of files to edit their
> content).
Mhm. Could you give some link or details in another form?
And part is implemented in e.g. KDE (integrated support for
> various file types in fulltext search and tagging of files/metadata, etc.).
Well, I view it as not much different from implementing a TC / MC / VIM 
plugin. Anybody can benefit, but they have to implement the right API 
(And there are several programs that use TC plugins).
It's interesting as a way of getting some of these benefits though.

> The chances of having all these complex libraries integrated in the
> FreeBSD OS are close to zero I presume. But I am not in a position to
> decide about that.
Frankly, I haven't expected anything different. My thoughts did jump to 
implementation issues and I see them numerous, but I think the idea 
itself is not sufficiently mature, so I decided to skip them in the 
first post.

> I think you can't expect the OS to serve everybody's detailed wishes.
I don't expect it. I just wanted to discuss an idea that seemed to have 
a potential.
> The OS serves files and user programs know what to do with them.
Unfortunately, far too often programs don't know it. Files are often not 
simple and a single program is unable to deal with them. The only way to 
deal with such cases ATM that I see is to manually remove layers 
obfuscating the meaningful sources. In some way, it resembles piping 
data through multiple programs, except that pipes transport bytes, not 
files and therefore the transformation has to be performed step by step.
-- 
Twoje radio

From owner-freebsd-fs@FreeBSD.ORG  Sun Feb 24 21:45:13 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by hub.freebsd.org (Postfix) with ESMTP id C5781D57
 for <freebsd-fs@freebsd.org>; Sun, 24 Feb 2013 21:45:13 +0000 (UTC)
 (envelope-from ronald-freebsd8@klop.yi.org)
Received: from smarthost1.greenhost.nl (smarthost1.greenhost.nl
 [195.190.28.78]) by mx1.freebsd.org (Postfix) with ESMTP id 600BD79D
 for <freebsd-fs@freebsd.org>; Sun, 24 Feb 2013 21:45:13 +0000 (UTC)
Received: from smtp.greenhost.nl ([213.108.104.138])
 by smarthost1.greenhost.nl with esmtps (TLS1.0:RSA_AES_256_CBC_SHA1:32)
 (Exim 4.69) (envelope-from <ronald-freebsd8@klop.yi.org>)
 id 1U9jNX-0006Y3-7r; Sun, 24 Feb 2013 22:45:04 +0100
Received: from h253044.upc-h.chello.nl ([62.194.253.44] helo=pinky)
 by smtp.greenhost.nl with esmtpsa (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32)
 (Exim 4.72) (envelope-from <ronald-freebsd8@klop.yi.org>)
 id 1U9jNX-00012H-6y; Sun, 24 Feb 2013 22:45:03 +0100
Content-Type: text/plain; charset=utf-8; format=flowed; delsp=yes
To: freebsd-fs@freebsd.org, =?utf-8?B?UmFkaW8gbcWCb2R5Y2ggYmFuZHl0w7N3?=
 <radiomlodychbandytow@o2.pl>
Subject: Re: Some filesystem thoughts
References: <mailman.15.1361534401.3001.freebsd-fs@freebsd.org>
 <5129F16A.6020505@o2.pl>
Date: Sun, 24 Feb 2013 22:45:03 +0100
MIME-Version: 1.0
Content-Transfer-Encoding: 8bit
From: "Ronald Klop" <ronald-freebsd8@klop.yi.org>
Message-ID: <op.ws05ddmt8527sy@pinky>
In-Reply-To: <5129F16A.6020505@o2.pl>
User-Agent: Opera Mail/12.14 (Win32)
X-Virus-Scanned: by clamav at smarthost1.samage.net
X-Spam-Level: /
X-Spam-Score: 0.8
X-Spam-Status: No, score=0.8 required=5.0 tests=BAYES_50 autolearn=disabled
 version=3.3.1
X-Scan-Signature: ba572e8a3bde05b4b19613c12a9e49fc
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sun, 24 Feb 2013 21:45:13 -0000

On Sun, 24 Feb 2013 11:54:34 +0100, Radio młodych bandytów  
<radiomlodychbandytow@o2.pl> wrote:

> "Ronald Klop" <ronald-freebsd8@klop.yi.org> wrote:
>> Creative ideas.
>> Part of what you want is in fusefs (mounting of files to edit their
>> content).
> Mhm. Could you give some link or details in another form?

Just google for 'fusefs'.
Filesystems based on FUSE:
http://sourceforge.net/apps/mediawiki/fuse/index.php?title=FileSystems

> And part is implemented in e.g. KDE (integrated support for
>> various file types in fulltext search and tagging of files/metadata,  
>> etc.).
> Well, I view it as not much different from implementing a TC / MC / VIM  
> plugin. Anybody can benefit, but they have to implement the right API  
> (And there are several programs that use TC plugins).
> It's interesting as a way of getting some of these benefits though.
>
>> The chances of having all these complex libraries integrated in the
>> FreeBSD OS are close to zero I presume. But I am not in a position to
>> decide about that.
> Frankly, I haven't expected anything different. My thoughts did jump to  
> implementation issues and I see them numerous, but I think the idea  
> itself is not sufficiently mature, so I decided to skip them in the  
> first post.
>
>> I think you can't expect the OS to serve everybody's detailed wishes.
> I don't expect it. I just wanted to discuss an idea that seemed to have  
> a potential.
>> The OS serves files and user programs know what to do with them.
> Unfortunately, far too often programs don't know it. Files are often not  
> simple and a single program is unable to deal with them. The only way to  
> deal with such cases ATM that I see is to manually remove layers  
> obfuscating the meaningful sources. In some way, it resembles piping  
> data through multiple programs, except that pipes transport bytes, not  
> files and therefore the transformation has to be performed step by step.

Well. It is probably me, but I don't really get what you're trying to say  
here.

Regards,
Ronald.

From owner-freebsd-fs@FreeBSD.ORG  Sun Feb 24 22:41:07 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@smarthost.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by hub.freebsd.org (Postfix) with ESMTP id 21BDC88F;
 Sun, 24 Feb 2013 22:41:07 +0000 (UTC)
 (envelope-from linimon@FreeBSD.org)
Received: from freefall.freebsd.org (freefall.freebsd.org
 [IPv6:2001:1900:2254:206c::16:87])
 by mx1.freebsd.org (Postfix) with ESMTP id D53DD991;
 Sun, 24 Feb 2013 22:41:06 +0000 (UTC)
Received: from freefall.freebsd.org (localhost [127.0.0.1])
 by freefall.freebsd.org (8.14.6/8.14.6) with ESMTP id r1OMf6On003798;
 Sun, 24 Feb 2013 22:41:06 GMT
 (envelope-from linimon@freefall.freebsd.org)
Received: (from linimon@localhost)
 by freefall.freebsd.org (8.14.6/8.14.6/Submit) id r1OMf60k003794;
 Sun, 24 Feb 2013 22:41:06 GMT (envelope-from linimon)
Date: Sun, 24 Feb 2013 22:41:06 GMT
Message-Id: <201302242241.r1OMf60k003794@freefall.freebsd.org>
To: linimon@FreeBSD.org, freebsd-fs@FreeBSD.org, rmacklem@FreeBSD.org
From: linimon@FreeBSD.org
Subject: Re: kern/165923: [nfs] Writing to NFS-backed mmapped files fails if
 flushed automatically
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sun, 24 Feb 2013 22:41:07 -0000

Synopsis: [nfs] Writing to NFS-backed mmapped files fails if flushed automatically

Responsible-Changed-From-To: freebsd-fs->rmacklem
Responsible-Changed-By: linimon
Responsible-Changed-When: Sun Feb 24 22:40:07 UTC 2013
Responsible-Changed-Why: 
Over to committer for possible MFC.

http://www.freebsd.org/cgi/query-pr.cgi?pr=165923

From owner-freebsd-fs@FreeBSD.ORG  Sun Feb 24 23:32:14 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by hub.freebsd.org (Postfix) with ESMTP id F3459960
 for <freebsd-fs@freebsd.org>; Sun, 24 Feb 2013 23:32:13 +0000 (UTC)
 (envelope-from rfg@tristatelogic.com)
Received: from outgoing.tristatelogic.com (segfault.tristatelogic.com
 [69.62.255.118]) by mx1.freebsd.org (Postfix) with ESMTP id C1242DDD
 for <freebsd-fs@freebsd.org>; Sun, 24 Feb 2013 23:32:13 +0000 (UTC)
Received: from segfault-nmh-helo.tristatelogic.com (localhost [127.0.0.1])
 by segfault.tristatelogic.com (Postfix) with ESMTP id 8A4AD3B9DD
 for <freebsd-fs@freebsd.org>; Sun, 24 Feb 2013 15:32:10 -0800 (PST)
From: "Ronald F. Guilmette" <rfg@tristatelogic.com>
To: freebsd-fs@freebsd.org
Subject: Hard drive device names... Serial Numbers?
Date: Sun, 24 Feb 2013 15:32:10 -0800
Message-ID: <2511.1361748730@server1.tristatelogic.com>
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sun, 24 Feb 2013 23:32:14 -0000


Today I am diddling with a system of mine that already... before today...
contained three SATA drives.  I just now added to this system one old
PATA drive I had lying around which I plan to use as a swap drive.

The motherboard for the system in question has two of the older PATA
ports (supporting up to four devices) and then also four SATA ports.

It appears that the BIOS numbers the PATA devices first, and that FreeBSD
just follows suit.  Thus, for FreeBSD, whatever drive is the (PATA)
primary master gets the name ada0, the primary slave, ada1, the secondary
master ada2, the secondary slave ada3, and then the SATA ports get names
ada4, ada5, etc.

So anyway, adding the PATA drive to this system of course rendered
everything I had previously had in my /etc/fstab suddenly incorrect.
Fortunately, I anticipated this and was prepared to boot FreeBSD Live
from a CD, and then go in and edit my /etc/fstab as necessary to
adjust everything for the new hard drive numbers.

This isn't the first time I've had to go through this process.  It is
always an annoyance.

Up until today I was only dimly aware of the different approach described
here:

    http://www.freebsd.org/doc/handbook/geom-glabel.html

but today I was finally motivated to seek out and read the above page,
which I have now done.

Having now read all about temporary labels, permanment labels, and ufsids,
and having noted the obvious drawbacks to each (including but not limited
to the fact that these techniques generally only appear to be applicable
exclusively to UFS file systems _and_ only recently created ones at that)
I thought that I would take a second and ask about the general idea of
using built-in hard drive serial numbers as a filesystem-independent
and interface-independent way of identifying specific hard drive devices
(and/or their sub-parts) e.g. within /etc/fstab.

This idea seems so obviously that I am forced to assume that I'm probably
far from the first person to have suggested and/or asked about it.

So what gives?  Why can't we have something like /dev/hdsn/ (hdsn ==
Hard Drive Serial Number) where a set of device numbers would automagically
be created within that directory, all of whose names correspond to the
actual hardware serial numbers of all currently attached hard drive
type devices?  (If just the serial numbers are not seen an being unique
enough, I can imagine other unique or semi-unique properties of the drive
being concatenated with the serial numbers.)

It is easy also to envision obvious extensions to such a scheme.  For
example, a node named /dev/hdsn/1mgbhxed.s1a might represent the BSD
"a" partition of MBR slice number 1 within the drive whose serial
number is "1mgbhxed".

Anyway, the whole point here is to have a naming convention that would
work across both UFS and non-UFS filesystems, and also/even across both
recently created UFS file systems which include the recently introduced
ufsids as well as older pree-existing UFS filesystems.

So, um, has any idea along these lines been disucssed previously?  If so,
what were the arguments for and against?


Regards,
rfg


From owner-freebsd-fs@FreeBSD.ORG  Mon Feb 25 06:45:04 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by hub.freebsd.org (Postfix) with ESMTP id 22FB5DDF
 for <freebsd-fs@freebsd.org>; Mon, 25 Feb 2013 06:45:04 +0000 (UTC)
 (envelope-from jdc@koitsu.org)
Received: from qmta10.emeryville.ca.mail.comcast.net
 (qmta10.emeryville.ca.mail.comcast.net [IPv6:2001:558:fe2d:43:76:96:30:17])
 by mx1.freebsd.org (Postfix) with ESMTP id ECF2682
 for <freebsd-fs@freebsd.org>; Mon, 25 Feb 2013 06:45:03 +0000 (UTC)
Received: from omta24.emeryville.ca.mail.comcast.net ([76.96.30.92])
 by qmta10.emeryville.ca.mail.comcast.net with comcast
 id 4Wl31l0021zF43QAAWl3hX; Mon, 25 Feb 2013 06:45:03 +0000
Received: from koitsu.strangled.net ([67.180.84.87])
 by omta24.emeryville.ca.mail.comcast.net with comcast
 id 4Wl21l0041t3BNj8kWl2pg; Mon, 25 Feb 2013 06:45:02 +0000
Received: by icarus.home.lan (Postfix, from userid 1000)
 id 1ED0273A1C; Sun, 24 Feb 2013 22:45:02 -0800 (PST)
Date: Sun, 24 Feb 2013 22:45:02 -0800
From: Jeremy Chadwick <jdc@koitsu.org>
To: rfg@tristatelogic.com
Subject: Re: Hard drive device names... Serial Numbers?
Message-ID: <20130225064502.GA26208@icarus.home.lan>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
User-Agent: Mutt/1.5.21 (2010-09-15)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=comcast.net;
 s=q20121106; t=1361774703;
 bh=iuXrbyQRIz8/VduCzoS0fqIcY7AiGRcwkxwryTeDHjY=;
 h=Received:Received:Received:Date:From:To:Subject:Message-ID:
 MIME-Version:Content-Type;
 b=ZAM7ckOmadbRYnwmU1awLvWGNTs6FbSYT81Z0EOdJpvJlnb4YVHfDZepz0vGozY1x
 Ywen9jDlBbEgY3BFYAFZqzgsP5OQaTD3iq/lVMdQN34Bz3yZ9LOmEABWSP3PXM/rxL
 eqA3bGNCkQD2eVmCLrraWrNtGtY/C0e6OAPvZefHzpsPQBo7O+BKGQhqZJ/rNPXOqk
 ub0Mg0AELDoOuGICT+trZQDCc0YCvHvX63bzjrYvqJbrzd7bapxYk5vcUN/2neiPMZ
 UvDR6eazzOrTZBh7pIPIPudsO0pla8f+vUnOijdrgQSbaarrtmsd+Tmwqkce8phofX
 2pDastHBmcR+w==
Cc: freebsd-fs@freebsd.org
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 25 Feb 2013 06:45:04 -0000

(Please keep me CC'd as I am not subscribed to freebsd-fs@)

This topic has been discussed at length before, and recently,
particularly between Warren Block and myself.  The thread, which you can
read time permitting (kind of scattered between two lists, sorry):

http://lists.freebsd.org/pipermail/freebsd-fs/2013-January/016237.html
http://lists.freebsd.org/pipermail/freebsd-stable/2013-January/071900.html

The answer -- and I am hard set on this and will not bend, so anyone
considering arguing with me on it should just save their breath -- is to
use the "wiring down" or "wired down" capability of CAM(4) to ensure you
get static device numbers for your disks (across multiple controllers
too).  You can then add/remove whatever you want and the numbers will
remain the same/however you declared them in /boot/loader.conf.

How to do that (references):

http://lists.freebsd.org/pipermail/freebsd-stable/2013-January/071851.html
http://lists.freebsd.org/pipermail/freebsd-fs/2011-March/011036.html
http://lists.freebsd.org/pipermail/freebsd-fs/2012-June/014522.html

Also see the CAM(4) man page for some details.

It becomes a little more tricky depending on what controllers you have.
All you have to do is spend some time paying very close attention to the
dmesg output and working it out.  Some reboots later you'll have it, and
you won't have to touch it/change it.  It's a one-time deal, and saves
you all the pain and idiocy that labels introduce (I explain what those
are in the initially-mentioned thread).

Footnote: I have tried mailing you 3 separate times in the past about
separate subjects and your mail server (server1.tristatelogic.com)
intentionally rejects mail (550 5.7.1) from Comcast's SMTP servers.  I
gave up trying to contact you after repeated attempts.  Example:

> Reporting-MTA: dns; qmta01.emeryville.ca.mail.comcast.net [76.96.30.16]
> Received-From-MTA: dns; omta05.emeryville.ca.mail.comcast.net [76.96.30.43]
> Arrival-Date: Fri, 07 Dec 2012 15:07:11 +0000
> Final-recipient: rfc822; rfg@tristatelogic.com
> Action: failed
> Status: 5.1.1
> Diagnostic-Code: smtp; 550 5.7.1 <qmta01.emeryville.ca.mail.comcast.net[76.96.30.16]>: Client host rejected: emeryville.ca.mail.comcast.net is BLACKLISTED - Use http://www.tristatelogic.com/contact.html
> Last-attempt-Date: Fri, 07 Dec 2012 15:07:13 +0000

If you have a problem with Comcast's mail servers, I can refer you to
lots of different people on the Comcast side who can help with that; I'd
be happy to talk to you off-list about it (but you'd have to release
that blockage to actually see my responses to you, naturally).

If this is a side effect of using DNSBLs and you need a DNSWL
(whitelist),, you might look into dnswl.org.  I stopped using them in
2012 given some changes of theirs which I did not agree with, but those
reasons were my own and were of an "administrative annoyance" nature.

-- 
| Jeremy Chadwick                                   jdc@koitsu.org |
| UNIX Systems Administrator                http://jdc.koitsu.org/ |
| Mountain View, CA, US                                            |
| Making life hard for others since 1977.             PGP 4BD6C0CB |


From owner-freebsd-fs@FreeBSD.ORG  Mon Feb 25 11:03:34 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115])
 by hub.freebsd.org (Postfix) with ESMTP id D4F63E80
 for <freebsd-fs@freebsd.org>; Mon, 25 Feb 2013 11:03:34 +0000 (UTC)
 (envelope-from freebsd-listen@fabiankeil.de)
Received: from smtprelay06.ispgateway.de (smtprelay06.ispgateway.de
 [80.67.31.96]) by mx1.freebsd.org (Postfix) with ESMTP id 60153DD6
 for <freebsd-fs@freebsd.org>; Mon, 25 Feb 2013 11:03:34 +0000 (UTC)
Received: from [84.44.152.99] (helo=fabiankeil.de)
 by smtprelay06.ispgateway.de with esmtpsa (SSLv3:AES128-SHA:128)
 (Exim 4.68) (envelope-from <freebsd-listen@fabiankeil.de>)
 id 1U9vqA-0007se-Jm; Mon, 25 Feb 2013 12:03:26 +0100
Date: Mon, 25 Feb 2013 12:02:34 +0100
From: Fabian Keil <freebsd-listen@fabiankeil.de>
To: "Ronald F. Guilmette" <rfg@tristatelogic.com>
Subject: Re: Hard drive device names... Serial Numbers?
Message-ID: <20130225120234.66dd1b36@fabiankeil.de>
In-Reply-To: <2511.1361748730@server1.tristatelogic.com>
References: <2511.1361748730@server1.tristatelogic.com>
Mime-Version: 1.0
Content-Type: multipart/signed; micalg=PGP-SHA1;
 boundary="Sig_/JTNJXq7cwkjlmXwwtAuC3.o"; protocol="application/pgp-signature"
X-Df-Sender: Nzc1MDY3
Cc: freebsd-fs@freebsd.org
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 25 Feb 2013 11:03:34 -0000

--Sig_/JTNJXq7cwkjlmXwwtAuC3.o
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: quoted-printable

"Ronald F. Guilmette" <rfg@tristatelogic.com> wrote:

> So anyway, adding the PATA drive to this system of course rendered
> everything I had previously had in my /etc/fstab suddenly incorrect.
> Fortunately, I anticipated this and was prepared to boot FreeBSD Live
> from a CD, and then go in and edit my /etc/fstab as necessary to
> adjust everything for the new hard drive numbers.
>=20
> This isn't the first time I've had to go through this process.  It is
> always an annoyance.
>=20
> Up until today I was only dimly aware of the different approach described
> here:
>=20
>     http://www.freebsd.org/doc/handbook/geom-glabel.html
>=20
> but today I was finally motivated to seek out and read the above page,
> which I have now done.
>=20
> Having now read all about temporary labels, permanment labels, and ufsids,
> and having noted the obvious drawbacks to each (including but not limited
> to the fact that these techniques generally only appear to be applicable
> exclusively to UFS file systems _and_ only recently created ones at that)

Only ufsids are limited to UFS, temporary and permanent labels are generic.

> I thought that I would take a second and ask about the general idea of
> using built-in hard drive serial numbers as a filesystem-independent
> and interface-independent way of identifying specific hard drive devices
> (and/or their sub-parts) e.g. within /etc/fstab.

Note that you can already do that manually by putting the serial number
in the permanent glabel label. If you are using GPT headers you need to
be careful with this, though, for details see gpart(8).
=20
> This idea seems so obviously that I am forced to assume that I'm probably
> far from the first person to have suggested and/or asked about it.
>=20
> So what gives?  Why can't we have something like /dev/hdsn/ (hdsn =3D=3D
> Hard Drive Serial Number) where a set of device numbers would automagical=
ly
> be created within that directory, all of whose names correspond to the
> actual hardware serial numbers of all currently attached hard drive
> type devices?  (If just the serial numbers are not seen an being unique
> enough, I can imagine other unique or semi-unique properties of the drive
> being concatenated with the serial numbers.)

I believe the main reasons is that so far nobody cared enough about
this to provide patches. It has been suggested before and I don't
remember anyone being against it.

Dragonfly BSD already supports this and maybe parts of the code
could be ported.

Fabian

--Sig_/JTNJXq7cwkjlmXwwtAuC3.o
Content-Type: application/pgp-signature; name=signature.asc
Content-Disposition: attachment; filename=signature.asc

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.19 (FreeBSD)

iEYEARECAAYFAlErRM0ACgkQBYqIVf93VJ1BlACfRJHG0tl5WbexidOqr/KD4edI
6ucAn0nrZZkLp2Flk/LnCJ2w+wUznAel
=oA+O
-----END PGP SIGNATURE-----

--Sig_/JTNJXq7cwkjlmXwwtAuC3.o--

From owner-freebsd-fs@FreeBSD.ORG  Mon Feb 25 11:06:46 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@FreeBSD.org
Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115])
 by hub.freebsd.org (Postfix) with ESMTP id A0400112
 for <freebsd-fs@FreeBSD.org>; Mon, 25 Feb 2013 11:06:46 +0000 (UTC)
 (envelope-from owner-bugmaster@FreeBSD.org)
Received: from freefall.freebsd.org (freefall.freebsd.org
 [IPv6:2001:1900:2254:206c::16:87])
 by mx1.freebsd.org (Postfix) with ESMTP id 93722E6A
 for <freebsd-fs@FreeBSD.org>; Mon, 25 Feb 2013 11:06:46 +0000 (UTC)
Received: from freefall.freebsd.org (localhost [127.0.0.1])
 by freefall.freebsd.org (8.14.6/8.14.6) with ESMTP id r1PB6kuD066574
 for <freebsd-fs@FreeBSD.org>; Mon, 25 Feb 2013 11:06:46 GMT
 (envelope-from owner-bugmaster@FreeBSD.org)
Received: (from gnats@localhost)
 by freefall.freebsd.org (8.14.6/8.14.6/Submit) id r1PB6kwL066572
 for freebsd-fs@FreeBSD.org; Mon, 25 Feb 2013 11:06:46 GMT
 (envelope-from owner-bugmaster@FreeBSD.org)
Date: Mon, 25 Feb 2013 11:06:46 GMT
Message-Id: <201302251106.r1PB6kwL066572@freefall.freebsd.org>
X-Authentication-Warning: freefall.freebsd.org: gnats set sender to
 owner-bugmaster@FreeBSD.org using -f
From: FreeBSD bugmaster <bugmaster@freebsd.org>
To: freebsd-fs@FreeBSD.org
Subject: Current problem reports assigned to freebsd-fs@FreeBSD.org
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 25 Feb 2013 11:06:46 -0000

Note: to view an individual PR, use:
  http://www.freebsd.org/cgi/query-pr.cgi?pr=(number).

The following is a listing of current problems submitted by FreeBSD users.
These represent problem reports covering all versions including
experimental development code and obsolete releases.


S Tracker      Resp.      Description
--------------------------------------------------------------------------------
o bin/176253   fs         zpool(8): zfs pool indentation is misleading/wrong
o kern/176141  fs         [zfs] sharesmb=on makes errors for sharenfs, and still
o kern/175950  fs         [zfs] Possible deadlock in zfs after long uptime
o kern/175897  fs         [zfs] operations on readonly zpool hang
o kern/175179  fs         [zfs] ZFS may attach wrong device on move
o kern/175071  fs         [ufs] [panic] softdep_deallocate_dependencies: unrecov
o kern/174372  fs         [zfs] Pagefault appears to be related to ZFS
o kern/174315  fs         [zfs] chflags uchg not supported
o kern/174310  fs         [zfs] root point mounting broken on CURRENT with multi
o kern/174279  fs         [ufs] UFS2-SU+J journal and filesystem corruption
o kern/174060  fs         [ext2fs] Ext2FS system crashes (buffer overflow?)
o kern/173830  fs         [zfs] Brain-dead simple change to ZFS error descriptio
o kern/173718  fs         [zfs] phantom directory in zraid2 pool
f kern/173657  fs         [nfs] strange UID map with nfsuserd
o kern/173363  fs         [zfs] [panic] Panic on 'zpool replace' on readonly poo
o kern/173136  fs         [unionfs] mounting above the NFS read-only share panic
o kern/172348  fs         [unionfs] umount -f of filesystem in use with readonly
o kern/172334  fs         [unionfs] unionfs permits recursive union mounts; caus
o kern/171626  fs         [tmpfs] tmpfs should be noisier when the requested siz
o kern/171415  fs         [zfs] zfs recv fails with "cannot receive incremental 
o kern/170945  fs         [gpt] disk layout not portable between direct connect 
o bin/170778   fs         [zfs] [panic] FreeBSD panics randomly
o kern/170680  fs         [nfs] Multiple NFS Client bug in the FreeBSD 7.4-RELEA
o kern/170497  fs         [xfs][panic] kernel will panic whenever I ls a mounted
o kern/169945  fs         [zfs] [panic] Kernel panic while importing zpool (afte
o kern/169480  fs         [zfs] ZFS stalls on heavy I/O
o kern/169398  fs         [zfs] Can't remove file with permanent error
o kern/169339  fs         panic while " : > /etc/123"
o kern/169319  fs         [zfs] zfs resilver can't complete
o kern/168947  fs         [nfs] [zfs] .zfs/snapshot directory is messed up when 
o kern/168942  fs         [nfs] [hang] nfsd hangs after being restarted (not -HU
o kern/168158  fs         [zfs] incorrect parsing of sharenfs options in zfs (fs
o kern/167979  fs         [ufs] DIOCGDINFO ioctl does not work on 8.2 file syste
o kern/167977  fs         [smbfs] mount_smbfs results are differ when utf-8 or U
o kern/167688  fs         [fusefs] Incorrect signal handling with direct_io
o kern/167685  fs         [zfs] ZFS on USB drive prevents shutdown / reboot
o kern/167612  fs         [portalfs] The portal file system gets stuck inside po
o kern/167272  fs         [zfs] ZFS Disks reordering causes ZFS to pick the wron
o kern/167260  fs         [msdosfs] msdosfs disk was mounted the second time whe
o kern/167109  fs         [zfs] [panic] zfs diff kernel panic Fatal trap 9: gene
o kern/167105  fs         [nfs] mount_nfs can not handle source exports wiht mor
o kern/167067  fs         [zfs] [panic] ZFS panics the server
o kern/167065  fs         [zfs] boot fails when a spare is the boot disk
o kern/167048  fs         [nfs] [patch] RELEASE-9 crash when using ZFS+NULLFS+NF
o kern/166912  fs         [ufs] [panic] Panic after converting Softupdates to jo
o kern/166851  fs         [zfs] [hang] Copying directory from the mounted UFS di
o kern/166477  fs         [nfs] NFS data corruption.
o kern/165950  fs         [ffs] SU+J and fsck problem
o kern/165521  fs         [zfs] [hang] livelock on 1 Gig of RAM with zfs when 31
o kern/165392  fs         Multiple mkdir/rmdir fails with errno 31
o kern/165087  fs         [unionfs] lock violation in unionfs
o kern/164472  fs         [ufs] fsck -B panics on particular data inconsistency
o kern/164370  fs         [zfs] zfs destroy for snapshot fails on i386 and sparc
o kern/164261  fs         [nullfs] [patch] fix panic with NFS served from NULLFS
o kern/164256  fs         [zfs] device entry for volume is not created after zfs
o kern/164184  fs         [ufs] [panic] Kernel panic with ufs_makeinode
o kern/163801  fs         [md] [request] allow mfsBSD legacy installed in 'swap'
o kern/163770  fs         [zfs] [hang] LOR between zfs&syncer + vnlru leading to
o kern/163501  fs         [nfs] NFS exporting a dir and a subdir in that dir to 
o kern/162944  fs         [coda] Coda file system module looks broken in 9.0
o kern/162860  fs         [zfs] Cannot share ZFS filesystem to hosts with a hyph
o kern/162751  fs         [zfs] [panic] kernel panics during file operations
o kern/162591  fs         [nullfs] cross-filesystem nullfs does not work as expe
o kern/162519  fs         [zfs] "zpool import" relies on buggy realpath() behavi
o kern/162362  fs         [snapshots] [panic] ufs with snapshot(s) panics when g
o kern/161968  fs         [zfs] [hang] renaming snapshot with -r including a zvo
o kern/161864  fs         [ufs] removing journaling from UFS partition fails on 
o bin/161807   fs         [patch] add option for explicitly specifying metadata 
o kern/161579  fs         [smbfs] FreeBSD sometimes panics when an smb share is 
o kern/161533  fs         [zfs] [panic] zfs receive panic: system ioctl returnin
o kern/161438  fs         [zfs] [panic] recursed on non-recursive spa_namespace_
o kern/161424  fs         [nullfs] __getcwd() calls fail when used on nullfs mou
o kern/161280  fs         [zfs] Stack overflow in gptzfsboot
o kern/161205  fs         [nfs] [pfsync] [regression] [build] Bug report freebsd
o kern/161169  fs         [zfs] [panic] ZFS causes kernel panic in dbuf_dirty
o kern/161112  fs         [ufs] [lor] filesystem LOR in FreeBSD 9.0-BETA3
o kern/160893  fs         [zfs] [panic] 9.0-BETA2 kernel panic
o kern/160860  fs         [ufs] Random UFS root filesystem corruption with SU+J 
o kern/160801  fs         [zfs] zfsboot on 8.2-RELEASE fails to boot from root-o
o kern/160790  fs         [fusefs] [panic] VPUTX: negative ref count with FUSE
o kern/160777  fs         [zfs] [hang] RAID-Z3 causes fatal hang upon scrub/impo
o kern/160706  fs         [zfs] zfs bootloader fails when a non-root vdev exists
o kern/160591  fs         [zfs] Fail to boot on zfs root with degraded raidz2 [r
o kern/160410  fs         [smbfs] [hang] smbfs hangs when transferring large fil
o kern/160283  fs         [zfs] [patch] 'zfs list' does abort in make_dataset_ha
o kern/159930  fs         [ufs] [panic] kernel core
o kern/159402  fs         [zfs][loader] symlinks cause I/O errors
o kern/159357  fs         [zfs] ZFS MAXNAMELEN macro has confusing name (off-by-
o kern/159356  fs         [zfs] [patch] ZFS NAME_ERR_DISKLIKE check is Solaris-s
o kern/159351  fs         [nfs] [patch] - divide by zero in mountnfs()
o kern/159251  fs         [zfs] [request]: add FLETCHER4 as DEDUP hash option
o kern/159077  fs         [zfs] Can't cd .. with latest zfs version
o kern/159048  fs         [smbfs] smb mount corrupts large files
o kern/159045  fs         [zfs] [hang] ZFS scrub freezes system
o kern/158839  fs         [zfs] ZFS Bootloader Fails if there is a Dead Disk
o kern/158802  fs         amd(8) ICMP storm and unkillable process.
o kern/158231  fs         [nullfs] panic on unmounting nullfs mounted over ufs o
f kern/157929  fs         [nfs] NFS slow read
o kern/157399  fs         [zfs] trouble with: mdconfig force delete && zfs strip
o kern/157179  fs         [zfs] zfs/dbuf.c: panic: solaris assert: arc_buf_remov
o kern/156797  fs         [zfs] [panic] Double panic with FreeBSD 9-CURRENT and 
o kern/156781  fs         [zfs] zfs is losing the snapshot directory,
p kern/156545  fs         [ufs] mv could break UFS on SMP systems
o kern/156193  fs         [ufs] [hang] UFS snapshot hangs && deadlocks processes
o kern/156039  fs         [nullfs] [unionfs] nullfs + unionfs do not compose, re
o kern/155615  fs         [zfs] zfs v28 broken on sparc64 -current
o kern/155587  fs         [zfs] [panic] kernel panic with zfs
p kern/155411  fs         [regression] [8.2-release] [tmpfs]: mount: tmpfs : No 
o kern/155199  fs         [ext2fs] ext3fs mounted as ext2fs gives I/O errors
o bin/155104   fs         [zfs][patch] use /dev prefix by default when importing
o kern/154930  fs         [zfs] cannot delete/unlink file from full volume -> EN
o kern/154828  fs         [msdosfs] Unable to create directories on external USB
o kern/154491  fs         [smbfs] smb_co_lock: recursive lock for object 1
p kern/154228  fs         [md] md getting stuck in wdrain state
o kern/153996  fs         [zfs] zfs root mount error while kernel is not located
o kern/153753  fs         [zfs] ZFS v15 - grammatical error when attempting to u
o kern/153716  fs         [zfs] zpool scrub time remaining is incorrect
o kern/153695  fs         [patch] [zfs] Booting from zpool created on 4k-sector 
o kern/153680  fs         [xfs] 8.1 failing to mount XFS partitions
o kern/153418  fs         [zfs] [panic] Kernel Panic occurred writing to zfs vol
o kern/153351  fs         [zfs] locking directories/files in ZFS
o bin/153258   fs         [patch][zfs] creating ZVOLs requires `refreservation' 
s kern/153173  fs         [zfs] booting from a gzip-compressed dataset doesn't w
o bin/153142   fs         [zfs] ls -l outputs `ls: ./.zfs: Operation not support
o kern/153126  fs         [zfs] vdev failure, zpool=peegel type=vdev.too_small
o kern/152022  fs         [nfs] nfs service hangs with linux client [regression]
o kern/151942  fs         [zfs] panic during ls(1) zfs snapshot directory
o kern/151905  fs         [zfs] page fault under load in /sbin/zfs
o bin/151713   fs         [patch] Bug in growfs(8) with respect to 32-bit overfl
o kern/151648  fs         [zfs] disk wait bug
o kern/151629  fs         [fs] [patch] Skip empty directory entries during name 
o kern/151330  fs         [zfs] will unshare all zfs filesystem after execute a 
o kern/151326  fs         [nfs] nfs exports fail if netgroups contain duplicate 
o kern/151251  fs         [ufs] Can not create files on filesystem with heavy us
o kern/151226  fs         [zfs] can't delete zfs snapshot
o kern/150503  fs         [zfs] ZFS disks are UNAVAIL and corrupted after reboot
o kern/150501  fs         [zfs] ZFS vdev failure vdev.bad_label on amd64
o kern/150390  fs         [zfs] zfs deadlock when arcmsr reports drive faulted
o kern/150336  fs         [nfs] mountd/nfsd became confused; refused to reload n
o kern/149208  fs         mksnap_ffs(8) hang/deadlock
o kern/149173  fs         [patch] [zfs] make OpenSolaris <sys/nvpair.h> installa
o kern/149015  fs         [zfs] [patch] misc fixes for ZFS code to build on Glib
o kern/149014  fs         [zfs] [patch] declarations in ZFS libraries/utilities 
o kern/149013  fs         [zfs] [patch] make ZFS makefiles use the libraries fro
o kern/148504  fs         [zfs] ZFS' zpool does not allow replacing drives to be
o kern/148490  fs         [zfs]: zpool attach - resilver bidirectionally, and re
o kern/148368  fs         [zfs] ZFS hanging forever on 8.1-PRERELEASE
o kern/148138  fs         [zfs] zfs raidz pool commands freeze
o kern/147903  fs         [zfs] [panic] Kernel panics on faulty zfs device
o kern/147881  fs         [zfs] [patch] ZFS "sharenfs" doesn't allow different "
o kern/147420  fs         [ufs] [panic] ufs_dirbad, nullfs, jail panic (corrupt 
o kern/146941  fs         [zfs] [panic] Kernel Double Fault - Happens constantly
o kern/146786  fs         [zfs] zpool import hangs with checksum errors
o kern/146708  fs         [ufs] [panic] Kernel panic in softdep_disk_write_compl
o kern/146528  fs         [zfs] Severe memory leak in ZFS on i386
o kern/146502  fs         [nfs] FreeBSD 8 NFS Client Connection to Server
s kern/145712  fs         [zfs] cannot offline two drives in a raidz2 configurat
o kern/145411  fs         [xfs] [panic] Kernel panics shortly after mounting an 
f bin/145309   fs         bsdlabel: Editing disk label invalidates the whole dev
o kern/145272  fs         [zfs] [panic] Panic during boot when accessing zfs on 
o kern/145246  fs         [ufs] dirhash in 7.3 gratuitously frees hashes when it
o kern/145238  fs         [zfs] [panic] kernel panic on zpool clear tank
o kern/145229  fs         [zfs] Vast differences in ZFS ARC behavior between 8.0
o kern/145189  fs         [nfs] nfsd performs abysmally under load
o kern/144929  fs         [ufs] [lor] vfs_bio.c + ufs_dirhash.c
p kern/144447  fs         [zfs] sharenfs fsunshare() & fsshare_main() non functi
o kern/144416  fs         [panic] Kernel panic on online filesystem optimization
s kern/144415  fs         [zfs] [panic] kernel panics on boot after zfs crash
o kern/144234  fs         [zfs] Cannot boot machine with recent gptzfsboot code 
o kern/143825  fs         [nfs] [panic] Kernel panic on NFS client
o bin/143572   fs         [zfs] zpool(1): [patch] The verbose output from iostat
o kern/143212  fs         [nfs] NFSv4 client strange work ...
o kern/143184  fs         [zfs] [lor] zfs/bufwait LOR
o kern/142878  fs         [zfs] [vfs] lock order reversal
o kern/142597  fs         [ext2fs] ext2fs does not work on filesystems with real
o kern/142489  fs         [zfs] [lor] allproc/zfs LOR
o kern/142466  fs         Update 7.2 -> 8.0 on Raid 1 ends with screwed raid [re
o kern/142306  fs         [zfs] [panic] ZFS drive (from OSX Leopard) causes two 
o kern/142068  fs         [ufs] BSD labels are got deleted spontaneously
o kern/141897  fs         [msdosfs] [panic] Kernel panic. msdofs: file name leng
o kern/141463  fs         [nfs] [panic] Frequent kernel panics after upgrade fro
o kern/141305  fs         [zfs] FreeBSD ZFS+sendfile severe performance issues (
o kern/141091  fs         [patch] [nullfs] fix panics with DIAGNOSTIC enabled
o kern/141086  fs         [nfs] [panic] panic("nfs: bioread, not dir") on FreeBS
o kern/141010  fs         [zfs] "zfs scrub" fails when backed by files in UFS2
o kern/140888  fs         [zfs] boot fail from zfs root while the pool resilveri
o kern/140661  fs         [zfs] [patch] /boot/loader fails to work on a GPT/ZFS-
o kern/140640  fs         [zfs] snapshot crash
o kern/140068  fs         [smbfs] [patch] smbfs does not allow semicolon in file
o kern/139725  fs         [zfs] zdb(1) dumps core on i386 when examining zpool c
o kern/139715  fs         [zfs] vfs.numvnodes leak on busy zfs
p bin/139651   fs         [nfs] mount(8): read-only remount of NFS volume does n
o kern/139407  fs         [smbfs] [panic] smb mount causes system crash if remot
o kern/138662  fs         [panic] ffs_blkfree: freeing free block
o kern/138421  fs         [ufs] [patch] remove UFS label limitations
o kern/138202  fs         mount_msdosfs(1) see only 2Gb
o kern/136968  fs         [ufs] [lor] ufs/bufwait/ufs (open)
o kern/136945  fs         [ufs] [lor] filedesc structure/ufs (poll)
o kern/136944  fs         [ffs] [lor] bufwait/snaplk (fsync)
o kern/136873  fs         [ntfs] Missing directories/files on NTFS volume
o kern/136865  fs         [nfs] [patch] NFS exports atomic and on-the-fly atomic
p kern/136470  fs         [nfs] Cannot mount / in read-only, over NFS
o kern/135546  fs         [zfs] zfs.ko module doesn't ignore zpool.cache filenam
o kern/135469  fs         [ufs] [panic] kernel crash on md operation in ufs_dirb
o kern/135050  fs         [zfs] ZFS clears/hides disk errors on reboot
o kern/134491  fs         [zfs] Hot spares are rather cold...
o kern/133676  fs         [smbfs] [panic] umount -f'ing a vnode-based memory dis
p kern/133174  fs         [msdosfs] [patch] msdosfs must support multibyte inter
o kern/132960  fs         [ufs] [panic] panic:ffs_blkfree: freeing free frag
o kern/132397  fs         reboot causes filesystem corruption (failure to sync b
o kern/132331  fs         [ufs] [lor] LOR ufs and syncer
o kern/132237  fs         [msdosfs] msdosfs has problems to read MSDOS Floppy
o kern/132145  fs         [panic] File System Hard Crashes
o kern/131441  fs         [unionfs] [nullfs] unionfs and/or nullfs not combineab
o kern/131360  fs         [nfs] poor scaling behavior of the NFS server under lo
o kern/131342  fs         [nfs] mounting/unmounting of disks causes NFS to fail
o bin/131341   fs         makefs: error "Bad file descriptor"  on the mount poin
o kern/130920  fs         [msdosfs] cp(1) takes 100% CPU time while copying file
o kern/130210  fs         [nullfs] Error by check nullfs
o kern/129760  fs         [nfs] after 'umount -f' of a stale NFS share FreeBSD l
o kern/129488  fs         [smbfs] Kernel "bug" when using smbfs in smbfs_smb.c: 
o kern/129231  fs         [ufs] [patch] New UFS mount (norandom) option - mostly
o kern/129152  fs         [panic] non-userfriendly panic when trying to mount(8)
o kern/127787  fs         [lor] [ufs] Three LORs: vfslock/devfs/vfslock, ufs/vfs
o bin/127270   fs         fsck_msdosfs(8) may crash if BytesPerSec is zero
o kern/127029  fs         [panic] mount(8): trying to mount a write protected zi
o kern/126287  fs         [ufs] [panic] Kernel panics while mounting an UFS file
o kern/125895  fs         [ffs] [panic] kernel: panic: ffs_blkfree: freeing free
s kern/125738  fs         [zfs] [request] SHA256 acceleration in ZFS
o kern/123939  fs         [msdosfs] corrupts new files
o kern/122380  fs         [ffs] ffs_valloc:dup alloc (Soekris 4801/7.0/USB Flash
o bin/122172   fs         [fs]: amd(8) automount daemon dies on 6.3-STABLE i386,
o bin/121898   fs         [nullfs] pwd(1)/getcwd(2) fails with Permission denied
o bin/121072   fs         [smbfs] mount_smbfs(8) cannot normally convert the cha
o kern/120483  fs         [ntfs] [patch] NTFS filesystem locking changes
o kern/120482  fs         [ntfs] [patch] Sync style changes between NetBSD and F
o kern/118912  fs         [2tb] disk sizing/geometry problem with large array
o kern/118713  fs         [minidump] [patch] Display media size required for a k
o kern/118318  fs         [nfs] NFS server hangs under special circumstances
o bin/118249   fs         [ufs] mv(1): moving a directory changes its mtime
o kern/118126  fs         [nfs] [patch] Poor NFS server write performance
o kern/118107  fs         [ntfs] [panic] Kernel panic when accessing a file at N
o kern/117954  fs         [ufs] dirhash on very large directories blocks the mac
o bin/117315   fs         [smbfs] mount_smbfs(8) and related options can't mount
o kern/117158  fs         [zfs] zpool scrub causes panic if geli vdevs detach on
o bin/116980   fs         [msdosfs] [patch] mount_msdosfs(8) resets some flags f
o conf/116931  fs         lack of fsck_cd9660 prevents mounting iso images with 
o kern/116583  fs         [ffs] [hang] System freezes for short time when using 
o bin/115361   fs         [zfs] mount(8) gets into a state where it won't set/un
o kern/114955  fs         [cd9660] [patch] [request] support for mask,dirmask,ui
o kern/114847  fs         [ntfs] [patch] [request] dirmask support for NTFS ala 
o kern/114676  fs         [ufs] snapshot creation panics: snapacct_ufs2: bad blo
o bin/114468   fs         [patch] [request] add -d option to umount(8) to detach
o kern/113852  fs         [smbfs] smbfs does not properly implement DFS referral
o bin/113838   fs         [patch] [request] mount(8): add support for relative p
o bin/113049   fs         [patch] [request] make quot(8) use getopt(3) and show 
o kern/112658  fs         [smbfs] [patch] smbfs and caching problems (resolves b
o kern/111843  fs         [msdosfs] Long Names of files are incorrectly created 
o kern/111782  fs         [ufs] dump(8) fails horribly for large filesystems
s bin/111146   fs         [2tb] fsck(8) fails on 6T filesystem
o bin/107829   fs         [2TB] fdisk(8): invalid boundary checking in fdisk / w
o kern/106107  fs         [ufs] left-over fsck_snapshot after unfinished backgro
o kern/104406  fs         [ufs] Processes get stuck in "ufs" state under persist
o kern/104133  fs         [ext2fs] EXT2FS module corrupts EXT2/3 filesystems
o kern/103035  fs         [ntfs] Directories in NTFS mounted disc images appear 
o kern/101324  fs         [smbfs] smbfs sometimes not case sensitive when it's s
o kern/99290   fs         [ntfs] mount_ntfs ignorant of cluster sizes
s bin/97498    fs         [request] newfs(8) has no option to clear the first 12
o kern/97377   fs         [ntfs] [patch] syntax cleanup for ntfs_ihash.c
o kern/95222   fs         [cd9660] File sections on ISO9660 level 3 CDs ignored
o kern/94849   fs         [ufs] rename on UFS filesystem is not atomic
o bin/94810    fs         fsck(8) incorrectly reports 'file system marked clean'
o kern/94769   fs         [ufs] Multiple file deletions on multi-snapshotted fil
o kern/94733   fs         [smbfs] smbfs may cause double unlock
o kern/93942   fs         [vfs] [patch] panic: ufs_dirbad: bad dir (patch from D
o kern/92272   fs         [ffs] [hang] Filling a filesystem while creating a sna
o kern/91134   fs         [smbfs] [patch] Preserve access and modification time 
a kern/90815   fs         [smbfs] [patch] SMBFS with character conversions somet
o kern/88657   fs         [smbfs] windows client hang when browsing a samba shar
o kern/88555   fs         [panic] ffs_blkfree: freeing free frag on AMD 64
o bin/87966    fs         [patch] newfs(8): introduce -A flag for newfs to enabl
o kern/87859   fs         [smbfs] System reboot while umount smbfs.
o kern/86587   fs         [msdosfs] rm -r /PATH fails with lots of small files
o bin/85494    fs         fsck_ffs: unchecked use of cg_inosused macro etc.
o kern/80088   fs         [smbfs] Incorrect file time setting on NTFS mounted vi
o bin/74779    fs         Background-fsck checks one filesystem twice and omits 
o kern/73484   fs         [ntfs] Kernel panic when doing `ls` from the client si
o bin/73019    fs         [ufs] fsck_ufs(8) cannot alloc 607016868 bytes for ino
o kern/71774   fs         [ntfs] NTFS cannot "see" files on a WinXP filesystem
o bin/70600    fs         fsck(8) throws files away when it can't grow lost+foun
o kern/68978   fs         [panic] [ufs] crashes with failing hard disk, loose po
o kern/65920   fs         [nwfs] Mounted Netware filesystem behaves strange
o kern/65901   fs         [smbfs] [patch] smbfs fails fsx write/truncate-down/tr
o kern/61503   fs         [smbfs] mount_smbfs does not work as non-root
o kern/55617   fs         [smbfs] Accessing an nsmb-mounted drive via a smb expo
o kern/51685   fs         [hang] Unbounded inode allocation causes kernel to loc
o kern/36566   fs         [smbfs] System reboot with dead smb mount and umount
o bin/27687    fs         fsck(8) wrapper is not properly passing options to fsc
o kern/18874   fs         [2TB] 32bit NFS servers export wrong negative values t

299 problems total.


From owner-freebsd-fs@FreeBSD.ORG  Mon Feb 25 15:15:15 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by hub.freebsd.org (Postfix) with ESMTP id 5B60D1F6
 for <freebsd-fs@freebsd.org>; Mon, 25 Feb 2013 15:15:15 +0000 (UTC)
 (envelope-from olivier@gid0.org)
Received: from mail-ee0-f54.google.com (mail-ee0-f54.google.com [74.125.83.54])
 by mx1.freebsd.org (Postfix) with ESMTP id ECAA2287
 for <freebsd-fs@freebsd.org>; Mon, 25 Feb 2013 15:15:14 +0000 (UTC)
Received: by mail-ee0-f54.google.com with SMTP id c41so1469516eek.13
 for <freebsd-fs@freebsd.org>; Mon, 25 Feb 2013 07:15:13 -0800 (PST)
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=google.com; s=20120113;
 h=mime-version:x-received:in-reply-to:references:date:message-id
 :subject:from:to:cc:content-type:content-transfer-encoding
 :x-gm-message-state;
 bh=axMB73Iym45D+8oJaODrh1sE9cKF1PuBN96Hi3mrEpA=;
 b=na1GewtnAFYeDAXW7phuCGGKhhodoikuBzgahFWrDmB3gon6yYgZh9W82w0FwX/4y1
 jCBNBDDbNtWXH7yoH0dROQsseS7U2jgiM0GvsBlEUM+1ZedaAfTPsjlWs3jgP1NhEd0T
 ri5XOr7hrajWNmd5Da10WG/Bwhi4pARmZNljipdlSWQcZy8Le9sgrjLvWyT5uO6GSQZa
 WyhL7sVmNpQwSOzFIcoM4FGk68jse4CQu24USQ+wCWP3YicyyWiQoUMGO/8W8vgAN0NX
 T8BmsPaKYDyBYgkLOhzGSEuDU/sZ1OKHhBBIf1jTZ649O5pqvKMN+FKZTXKjTUufW7AL
 WgfQ==
MIME-Version: 1.0
X-Received: by 10.14.219.129 with SMTP id m1mr39716245eep.16.1361805313804;
 Mon, 25 Feb 2013 07:15:13 -0800 (PST)
Received: by 10.14.179.65 with HTTP; Mon, 25 Feb 2013 07:15:13 -0800 (PST)
In-Reply-To: <A196DCF7-B5C0-4EAB-960F-795D99A4B7F5@dragondata.com>
References: <314B600D-E8E6-4300-B60F-33D5FA5A39CF@sarenet.es>
 <alpine.BSF.2.00.1301220804530.61512@wonkity.com>
 <A196DCF7-B5C0-4EAB-960F-795D99A4B7F5@dragondata.com>
Date: Mon, 25 Feb 2013 16:15:13 +0100
Message-ID: <CABzXLYPn_TQ7etStsLCpmfW8Hv8jSX6Ab0j7Qap0wjrE8e2T5A@mail.gmail.com>
Subject: Re: RFC: Suggesting ZFS "best practices" in FreeBSD
From: Olivier Smedts <olivier@gid0.org>
To: Kevin Day <toasty@dragondata.com>
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable
X-Gm-Message-State: ALoCoQlaNU+QTv0D9Je4dsQAw4v739eiKs64TXM4Pv6g9NdnB+UwByAmC8GhLCPmm/7Asy9HWF7Y
Cc: FreeBSD Filesystems <freebsd-fs@freebsd.org>,
 Scott Long <scottl@samsco.org>, wblock@freebsd.org
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 25 Feb 2013 15:15:15 -0000

Hi,

2013/1/22 Kevin Day <toasty@dragondata.com>:
> I run ftpmirror.your.org, which is a 72 x 3TB drive ZFS server. It's a ve=
ry busy server. It currently houses the only off-site backup of all of the =
Wikimedia projects(121TB), a full FreeBSD FTP mirror(1T), a full CentOS mir=
ror,  all of FreeBSD-Archive(1.5TB), FreeBSD-CVS, etc. It's usually running=
 between 100 and 1500mbps of ethernet traffic in/out of it. There are usual=
ly around 15 FTP connections, 20-50 HTTP connections, 10 rsync connections =
and 1 or 2 CVS connections.
>
> The only changes we've made that are ZFS specific are atime=3Doff and syn=
c=3Ddisabled. Nothing we do uses atimes so disabling that cuts down on a to=
n of unnecessary writes. Disabling sync is okay here too - we're just mirro=
ring stuff that's available elsewhere, so there's no threat of data loss. O=
ther than some TCP tuning in sysctl.conf, this is running a totally stock k=
ernel with no special settings.

If your workload is mostly made of reads (you're a mirror after all,
you should only write when you're syncing with upstream servers) why
use sync=3Ddisabled ? It shouldn't make a big difference for such a
workload.

> I've looked at using an SSD for meta-data only caching, but it appears th=
at we've got far more than 256GB of metadata here that's being accessed reg=
ularly (nearly every file is being stat'ed when rsync runs) so I'm guessing=
 it's not going to be incredibly effective unless I buy a seriously large S=
SD.
>
> If you have any specific questions I'm happy to answer though.
>
> -- Kevin
>
> _______________________________________________
> freebsd-fs@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-fs
> To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org"



--=20
Olivier Smedts                                                 _
                                        ASCII ribbon campaign ( )
e-mail: olivier@gid0.org        - against HTML email & vCards  X
www: http://www.gid0.org    - against proprietary attachments / \

  "Il y a seulement 10 sortes de gens dans le monde :
  ceux qui comprennent le binaire,
  et ceux qui ne le comprennent pas."

From owner-freebsd-fs@FreeBSD.ORG  Mon Feb 25 17:00:11 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@FreeBSD.org
Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115])
 by hub.freebsd.org (Postfix) with ESMTP id DF07B204;
 Mon, 25 Feb 2013 17:00:11 +0000 (UTC)
 (envelope-from toasty@dragondata.com)
Received: from mail.your.org (mail.your.org [IPv6:2001:4978:1:2::cc09:3717])
 by mx1.freebsd.org (Postfix) with ESMTP id BE357934;
 Mon, 25 Feb 2013 17:00:11 +0000 (UTC)
Received: from mail.your.org (chi02.mail.your.org [204.9.55.23])
 by mail.your.org (Postfix) with ESMTP id 5D6A6F06C72;
 Mon, 25 Feb 2013 17:00:11 +0000 (UTC)
Received: from vpn132.rw1.your.org (vpn132.rw1.your.org [204.9.51.132])
 (using TLSv1 with cipher AES128-SHA (128/128 bits))
 (No client certificate requested)
 by mail.your.org (Postfix) with ESMTPSA id 23EFFF06C6E;
 Mon, 25 Feb 2013 17:00:11 +0000 (UTC)
Content-Type: text/plain; charset=windows-1252
Mime-Version: 1.0 (Mac OS X Mail 6.2 \(1499\))
Subject: Re: Improving ZFS performance for large directories
From: Kevin Day <toasty@dragondata.com>
In-Reply-To: <5124AC69.6010709@FreeBSD.org>
Date: Mon, 25 Feb 2013 11:00:10 -0600
Content-Transfer-Encoding: quoted-printable
Message-Id: <237DCD81-5CAB-466B-8BF4-543D195FA545@dragondata.com>
References: <19DB8F4A-6788-44F6-9A2C-E01DEA01BED9@dragondata.com>
 <CAJjvXiE+8OMu_yvdRAsWugH7W=fhFW7bicOLLyjEn8YrgvCwiw@mail.gmail.com>
 <F4420A8C-FB92-4771-B261-6C47A736CF7F@dragondata.com>
 <20130201192416.GA76461@server.rulingia.com>
 <19E0C908-79F1-43F8-899C-6B60F998D4A5@dragondata.com>
 <5124AC69.6010709@FreeBSD.org>
To: Andriy Gapon <avg@FreeBSD.org>
X-Mailer: Apple Mail (2.1499)
Cc: FreeBSD Filesystems <freebsd-fs@FreeBSD.org>
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 25 Feb 2013 17:00:11 -0000


On Feb 20, 2013, at 4:58 AM, Andriy Gapon <avg@FreeBSD.org> wrote:

> on 19/02/2013 22:10 Kevin Day said the following:
>> Timing doing an "ls" in large directories 20 times, the first is the =
slowest,
> then all subsequent listings are roughly the same. There doesn't =
appear to be any
> gain after 20 repetitions
>=20
> I think that the above could be related to the below
>=20
>> 	vfs.zfs.arc_meta_limit                  16398159872
>> 	vfs.zfs.arc_meta_used                   16398120264
>=20


Doing some more testing=85

After a fresh reboot, without the SSD cache, an ls(1) in a large =
directory is pretty fast. After we've been running for an hour or so, =
the speed gets progressively worse. I can kill all other activity on the =
system, and it's still bad. I reboot, and it's back to normal.=20

On an idle system, I watched gstat(8), during the ls(1) the drives are =
basically at 100% busy while it's running, reading far more data than =
I'd think necessary to read a directory. top(1) is showing that the =
"zfskern" kernel process is burning a lot of CPU during that time too. =
Is there a possibility there's a bug/sub-optimal access pattern we're =
hitting when the arc_meta_limit is hit? Something akin to if something =
that was just read doesn't get put into the arc_meta cache, it's having =
to re-read the same data many times just to iterate through the =
directory?

I've been hesitating to increase the arc size because we've only got =
64GB of memory here and I can't add any further. The processes running =
on the system themselves need a fair chunk of ram, so I'm trying to =
figure out how we can either upgrade this motherboard to something newer =
or reduce our memory size. I've got a feeling I'm going to need to do =
this, but since this is a non-commercial project it's kinda hard to =
spend that much money on it. :)

-- Kevin


From owner-freebsd-fs@FreeBSD.ORG  Mon Feb 25 17:34:05 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115])
 by hub.freebsd.org (Postfix) with ESMTP id 4CD5A9D1
 for <freebsd-fs@freebsd.org>; Mon, 25 Feb 2013 17:34:05 +0000 (UTC)
 (envelope-from daniel@digsys.bg)
Received: from smtp-sofia.digsys.bg (smtp-sofia.digsys.bg [193.68.3.230])
 by mx1.freebsd.org (Postfix) with ESMTP id BA86DA6C
 for <freebsd-fs@freebsd.org>; Mon, 25 Feb 2013 17:34:04 +0000 (UTC)
Received: from dcave.digsys.bg (dcave.digsys.bg [192.92.129.5])
 (authenticated bits=0)
 by smtp-sofia.digsys.bg (8.14.5/8.14.5) with ESMTP id r1PHXsbF029070
 (version=TLSv1/SSLv3 cipher=DHE-RSA-CAMELLIA256-SHA bits=256 verify=NO)
 for <freebsd-fs@freebsd.org>; Mon, 25 Feb 2013 19:33:54 +0200 (EET)
 (envelope-from daniel@digsys.bg)
Message-ID: <512BA082.3070605@digsys.bg>
Date: Mon, 25 Feb 2013 19:33:54 +0200
From: Daniel Kalchev <daniel@digsys.bg>
User-Agent: Mozilla/5.0 (X11; FreeBSD amd64;
 rv:10.0.12) Gecko/20130125 Thunderbird/10.0.12
MIME-Version: 1.0
To: freebsd-fs@freebsd.org
Subject: Re: Improving ZFS performance for large directories
References: <19DB8F4A-6788-44F6-9A2C-E01DEA01BED9@dragondata.com>
 <CAJjvXiE+8OMu_yvdRAsWugH7W=fhFW7bicOLLyjEn8YrgvCwiw@mail.gmail.com>
 <F4420A8C-FB92-4771-B261-6C47A736CF7F@dragondata.com>
 <20130201192416.GA76461@server.rulingia.com>
 <19E0C908-79F1-43F8-899C-6B60F998D4A5@dragondata.com>
 <5124AC69.6010709@FreeBSD.org>
 <237DCD81-5CAB-466B-8BF4-543D195FA545@dragondata.com>
In-Reply-To: <237DCD81-5CAB-466B-8BF4-543D195FA545@dragondata.com>
Content-Type: text/plain; charset=windows-1252; format=flowed
Content-Transfer-Encoding: 7bit
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 25 Feb 2013 17:34:05 -0000



On 25.02.13 19:00, Kevin Day wrote:
> I've been hesitating to increase the arc size because we've only got 64GB of memory here and I can't add any further. The processes running on the system themselves need a fair chunk of ram, so I'm trying to figure out how we can either upgrade this motherboard to something newer or reduce our memory size. I've got a feeling I'm going to need to do this, but since this is a non-commercial project it's kinda hard to spend that much money on it. :)

Just make vfs.zfs.arc_meta_limit as big as arc_max. This is safe. By 
default it is 25% of arc_max I believe. In your case, you are better 
caching more metadata than file data anyway.

Daniel

From owner-freebsd-fs@FreeBSD.ORG  Mon Feb 25 23:38:39 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by hub.freebsd.org (Postfix) with ESMTP id A940D2DD
 for <freebsd-fs@freebsd.org>; Mon, 25 Feb 2013 23:38:39 +0000 (UTC)
 (envelope-from rfg@tristatelogic.com)
Received: from outgoing.tristatelogic.com (segfault.tristatelogic.com
 [69.62.255.118]) by mx1.freebsd.org (Postfix) with ESMTP id 73474EDB
 for <freebsd-fs@freebsd.org>; Mon, 25 Feb 2013 23:38:34 +0000 (UTC)
Received: from segfault-nmh-helo.tristatelogic.com (localhost [127.0.0.1])
 by segfault.tristatelogic.com (Postfix) with ESMTP id CEA633B59D;
 Mon, 25 Feb 2013 15:38:17 -0800 (PST)
From: "Ronald F. Guilmette" <rfg@tristatelogic.com>
To: freebsd-fs@freebsd.org
Subject: Hard drive device names... Serial Numbers?
Date: Mon, 25 Feb 2013 15:38:17 -0800
Message-ID: <10096.1361835497@server1.tristatelogic.com>
Cc: Jeremy Chadwick <jdc@koitsu.org>
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 25 Feb 2013 23:38:39 -0000


Firstly, I want to apologize to Jeremy Chadwick, and also to anyone
else who, either recently or in the past, has been kind enough to
try contact me (to send me something non-spamish) and who has been
tripped up by my ham-fisted local spam filtering.  Believe me, it
isn't personal, and I regret that you had trouble reaching me.

I could go on at great length about my personal philosophy regarding
spam, and spam filtering, but this is neither the time nor the place.
Still, I feel compelled to say just a few words on this topic now.

For now I'll just briefly say that unlike 99.999% of all Internet
users, I personally have never come around to the belief that the
existance of spam is inevitable.  Rather, I believe that 100% of it
is due to either incompetence or greed on the part of the folks
responsible for overseeing the machines at the IP addresses from
which it emanates, with incompetence being responsible for the majority
of it.  In the case of spam coming out of the likes of Comcast and
other "public access" Internet Service Providers that are neither
knowingly or willingly supporting spam however, I am of the belief...
not widely shared... that they could put a stop to virtually 100% of
the spam coming out of their networks simply by requiring all local
senders to authenticate and by limiting per-day outbound mail flow
on a per-account basis to something modest (e.g. 100 messages) except
in special cases (and by special request on the part of the user in
question).  But very few ISPs do this, because they are, by and large,
too cheap/greedy to be willing to spend the money to implement and support
any such simple and effective system for curtailing their own outbound
spam flow.  Comcast is no exception to this general rule.  As a result,
I have previously locally blocked all Comcast sub-domains that have 
spammed me in the past, specifically:

mn.comcast.net
ny.comcast.net
in.comcast.net
fl.comcast.net
ma.comcast.net
co.comcast.net
de.comcast.net
ga.comcast.net
va.comcast.net
mi.comcast.net
tx.comcast.net
wa.comcast.net
ut.comcast.net
pa.comcast.net
nj.comcast.net
md.comcast.net
sc.comcast.net
ms.comcast.net
or.comcast.net
tn.comcast.net
al.comcast.net
ct.comcast.net
la.comcast.net
dc.comcast.net
ar.comcast.net
nh.comcast.net
ks.comcast.net
wv.comcast.net
nm.comcast.net
vt.comcast.net

But I had made a special exception for ca.comcast.net, because I needed
to do so in order to be able to receive e-mail from one California-resident
relative.  Unfortunately, it appears that Comcast recently snafued my
special California exception by changing their DNS naming scheme so that
now, mail coming out of their California mail servers arrives from nodes
within the ca.mail.comcast.net subdomain, rather than nodes within the
mail.ca.comcast.net domain, as previously.  Predictably, and shortly
thereafter, I got spammed from a node within the ca.mail.comcast.net
subdomain, thus causing that domain to end up in the local blacklist.
(And that in turn caused e-mail from both Jeremy and my California
relative to start bouncing.)

I thank Jeremy for his ernest offer to put me in touch with "people on
the Comcast side" and I do accept that offer, but I feel sure that despite
any amount of haranguing and/or cajoling I might subject any such contacts
to, Comcast corporate, like so many other ISPs, has long ago made the
irrevocable decision NOT to do what it takes to stop their network from
leaking massive amounts of spam on a daily basis, because to do otherwise
might subtract some paltry number of bucks from the corporate bottom line.

	"Our problems are manmade, therefore, they can be solved by man."
			-- John Fitzgerald Kennedy

My apologies to all for the lengthy off-topic digression.  Jeremy, I've
readjusted my local blacklists now, so you should be able to e-mail me
direct.

Back on topic...

I've tried to plow through the references Jeremy gave regading the "wired
down" capability of CAM(4).  I think that I may sort-of understand it.
It does appear to be _a_ solution.  I'm not yet 100% persuaded that it
is the _best_ solution, and the idea of using serial numbers (or WWN
numbers) is still appealing, at least to me.  But I'm not going to
advocate for that, mostly because I don't feel that I fully understand
this "wired down" stuff yet.  I need to look into that more before I
say anything else.

At least I come away with the the satisfaction of knowing that (a) I am
indeed not the first person to have either thought of or suggested using
drive serial numbers and also (b) that this idea _has_ already been well
and truly discussed, apparently by and among better minds than mine.


Regards,
rfg


P.S.  I confess that I've only skimmed the material on the "wired down"
capability of CAM(4).  Perhaps the answer to this question is burried in
there someplace, but I'd just like to ask: What are the rules, if any
regarding what I can rename a given controller channel to within the
/boot/loader.conf file?  Could I rename one to, for example /dev/foobar707
if I wanted to?  If not, then what are the rules?


From owner-freebsd-fs@FreeBSD.ORG  Tue Feb 26 05:59:26 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115])
 by hub.freebsd.org (Postfix) with ESMTP id 8385D3F4
 for <freebsd-fs@freebsd.org>; Tue, 26 Feb 2013 05:59:26 +0000 (UTC)
 (envelope-from zbeeble@gmail.com)
Received: from mail-ve0-f181.google.com (mail-ve0-f181.google.com
 [209.85.128.181]) by mx1.freebsd.org (Postfix) with ESMTP id 48DBCEA2
 for <freebsd-fs@freebsd.org>; Tue, 26 Feb 2013 05:59:26 +0000 (UTC)
Received: by mail-ve0-f181.google.com with SMTP id d10so2970638vea.40
 for <freebsd-fs@freebsd.org>; Mon, 25 Feb 2013 21:59:25 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
 h=mime-version:x-received:in-reply-to:references:date:message-id
 :subject:from:to:cc:content-type;
 bh=gnaumDMx2vkbvEwuoJwziDUcrG5di7rBiUFs0x1Xy3g=;
 b=vn7/Zmqu4A8+xhkl+KwqyrvKFHiiqI9To2LmX9xCExZ/NHqLvuQ3lKw2w5qHZXk5Hl
 HqVqXcMUCVnVjVBQ31DaOmS5uUc6cqqu0A5nIp0E0I9Dj8s/AwuLvyS6/T32PeNbm6l5
 QGK5i8r5TYeaWTYMwITsnQOdofwHjIs2oSW5AVmRpfXvbLoBfGDRPAl9T4LT+zZ6gSds
 eF5ehD5saoXiYd7S6w4jmX3BY7YOWl2qOmB7hiyXGBJQQAGc9sIac6jF8j5Rsuo7+geR
 AsE8Odzr61HW3NPu4Y1BJOB+Ejcl4+QIZ+2XUqpha1FV6a4GZD6OJqv2goOe0a/Rtmrm
 9dzw==
MIME-Version: 1.0
X-Received: by 10.220.222.8 with SMTP id ie8mr11099897vcb.27.1361858365523;
 Mon, 25 Feb 2013 21:59:25 -0800 (PST)
Received: by 10.220.232.6 with HTTP; Mon, 25 Feb 2013 21:59:25 -0800 (PST)
In-Reply-To: <20130123111852.GM30633@server.rulingia.com>
References: <314B600D-E8E6-4300-B60F-33D5FA5A39CF@sarenet.es>
 <AAE9CC17-B5C4-43DC-B86B-2F498FCA5AD4@deman.com>
 <20130123111852.GM30633@server.rulingia.com>
Date: Tue, 26 Feb 2013 00:59:25 -0500
Message-ID: <CACpH0MfpF65N9hHxD9o6UYjF9BGWF3TAUj004tWDxq-oX-N1Vg@mail.gmail.com>
Subject: Re: RFC: Suggesting ZFS "best practices" in FreeBSD
From: Zaphod Beeblebrox <zbeeble@gmail.com>
To: Peter Jeremy <peter@rulingia.com>
Content-Type: text/plain; charset=ISO-8859-1
X-Content-Filtered-By: Mailman/MimeDel 2.1.14
Cc: FreeBSD Filesystems <freebsd-fs@freebsd.org>
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 26 Feb 2013 05:59:26 -0000

On Wed, Jan 23, 2013 at 6:18 AM, Peter Jeremy <peter@rulingia.com> wrote:

> On 2013-Jan-22 17:27:13 -0800, Michael DeMan <freebsd@deman.com> wrote:
>


> >#2.  Ensure a little extra space is left on the drive since if the whole
> drive is used, a replacement may be a tiny bit smaller and will not work.
>
> As someone else has mentioned, recent ZFS allows some slop here.  But
> I still think it's worthwhile carving out some space to allow for a
> marginally smaller replacement disk.
>

I'm somewhat interested in this point.  Not that we should miss a few meg
on a multi-terrabyte disk, but in my recent experience, all the drive
manufacturers seem to "agree" on the number of sectors for a certain size
of disk.  I'm just not sure we need to leave for the allowance of a smaller
disk.  larger (than required) disks already work anyways.

From owner-freebsd-fs@FreeBSD.ORG  Tue Feb 26 09:14:45 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by hub.freebsd.org (Postfix) with ESMTP id D223B366
 for <freebsd-fs@freebsd.org>; Tue, 26 Feb 2013 09:14:45 +0000 (UTC)
 (envelope-from ronald-freebsd8@klop.yi.org)
Received: from smarthost1.greenhost.nl (smarthost1.greenhost.nl
 [195.190.28.78]) by mx1.freebsd.org (Postfix) with ESMTP id 958F57C8
 for <freebsd-fs@freebsd.org>; Tue, 26 Feb 2013 09:14:45 +0000 (UTC)
Received: from smtp.greenhost.nl ([213.108.104.138])
 by smarthost1.greenhost.nl with esmtps (TLS1.0:RSA_AES_256_CBC_SHA1:32)
 (Exim 4.69) (envelope-from <ronald-freebsd8@klop.yi.org>)
 id 1UAGcU-00033k-5Y; Tue, 26 Feb 2013 10:14:43 +0100
Received: from [81.21.138.17] (helo=ronaldradial.versatec.local)
 by smtp.greenhost.nl with esmtpsa (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32)
 (Exim 4.72) (envelope-from <ronald-freebsd8@klop.yi.org>)
 id 1UAGcU-0004ok-6f; Tue, 26 Feb 2013 10:14:42 +0100
Content-Type: text/plain; charset=us-ascii; format=flowed; delsp=yes
To: freebsd-fs@freebsd.org, "Ronald F. Guilmette" <rfg@tristatelogic.com>
Subject: Re: Hard drive device names... Serial Numbers?
References: <10096.1361835497@server1.tristatelogic.com>
Date: Tue, 26 Feb 2013 10:14:42 +0100
MIME-Version: 1.0
Content-Transfer-Encoding: 8bit
From: "Ronald Klop" <ronald-freebsd8@klop.yi.org>
Message-ID: <op.ws3vyst08527sy@ronaldradial.versatec.local>
In-Reply-To: <10096.1361835497@server1.tristatelogic.com>
User-Agent: Opera Mail/12.14 (Win32)
X-Virus-Scanned: by clamav at smarthost1.samage.net
X-Spam-Level: /
X-Spam-Score: 0.8
X-Spam-Status: No, score=0.8 required=5.0 tests=BAYES_50 autolearn=disabled
 version=3.3.1
X-Scan-Signature: e462de357cb394d64966911c06262bc8
Cc: Jeremy Chadwick <jdc@koitsu.org>
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 26 Feb 2013 09:14:45 -0000

On Tue, 26 Feb 2013 00:38:17 +0100, Ronald F. Guilmette  
<rfg@tristatelogic.com> wrote:

>
[snip some talk about spam from comcast.net]
>
> I've tried to plow through the references Jeremy gave regading the "wired
> down" capability of CAM(4).  I think that I may sort-of understand it.
> It does appear to be _a_ solution.  I'm not yet 100% persuaded that it
> is the _best_ solution, and the idea of using serial numbers (or WWN
> numbers) is still appealing, at least to me.  But I'm not going to
> advocate for that, mostly because I don't feel that I fully understand
> this "wired down" stuff yet.  I need to look into that more before I
> say anything else.
>
> At least I come away with the the satisfaction of knowing that (a) I am
> indeed not the first person to have either thought of or suggested using
> drive serial numbers and also (b) that this idea _has_ already been well
> and truly discussed, apparently by and among better minds than mine.
>
>
> Regards,
> rfg
>
>
> P.S.  I confess that I've only skimmed the material on the "wired down"
> capability of CAM(4).  Perhaps the answer to this question is burried in
> there someplace, but I'd just like to ask: What are the rules, if any
> regarding what I can rename a given controller channel to within the
> /boot/loader.conf file?  Could I rename one to, for example  
> /dev/foobar707
> if I wanted to?  If not, then what are the rules?


This cam wiring can be very good for complex setups with dedicated  
sysadmins, but for a lot of FreeBSD users mounting on serial number makes  
administrating their servers really easy.
I would think both ways can exist together.

Ronald,

From owner-freebsd-fs@FreeBSD.ORG  Tue Feb 26 20:42:58 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: by hub.freebsd.org (Postfix, from userid 821)
 id 8F1FF7C3; Tue, 26 Feb 2013 20:42:58 +0000 (UTC)
Date: Tue, 26 Feb 2013 20:42:58 +0000
From: John <jwd@FreeBSD.org>
To: Zaphod Beeblebrox <zbeeble@gmail.com>
Subject: Re: RFC: Suggesting ZFS "best practices" in FreeBSD
Message-ID: <20130226204258.GA62875@FreeBSD.org>
References: <314B600D-E8E6-4300-B60F-33D5FA5A39CF@sarenet.es>
 <AAE9CC17-B5C4-43DC-B86B-2F498FCA5AD4@deman.com>
 <20130123111852.GM30633@server.rulingia.com>
 <CACpH0MfpF65N9hHxD9o6UYjF9BGWF3TAUj004tWDxq-oX-N1Vg@mail.gmail.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <CACpH0MfpF65N9hHxD9o6UYjF9BGWF3TAUj004tWDxq-oX-N1Vg@mail.gmail.com>
User-Agent: Mutt/1.5.21 (2010-09-15)
Cc: FreeBSD Filesystems <freebsd-fs@freebsd.org>
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 26 Feb 2013 20:42:58 -0000

----- Zaphod Beeblebrox's Original Message -----
> On Wed, Jan 23, 2013 at 6:18 AM, Peter Jeremy <peter@rulingia.com> wrote:
> 
> > On 2013-Jan-22 17:27:13 -0800, Michael DeMan <freebsd@deman.com> wrote:
> 
> > >#2.  Ensure a little extra space is left on the drive since if the whole
> > drive is used, a replacement may be a tiny bit smaller and will not work.
> >
> > As someone else has mentioned, recent ZFS allows some slop here.  But
> > I still think it's worthwhile carving out some space to allow for a
> > marginally smaller replacement disk.
> 
> I'm somewhat interested in this point.  Not that we should miss a few meg
> on a multi-terrabyte disk, but in my recent experience, all the drive
> manufacturers seem to "agree" on the number of sectors for a certain size
> of disk.  I'm just not sure we need to leave for the allowance of a smaller
> disk.  larger (than required) disks already work anyways.

>From the zpool manpage:

     disk    A block device, typically located under /dev.  ZFS can use indi-
             vidual slices or partitions, though the recommended mode of oper-
             ation is to use whole disks. A disk can be specified by a full
             path to the device or the geom(4) provider name. When given a
             whole disk, ZFS automatically labels the disk, if necessary.

...

         For pools to be portable, you must give the zpool command whole
         disks, not just slices, so that ZFS can label the disks with portable
         EFI labels. Otherwise, disk drivers on platforms of different endian-
         ness will not recognize the disks.

   And of course, if you look through the source, you'll see where ZFS
makes a distinction between slices & whole disks. I have not debugged
through it recently to see how much of it is currently in use.

   If you use dual-channel SAS drives with geom multipath, you need to be
clear whether your meta-data on disk from the different geoms collide... 

   Regardless of how the best practices is put together, make sure
folks are aware of the limitations/caveats of the different choices.

YMMV

Cheers!
John


From owner-freebsd-fs@FreeBSD.ORG  Tue Feb 26 21:39:30 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by hub.freebsd.org (Postfix) with ESMTP id D38F358E
 for <freebsd-fs@freebsd.org>; Tue, 26 Feb 2013 21:39:30 +0000 (UTC)
 (envelope-from radiomlodychbandytow@o2.pl)
Received: from tur.go2.pl (tur.go2.pl [193.17.41.50])
 by mx1.freebsd.org (Postfix) with ESMTP id 6195B226
 for <freebsd-fs@freebsd.org>; Tue, 26 Feb 2013 21:39:30 +0000 (UTC)
Received: from moh1-ve2.go2.pl (moh1-ve2.go2.pl [193.17.41.132])
 by tur.go2.pl (Postfix) with ESMTP id BF29415A080F
 for <freebsd-fs@freebsd.org>; Tue, 26 Feb 2013 22:39:22 +0100 (CET)
Received: from moh1-ve2.go2.pl (unknown [10.0.0.132])
 by moh1-ve2.go2.pl (Postfix) with ESMTP id 27DD7104401F
 for <freebsd-fs@freebsd.org>; Tue, 26 Feb 2013 22:38:53 +0100 (CET)
Received: from unknown (unknown [10.0.0.74])
 by moh1-ve2.go2.pl (Postfix) with SMTP
 for <freebsd-fs@freebsd.org>; Tue, 26 Feb 2013 22:38:53 +0100 (CET)
Received: from unknown [93.175.66.185] by poczta.o2.pl with ESMTP id ESXQdj;
 Tue, 26 Feb 2013 22:38:52 +0100
Message-ID: <512D2B6C.4010009@o2.pl>
Date: Tue, 26 Feb 2013 22:38:52 +0100
From: =?UTF-8?B?UmFkaW8gbcWCb2R5Y2ggYmFuZHl0w7N3?= <radiomlodychbandytow@o2.pl>
User-Agent: Mozilla/5.0 (X11; FreeBSD amd64;
 rv:17.0) Gecko/20130201 Thunderbird/17.0.2
MIME-Version: 1.0
To: Ronald Klop <ronald-freebsd8@klop.yi.org>
Subject: Re: Some filesystem thoughts
References: <mailman.15.1361534401.3001.freebsd-fs@freebsd.org>
 <5129F16A.6020505@o2.pl> <op.ws05ddmt8527sy@pinky>
In-Reply-To: <op.ws05ddmt8527sy@pinky>
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
X-O2-Trust: 1, 38
X-O2-SPF: neutral
Cc: freebsd-fs@freebsd.org
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 26 Feb 2013 21:39:30 -0000

On 24/02/2013 22:45, Ronald Klop wrote:
> On Sun, 24 Feb 2013 11:54:34 +0100, Radio młodych bandytów
> <radiomlodychbandytow@o2.pl> wrote:
>
>> "Ronald Klop" <ronald-freebsd8@klop.yi.org> wrote:
>>> Creative ideas.
>>> Part of what you want is in fusefs (mounting of files to edit their
>>> content).
>> Mhm. Could you give some link or details in another form?
>
> Just google for 'fusefs'.
> Filesystems based on FUSE:
> http://sourceforge.net/apps/mediawiki/fuse/index.php?title=FileSystems

OK, so you meant zipfs-like things. I thought about file being its own 
mountpoint.

>> And part is implemented in e.g. KDE (integrated support for
>>> various file types in fulltext search and tagging of files/metadata,
>>> etc.).
>> Well, I view it as not much different from implementing a TC / MC /
>> VIM plugin. Anybody can benefit, but they have to implement the right
>> API (And there are several programs that use TC plugins).
>> It's interesting as a way of getting some of these benefits though.
>>
>>> The chances of having all these complex libraries integrated in the
>>> FreeBSD OS are close to zero I presume. But I am not in a position to
>>> decide about that.
>> Frankly, I haven't expected anything different. My thoughts did jump
>> to implementation issues and I see them numerous, but I think the idea
>> itself is not sufficiently mature, so I decided to skip them in the
>> first post.
>>
>>> I think you can't expect the OS to serve everybody's detailed wishes.
>> I don't expect it. I just wanted to discuss an idea that seemed to
>> have a potential.
>>> The OS serves files and user programs know what to do with them.
>> Unfortunately, far too often programs don't know it. Files are often
>> not simple and a single program is unable to deal with them. The only
>> way to deal with such cases ATM that I see is to manually remove
>> layers obfuscating the meaningful sources. In some way, it resembles
>> piping data through multiple programs, except that pipes transport
>> bytes, not files and therefore the transformation has to be performed
>> step by step.
>
> Well. It is probably me, but I don't really get what you're trying to
> say here.
We have
grepmail
mboxgrep
pdfgrep
zgrep
They exist solely because grep doesn't know how to deal with some kinds 
of data. Adding tools doesn't scale as one needs number_of_formats * 
number_of_tools for full coverage. Moving it to another layer would 
reduce it to number_of_formats + number_of_tools.

The approach is not only redundant, but also insufficient because such 
tools don't let you grep f.e. pdfs in gzip email-attachments despite 
having all necessary parts in place.

When we look at the data flow that's necessary to perform such task, it's
unmbox (extracts a list of emails)
  V
unmail (separates individual emails to text and attachments)
  V
ungzip (unzips gzipped attachments)
  V
unpdf (extracts texts)
  V
grep (greps)

If mailboxes contained at most 1 email, emails at most 1 attachment, 
this could be performed as a pipe job.

That's why I say it's similar to piping, yet impossible to implement 
this way.
-- 
Twoje radio

From owner-freebsd-fs@FreeBSD.ORG  Wed Feb 27 13:50:56 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by hub.freebsd.org (Postfix) with ESMTP id 356BA6C3;
 Wed, 27 Feb 2013 13:50:56 +0000 (UTC)
 (envelope-from borjam@sarenet.es)
Received: from proxypop03b.sare.net (proxypop03b.sare.net [194.30.0.251])
 by mx1.freebsd.org (Postfix) with ESMTP id ED4551B2;
 Wed, 27 Feb 2013 13:50:55 +0000 (UTC)
Received: from [172.16.1.163] (izaro.sarenet.es [192.148.167.11])
 by proxypop03.sare.net (Postfix) with ESMTPSA id E26899DD406;
 Wed, 27 Feb 2013 14:44:15 +0100 (CET)
Subject: Re: RFC: Suggesting ZFS "best practices" in FreeBSD
Mime-Version: 1.0 (Apple Message framework v1085)
Content-Type: text/plain; charset=us-ascii
From: Borja Marcos <borjam@sarenet.es>
In-Reply-To: <20130226204258.GA62875@FreeBSD.org>
Date: Wed, 27 Feb 2013 14:44:10 +0100
Content-Transfer-Encoding: quoted-printable
Message-Id: <E6A5220B-7C62-420D-956B-83DC11D2E8BD@sarenet.es>
References: <314B600D-E8E6-4300-B60F-33D5FA5A39CF@sarenet.es>
 <AAE9CC17-B5C4-43DC-B86B-2F498FCA5AD4@deman.com>
 <20130123111852.GM30633@server.rulingia.com>
 <CACpH0MfpF65N9hHxD9o6UYjF9BGWF3TAUj004tWDxq-oX-N1Vg@mail.gmail.com>
 <20130226204258.GA62875@FreeBSD.org>
To: John <jwd@FreeBSD.org>
X-Mailer: Apple Mail (2.1085)
Cc: FreeBSD Filesystems <freebsd-fs@freebsd.org>
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 27 Feb 2013 13:50:56 -0000


On Feb 26, 2013, at 9:42 PM, John wrote:

>   And of course, if you look through the source, you'll see where ZFS
> makes a distinction between slices & whole disks. I have not debugged
> through it recently to see how much of it is currently in use.
>=20
>   If you use dual-channel SAS drives with geom multipath, you need to =
be
> clear whether your meta-data on disk from the different geoms =
collide...=20
>=20
>   Regardless of how the best practices is put together, make sure
> folks are aware of the limitations/caveats of the different choices.

Exactly. Anyway, as far as I know, both FreeBSD and Solaris should be =
able to work with GPT slices instead of
whole disks.=20

In the past at least (and, despite the lore one can read here and there) =
Solaris refused to use the disks cache if the vdevs
were made of slices instead of whole disks, but maybe it has changed in =
the past. As far as I know, however, FreeBSD doesn't show
that behavior.




Borja.


From owner-freebsd-fs@FreeBSD.ORG  Wed Feb 27 17:01:57 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by hub.freebsd.org (Postfix) with ESMTP id 61B3750E
 for <freebsd-fs@freebsd.org>; Wed, 27 Feb 2013 17:01:57 +0000 (UTC)
 (envelope-from freebsd-fs@m.gmane.org)
Received: from plane.gmane.org (plane.gmane.org [80.91.229.3])
 by mx1.freebsd.org (Postfix) with ESMTP id 19869ED2
 for <freebsd-fs@freebsd.org>; Wed, 27 Feb 2013 17:01:57 +0000 (UTC)
Received: from list by plane.gmane.org with local (Exim 4.69)
 (envelope-from <freebsd-fs@m.gmane.org>) id 1UAkOO-00021R-EL
 for freebsd-fs@freebsd.org; Wed, 27 Feb 2013 18:02:08 +0100
Received: from lara.cc.fer.hr ([161.53.72.113])
 by main.gmane.org with esmtp (Gmexim 0.1 (Debian))
 id 1AlnuQ-0007hv-00
 for <freebsd-fs@freebsd.org>; Wed, 27 Feb 2013 18:02:08 +0100
Received: from ivoras by lara.cc.fer.hr with local (Gmexim 0.1 (Debian))
 id 1AlnuQ-0007hv-00
 for <freebsd-fs@freebsd.org>; Wed, 27 Feb 2013 18:02:08 +0100
X-Injected-Via-Gmane: http://gmane.org/
To: freebsd-fs@freebsd.org
From: Ivan Voras <ivoras@freebsd.org>
Subject: Re: Some filesystem thoughts
Date: Wed, 27 Feb 2013 18:01:27 +0100
Lines: 70
Message-ID: <kgle54$f79$1@ger.gmane.org>
References: <mailman.15.1361361601.62143.freebsd-fs@freebsd.org>
 <51252372.1040001@o2.pl>
Mime-Version: 1.0
Content-Type: multipart/signed; micalg=pgp-sha1;
 protocol="application/pgp-signature";
 boundary="------------enigF51AC19A28A4F755E7D2BDE5"
X-Complaints-To: usenet@ger.gmane.org
X-Gmane-NNTP-Posting-Host: lara.cc.fer.hr
User-Agent: Mozilla/5.0 (X11; FreeBSD amd64;
 rv:14.0) Gecko/20120812 Thunderbird/14.0
In-Reply-To: <51252372.1040001@o2.pl>
X-Enigmail-Version: 1.4.3
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 27 Feb 2013 17:01:57 -0000

This is an OpenPGP/MIME signed message (RFC 2440 and 3156)
--------------enigF51AC19A28A4F755E7D2BDE5
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

On 20/02/2013 20:26, Radio m=C5=82odych bandyt=C3=B3w wrote:

> The way I see it is not to treat files as streams of bytes. That's not
> what they are, files have meanings and there are tools that bring them
> out. A picture is a stored emotion. OK, there are no tools for that yet=
=2E
> But it is also an array of pixels. And a container with exif data. And
> may be a container with an encrypted archive. And, a stream of bytes to=
o.
> They have multiple facets.
> I think that it would be useful to somehow expose them to applications.=

> Wouldn't it be useful to be able to grep through pdfs in your email
> attachments?

I think the problem is presentation - offering just the "grep" function
is waste of effort since those using GUIs will generally not use grep.
What you're talking about is something like google tried to do with
android (and, probably, failed): a unified search interface across all
applications and their data.

Actually, modern smartphones & tablets are slowly moving into the
direction that there are no "files" and no "filesystems" on your device,
but rather jost your "data" and "apps" which both are managed by the
system (and possibly reside in a "cloud"). It may be that the
"hierarhical filesystem" idea has just not so useful or efficient any
more (but OTOH, I don't see it going away any time soon).

> Mass-edit music tags with sed? Manually edit with your favourite text
> editor instead of the sucky one-liner provided by your favourite music
> player?
> How about video players being able to play videos by reading them in
> decoded form directly from the filesystem instead of having to integrat=
e
> a significant number of complex libraries to provide sufficient format
> coverage?

All those things already exist (or will exist soon) in modern GUI
desktop environments, and especially on handheld-enabled OSes. The way
they are achieved is to introduce a Grand Unified Interface (or several
of them, as it happens), which severly abstract the low-level libraries,
even to the point where the (GUI) application doesn't know it's dealing
with actual files or something completely different.

If you're more concerned about the technical aspects, then learning to
write filesystems in FUSE would be a good starting point for you.




--------------enigF51AC19A28A4F755E7D2BDE5
Content-Type: application/pgp-signature; name="signature.asc"
Content-Description: OpenPGP digital signature
Content-Disposition: attachment; filename="signature.asc"

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.19 (FreeBSD)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAlEuO+cACgkQ/QjVBj3/HSy6HgCfSt+PSRDWzubuIY4WdOyG1C+z
VNcAniNDeHoT2gIk3w66cItjOh71Lg4f
=xdTa
-----END PGP SIGNATURE-----

--------------enigF51AC19A28A4F755E7D2BDE5--


From owner-freebsd-fs@FreeBSD.ORG  Wed Feb 27 19:30:01 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@smarthost.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by hub.freebsd.org (Postfix) with ESMTP id EA5E6E79
 for <freebsd-fs@smarthost.ysv.freebsd.org>;
 Wed, 27 Feb 2013 19:30:01 +0000 (UTC)
 (envelope-from gnats@FreeBSD.org)
Received: from freefall.freebsd.org (freefall.freebsd.org
 [IPv6:2001:1900:2254:206c::16:87])
 by mx1.freebsd.org (Postfix) with ESMTP id DB791999
 for <freebsd-fs@smarthost.ysv.freebsd.org>;
 Wed, 27 Feb 2013 19:30:01 +0000 (UTC)
Received: from freefall.freebsd.org (localhost [127.0.0.1])
 by freefall.freebsd.org (8.14.6/8.14.6) with ESMTP id r1RJU1Mk060466
 for <freebsd-fs@freefall.freebsd.org>; Wed, 27 Feb 2013 19:30:01 GMT
 (envelope-from gnats@freefall.freebsd.org)
Received: (from gnats@localhost)
 by freefall.freebsd.org (8.14.6/8.14.6/Submit) id r1RJU15a060465;
 Wed, 27 Feb 2013 19:30:01 GMT (envelope-from gnats)
Date: Wed, 27 Feb 2013 19:30:01 GMT
Message-Id: <201302271930.r1RJU15a060465@freefall.freebsd.org>
To: freebsd-fs@FreeBSD.org
Cc: 
From: dfilter@FreeBSD.ORG (dfilter service)
Subject: Re: kern/175897: commit references a PR
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
Reply-To: dfilter service <dfilter@FreeBSD.ORG>
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 27 Feb 2013 19:30:02 -0000

The following reply was made to PR kern/175897; it has been noted by GNATS.

From: dfilter@FreeBSD.ORG (dfilter service)
To: bug-followup@FreeBSD.org
Cc:  
Subject: Re: kern/175897: commit references a PR
Date: Wed, 27 Feb 2013 19:22:54 +0000 (UTC)

 Author: mm
 Date: Wed Feb 27 19:22:27 2013
 New Revision: 247407
 URL: http://svnweb.freebsd.org/changeset/base/247407
 
 Log:
   MFC r246631,246651,246666,246675,246678,246688:
   Merge various ZFS bugfixes
   
   MFC r246631:
   Import vendor bugfixes
   
   Illumos ZFS issues:
     3422 zpool create/syseventd race yield non-importable pool
     3425 first write to a new zvol can fail with EFBIG
   
   MFC r246651:
   Import minor type change in refcount.h header from vendor (illumos).
   
   MFC r246666:
   Import vendor ZFS bugfix fixing a problem in arc_read().
   
   Illumos ZFS issues:
     3498 panic in arc_read(): !refcount_is_zero(&pbuf->b_hdr->b_refcnt)
   
   MFC r246675:
   Add tunable to allow block allocation on degraded vdevs.
   
   Illumos ZFS issues:
     3507 Tunable to allow block allocation even on degraded vdevs
   
   MFC r246678:
   Import vendor bugfixes regarding SA rounding, header size and layout.
   This was already partially fixed by avg.
   
   Illumos ZFS issues:
     3512 rounding discrepancy in sa_find_sizes()
     3513 mismatch between SA header size and layout
   
   MFC r246688 [1]:
   Merge zfs_ioctl.c code that should have been merged together with ZFS v28.
   Fixes several problems if working with read-only pools.
   
   Changed code originaly introduced in onnv-gate 13061:bda0decf867b
   Contains changes up to illumos-gate 13700:4bc0783f6064
   
   PR:		kern/175897 [1]
   Suggested by:	avg [1]
 
 Modified:
   stable/8/cddl/contrib/opensolaris/cmd/zdb/zdb.c
   stable/8/cddl/contrib/opensolaris/lib/libzfs/common/libzfs_import.c
   stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c
   stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/bptree.c
   stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dbuf.c
   stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_diff.c
   stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_objset.c
   stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_send.c
   stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_traverse.c
   stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_tx.c
   stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_dataset.c
   stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_scan.c
   stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/metaslab.c
   stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sa.c
   stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa.c
   stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/arc.h
   stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/dmu_traverse.h
   stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/dsl_pool.h
   stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/refcount.h
   stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev.c
   stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_ioctl.c
   stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zil.c
   stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c
   stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zvol.c
 Directory Properties:
   stable/8/cddl/contrib/opensolaris/   (props changed)
   stable/8/cddl/contrib/opensolaris/lib/libzfs/   (props changed)
   stable/8/sys/   (props changed)
   stable/8/sys/cddl/   (props changed)
   stable/8/sys/cddl/contrib/opensolaris/   (props changed)
 
 Modified: stable/8/cddl/contrib/opensolaris/cmd/zdb/zdb.c
 ==============================================================================
 --- stable/8/cddl/contrib/opensolaris/cmd/zdb/zdb.c	Wed Feb 27 19:20:50 2013	(r247406)
 +++ stable/8/cddl/contrib/opensolaris/cmd/zdb/zdb.c	Wed Feb 27 19:22:27 2013	(r247407)
 @@ -983,7 +983,7 @@ visit_indirect(spa_t *spa, const dnode_p
  		arc_buf_t *buf;
  		uint64_t fill = 0;
  
 -		err = arc_read_nolock(NULL, spa, bp, arc_getbuf_func, &buf,
 +		err = arc_read(NULL, spa, bp, arc_getbuf_func, &buf,
  		    ZIO_PRIORITY_ASYNC_READ, ZIO_FLAG_CANFAIL, &flags, zb);
  		if (err)
  			return (err);
 @@ -2001,9 +2001,8 @@ zdb_count_block(zdb_cb_t *zcb, zilog_t *
  	    bp, NULL, NULL, ZIO_FLAG_CANFAIL)), ==, 0);
  }
  
 -/* ARGSUSED */
  static int
 -zdb_blkptr_cb(spa_t *spa, zilog_t *zilog, const blkptr_t *bp, arc_buf_t *pbuf,
 +zdb_blkptr_cb(spa_t *spa, zilog_t *zilog, const blkptr_t *bp,
      const zbookmark_t *zb, const dnode_phys_t *dnp, void *arg)
  {
  	zdb_cb_t *zcb = arg;
 @@ -2410,7 +2409,7 @@ typedef struct zdb_ddt_entry {
  /* ARGSUSED */
  static int
  zdb_ddt_add_cb(spa_t *spa, zilog_t *zilog, const blkptr_t *bp,
 -    arc_buf_t *pbuf, const zbookmark_t *zb, const dnode_phys_t *dnp, void *arg)
 +    const zbookmark_t *zb, const dnode_phys_t *dnp, void *arg)
  {
  	avl_tree_t *t = arg;
  	avl_index_t where;
 
 Modified: stable/8/cddl/contrib/opensolaris/lib/libzfs/common/libzfs_import.c
 ==============================================================================
 --- stable/8/cddl/contrib/opensolaris/lib/libzfs/common/libzfs_import.c	Wed Feb 27 19:20:50 2013	(r247406)
 +++ stable/8/cddl/contrib/opensolaris/lib/libzfs/common/libzfs_import.c	Wed Feb 27 19:22:27 2013	(r247407)
 @@ -526,13 +526,12 @@ get_configs(libzfs_handle_t *hdl, pool_l
  				 *	version
  				 *	pool guid
  				 *	name
 -				 *	pool txg (if available)
  				 *	comment (if available)
  				 *	pool state
  				 *	hostid (if available)
  				 *	hostname (if available)
  				 */
 -				uint64_t state, version, pool_txg;
 +				uint64_t state, version;
  				char *comment = NULL;
  
  				version = fnvlist_lookup_uint64(tmp,
 @@ -548,11 +547,6 @@ get_configs(libzfs_handle_t *hdl, pool_l
  				fnvlist_add_string(config,
  				    ZPOOL_CONFIG_POOL_NAME, name);
  
 -				if (nvlist_lookup_uint64(tmp,
 -				    ZPOOL_CONFIG_POOL_TXG, &pool_txg) == 0)
 -					fnvlist_add_uint64(config,
 -					    ZPOOL_CONFIG_POOL_TXG, pool_txg);
 -
  				if (nvlist_lookup_string(tmp,
  				    ZPOOL_CONFIG_COMMENT, &comment) == 0)
  					fnvlist_add_string(config,
 
 Modified: stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c
 ==============================================================================
 --- stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c	Wed Feb 27 19:20:50 2013	(r247406)
 +++ stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c	Wed Feb 27 19:22:27 2013	(r247407)
 @@ -940,7 +940,6 @@ buf_cons(void *vbuf, void *unused, int k
  
  	bzero(buf, sizeof (arc_buf_t));
  	mutex_init(&buf->b_evict_lock, NULL, MUTEX_DEFAULT, NULL);
 -	rw_init(&buf->b_data_lock, NULL, RW_DEFAULT, NULL);
  	arc_space_consume(sizeof (arc_buf_t), ARC_SPACE_HDRS);
  
  	return (0);
 @@ -970,7 +969,6 @@ buf_dest(void *vbuf, void *unused)
  	arc_buf_t *buf = vbuf;
  
  	mutex_destroy(&buf->b_evict_lock);
 -	rw_destroy(&buf->b_data_lock);
  	arc_space_return(sizeof (arc_buf_t), ARC_SPACE_HDRS);
  }
  
 @@ -2968,42 +2966,11 @@ arc_read_done(zio_t *zio)
   *
   * arc_read_done() will invoke all the requested "done" functions
   * for readers of this block.
 - *
 - * Normal callers should use arc_read and pass the arc buffer and offset
 - * for the bp.  But if you know you don't need locking, you can use
 - * arc_read_nolock.
   */
  int
 -arc_read(zio_t *pio, spa_t *spa, const blkptr_t *bp, arc_buf_t *pbuf,
 -    arc_done_func_t *done, void *private, int priority, int zio_flags,
 -    uint32_t *arc_flags, const zbookmark_t *zb)
 -{
 -	int err;
 -
 -	if (pbuf == NULL) {
 -		/*
 -		 * XXX This happens from traverse callback funcs, for
 -		 * the objset_phys_t block.
 -		 */
 -		return (arc_read_nolock(pio, spa, bp, done, private, priority,
 -		    zio_flags, arc_flags, zb));
 -	}
 -
 -	ASSERT(!refcount_is_zero(&pbuf->b_hdr->b_refcnt));
 -	ASSERT3U((char *)bp - (char *)pbuf->b_data, <, pbuf->b_hdr->b_size);
 -	rw_enter(&pbuf->b_data_lock, RW_READER);
 -
 -	err = arc_read_nolock(pio, spa, bp, done, private, priority,
 -	    zio_flags, arc_flags, zb);
 -	rw_exit(&pbuf->b_data_lock);
 -
 -	return (err);
 -}
 -
 -int
 -arc_read_nolock(zio_t *pio, spa_t *spa, const blkptr_t *bp,
 -    arc_done_func_t *done, void *private, int priority, int zio_flags,
 -    uint32_t *arc_flags, const zbookmark_t *zb)
 +arc_read(zio_t *pio, spa_t *spa, const blkptr_t *bp, arc_done_func_t *done,
 +    void *private, int priority, int zio_flags, uint32_t *arc_flags,
 +    const zbookmark_t *zb)
  {
  	arc_buf_hdr_t *hdr;
  	arc_buf_t *buf;
 @@ -3482,19 +3449,6 @@ arc_release(arc_buf_t *buf, void *tag)
  	}
  }
  
 -/*
 - * Release this buffer.  If it does not match the provided BP, fill it
 - * with that block's contents.
 - */
 -/* ARGSUSED */
 -int
 -arc_release_bp(arc_buf_t *buf, void *tag, blkptr_t *bp, spa_t *spa,
 -    zbookmark_t *zb)
 -{
 -	arc_release(buf, tag);
 -	return (0);
 -}
 -
  int
  arc_released(arc_buf_t *buf)
  {
 
 Modified: stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/bptree.c
 ==============================================================================
 --- stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/bptree.c	Wed Feb 27 19:20:50 2013	(r247406)
 +++ stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/bptree.c	Wed Feb 27 19:22:27 2013	(r247407)
 @@ -135,7 +135,7 @@ bptree_add(objset_t *os, uint64_t obj, b
  
  /* ARGSUSED */
  static int
 -bptree_visit_cb(spa_t *spa, zilog_t *zilog, const blkptr_t *bp, arc_buf_t *pbuf,
 +bptree_visit_cb(spa_t *spa, zilog_t *zilog, const blkptr_t *bp,
      const zbookmark_t *zb, const dnode_phys_t *dnp, void *arg)
  {
  	int err;
 
 Modified: stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dbuf.c
 ==============================================================================
 --- stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dbuf.c	Wed Feb 27 19:20:50 2013	(r247406)
 +++ stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dbuf.c	Wed Feb 27 19:22:27 2013	(r247407)
 @@ -513,7 +513,6 @@ dbuf_read_impl(dmu_buf_impl_t *db, zio_t
  	spa_t *spa;
  	zbookmark_t zb;
  	uint32_t aflags = ARC_NOWAIT;
 -	arc_buf_t *pbuf;
  
  	DB_DNODE_ENTER(db);
  	dn = DB_DNODE(db);
 @@ -575,14 +574,8 @@ dbuf_read_impl(dmu_buf_impl_t *db, zio_t
  	    db->db.db_object, db->db_level, db->db_blkid);
  
  	dbuf_add_ref(db, NULL);
 -	/* ZIO_FLAG_CANFAIL callers have to check the parent zio's error */
  
 -	if (db->db_parent)
 -		pbuf = db->db_parent->db_buf;
 -	else
 -		pbuf = db->db_objset->os_phys_buf;
 -
 -	(void) dsl_read(zio, spa, db->db_blkptr, pbuf,
 +	(void) arc_read(zio, spa, db->db_blkptr,
  	    dbuf_read_done, db, ZIO_PRIORITY_SYNC_READ,
  	    (*flags & DB_RF_CANFAIL) ? ZIO_FLAG_CANFAIL : ZIO_FLAG_MUSTSUCCEED,
  	    &aflags, &zb);
 @@ -982,7 +975,6 @@ void
  dbuf_release_bp(dmu_buf_impl_t *db)
  {
  	objset_t *os;
 -	zbookmark_t zb;
  
  	DB_GET_OBJSET(&os, db);
  	ASSERT(dsl_pool_sync_context(dmu_objset_pool(os)));
 @@ -990,13 +982,7 @@ dbuf_release_bp(dmu_buf_impl_t *db)
  	    list_link_active(&os->os_dsl_dataset->ds_synced_link));
  	ASSERT(db->db_parent == NULL || arc_released(db->db_parent->db_buf));
  
 -	zb.zb_objset = os->os_dsl_dataset ?
 -	    os->os_dsl_dataset->ds_object : 0;
 -	zb.zb_object = db->db.db_object;
 -	zb.zb_level = db->db_level;
 -	zb.zb_blkid = db->db_blkid;
 -	(void) arc_release_bp(db->db_buf, db,
 -	    db->db_blkptr, os->os_spa, &zb);
 +	(void) arc_release(db->db_buf, db);
  }
  
  dbuf_dirty_record_t *
 @@ -1831,7 +1817,6 @@ dbuf_prefetch(dnode_t *dn, uint64_t blki
  		if (bp && !BP_IS_HOLE(bp)) {
  			int priority = dn->dn_type == DMU_OT_DDT_ZAP ?
  			    ZIO_PRIORITY_DDT_PREFETCH : ZIO_PRIORITY_ASYNC_READ;
 -			arc_buf_t *pbuf;
  			dsl_dataset_t *ds = dn->dn_objset->os_dsl_dataset;
  			uint32_t aflags = ARC_NOWAIT | ARC_PREFETCH;
  			zbookmark_t zb;
 @@ -1839,13 +1824,8 @@ dbuf_prefetch(dnode_t *dn, uint64_t blki
  			SET_BOOKMARK(&zb, ds ? ds->ds_object : DMU_META_OBJSET,
  			    dn->dn_object, 0, blkid);
  
 -			if (db)
 -				pbuf = db->db_buf;
 -			else
 -				pbuf = dn->dn_objset->os_phys_buf;
 -
 -			(void) dsl_read(NULL, dn->dn_objset->os_spa,
 -			    bp, pbuf, NULL, NULL, priority,
 +			(void) arc_read(NULL, dn->dn_objset->os_spa,
 +			    bp, NULL, NULL, priority,
  			    ZIO_FLAG_CANFAIL | ZIO_FLAG_SPECULATIVE,
  			    &aflags, &zb);
  		}
 
 Modified: stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_diff.c
 ==============================================================================
 --- stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_diff.c	Wed Feb 27 19:20:50 2013	(r247406)
 +++ stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_diff.c	Wed Feb 27 19:22:27 2013	(r247407)
 @@ -128,7 +128,7 @@ report_dnode(struct diffarg *da, uint64_
  
  /* ARGSUSED */
  static int
 -diff_cb(spa_t *spa, zilog_t *zilog, const blkptr_t *bp, arc_buf_t *pbuf,
 +diff_cb(spa_t *spa, zilog_t *zilog, const blkptr_t *bp,
      const zbookmark_t *zb, const dnode_phys_t *dnp, void *arg)
  {
  	struct diffarg *da = arg;
 @@ -155,9 +155,9 @@ diff_cb(spa_t *spa, zilog_t *zilog, cons
  		int blksz = BP_GET_LSIZE(bp);
  		int i;
  
 -		if (dsl_read(NULL, spa, bp, pbuf,
 -		    arc_getbuf_func, &abuf, ZIO_PRIORITY_ASYNC_READ,
 -		    ZIO_FLAG_CANFAIL, &aflags, zb) != 0)
 +		if (arc_read(NULL, spa, bp, arc_getbuf_func, &abuf,
 +		    ZIO_PRIORITY_ASYNC_READ, ZIO_FLAG_CANFAIL,
 +		    &aflags, zb) != 0)
  			return (EIO);
  
  		blk = abuf->b_data;
 
 Modified: stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_objset.c
 ==============================================================================
 --- stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_objset.c	Wed Feb 27 19:20:50 2013	(r247406)
 +++ stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_objset.c	Wed Feb 27 19:22:27 2013	(r247407)
 @@ -276,12 +276,7 @@ dmu_objset_open_impl(spa_t *spa, dsl_dat
  			aflags |= ARC_L2CACHE;
  
  		dprintf_bp(os->os_rootbp, "reading %s", "");
 -		/*
 -		 * XXX when bprewrite scrub can change the bp,
 -		 * and this is called from dmu_objset_open_ds_os, the bp
 -		 * could change, and we'll need a lock.
 -		 */
 -		err = dsl_read_nolock(NULL, spa, os->os_rootbp,
 +		err = arc_read(NULL, spa, os->os_rootbp,
  		    arc_getbuf_func, &os->os_phys_buf,
  		    ZIO_PRIORITY_SYNC_READ, ZIO_FLAG_CANFAIL, &aflags, &zb);
  		if (err) {
 @@ -1124,8 +1119,7 @@ dmu_objset_sync(objset_t *os, zio_t *pio
  	SET_BOOKMARK(&zb, os->os_dsl_dataset ?
  	    os->os_dsl_dataset->ds_object : DMU_META_OBJSET,
  	    ZB_ROOT_OBJECT, ZB_ROOT_LEVEL, ZB_ROOT_BLKID);
 -	VERIFY3U(0, ==, arc_release_bp(os->os_phys_buf, &os->os_phys_buf,
 -	    os->os_rootbp, os->os_spa, &zb));
 +	arc_release(os->os_phys_buf, &os->os_phys_buf);
  
  	dmu_write_policy(os, NULL, 0, 0, &zp);
  
 @@ -1764,7 +1758,7 @@ dmu_objset_prefetch(const char *name, vo
  			SET_BOOKMARK(&zb, ds->ds_object, ZB_ROOT_OBJECT,
  			    ZB_ROOT_LEVEL, ZB_ROOT_BLKID);
  
 -			(void) dsl_read_nolock(NULL, dsl_dataset_get_spa(ds),
 +			(void) arc_read(NULL, dsl_dataset_get_spa(ds),
  			    &ds->ds_phys->ds_bp, NULL, NULL,
  			    ZIO_PRIORITY_ASYNC_READ,
  			    ZIO_FLAG_CANFAIL | ZIO_FLAG_SPECULATIVE,
 
 Modified: stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_send.c
 ==============================================================================
 --- stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_send.c	Wed Feb 27 19:20:50 2013	(r247406)
 +++ stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_send.c	Wed Feb 27 19:22:27 2013	(r247407)
 @@ -317,7 +317,7 @@ dump_dnode(dmu_sendarg_t *dsp, uint64_t 
  
  /* ARGSUSED */
  static int
 -backup_cb(spa_t *spa, zilog_t *zilog, const blkptr_t *bp, arc_buf_t *pbuf,
 +backup_cb(spa_t *spa, zilog_t *zilog, const blkptr_t *bp,
      const zbookmark_t *zb, const dnode_phys_t *dnp, void *arg)
  {
  	dmu_sendarg_t *dsp = arg;
 @@ -346,9 +346,9 @@ backup_cb(spa_t *spa, zilog_t *zilog, co
  		uint32_t aflags = ARC_WAIT;
  		arc_buf_t *abuf;
  
 -		if (dsl_read(NULL, spa, bp, pbuf,
 -		    arc_getbuf_func, &abuf, ZIO_PRIORITY_ASYNC_READ,
 -		    ZIO_FLAG_CANFAIL, &aflags, zb) != 0)
 +		if (arc_read(NULL, spa, bp, arc_getbuf_func, &abuf,
 +		    ZIO_PRIORITY_ASYNC_READ, ZIO_FLAG_CANFAIL,
 +		    &aflags, zb) != 0)
  			return (EIO);
  
  		blk = abuf->b_data;
 @@ -365,9 +365,9 @@ backup_cb(spa_t *spa, zilog_t *zilog, co
  		arc_buf_t *abuf;
  		int blksz = BP_GET_LSIZE(bp);
  
 -		if (arc_read_nolock(NULL, spa, bp,
 -		    arc_getbuf_func, &abuf, ZIO_PRIORITY_ASYNC_READ,
 -		    ZIO_FLAG_CANFAIL, &aflags, zb) != 0)
 +		if (arc_read(NULL, spa, bp, arc_getbuf_func, &abuf,
 +		    ZIO_PRIORITY_ASYNC_READ, ZIO_FLAG_CANFAIL,
 +		    &aflags, zb) != 0)
  			return (EIO);
  
  		err = dump_spill(dsp, zb->zb_object, blksz, abuf->b_data);
 @@ -377,9 +377,9 @@ backup_cb(spa_t *spa, zilog_t *zilog, co
  		arc_buf_t *abuf;
  		int blksz = BP_GET_LSIZE(bp);
  
 -		if (dsl_read(NULL, spa, bp, pbuf,
 -		    arc_getbuf_func, &abuf, ZIO_PRIORITY_ASYNC_READ,
 -		    ZIO_FLAG_CANFAIL, &aflags, zb) != 0) {
 +		if (arc_read(NULL, spa, bp, arc_getbuf_func, &abuf,
 +		    ZIO_PRIORITY_ASYNC_READ, ZIO_FLAG_CANFAIL,
 +		    &aflags, zb) != 0) {
  			if (zfs_send_corrupt_data) {
  				/* Send a block filled with 0x"zfs badd bloc" */
  				abuf = arc_buf_alloc(spa, blksz, &abuf,
 
 Modified: stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_traverse.c
 ==============================================================================
 --- stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_traverse.c	Wed Feb 27 19:20:50 2013	(r247406)
 +++ stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_traverse.c	Wed Feb 27 19:22:27 2013	(r247407)
 @@ -62,9 +62,9 @@ typedef struct traverse_data {
  } traverse_data_t;
  
  static int traverse_dnode(traverse_data_t *td, const dnode_phys_t *dnp,
 -    arc_buf_t *buf, uint64_t objset, uint64_t object);
 +    uint64_t objset, uint64_t object);
  static void prefetch_dnode_metadata(traverse_data_t *td, const dnode_phys_t *,
 -    arc_buf_t *buf, uint64_t objset, uint64_t object);
 +    uint64_t objset, uint64_t object);
  
  static int
  traverse_zil_block(zilog_t *zilog, blkptr_t *bp, void *arg, uint64_t claim_txg)
 @@ -81,7 +81,7 @@ traverse_zil_block(zilog_t *zilog, blkpt
  	SET_BOOKMARK(&zb, td->td_objset, ZB_ZIL_OBJECT, ZB_ZIL_LEVEL,
  	    bp->blk_cksum.zc_word[ZIL_ZC_SEQ]);
  
 -	(void) td->td_func(td->td_spa, zilog, bp, NULL, &zb, NULL, td->td_arg);
 +	(void) td->td_func(td->td_spa, zilog, bp, &zb, NULL, td->td_arg);
  
  	return (0);
  }
 @@ -105,7 +105,7 @@ traverse_zil_record(zilog_t *zilog, lr_t
  		SET_BOOKMARK(&zb, td->td_objset, lr->lr_foid,
  		    ZB_ZIL_LEVEL, lr->lr_offset / BP_GET_LSIZE(bp));
  
 -		(void) td->td_func(td->td_spa, zilog, bp, NULL, &zb, NULL,
 +		(void) td->td_func(td->td_spa, zilog, bp, &zb, NULL,
  		    td->td_arg);
  	}
  	return (0);
 @@ -182,7 +182,7 @@ traverse_pause(traverse_data_t *td, cons
  
  static void
  traverse_prefetch_metadata(traverse_data_t *td,
 -    arc_buf_t *pbuf, const blkptr_t *bp, const zbookmark_t *zb)
 +    const blkptr_t *bp, const zbookmark_t *zb)
  {
  	uint32_t flags = ARC_NOWAIT | ARC_PREFETCH;
  
 @@ -200,14 +200,13 @@ traverse_prefetch_metadata(traverse_data
  	if (BP_GET_LEVEL(bp) == 0 && BP_GET_TYPE(bp) != DMU_OT_DNODE)
  		return;
  
 -	(void) arc_read(NULL, td->td_spa, bp,
 -	    pbuf, NULL, NULL, ZIO_PRIORITY_ASYNC_READ,
 -	    ZIO_FLAG_CANFAIL, &flags, zb);
 +	(void) arc_read(NULL, td->td_spa, bp, NULL, NULL,
 +	    ZIO_PRIORITY_ASYNC_READ, ZIO_FLAG_CANFAIL, &flags, zb);
  }
  
  static int
  traverse_visitbp(traverse_data_t *td, const dnode_phys_t *dnp,
 -    arc_buf_t *pbuf, const blkptr_t *bp, const zbookmark_t *zb)
 +    const blkptr_t *bp, const zbookmark_t *zb)
  {
  	zbookmark_t czb;
  	int err = 0, lasterr = 0;
 @@ -228,8 +227,7 @@ traverse_visitbp(traverse_data_t *td, co
  	}
  
  	if (BP_IS_HOLE(bp)) {
 -		err = td->td_func(td->td_spa, NULL, NULL, pbuf, zb, dnp,
 -		    td->td_arg);
 +		err = td->td_func(td->td_spa, NULL, NULL, zb, dnp, td->td_arg);
  		return (err);
  	}
  
 @@ -249,7 +247,7 @@ traverse_visitbp(traverse_data_t *td, co
  	}
  
  	if (td->td_flags & TRAVERSE_PRE) {
 -		err = td->td_func(td->td_spa, NULL, bp, pbuf, zb, dnp,
 +		err = td->td_func(td->td_spa, NULL, bp, zb, dnp,
  		    td->td_arg);
  		if (err == TRAVERSE_VISIT_NO_CHILDREN)
  			return (0);
 @@ -265,8 +263,7 @@ traverse_visitbp(traverse_data_t *td, co
  		blkptr_t *cbp;
  		int epb = BP_GET_LSIZE(bp) >> SPA_BLKPTRSHIFT;
  
 -		err = dsl_read(NULL, td->td_spa, bp, pbuf,
 -		    arc_getbuf_func, &buf,
 +		err = arc_read(NULL, td->td_spa, bp, arc_getbuf_func, &buf,
  		    ZIO_PRIORITY_ASYNC_READ, ZIO_FLAG_CANFAIL, &flags, zb);
  		if (err)
  			return (err);
 @@ -276,7 +273,7 @@ traverse_visitbp(traverse_data_t *td, co
  			SET_BOOKMARK(&czb, zb->zb_objset, zb->zb_object,
  			    zb->zb_level - 1,
  			    zb->zb_blkid * epb + i);
 -			traverse_prefetch_metadata(td, buf, &cbp[i], &czb);
 +			traverse_prefetch_metadata(td, &cbp[i], &czb);
  		}
  
  		/* recursively visitbp() blocks below this */
 @@ -284,7 +281,7 @@ traverse_visitbp(traverse_data_t *td, co
  			SET_BOOKMARK(&czb, zb->zb_objset, zb->zb_object,
  			    zb->zb_level - 1,
  			    zb->zb_blkid * epb + i);
 -			err = traverse_visitbp(td, dnp, buf, &cbp[i], &czb);
 +			err = traverse_visitbp(td, dnp, &cbp[i], &czb);
  			if (err) {
  				if (!hard)
  					break;
 @@ -296,21 +293,20 @@ traverse_visitbp(traverse_data_t *td, co
  		int i;
  		int epb = BP_GET_LSIZE(bp) >> DNODE_SHIFT;
  
 -		err = dsl_read(NULL, td->td_spa, bp, pbuf,
 -		    arc_getbuf_func, &buf,
 +		err = arc_read(NULL, td->td_spa, bp, arc_getbuf_func, &buf,
  		    ZIO_PRIORITY_ASYNC_READ, ZIO_FLAG_CANFAIL, &flags, zb);
  		if (err)
  			return (err);
  		dnp = buf->b_data;
  
  		for (i = 0; i < epb; i++) {
 -			prefetch_dnode_metadata(td, &dnp[i], buf, zb->zb_objset,
 +			prefetch_dnode_metadata(td, &dnp[i], zb->zb_objset,
  			    zb->zb_blkid * epb + i);
  		}
  
  		/* recursively visitbp() blocks below this */
  		for (i = 0; i < epb; i++) {
 -			err = traverse_dnode(td, &dnp[i], buf, zb->zb_objset,
 +			err = traverse_dnode(td, &dnp[i], zb->zb_objset,
  			    zb->zb_blkid * epb + i);
  			if (err) {
  				if (!hard)
 @@ -323,24 +319,23 @@ traverse_visitbp(traverse_data_t *td, co
  		objset_phys_t *osp;
  		dnode_phys_t *dnp;
  
 -		err = dsl_read_nolock(NULL, td->td_spa, bp,
 -		    arc_getbuf_func, &buf,
 +		err = arc_read(NULL, td->td_spa, bp, arc_getbuf_func, &buf,
  		    ZIO_PRIORITY_ASYNC_READ, ZIO_FLAG_CANFAIL, &flags, zb);
  		if (err)
  			return (err);
  
  		osp = buf->b_data;
  		dnp = &osp->os_meta_dnode;
 -		prefetch_dnode_metadata(td, dnp, buf, zb->zb_objset,
 +		prefetch_dnode_metadata(td, dnp, zb->zb_objset,
  		    DMU_META_DNODE_OBJECT);
  		if (arc_buf_size(buf) >= sizeof (objset_phys_t)) {
  			prefetch_dnode_metadata(td, &osp->os_userused_dnode,
 -			    buf, zb->zb_objset, DMU_USERUSED_OBJECT);
 +			    zb->zb_objset, DMU_USERUSED_OBJECT);
  			prefetch_dnode_metadata(td, &osp->os_groupused_dnode,
 -			    buf, zb->zb_objset, DMU_USERUSED_OBJECT);
 +			    zb->zb_objset, DMU_USERUSED_OBJECT);
  		}
  
 -		err = traverse_dnode(td, dnp, buf, zb->zb_objset,
 +		err = traverse_dnode(td, dnp, zb->zb_objset,
  		    DMU_META_DNODE_OBJECT);
  		if (err && hard) {
  			lasterr = err;
 @@ -348,7 +343,7 @@ traverse_visitbp(traverse_data_t *td, co
  		}
  		if (err == 0 && arc_buf_size(buf) >= sizeof (objset_phys_t)) {
  			dnp = &osp->os_userused_dnode;
 -			err = traverse_dnode(td, dnp, buf, zb->zb_objset,
 +			err = traverse_dnode(td, dnp, zb->zb_objset,
  			    DMU_USERUSED_OBJECT);
  		}
  		if (err && hard) {
 @@ -357,7 +352,7 @@ traverse_visitbp(traverse_data_t *td, co
  		}
  		if (err == 0 && arc_buf_size(buf) >= sizeof (objset_phys_t)) {
  			dnp = &osp->os_groupused_dnode;
 -			err = traverse_dnode(td, dnp, buf, zb->zb_objset,
 +			err = traverse_dnode(td, dnp, zb->zb_objset,
  			    DMU_GROUPUSED_OBJECT);
  		}
  	}
 @@ -367,8 +362,7 @@ traverse_visitbp(traverse_data_t *td, co
  
  post:
  	if (err == 0 && lasterr == 0 && (td->td_flags & TRAVERSE_POST)) {
 -		err = td->td_func(td->td_spa, NULL, bp, pbuf, zb, dnp,
 -		    td->td_arg);
 +		err = td->td_func(td->td_spa, NULL, bp, zb, dnp, td->td_arg);
  		if (err == ERESTART)
  			pause = B_TRUE;
  	}
 @@ -384,25 +378,25 @@ post:
  
  static void
  prefetch_dnode_metadata(traverse_data_t *td, const dnode_phys_t *dnp,
 -    arc_buf_t *buf, uint64_t objset, uint64_t object)
 +    uint64_t objset, uint64_t object)
  {
  	int j;
  	zbookmark_t czb;
  
  	for (j = 0; j < dnp->dn_nblkptr; j++) {
  		SET_BOOKMARK(&czb, objset, object, dnp->dn_nlevels - 1, j);
 -		traverse_prefetch_metadata(td, buf, &dnp->dn_blkptr[j], &czb);
 +		traverse_prefetch_metadata(td, &dnp->dn_blkptr[j], &czb);
  	}
  
  	if (dnp->dn_flags & DNODE_FLAG_SPILL_BLKPTR) {
  		SET_BOOKMARK(&czb, objset, object, 0, DMU_SPILL_BLKID);
 -		traverse_prefetch_metadata(td, buf, &dnp->dn_spill, &czb);
 +		traverse_prefetch_metadata(td, &dnp->dn_spill, &czb);
  	}
  }
  
  static int
  traverse_dnode(traverse_data_t *td, const dnode_phys_t *dnp,
 -    arc_buf_t *buf, uint64_t objset, uint64_t object)
 +    uint64_t objset, uint64_t object)
  {
  	int j, err = 0, lasterr = 0;
  	zbookmark_t czb;
 @@ -410,7 +404,7 @@ traverse_dnode(traverse_data_t *td, cons
  
  	for (j = 0; j < dnp->dn_nblkptr; j++) {
  		SET_BOOKMARK(&czb, objset, object, dnp->dn_nlevels - 1, j);
 -		err = traverse_visitbp(td, dnp, buf, &dnp->dn_blkptr[j], &czb);
 +		err = traverse_visitbp(td, dnp, &dnp->dn_blkptr[j], &czb);
  		if (err) {
  			if (!hard)
  				break;
 @@ -420,7 +414,7 @@ traverse_dnode(traverse_data_t *td, cons
  
  	if (dnp->dn_flags & DNODE_FLAG_SPILL_BLKPTR) {
  		SET_BOOKMARK(&czb, objset, object, 0, DMU_SPILL_BLKID);
 -		err = traverse_visitbp(td, dnp, buf, &dnp->dn_spill, &czb);
 +		err = traverse_visitbp(td, dnp, &dnp->dn_spill, &czb);
  		if (err) {
  			if (!hard)
  				return (err);
 @@ -433,8 +427,7 @@ traverse_dnode(traverse_data_t *td, cons
  /* ARGSUSED */
  static int
  traverse_prefetcher(spa_t *spa, zilog_t *zilog, const blkptr_t *bp,
 -    arc_buf_t *pbuf, const zbookmark_t *zb, const dnode_phys_t *dnp,
 -    void *arg)
 +    const zbookmark_t *zb, const dnode_phys_t *dnp, void *arg)
  {
  	prefetch_data_t *pfd = arg;
  	uint32_t aflags = ARC_NOWAIT | ARC_PREFETCH;
 @@ -455,10 +448,8 @@ traverse_prefetcher(spa_t *spa, zilog_t 
  	cv_broadcast(&pfd->pd_cv);
  	mutex_exit(&pfd->pd_mtx);
  
 -	(void) dsl_read(NULL, spa, bp, pbuf, NULL, NULL,
 -	    ZIO_PRIORITY_ASYNC_READ,
 -	    ZIO_FLAG_CANFAIL | ZIO_FLAG_SPECULATIVE,
 -	    &aflags, zb);
 +	(void) arc_read(NULL, spa, bp, NULL, NULL, ZIO_PRIORITY_ASYNC_READ,
 +	    ZIO_FLAG_CANFAIL | ZIO_FLAG_SPECULATIVE, &aflags, zb);
  
  	return (0);
  }
 @@ -476,7 +467,7 @@ traverse_prefetch_thread(void *arg)
  
  	SET_BOOKMARK(&czb, td.td_objset,
  	    ZB_ROOT_OBJECT, ZB_ROOT_LEVEL, ZB_ROOT_BLKID);
 -	(void) traverse_visitbp(&td, NULL, NULL, td.td_rootbp, &czb);
 +	(void) traverse_visitbp(&td, NULL, td.td_rootbp, &czb);
  
  	mutex_enter(&td_main->td_pfd->pd_mtx);
  	td_main->td_pfd->pd_exited = B_TRUE;
 @@ -540,7 +531,7 @@ traverse_impl(spa_t *spa, dsl_dataset_t 
  
  	SET_BOOKMARK(&czb, td.td_objset,
  	    ZB_ROOT_OBJECT, ZB_ROOT_LEVEL, ZB_ROOT_BLKID);
 -	err = traverse_visitbp(&td, NULL, NULL, rootbp, &czb);
 +	err = traverse_visitbp(&td, NULL, rootbp, &czb);
  
  	mutex_enter(&pd.pd_mtx);
  	pd.pd_cancel = B_TRUE;
 
 Modified: stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_tx.c
 ==============================================================================
 --- stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_tx.c	Wed Feb 27 19:20:50 2013	(r247406)
 +++ stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_tx.c	Wed Feb 27 19:22:27 2013	(r247407)
 @@ -21,7 +21,7 @@
  /*
   * Copyright (c) 2005, 2010, Oracle and/or its affiliates. All rights reserved.
   * Copyright 2011 Nexenta Systems, Inc.  All rights reserved.
 - * Copyright (c) 2012 by Delphix. All rights reserved.
 + * Copyright (c) 2013 by Delphix. All rights reserved.
   */
  
  #include <sys/dmu.h>
 @@ -284,6 +284,7 @@ dmu_tx_count_write(dmu_tx_hold_t *txh, u
  			delta = P2NPHASE(off, dn->dn_datablksz);
  		}
  
 +		min_ibs = max_ibs = dn->dn_indblkshift;
  		if (dn->dn_maxblkid > 0) {
  			/*
  			 * The blocksize can't change,
 @@ -291,13 +292,6 @@ dmu_tx_count_write(dmu_tx_hold_t *txh, u
  			 */
  			ASSERT(dn->dn_datablkshift != 0);
  			min_bs = max_bs = dn->dn_datablkshift;
 -			min_ibs = max_ibs = dn->dn_indblkshift;
 -		} else if (dn->dn_indblkshift > max_ibs) {
 -			/*
 -			 * This ensures that if we reduce DN_MAX_INDBLKSHIFT,
 -			 * the code will still work correctly on older pools.
 -			 */
 -			min_ibs = max_ibs = dn->dn_indblkshift;
  		}
  
  		/*
 
 Modified: stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_dataset.c
 ==============================================================================
 --- stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_dataset.c	Wed Feb 27 19:20:50 2013	(r247406)
 +++ stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_dataset.c	Wed Feb 27 19:22:27 2013	(r247407)
 @@ -1308,7 +1308,7 @@ struct killarg {
  
  /* ARGSUSED */
  static int
 -kill_blkptr(spa_t *spa, zilog_t *zilog, const blkptr_t *bp, arc_buf_t *pbuf,
 +kill_blkptr(spa_t *spa, zilog_t *zilog, const blkptr_t *bp,
      const zbookmark_t *zb, const dnode_phys_t *dnp, void *arg)
  {
  	struct killarg *ka = arg;
 
 Modified: stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_scan.c
 ==============================================================================
 --- stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_scan.c	Wed Feb 27 19:20:50 2013	(r247406)
 +++ stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_scan.c	Wed Feb 27 19:22:27 2013	(r247407)
 @@ -396,24 +396,6 @@ dsl_free_sync(zio_t *pio, dsl_pool_t *dp
  	zio_nowait(zio_free_sync(pio, dp->dp_spa, txg, bpp, pio->io_flags));
  }
  
 -int
 -dsl_read(zio_t *pio, spa_t *spa, const blkptr_t *bpp, arc_buf_t *pbuf,
 -    arc_done_func_t *done, void *private, int priority, int zio_flags,
 -    uint32_t *arc_flags, const zbookmark_t *zb)
 -{
 -	return (arc_read(pio, spa, bpp, pbuf, done, private,
 -	    priority, zio_flags, arc_flags, zb));
 -}
 -
 -int
 -dsl_read_nolock(zio_t *pio, spa_t *spa, const blkptr_t *bpp,
 -    arc_done_func_t *done, void *private, int priority, int zio_flags,
 -    uint32_t *arc_flags, const zbookmark_t *zb)
 -{
 -	return (arc_read_nolock(pio, spa, bpp, done, private,
 -	    priority, zio_flags, arc_flags, zb));
 -}
 -
  static uint64_t
  dsl_scan_ds_maxtxg(dsl_dataset_t *ds)
  {
 @@ -584,12 +566,8 @@ dsl_scan_prefetch(dsl_scan_t *scn, arc_b
  
  	SET_BOOKMARK(&czb, objset, object, BP_GET_LEVEL(bp), blkid);
  
 -	/*
 -	 * XXX need to make sure all of these arc_read() prefetches are
 -	 * done before setting xlateall (similar to dsl_read())
 -	 */
  	(void) arc_read(scn->scn_zio_root, scn->scn_dp->dp_spa, bp,
 -	    buf, NULL, NULL, ZIO_PRIORITY_ASYNC_READ,
 +	    NULL, NULL, ZIO_PRIORITY_ASYNC_READ,
  	    ZIO_FLAG_CANFAIL | ZIO_FLAG_SCAN_THREAD, &flags, &czb);
  }
  
 @@ -647,8 +625,7 @@ dsl_scan_recurse(dsl_scan_t *scn, dsl_da
  		blkptr_t *cbp;
  		int epb = BP_GET_LSIZE(bp) >> SPA_BLKPTRSHIFT;
  
 -		err = arc_read_nolock(NULL, dp->dp_spa, bp,
 -		    arc_getbuf_func, bufp,
 +		err = arc_read(NULL, dp->dp_spa, bp, arc_getbuf_func, bufp,
  		    ZIO_PRIORITY_ASYNC_READ, zio_flags, &flags, zb);
  		if (err) {
  			scn->scn_phys.scn_errors++;
 @@ -670,8 +647,7 @@ dsl_scan_recurse(dsl_scan_t *scn, dsl_da
  	} else if (BP_GET_TYPE(bp) == DMU_OT_USERGROUP_USED) {
  		uint32_t flags = ARC_WAIT;
  
 -		err = arc_read_nolock(NULL, dp->dp_spa, bp,
 -		    arc_getbuf_func, bufp,
 +		err = arc_read(NULL, dp->dp_spa, bp, arc_getbuf_func, bufp,
  		    ZIO_PRIORITY_ASYNC_READ, zio_flags, &flags, zb);
  		if (err) {
  			scn->scn_phys.scn_errors++;
 @@ -683,8 +659,7 @@ dsl_scan_recurse(dsl_scan_t *scn, dsl_da
  		int i, j;
  		int epb = BP_GET_LSIZE(bp) >> DNODE_SHIFT;
  
 -		err = arc_read_nolock(NULL, dp->dp_spa, bp,
 -		    arc_getbuf_func, bufp,
 +		err = arc_read(NULL, dp->dp_spa, bp, arc_getbuf_func, bufp,
  		    ZIO_PRIORITY_ASYNC_READ, zio_flags, &flags, zb);
  		if (err) {
  			scn->scn_phys.scn_errors++;
 @@ -706,8 +681,7 @@ dsl_scan_recurse(dsl_scan_t *scn, dsl_da
  		uint32_t flags = ARC_WAIT;
  		objset_phys_t *osp;
  
 -		err = arc_read_nolock(NULL, dp->dp_spa, bp,
 -		    arc_getbuf_func, bufp,
 +		err = arc_read(NULL, dp->dp_spa, bp, arc_getbuf_func, bufp,
  		    ZIO_PRIORITY_ASYNC_READ, zio_flags, &flags, zb);
  		if (err) {
  			scn->scn_phys.scn_errors++;
 
 Modified: stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/metaslab.c
 ==============================================================================
 --- stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/metaslab.c	Wed Feb 27 19:20:50 2013	(r247406)
 +++ stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/metaslab.c	Wed Feb 27 19:22:27 2013	(r247407)
 @@ -21,6 +21,7 @@
  /*
   * Copyright (c) 2005, 2010, Oracle and/or its affiliates. All rights reserved.
   * Copyright (c) 2012 by Delphix. All rights reserved.
 + * Copyright (c) 2013 by Saso Kiselkov. All rights reserved.
   */
  
  #include <sys/zfs_context.h>
 @@ -97,6 +98,15 @@ int metaslab_prefetch_limit = SPA_DVAS_P
  int metaslab_smo_bonus_pct = 150;
  
  /*
 + * Should we be willing to write data to degraded vdevs?
 + */
 +boolean_t zfs_write_to_degraded = B_FALSE;
 +SYSCTL_INT(_vfs_zfs, OID_AUTO, write_to_degraded, CTLFLAG_RW,
 +    &zfs_write_to_degraded, 0,
 +    "Allow writing data to degraded vdevs");
 +TUNABLE_INT("vfs.zfs.write_to_degraded", &zfs_write_to_degraded);
 +
 +/*
   * ==========================================================================
   * Metaslab classes
   * ==========================================================================
 @@ -1383,10 +1393,13 @@ top:
  
  		/*
  		 * Avoid writing single-copy data to a failing vdev
 +		 * unless the user instructs us that it is okay.
  		 */
  		if ((vd->vdev_stat.vs_write_errors > 0 ||
  		    vd->vdev_state < VDEV_STATE_HEALTHY) &&
 -		    d == 0 && dshift == 3) {
 +		    d == 0 && dshift == 3 &&
 +		    !(zfs_write_to_degraded && vd->vdev_state ==
 +		    VDEV_STATE_DEGRADED)) {
  			all_zero = B_FALSE;
  			goto next;
  		}
 
 Modified: stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sa.c
 ==============================================================================
 --- stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sa.c	Wed Feb 27 19:20:50 2013	(r247406)
 +++ stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sa.c	Wed Feb 27 19:22:27 2013	(r247407)
 @@ -553,6 +553,7 @@ sa_find_sizes(sa_os_t *sa, sa_bulk_attr_
  {
  	int var_size = 0;
  	int i;
 +	int j = -1;
  	int full_space;
  	int hdrsize;
  	boolean_t done = B_FALSE;
 @@ -574,11 +575,13 @@ sa_find_sizes(sa_os_t *sa, sa_bulk_attr_
  	    sizeof (sa_hdr_phys_t);
  
  	full_space = (buftype == SA_BONUS) ? DN_MAX_BONUSLEN : db->db_size;
 +	ASSERT(IS_P2ALIGNED(full_space, 8));
  
  	for (i = 0; i != attr_count; i++) {
  		boolean_t is_var_sz;
  
 -		*total += P2ROUNDUP(attr_desc[i].sa_length, 8);
 +		*total = P2ROUNDUP(*total, 8);
 +		*total += attr_desc[i].sa_length;
  		if (done)
  			goto next;
  
 @@ -590,7 +593,14 @@ sa_find_sizes(sa_os_t *sa, sa_bulk_attr_
  		if (is_var_sz && var_size > 1) {
  			if (P2ROUNDUP(hdrsize + sizeof (uint16_t), 8) +
  			    *total < full_space) {
 +				/*
 +				 * Account for header space used by array of
 +				 * optional sizes of variable-length attributes.
 +				 * Record the index in case this increase needs
 +				 * to be reversed due to spill-over.
 +				 */
  				hdrsize += sizeof (uint16_t);
 +				j = i;
  			} else {
  				done = B_TRUE;
  				*index = i;
 @@ -619,6 +629,14 @@ next:
  			*will_spill = B_TRUE;
  	}
  
 +	/*
 +	 * j holds the index of the last variable-sized attribute for
 +	 * which hdrsize was increased.  Reverse the increase if that
 +	 * attribute will be relocated to the spill block.
 +	 */
 +	if (*will_spill && j == *index)
 +		hdrsize -= sizeof (uint16_t);
 +
  	hdrsize = P2ROUNDUP(hdrsize, 8);
  	return (hdrsize);
  }
 @@ -709,6 +727,8 @@ sa_build_layouts(sa_handle_t *hdl, sa_bu
  	for (i = 0, len_idx = 0, hash = -1ULL; i != attr_count; i++) {
  		uint16_t length;
  
 +		ASSERT(IS_P2ALIGNED(data_start, 8));
 +		ASSERT(IS_P2ALIGNED(buf_space, 8));
  		attrs[i] = attr_desc[i].sa_attr;
  		length = SA_REGISTERED_LEN(sa, attrs[i]);
  		if (length == 0)
 @@ -717,6 +737,7 @@ sa_build_layouts(sa_handle_t *hdl, sa_bu
  			VERIFY(length == attr_desc[i].sa_length);
  
  		if (buf_space < length) {  /* switch to spill buffer */
 +			VERIFY(spilling);
  			VERIFY(bonustype == DMU_OT_SA);
  			if (buftype == SA_BONUS && !sa->sa_force_spill) {
  				sa_find_layout(hdl->sa_os, hash, attrs_start,
 
 Modified: stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa.c
 ==============================================================================
 --- stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa.c	Wed Feb 27 19:20:50 2013	(r247406)
 +++ stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa.c	Wed Feb 27 19:22:27 2013	(r247407)
 @@ -1764,7 +1764,7 @@ spa_load_verify_done(zio_t *zio)
  /*ARGSUSED*/
  static int
  spa_load_verify_cb(spa_t *spa, zilog_t *zilog, const blkptr_t *bp,
 -    arc_buf_t *pbuf, const zbookmark_t *zb, const dnode_phys_t *dnp, void *arg)
 +    const zbookmark_t *zb, const dnode_phys_t *dnp, void *arg)
  {
  	if (bp != NULL) {
  		zio_t *rio = arg;
 
 Modified: stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/arc.h
 ==============================================================================
 --- stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/arc.h	Wed Feb 27 19:20:50 2013	(r247406)
 +++ stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/arc.h	Wed Feb 27 19:22:27 2013	(r247407)
 @@ -49,7 +49,6 @@ struct arc_buf {
  	arc_buf_hdr_t		*b_hdr;
  	arc_buf_t		*b_next;
  	kmutex_t		b_evict_lock;
 -	krwlock_t		b_data_lock;
  	void			*b_data;
  	arc_evict_func_t	*b_efunc;
  	void			*b_private;
 @@ -93,8 +92,6 @@ void arc_buf_add_ref(arc_buf_t *buf, voi
  int arc_buf_remove_ref(arc_buf_t *buf, void *tag);
  int arc_buf_size(arc_buf_t *buf);
  void arc_release(arc_buf_t *buf, void *tag);
 -int arc_release_bp(arc_buf_t *buf, void *tag, blkptr_t *bp, spa_t *spa,
 -    zbookmark_t *zb);
  int arc_released(arc_buf_t *buf);
  int arc_has_callback(arc_buf_t *buf);
  void arc_buf_freeze(arc_buf_t *buf);
 @@ -103,10 +100,7 @@ void arc_buf_thaw(arc_buf_t *buf);
  int arc_referenced(arc_buf_t *buf);
  #endif
  
 -int arc_read(zio_t *pio, spa_t *spa, const blkptr_t *bp, arc_buf_t *pbuf,
 -    arc_done_func_t *done, void *priv, int priority, int zio_flags,
 -    uint32_t *arc_flags, const zbookmark_t *zb);
 -int arc_read_nolock(zio_t *pio, spa_t *spa, const blkptr_t *bp,
 +int arc_read(zio_t *pio, spa_t *spa, const blkptr_t *bp,
      arc_done_func_t *done, void *priv, int priority, int flags,
      uint32_t *arc_flags, const zbookmark_t *zb);
  zio_t *arc_write(zio_t *pio, spa_t *spa, uint64_t txg,
 
 Modified: stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/dmu_traverse.h
 ==============================================================================
 --- stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/dmu_traverse.h	Wed Feb 27 19:20:50 2013	(r247406)
 +++ stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/dmu_traverse.h	Wed Feb 27 19:22:27 2013	(r247407)
 @@ -40,8 +40,7 @@ struct zilog;
  struct arc_buf;
  
  typedef int (blkptr_cb_t)(spa_t *spa, zilog_t *zilog, const blkptr_t *bp,
 -    struct arc_buf *pbuf, const zbookmark_t *zb, const struct dnode_phys *dnp,
 -    void *arg);
 +    const zbookmark_t *zb, const struct dnode_phys *dnp, void *arg);
  
  #define	TRAVERSE_PRE			(1<<0)
  #define	TRAVERSE_POST			(1<<1)
 
 Modified: stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/dsl_pool.h
 ==============================================================================
 --- stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/dsl_pool.h	Wed Feb 27 19:20:50 2013	(r247406)
 +++ stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/dsl_pool.h	Wed Feb 27 19:22:27 2013	(r247407)
 @@ -134,12 +134,6 @@ void dsl_pool_willuse_space(dsl_pool_t *
  void dsl_free(dsl_pool_t *dp, uint64_t txg, const blkptr_t *bpp);
  void dsl_free_sync(zio_t *pio, dsl_pool_t *dp, uint64_t txg,
      const blkptr_t *bpp);
 -int dsl_read(zio_t *pio, spa_t *spa, const blkptr_t *bpp, arc_buf_t *pbuf,
 -    arc_done_func_t *done, void *priv, int priority, int zio_flags,
 -    uint32_t *arc_flags, const zbookmark_t *zb);
 -int dsl_read_nolock(zio_t *pio, spa_t *spa, const blkptr_t *bpp,
 -    arc_done_func_t *done, void *priv, int priority, int zio_flags,
 -    uint32_t *arc_flags, const zbookmark_t *zb);
  void dsl_pool_create_origin(dsl_pool_t *dp, dmu_tx_t *tx);
  void dsl_pool_upgrade_clones(dsl_pool_t *dp, dmu_tx_t *tx);
  void dsl_pool_upgrade_dir_clones(dsl_pool_t *dp, dmu_tx_t *tx);
 
 Modified: stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/refcount.h
 ==============================================================================
 --- stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/refcount.h	Wed Feb 27 19:20:50 2013	(r247406)
 +++ stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/refcount.h	Wed Feb 27 19:22:27 2013	(r247407)
 @@ -20,6 +20,7 @@
   */
  /*
   * Copyright (c) 2005, 2010, Oracle and/or its affiliates. All rights reserved.
 + * Copyright (c) 2012 by Delphix. All rights reserved.
   */
  
  #ifndef	_SYS_REFCOUNT_H
 @@ -54,8 +55,8 @@ typedef struct refcount {
  	kmutex_t rc_mtx;
  	list_t rc_list;
  	list_t rc_removed;
 -	int64_t rc_count;
 -	int64_t rc_removed_count;
 +	uint64_t rc_count;
 +	uint64_t rc_removed_count;
  } refcount_t;
  
  /* Note: refcount_t must be initialized with refcount_create() */
 
 Modified: stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev.c
 ==============================================================================
 --- stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev.c	Wed Feb 27 19:20:50 2013	(r247406)
 +++ stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev.c	Wed Feb 27 19:22:27 2013	(r247407)
 @@ -1328,7 +1328,8 @@ vdev_validate(vdev_t *vd, boolean_t stri
  	if (vd->vdev_ops->vdev_op_leaf && vdev_readable(vd)) {
  		uint64_t aux_guid = 0;
  		nvlist_t *nvl;
 -		uint64_t txg = strict ? spa->spa_config_txg : -1ULL;
 +		uint64_t txg = spa_last_synced_txg(spa) != 0 ?
 +		    spa_last_synced_txg(spa) : -1ULL;
  
  		if ((label = vdev_label_read_config(vd, txg)) == NULL) {
  			vdev_set_state(vd, B_TRUE, VDEV_STATE_CANT_OPEN,
 @@ -1512,7 +1513,7 @@ vdev_reopen(vdev_t *vd)
  		    !l2arc_vdev_present(vd))
  			l2arc_add_vdev(spa, vd);
  	} else {
 -		(void) vdev_validate(vd, spa_last_synced_txg(spa));
 +		(void) vdev_validate(vd, B_TRUE);
  	}
  
  	/*
 
 Modified: stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_ioctl.c
 ==============================================================================
 --- stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_ioctl.c	Wed Feb 27 19:20:50 2013	(r247406)
 +++ stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_ioctl.c	Wed Feb 27 19:22:27 2013	(r247407)
 @@ -106,12 +106,18 @@ typedef enum {
  	DATASET_NAME
  } zfs_ioc_namecheck_t;
  
 +typedef enum {
 +	POOL_CHECK_NONE		= 1 << 0,
 +	POOL_CHECK_SUSPENDED	= 1 << 1,
 +	POOL_CHECK_READONLY	= 1 << 2
 +} zfs_ioc_poolcheck_t;
 +
  typedef struct zfs_ioc_vec {
  	zfs_ioc_func_t		*zvec_func;
  	zfs_secpolicy_func_t	*zvec_secpolicy;
  	zfs_ioc_namecheck_t	zvec_namecheck;
  	boolean_t		zvec_his_log;
 -	boolean_t		zvec_pool_check;
 +	zfs_ioc_poolcheck_t	zvec_pool_check;
  } zfs_ioc_vec_t;
  
  /* This array is indexed by zfs_userquota_prop_t */
 @@ -5033,138 +5039,155 @@ zfs_ioc_unjail(zfs_cmd_t *zc)
  
  static zfs_ioc_vec_t zfs_ioc_vec[] = {
  	{ zfs_ioc_pool_create, zfs_secpolicy_config, POOL_NAME, B_FALSE,
 -	    B_FALSE },
 +	    POOL_CHECK_NONE },
  	{ zfs_ioc_pool_destroy,	zfs_secpolicy_config, POOL_NAME, B_FALSE,
 -	    B_FALSE },
 +	    POOL_CHECK_NONE },
  	{ zfs_ioc_pool_import, zfs_secpolicy_config, POOL_NAME, B_TRUE,
 -	    B_FALSE },
 +	    POOL_CHECK_NONE },
  	{ zfs_ioc_pool_export, zfs_secpolicy_config, POOL_NAME, B_FALSE,
 -	    B_FALSE },
 
 *** DIFF OUTPUT TRUNCATED AT 1000 LINES ***
 _______________________________________________
 svn-src-all@freebsd.org mailing list
 http://lists.freebsd.org/mailman/listinfo/svn-src-all
 To unsubscribe, send any mail to "svn-src-all-unsubscribe@freebsd.org"
 

From owner-freebsd-fs@FreeBSD.ORG  Wed Feb 27 19:30:02 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@smarthost.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by hub.freebsd.org (Postfix) with ESMTP id D3596E7A
 for <freebsd-fs@smarthost.ysv.freebsd.org>;
 Wed, 27 Feb 2013 19:30:02 +0000 (UTC)
 (envelope-from gnats@FreeBSD.org)
Received: from freefall.freebsd.org (freefall.freebsd.org
 [IPv6:2001:1900:2254:206c::16:87])
 by mx1.freebsd.org (Postfix) with ESMTP id C488899A
 for <freebsd-fs@smarthost.ysv.freebsd.org>;
 Wed, 27 Feb 2013 19:30:02 +0000 (UTC)
Received: from freefall.freebsd.org (localhost [127.0.0.1])
 by freefall.freebsd.org (8.14.6/8.14.6) with ESMTP id r1RJU2n3060472
 for <freebsd-fs@freefall.freebsd.org>; Wed, 27 Feb 2013 19:30:02 GMT
 (envelope-from gnats@freefall.freebsd.org)
Received: (from gnats@localhost)
 by freefall.freebsd.org (8.14.6/8.14.6/Submit) id r1RJU2bx060471;
 Wed, 27 Feb 2013 19:30:02 GMT (envelope-from gnats)
Date: Wed, 27 Feb 2013 19:30:02 GMT
Message-Id: <201302271930.r1RJU2bx060471@freefall.freebsd.org>
To: freebsd-fs@FreeBSD.org
Cc: 
From: dfilter@FreeBSD.ORG (dfilter service)
Subject: Re: kern/175897: commit references a PR
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
Reply-To: dfilter service <dfilter@FreeBSD.ORG>
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 27 Feb 2013 19:30:02 -0000

The following reply was made to PR kern/175897; it has been noted by GNATS.

From: dfilter@FreeBSD.ORG (dfilter service)
To: bug-followup@FreeBSD.org
Cc:  
Subject: Re: kern/175897: commit references a PR
Date: Wed, 27 Feb 2013 19:21:12 +0000 (UTC)

 Author: mm
 Date: Wed Feb 27 19:20:50 2013
 New Revision: 247406
 URL: http://svnweb.freebsd.org/changeset/base/247406
 
 Log:
   MFC r246631,246651,246666,246675,246678,246688:
   Merge various ZFS bugfixes
   
   MFC r246631:
   Import vendor bugfixes
   
   Illumos ZFS issues:
     3422 zpool create/syseventd race yield non-importable pool
     3425 first write to a new zvol can fail with EFBIG
   
   MFC r246651:
   Import minor type change in refcount.h header from vendor (illumos).
   
   MFC r246666:
   Import vendor ZFS bugfix fixing a problem in arc_read().
   
   Illumos ZFS issues:
     3498 panic in arc_read(): !refcount_is_zero(&pbuf->b_hdr->b_refcnt)
   
   MFC r246675:
   Add tunable to allow block allocation on degraded vdevs.
   
   Illumos ZFS issues:
     3507 Tunable to allow block allocation even on degraded vdevs
   
   MFC r246678:
   Import vendor bugfixes regarding SA rounding, header size and layout.
   This was already partially fixed by avg.
   
   Illumos ZFS issues:
     3512 rounding discrepancy in sa_find_sizes()
     3513 mismatch between SA header size and layout
   
   MFC r246688 [1]:
   Merge zfs_ioctl.c code that should have been merged together with ZFS v28.
   Fixes several problems if working with read-only pools.
   
   Changed code originaly introduced in onnv-gate 13061:bda0decf867b
   Contains changes up to illumos-gate 13700:4bc0783f6064
   
   PR:		kern/175897 [1]
   Suggested by:	avg [1]
 
 Modified:
   stable/9/cddl/contrib/opensolaris/cmd/zdb/zdb.c
   stable/9/cddl/contrib/opensolaris/lib/libzfs/common/libzfs_import.c
   stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c
   stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/bptree.c
   stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dbuf.c
   stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_diff.c
   stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_objset.c
   stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_send.c
   stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_traverse.c
   stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_tx.c
   stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_dataset.c
   stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_scan.c
   stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/metaslab.c
   stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sa.c
   stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa.c
   stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/arc.h
   stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/dmu_traverse.h
   stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/dsl_pool.h
   stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/refcount.h
   stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev.c
   stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_ioctl.c
   stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zil.c
   stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c
   stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zvol.c
 Directory Properties:
   stable/9/cddl/contrib/opensolaris/   (props changed)
   stable/9/cddl/contrib/opensolaris/lib/libzfs/   (props changed)
   stable/9/sys/   (props changed)
   stable/9/sys/cddl/contrib/opensolaris/   (props changed)
 
 Modified: stable/9/cddl/contrib/opensolaris/cmd/zdb/zdb.c
 ==============================================================================
 --- stable/9/cddl/contrib/opensolaris/cmd/zdb/zdb.c	Wed Feb 27 19:03:31 2013	(r247405)
 +++ stable/9/cddl/contrib/opensolaris/cmd/zdb/zdb.c	Wed Feb 27 19:20:50 2013	(r247406)
 @@ -983,7 +983,7 @@ visit_indirect(spa_t *spa, const dnode_p
  		arc_buf_t *buf;
  		uint64_t fill = 0;
  
 -		err = arc_read_nolock(NULL, spa, bp, arc_getbuf_func, &buf,
 +		err = arc_read(NULL, spa, bp, arc_getbuf_func, &buf,
  		    ZIO_PRIORITY_ASYNC_READ, ZIO_FLAG_CANFAIL, &flags, zb);
  		if (err)
  			return (err);
 @@ -2001,9 +2001,8 @@ zdb_count_block(zdb_cb_t *zcb, zilog_t *
  	    bp, NULL, NULL, ZIO_FLAG_CANFAIL)), ==, 0);
  }
  
 -/* ARGSUSED */
  static int
 -zdb_blkptr_cb(spa_t *spa, zilog_t *zilog, const blkptr_t *bp, arc_buf_t *pbuf,
 +zdb_blkptr_cb(spa_t *spa, zilog_t *zilog, const blkptr_t *bp,
      const zbookmark_t *zb, const dnode_phys_t *dnp, void *arg)
  {
  	zdb_cb_t *zcb = arg;
 @@ -2410,7 +2409,7 @@ typedef struct zdb_ddt_entry {
  /* ARGSUSED */
  static int
  zdb_ddt_add_cb(spa_t *spa, zilog_t *zilog, const blkptr_t *bp,
 -    arc_buf_t *pbuf, const zbookmark_t *zb, const dnode_phys_t *dnp, void *arg)
 +    const zbookmark_t *zb, const dnode_phys_t *dnp, void *arg)
  {
  	avl_tree_t *t = arg;
  	avl_index_t where;
 
 Modified: stable/9/cddl/contrib/opensolaris/lib/libzfs/common/libzfs_import.c
 ==============================================================================
 --- stable/9/cddl/contrib/opensolaris/lib/libzfs/common/libzfs_import.c	Wed Feb 27 19:03:31 2013	(r247405)
 +++ stable/9/cddl/contrib/opensolaris/lib/libzfs/common/libzfs_import.c	Wed Feb 27 19:20:50 2013	(r247406)
 @@ -526,13 +526,12 @@ get_configs(libzfs_handle_t *hdl, pool_l
  				 *	version
  				 *	pool guid
  				 *	name
 -				 *	pool txg (if available)
  				 *	comment (if available)
  				 *	pool state
  				 *	hostid (if available)
  				 *	hostname (if available)
  				 */
 -				uint64_t state, version, pool_txg;
 +				uint64_t state, version;
  				char *comment = NULL;
  
  				version = fnvlist_lookup_uint64(tmp,
 @@ -548,11 +547,6 @@ get_configs(libzfs_handle_t *hdl, pool_l
  				fnvlist_add_string(config,
  				    ZPOOL_CONFIG_POOL_NAME, name);
  
 -				if (nvlist_lookup_uint64(tmp,
 -				    ZPOOL_CONFIG_POOL_TXG, &pool_txg) == 0)
 -					fnvlist_add_uint64(config,
 -					    ZPOOL_CONFIG_POOL_TXG, pool_txg);
 -
  				if (nvlist_lookup_string(tmp,
  				    ZPOOL_CONFIG_COMMENT, &comment) == 0)
  					fnvlist_add_string(config,
 
 Modified: stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c
 ==============================================================================
 --- stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c	Wed Feb 27 19:03:31 2013	(r247405)
 +++ stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c	Wed Feb 27 19:20:50 2013	(r247406)
 @@ -940,7 +940,6 @@ buf_cons(void *vbuf, void *unused, int k
  
  	bzero(buf, sizeof (arc_buf_t));
  	mutex_init(&buf->b_evict_lock, NULL, MUTEX_DEFAULT, NULL);
 -	rw_init(&buf->b_data_lock, NULL, RW_DEFAULT, NULL);
  	arc_space_consume(sizeof (arc_buf_t), ARC_SPACE_HDRS);
  
  	return (0);
 @@ -970,7 +969,6 @@ buf_dest(void *vbuf, void *unused)
  	arc_buf_t *buf = vbuf;
  
  	mutex_destroy(&buf->b_evict_lock);
 -	rw_destroy(&buf->b_data_lock);
  	arc_space_return(sizeof (arc_buf_t), ARC_SPACE_HDRS);
  }
  
 @@ -2968,42 +2966,11 @@ arc_read_done(zio_t *zio)
   *
   * arc_read_done() will invoke all the requested "done" functions
   * for readers of this block.
 - *
 - * Normal callers should use arc_read and pass the arc buffer and offset
 - * for the bp.  But if you know you don't need locking, you can use
 - * arc_read_nolock.
   */
  int
 -arc_read(zio_t *pio, spa_t *spa, const blkptr_t *bp, arc_buf_t *pbuf,
 -    arc_done_func_t *done, void *private, int priority, int zio_flags,
 -    uint32_t *arc_flags, const zbookmark_t *zb)
 -{
 -	int err;
 -
 -	if (pbuf == NULL) {
 -		/*
 -		 * XXX This happens from traverse callback funcs, for
 -		 * the objset_phys_t block.
 -		 */
 -		return (arc_read_nolock(pio, spa, bp, done, private, priority,
 -		    zio_flags, arc_flags, zb));
 -	}
 -
 -	ASSERT(!refcount_is_zero(&pbuf->b_hdr->b_refcnt));
 -	ASSERT3U((char *)bp - (char *)pbuf->b_data, <, pbuf->b_hdr->b_size);
 -	rw_enter(&pbuf->b_data_lock, RW_READER);
 -
 -	err = arc_read_nolock(pio, spa, bp, done, private, priority,
 -	    zio_flags, arc_flags, zb);
 -	rw_exit(&pbuf->b_data_lock);
 -
 -	return (err);
 -}
 -
 -int
 -arc_read_nolock(zio_t *pio, spa_t *spa, const blkptr_t *bp,
 -    arc_done_func_t *done, void *private, int priority, int zio_flags,
 -    uint32_t *arc_flags, const zbookmark_t *zb)
 +arc_read(zio_t *pio, spa_t *spa, const blkptr_t *bp, arc_done_func_t *done,
 +    void *private, int priority, int zio_flags, uint32_t *arc_flags,
 +    const zbookmark_t *zb)
  {
  	arc_buf_hdr_t *hdr;
  	arc_buf_t *buf;
 @@ -3482,19 +3449,6 @@ arc_release(arc_buf_t *buf, void *tag)
  	}
  }
  
 -/*
 - * Release this buffer.  If it does not match the provided BP, fill it
 - * with that block's contents.
 - */
 -/* ARGSUSED */
 -int
 -arc_release_bp(arc_buf_t *buf, void *tag, blkptr_t *bp, spa_t *spa,
 -    zbookmark_t *zb)
 -{
 -	arc_release(buf, tag);
 -	return (0);
 -}
 -
  int
  arc_released(arc_buf_t *buf)
  {
 
 Modified: stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/bptree.c
 ==============================================================================
 --- stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/bptree.c	Wed Feb 27 19:03:31 2013	(r247405)
 +++ stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/bptree.c	Wed Feb 27 19:20:50 2013	(r247406)
 @@ -135,7 +135,7 @@ bptree_add(objset_t *os, uint64_t obj, b
  
  /* ARGSUSED */
  static int
 -bptree_visit_cb(spa_t *spa, zilog_t *zilog, const blkptr_t *bp, arc_buf_t *pbuf,
 +bptree_visit_cb(spa_t *spa, zilog_t *zilog, const blkptr_t *bp,
      const zbookmark_t *zb, const dnode_phys_t *dnp, void *arg)
  {
  	int err;
 
 Modified: stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dbuf.c
 ==============================================================================
 --- stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dbuf.c	Wed Feb 27 19:03:31 2013	(r247405)
 +++ stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dbuf.c	Wed Feb 27 19:20:50 2013	(r247406)
 @@ -513,7 +513,6 @@ dbuf_read_impl(dmu_buf_impl_t *db, zio_t
  	spa_t *spa;
  	zbookmark_t zb;
  	uint32_t aflags = ARC_NOWAIT;
 -	arc_buf_t *pbuf;
  
  	DB_DNODE_ENTER(db);
  	dn = DB_DNODE(db);
 @@ -575,14 +574,8 @@ dbuf_read_impl(dmu_buf_impl_t *db, zio_t
  	    db->db.db_object, db->db_level, db->db_blkid);
  
  	dbuf_add_ref(db, NULL);
 -	/* ZIO_FLAG_CANFAIL callers have to check the parent zio's error */
  
 -	if (db->db_parent)
 -		pbuf = db->db_parent->db_buf;
 -	else
 -		pbuf = db->db_objset->os_phys_buf;
 -
 -	(void) dsl_read(zio, spa, db->db_blkptr, pbuf,
 +	(void) arc_read(zio, spa, db->db_blkptr,
  	    dbuf_read_done, db, ZIO_PRIORITY_SYNC_READ,
  	    (*flags & DB_RF_CANFAIL) ? ZIO_FLAG_CANFAIL : ZIO_FLAG_MUSTSUCCEED,
  	    &aflags, &zb);
 @@ -982,7 +975,6 @@ void
  dbuf_release_bp(dmu_buf_impl_t *db)
  {
  	objset_t *os;
 -	zbookmark_t zb;
  
  	DB_GET_OBJSET(&os, db);
  	ASSERT(dsl_pool_sync_context(dmu_objset_pool(os)));
 @@ -990,13 +982,7 @@ dbuf_release_bp(dmu_buf_impl_t *db)
  	    list_link_active(&os->os_dsl_dataset->ds_synced_link));
  	ASSERT(db->db_parent == NULL || arc_released(db->db_parent->db_buf));
  
 -	zb.zb_objset = os->os_dsl_dataset ?
 -	    os->os_dsl_dataset->ds_object : 0;
 -	zb.zb_object = db->db.db_object;
 -	zb.zb_level = db->db_level;
 -	zb.zb_blkid = db->db_blkid;
 -	(void) arc_release_bp(db->db_buf, db,
 -	    db->db_blkptr, os->os_spa, &zb);
 +	(void) arc_release(db->db_buf, db);
  }
  
  dbuf_dirty_record_t *
 @@ -1831,7 +1817,6 @@ dbuf_prefetch(dnode_t *dn, uint64_t blki
  		if (bp && !BP_IS_HOLE(bp)) {
  			int priority = dn->dn_type == DMU_OT_DDT_ZAP ?
  			    ZIO_PRIORITY_DDT_PREFETCH : ZIO_PRIORITY_ASYNC_READ;
 -			arc_buf_t *pbuf;
  			dsl_dataset_t *ds = dn->dn_objset->os_dsl_dataset;
  			uint32_t aflags = ARC_NOWAIT | ARC_PREFETCH;
  			zbookmark_t zb;
 @@ -1839,13 +1824,8 @@ dbuf_prefetch(dnode_t *dn, uint64_t blki
  			SET_BOOKMARK(&zb, ds ? ds->ds_object : DMU_META_OBJSET,
  			    dn->dn_object, 0, blkid);
  
 -			if (db)
 -				pbuf = db->db_buf;
 -			else
 -				pbuf = dn->dn_objset->os_phys_buf;
 -
 -			(void) dsl_read(NULL, dn->dn_objset->os_spa,
 -			    bp, pbuf, NULL, NULL, priority,
 +			(void) arc_read(NULL, dn->dn_objset->os_spa,
 +			    bp, NULL, NULL, priority,
  			    ZIO_FLAG_CANFAIL | ZIO_FLAG_SPECULATIVE,
  			    &aflags, &zb);
  		}
 
 Modified: stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_diff.c
 ==============================================================================
 --- stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_diff.c	Wed Feb 27 19:03:31 2013	(r247405)
 +++ stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_diff.c	Wed Feb 27 19:20:50 2013	(r247406)
 @@ -128,7 +128,7 @@ report_dnode(struct diffarg *da, uint64_
  
  /* ARGSUSED */
  static int
 -diff_cb(spa_t *spa, zilog_t *zilog, const blkptr_t *bp, arc_buf_t *pbuf,
 +diff_cb(spa_t *spa, zilog_t *zilog, const blkptr_t *bp,
      const zbookmark_t *zb, const dnode_phys_t *dnp, void *arg)
  {
  	struct diffarg *da = arg;
 @@ -155,9 +155,9 @@ diff_cb(spa_t *spa, zilog_t *zilog, cons
  		int blksz = BP_GET_LSIZE(bp);
  		int i;
  
 -		if (dsl_read(NULL, spa, bp, pbuf,
 -		    arc_getbuf_func, &abuf, ZIO_PRIORITY_ASYNC_READ,
 -		    ZIO_FLAG_CANFAIL, &aflags, zb) != 0)
 +		if (arc_read(NULL, spa, bp, arc_getbuf_func, &abuf,
 +		    ZIO_PRIORITY_ASYNC_READ, ZIO_FLAG_CANFAIL,
 +		    &aflags, zb) != 0)
  			return (EIO);
  
  		blk = abuf->b_data;
 
 Modified: stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_objset.c
 ==============================================================================
 --- stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_objset.c	Wed Feb 27 19:03:31 2013	(r247405)
 +++ stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_objset.c	Wed Feb 27 19:20:50 2013	(r247406)
 @@ -276,12 +276,7 @@ dmu_objset_open_impl(spa_t *spa, dsl_dat
  			aflags |= ARC_L2CACHE;
  
  		dprintf_bp(os->os_rootbp, "reading %s", "");
 -		/*
 -		 * XXX when bprewrite scrub can change the bp,
 -		 * and this is called from dmu_objset_open_ds_os, the bp
 -		 * could change, and we'll need a lock.
 -		 */
 -		err = dsl_read_nolock(NULL, spa, os->os_rootbp,
 +		err = arc_read(NULL, spa, os->os_rootbp,
  		    arc_getbuf_func, &os->os_phys_buf,
  		    ZIO_PRIORITY_SYNC_READ, ZIO_FLAG_CANFAIL, &aflags, &zb);
  		if (err) {
 @@ -1124,8 +1119,7 @@ dmu_objset_sync(objset_t *os, zio_t *pio
  	SET_BOOKMARK(&zb, os->os_dsl_dataset ?
  	    os->os_dsl_dataset->ds_object : DMU_META_OBJSET,
  	    ZB_ROOT_OBJECT, ZB_ROOT_LEVEL, ZB_ROOT_BLKID);
 -	VERIFY3U(0, ==, arc_release_bp(os->os_phys_buf, &os->os_phys_buf,
 -	    os->os_rootbp, os->os_spa, &zb));
 +	arc_release(os->os_phys_buf, &os->os_phys_buf);
  
  	dmu_write_policy(os, NULL, 0, 0, &zp);
  
 @@ -1764,7 +1758,7 @@ dmu_objset_prefetch(const char *name, vo
  			SET_BOOKMARK(&zb, ds->ds_object, ZB_ROOT_OBJECT,
  			    ZB_ROOT_LEVEL, ZB_ROOT_BLKID);
  
 -			(void) dsl_read_nolock(NULL, dsl_dataset_get_spa(ds),
 +			(void) arc_read(NULL, dsl_dataset_get_spa(ds),
  			    &ds->ds_phys->ds_bp, NULL, NULL,
  			    ZIO_PRIORITY_ASYNC_READ,
  			    ZIO_FLAG_CANFAIL | ZIO_FLAG_SPECULATIVE,
 
 Modified: stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_send.c
 ==============================================================================
 --- stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_send.c	Wed Feb 27 19:03:31 2013	(r247405)
 +++ stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_send.c	Wed Feb 27 19:20:50 2013	(r247406)
 @@ -317,7 +317,7 @@ dump_dnode(dmu_sendarg_t *dsp, uint64_t 
  
  /* ARGSUSED */
  static int
 -backup_cb(spa_t *spa, zilog_t *zilog, const blkptr_t *bp, arc_buf_t *pbuf,
 +backup_cb(spa_t *spa, zilog_t *zilog, const blkptr_t *bp,
      const zbookmark_t *zb, const dnode_phys_t *dnp, void *arg)
  {
  	dmu_sendarg_t *dsp = arg;
 @@ -346,9 +346,9 @@ backup_cb(spa_t *spa, zilog_t *zilog, co
  		uint32_t aflags = ARC_WAIT;
  		arc_buf_t *abuf;
  
 -		if (dsl_read(NULL, spa, bp, pbuf,
 -		    arc_getbuf_func, &abuf, ZIO_PRIORITY_ASYNC_READ,
 -		    ZIO_FLAG_CANFAIL, &aflags, zb) != 0)
 +		if (arc_read(NULL, spa, bp, arc_getbuf_func, &abuf,
 +		    ZIO_PRIORITY_ASYNC_READ, ZIO_FLAG_CANFAIL,
 +		    &aflags, zb) != 0)
  			return (EIO);
  
  		blk = abuf->b_data;
 @@ -365,9 +365,9 @@ backup_cb(spa_t *spa, zilog_t *zilog, co
  		arc_buf_t *abuf;
  		int blksz = BP_GET_LSIZE(bp);
  
 -		if (arc_read_nolock(NULL, spa, bp,
 -		    arc_getbuf_func, &abuf, ZIO_PRIORITY_ASYNC_READ,
 -		    ZIO_FLAG_CANFAIL, &aflags, zb) != 0)
 +		if (arc_read(NULL, spa, bp, arc_getbuf_func, &abuf,
 +		    ZIO_PRIORITY_ASYNC_READ, ZIO_FLAG_CANFAIL,
 +		    &aflags, zb) != 0)
  			return (EIO);
  
  		err = dump_spill(dsp, zb->zb_object, blksz, abuf->b_data);
 @@ -377,9 +377,9 @@ backup_cb(spa_t *spa, zilog_t *zilog, co
  		arc_buf_t *abuf;
  		int blksz = BP_GET_LSIZE(bp);
  
 -		if (dsl_read(NULL, spa, bp, pbuf,
 -		    arc_getbuf_func, &abuf, ZIO_PRIORITY_ASYNC_READ,
 -		    ZIO_FLAG_CANFAIL, &aflags, zb) != 0) {
 +		if (arc_read(NULL, spa, bp, arc_getbuf_func, &abuf,
 +		    ZIO_PRIORITY_ASYNC_READ, ZIO_FLAG_CANFAIL,
 +		    &aflags, zb) != 0) {
  			if (zfs_send_corrupt_data) {
  				/* Send a block filled with 0x"zfs badd bloc" */
  				abuf = arc_buf_alloc(spa, blksz, &abuf,
 
 Modified: stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_traverse.c
 ==============================================================================
 --- stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_traverse.c	Wed Feb 27 19:03:31 2013	(r247405)
 +++ stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_traverse.c	Wed Feb 27 19:20:50 2013	(r247406)
 @@ -62,9 +62,9 @@ typedef struct traverse_data {
  } traverse_data_t;
  
  static int traverse_dnode(traverse_data_t *td, const dnode_phys_t *dnp,
 -    arc_buf_t *buf, uint64_t objset, uint64_t object);
 +    uint64_t objset, uint64_t object);
  static void prefetch_dnode_metadata(traverse_data_t *td, const dnode_phys_t *,
 -    arc_buf_t *buf, uint64_t objset, uint64_t object);
 +    uint64_t objset, uint64_t object);
  
  static int
  traverse_zil_block(zilog_t *zilog, blkptr_t *bp, void *arg, uint64_t claim_txg)
 @@ -81,7 +81,7 @@ traverse_zil_block(zilog_t *zilog, blkpt
  	SET_BOOKMARK(&zb, td->td_objset, ZB_ZIL_OBJECT, ZB_ZIL_LEVEL,
  	    bp->blk_cksum.zc_word[ZIL_ZC_SEQ]);
  
 -	(void) td->td_func(td->td_spa, zilog, bp, NULL, &zb, NULL, td->td_arg);
 +	(void) td->td_func(td->td_spa, zilog, bp, &zb, NULL, td->td_arg);
  
  	return (0);
  }
 @@ -105,7 +105,7 @@ traverse_zil_record(zilog_t *zilog, lr_t
  		SET_BOOKMARK(&zb, td->td_objset, lr->lr_foid,
  		    ZB_ZIL_LEVEL, lr->lr_offset / BP_GET_LSIZE(bp));
  
 -		(void) td->td_func(td->td_spa, zilog, bp, NULL, &zb, NULL,
 +		(void) td->td_func(td->td_spa, zilog, bp, &zb, NULL,
  		    td->td_arg);
  	}
  	return (0);
 @@ -182,7 +182,7 @@ traverse_pause(traverse_data_t *td, cons
  
  static void
  traverse_prefetch_metadata(traverse_data_t *td,
 -    arc_buf_t *pbuf, const blkptr_t *bp, const zbookmark_t *zb)
 +    const blkptr_t *bp, const zbookmark_t *zb)
  {
  	uint32_t flags = ARC_NOWAIT | ARC_PREFETCH;
  
 @@ -200,14 +200,13 @@ traverse_prefetch_metadata(traverse_data
  	if (BP_GET_LEVEL(bp) == 0 && BP_GET_TYPE(bp) != DMU_OT_DNODE)
  		return;
  
 -	(void) arc_read(NULL, td->td_spa, bp,
 -	    pbuf, NULL, NULL, ZIO_PRIORITY_ASYNC_READ,
 -	    ZIO_FLAG_CANFAIL, &flags, zb);
 +	(void) arc_read(NULL, td->td_spa, bp, NULL, NULL,
 +	    ZIO_PRIORITY_ASYNC_READ, ZIO_FLAG_CANFAIL, &flags, zb);
  }
  
  static int
  traverse_visitbp(traverse_data_t *td, const dnode_phys_t *dnp,
 -    arc_buf_t *pbuf, const blkptr_t *bp, const zbookmark_t *zb)
 +    const blkptr_t *bp, const zbookmark_t *zb)
  {
  	zbookmark_t czb;
  	int err = 0, lasterr = 0;
 @@ -228,8 +227,7 @@ traverse_visitbp(traverse_data_t *td, co
  	}
  
  	if (BP_IS_HOLE(bp)) {
 -		err = td->td_func(td->td_spa, NULL, NULL, pbuf, zb, dnp,
 -		    td->td_arg);
 +		err = td->td_func(td->td_spa, NULL, NULL, zb, dnp, td->td_arg);
  		return (err);
  	}
  
 @@ -249,7 +247,7 @@ traverse_visitbp(traverse_data_t *td, co
  	}
  
  	if (td->td_flags & TRAVERSE_PRE) {
 -		err = td->td_func(td->td_spa, NULL, bp, pbuf, zb, dnp,
 +		err = td->td_func(td->td_spa, NULL, bp, zb, dnp,
  		    td->td_arg);
  		if (err == TRAVERSE_VISIT_NO_CHILDREN)
  			return (0);
 @@ -265,8 +263,7 @@ traverse_visitbp(traverse_data_t *td, co
  		blkptr_t *cbp;
  		int epb = BP_GET_LSIZE(bp) >> SPA_BLKPTRSHIFT;
  
 -		err = dsl_read(NULL, td->td_spa, bp, pbuf,
 -		    arc_getbuf_func, &buf,
 +		err = arc_read(NULL, td->td_spa, bp, arc_getbuf_func, &buf,
  		    ZIO_PRIORITY_ASYNC_READ, ZIO_FLAG_CANFAIL, &flags, zb);
  		if (err)
  			return (err);
 @@ -276,7 +273,7 @@ traverse_visitbp(traverse_data_t *td, co
  			SET_BOOKMARK(&czb, zb->zb_objset, zb->zb_object,
  			    zb->zb_level - 1,
  			    zb->zb_blkid * epb + i);
 -			traverse_prefetch_metadata(td, buf, &cbp[i], &czb);
 +			traverse_prefetch_metadata(td, &cbp[i], &czb);
  		}
  
  		/* recursively visitbp() blocks below this */
 @@ -284,7 +281,7 @@ traverse_visitbp(traverse_data_t *td, co
  			SET_BOOKMARK(&czb, zb->zb_objset, zb->zb_object,
  			    zb->zb_level - 1,
  			    zb->zb_blkid * epb + i);
 -			err = traverse_visitbp(td, dnp, buf, &cbp[i], &czb);
 +			err = traverse_visitbp(td, dnp, &cbp[i], &czb);
  			if (err) {
  				if (!hard)
  					break;
 @@ -296,21 +293,20 @@ traverse_visitbp(traverse_data_t *td, co
  		int i;
  		int epb = BP_GET_LSIZE(bp) >> DNODE_SHIFT;
  
 -		err = dsl_read(NULL, td->td_spa, bp, pbuf,
 -		    arc_getbuf_func, &buf,
 +		err = arc_read(NULL, td->td_spa, bp, arc_getbuf_func, &buf,
  		    ZIO_PRIORITY_ASYNC_READ, ZIO_FLAG_CANFAIL, &flags, zb);
  		if (err)
  			return (err);
  		dnp = buf->b_data;
  
  		for (i = 0; i < epb; i++) {
 -			prefetch_dnode_metadata(td, &dnp[i], buf, zb->zb_objset,
 +			prefetch_dnode_metadata(td, &dnp[i], zb->zb_objset,
  			    zb->zb_blkid * epb + i);
  		}
  
  		/* recursively visitbp() blocks below this */
  		for (i = 0; i < epb; i++) {
 -			err = traverse_dnode(td, &dnp[i], buf, zb->zb_objset,
 +			err = traverse_dnode(td, &dnp[i], zb->zb_objset,
  			    zb->zb_blkid * epb + i);
  			if (err) {
  				if (!hard)
 @@ -323,24 +319,23 @@ traverse_visitbp(traverse_data_t *td, co
  		objset_phys_t *osp;
  		dnode_phys_t *dnp;
  
 -		err = dsl_read_nolock(NULL, td->td_spa, bp,
 -		    arc_getbuf_func, &buf,
 +		err = arc_read(NULL, td->td_spa, bp, arc_getbuf_func, &buf,
  		    ZIO_PRIORITY_ASYNC_READ, ZIO_FLAG_CANFAIL, &flags, zb);
  		if (err)
  			return (err);
  
  		osp = buf->b_data;
  		dnp = &osp->os_meta_dnode;
 -		prefetch_dnode_metadata(td, dnp, buf, zb->zb_objset,
 +		prefetch_dnode_metadata(td, dnp, zb->zb_objset,
  		    DMU_META_DNODE_OBJECT);
  		if (arc_buf_size(buf) >= sizeof (objset_phys_t)) {
  			prefetch_dnode_metadata(td, &osp->os_userused_dnode,
 -			    buf, zb->zb_objset, DMU_USERUSED_OBJECT);
 +			    zb->zb_objset, DMU_USERUSED_OBJECT);
  			prefetch_dnode_metadata(td, &osp->os_groupused_dnode,
 -			    buf, zb->zb_objset, DMU_USERUSED_OBJECT);
 +			    zb->zb_objset, DMU_USERUSED_OBJECT);
  		}
  
 -		err = traverse_dnode(td, dnp, buf, zb->zb_objset,
 +		err = traverse_dnode(td, dnp, zb->zb_objset,
  		    DMU_META_DNODE_OBJECT);
  		if (err && hard) {
  			lasterr = err;
 @@ -348,7 +343,7 @@ traverse_visitbp(traverse_data_t *td, co
  		}
  		if (err == 0 && arc_buf_size(buf) >= sizeof (objset_phys_t)) {
  			dnp = &osp->os_userused_dnode;
 -			err = traverse_dnode(td, dnp, buf, zb->zb_objset,
 +			err = traverse_dnode(td, dnp, zb->zb_objset,
  			    DMU_USERUSED_OBJECT);
  		}
  		if (err && hard) {
 @@ -357,7 +352,7 @@ traverse_visitbp(traverse_data_t *td, co
  		}
  		if (err == 0 && arc_buf_size(buf) >= sizeof (objset_phys_t)) {
  			dnp = &osp->os_groupused_dnode;
 -			err = traverse_dnode(td, dnp, buf, zb->zb_objset,
 +			err = traverse_dnode(td, dnp, zb->zb_objset,
  			    DMU_GROUPUSED_OBJECT);
  		}
  	}
 @@ -367,8 +362,7 @@ traverse_visitbp(traverse_data_t *td, co
  
  post:
  	if (err == 0 && lasterr == 0 && (td->td_flags & TRAVERSE_POST)) {
 -		err = td->td_func(td->td_spa, NULL, bp, pbuf, zb, dnp,
 -		    td->td_arg);
 +		err = td->td_func(td->td_spa, NULL, bp, zb, dnp, td->td_arg);
  		if (err == ERESTART)
  			pause = B_TRUE;
  	}
 @@ -384,25 +378,25 @@ post:
  
  static void
  prefetch_dnode_metadata(traverse_data_t *td, const dnode_phys_t *dnp,
 -    arc_buf_t *buf, uint64_t objset, uint64_t object)
 +    uint64_t objset, uint64_t object)
  {
  	int j;
  	zbookmark_t czb;
  
  	for (j = 0; j < dnp->dn_nblkptr; j++) {
  		SET_BOOKMARK(&czb, objset, object, dnp->dn_nlevels - 1, j);
 -		traverse_prefetch_metadata(td, buf, &dnp->dn_blkptr[j], &czb);
 +		traverse_prefetch_metadata(td, &dnp->dn_blkptr[j], &czb);
  	}
  
  	if (dnp->dn_flags & DNODE_FLAG_SPILL_BLKPTR) {
  		SET_BOOKMARK(&czb, objset, object, 0, DMU_SPILL_BLKID);
 -		traverse_prefetch_metadata(td, buf, &dnp->dn_spill, &czb);
 +		traverse_prefetch_metadata(td, &dnp->dn_spill, &czb);
  	}
  }
  
  static int
  traverse_dnode(traverse_data_t *td, const dnode_phys_t *dnp,
 -    arc_buf_t *buf, uint64_t objset, uint64_t object)
 +    uint64_t objset, uint64_t object)
  {
  	int j, err = 0, lasterr = 0;
  	zbookmark_t czb;
 @@ -410,7 +404,7 @@ traverse_dnode(traverse_data_t *td, cons
  
  	for (j = 0; j < dnp->dn_nblkptr; j++) {
  		SET_BOOKMARK(&czb, objset, object, dnp->dn_nlevels - 1, j);
 -		err = traverse_visitbp(td, dnp, buf, &dnp->dn_blkptr[j], &czb);
 +		err = traverse_visitbp(td, dnp, &dnp->dn_blkptr[j], &czb);
  		if (err) {
  			if (!hard)
  				break;
 @@ -420,7 +414,7 @@ traverse_dnode(traverse_data_t *td, cons
  
  	if (dnp->dn_flags & DNODE_FLAG_SPILL_BLKPTR) {
  		SET_BOOKMARK(&czb, objset, object, 0, DMU_SPILL_BLKID);
 -		err = traverse_visitbp(td, dnp, buf, &dnp->dn_spill, &czb);
 +		err = traverse_visitbp(td, dnp, &dnp->dn_spill, &czb);
  		if (err) {
  			if (!hard)
  				return (err);
 @@ -433,8 +427,7 @@ traverse_dnode(traverse_data_t *td, cons
  /* ARGSUSED */
  static int
  traverse_prefetcher(spa_t *spa, zilog_t *zilog, const blkptr_t *bp,
 -    arc_buf_t *pbuf, const zbookmark_t *zb, const dnode_phys_t *dnp,
 -    void *arg)
 +    const zbookmark_t *zb, const dnode_phys_t *dnp, void *arg)
  {
  	prefetch_data_t *pfd = arg;
  	uint32_t aflags = ARC_NOWAIT | ARC_PREFETCH;
 @@ -455,10 +448,8 @@ traverse_prefetcher(spa_t *spa, zilog_t 
  	cv_broadcast(&pfd->pd_cv);
  	mutex_exit(&pfd->pd_mtx);
  
 -	(void) dsl_read(NULL, spa, bp, pbuf, NULL, NULL,
 -	    ZIO_PRIORITY_ASYNC_READ,
 -	    ZIO_FLAG_CANFAIL | ZIO_FLAG_SPECULATIVE,
 -	    &aflags, zb);
 +	(void) arc_read(NULL, spa, bp, NULL, NULL, ZIO_PRIORITY_ASYNC_READ,
 +	    ZIO_FLAG_CANFAIL | ZIO_FLAG_SPECULATIVE, &aflags, zb);
  
  	return (0);
  }
 @@ -476,7 +467,7 @@ traverse_prefetch_thread(void *arg)
  
  	SET_BOOKMARK(&czb, td.td_objset,
  	    ZB_ROOT_OBJECT, ZB_ROOT_LEVEL, ZB_ROOT_BLKID);
 -	(void) traverse_visitbp(&td, NULL, NULL, td.td_rootbp, &czb);
 +	(void) traverse_visitbp(&td, NULL, td.td_rootbp, &czb);
  
  	mutex_enter(&td_main->td_pfd->pd_mtx);
  	td_main->td_pfd->pd_exited = B_TRUE;
 @@ -540,7 +531,7 @@ traverse_impl(spa_t *spa, dsl_dataset_t 
  
  	SET_BOOKMARK(&czb, td.td_objset,
  	    ZB_ROOT_OBJECT, ZB_ROOT_LEVEL, ZB_ROOT_BLKID);
 -	err = traverse_visitbp(&td, NULL, NULL, rootbp, &czb);
 +	err = traverse_visitbp(&td, NULL, rootbp, &czb);
  
  	mutex_enter(&pd.pd_mtx);
  	pd.pd_cancel = B_TRUE;
 
 Modified: stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_tx.c
 ==============================================================================
 --- stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_tx.c	Wed Feb 27 19:03:31 2013	(r247405)
 +++ stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_tx.c	Wed Feb 27 19:20:50 2013	(r247406)
 @@ -21,7 +21,7 @@
  /*
   * Copyright (c) 2005, 2010, Oracle and/or its affiliates. All rights reserved.
   * Copyright 2011 Nexenta Systems, Inc.  All rights reserved.
 - * Copyright (c) 2012 by Delphix. All rights reserved.
 + * Copyright (c) 2013 by Delphix. All rights reserved.
   */
  
  #include <sys/dmu.h>
 @@ -284,6 +284,7 @@ dmu_tx_count_write(dmu_tx_hold_t *txh, u
  			delta = P2NPHASE(off, dn->dn_datablksz);
  		}
  
 +		min_ibs = max_ibs = dn->dn_indblkshift;
  		if (dn->dn_maxblkid > 0) {
  			/*
  			 * The blocksize can't change,
 @@ -291,13 +292,6 @@ dmu_tx_count_write(dmu_tx_hold_t *txh, u
  			 */
  			ASSERT(dn->dn_datablkshift != 0);
  			min_bs = max_bs = dn->dn_datablkshift;
 -			min_ibs = max_ibs = dn->dn_indblkshift;
 -		} else if (dn->dn_indblkshift > max_ibs) {
 -			/*
 -			 * This ensures that if we reduce DN_MAX_INDBLKSHIFT,
 -			 * the code will still work correctly on older pools.
 -			 */
 -			min_ibs = max_ibs = dn->dn_indblkshift;
  		}
  
  		/*
 
 Modified: stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_dataset.c
 ==============================================================================
 --- stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_dataset.c	Wed Feb 27 19:03:31 2013	(r247405)
 +++ stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_dataset.c	Wed Feb 27 19:20:50 2013	(r247406)
 @@ -1308,7 +1308,7 @@ struct killarg {
  
  /* ARGSUSED */
  static int
 -kill_blkptr(spa_t *spa, zilog_t *zilog, const blkptr_t *bp, arc_buf_t *pbuf,
 +kill_blkptr(spa_t *spa, zilog_t *zilog, const blkptr_t *bp,
      const zbookmark_t *zb, const dnode_phys_t *dnp, void *arg)
  {
  	struct killarg *ka = arg;
 
 Modified: stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_scan.c
 ==============================================================================
 --- stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_scan.c	Wed Feb 27 19:03:31 2013	(r247405)
 +++ stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_scan.c	Wed Feb 27 19:20:50 2013	(r247406)
 @@ -396,24 +396,6 @@ dsl_free_sync(zio_t *pio, dsl_pool_t *dp
  	zio_nowait(zio_free_sync(pio, dp->dp_spa, txg, bpp, pio->io_flags));
  }
  
 -int
 -dsl_read(zio_t *pio, spa_t *spa, const blkptr_t *bpp, arc_buf_t *pbuf,
 -    arc_done_func_t *done, void *private, int priority, int zio_flags,
 -    uint32_t *arc_flags, const zbookmark_t *zb)
 -{
 -	return (arc_read(pio, spa, bpp, pbuf, done, private,
 -	    priority, zio_flags, arc_flags, zb));
 -}
 -
 -int
 -dsl_read_nolock(zio_t *pio, spa_t *spa, const blkptr_t *bpp,
 -    arc_done_func_t *done, void *private, int priority, int zio_flags,
 -    uint32_t *arc_flags, const zbookmark_t *zb)
 -{
 -	return (arc_read_nolock(pio, spa, bpp, done, private,
 -	    priority, zio_flags, arc_flags, zb));
 -}
 -
  static uint64_t
  dsl_scan_ds_maxtxg(dsl_dataset_t *ds)
  {
 @@ -584,12 +566,8 @@ dsl_scan_prefetch(dsl_scan_t *scn, arc_b
  
  	SET_BOOKMARK(&czb, objset, object, BP_GET_LEVEL(bp), blkid);
  
 -	/*
 -	 * XXX need to make sure all of these arc_read() prefetches are
 -	 * done before setting xlateall (similar to dsl_read())
 -	 */
  	(void) arc_read(scn->scn_zio_root, scn->scn_dp->dp_spa, bp,
 -	    buf, NULL, NULL, ZIO_PRIORITY_ASYNC_READ,
 +	    NULL, NULL, ZIO_PRIORITY_ASYNC_READ,
  	    ZIO_FLAG_CANFAIL | ZIO_FLAG_SCAN_THREAD, &flags, &czb);
  }
  
 @@ -647,8 +625,7 @@ dsl_scan_recurse(dsl_scan_t *scn, dsl_da
  		blkptr_t *cbp;
  		int epb = BP_GET_LSIZE(bp) >> SPA_BLKPTRSHIFT;
  
 -		err = arc_read_nolock(NULL, dp->dp_spa, bp,
 -		    arc_getbuf_func, bufp,
 +		err = arc_read(NULL, dp->dp_spa, bp, arc_getbuf_func, bufp,
  		    ZIO_PRIORITY_ASYNC_READ, zio_flags, &flags, zb);
  		if (err) {
  			scn->scn_phys.scn_errors++;
 @@ -670,8 +647,7 @@ dsl_scan_recurse(dsl_scan_t *scn, dsl_da
  	} else if (BP_GET_TYPE(bp) == DMU_OT_USERGROUP_USED) {
  		uint32_t flags = ARC_WAIT;
  
 -		err = arc_read_nolock(NULL, dp->dp_spa, bp,
 -		    arc_getbuf_func, bufp,
 +		err = arc_read(NULL, dp->dp_spa, bp, arc_getbuf_func, bufp,
  		    ZIO_PRIORITY_ASYNC_READ, zio_flags, &flags, zb);
  		if (err) {
  			scn->scn_phys.scn_errors++;
 @@ -683,8 +659,7 @@ dsl_scan_recurse(dsl_scan_t *scn, dsl_da
  		int i, j;
  		int epb = BP_GET_LSIZE(bp) >> DNODE_SHIFT;
  
 -		err = arc_read_nolock(NULL, dp->dp_spa, bp,
 -		    arc_getbuf_func, bufp,
 +		err = arc_read(NULL, dp->dp_spa, bp, arc_getbuf_func, bufp,
  		    ZIO_PRIORITY_ASYNC_READ, zio_flags, &flags, zb);
  		if (err) {
  			scn->scn_phys.scn_errors++;
 @@ -706,8 +681,7 @@ dsl_scan_recurse(dsl_scan_t *scn, dsl_da
  		uint32_t flags = ARC_WAIT;
  		objset_phys_t *osp;
  
 -		err = arc_read_nolock(NULL, dp->dp_spa, bp,
 -		    arc_getbuf_func, bufp,
 +		err = arc_read(NULL, dp->dp_spa, bp, arc_getbuf_func, bufp,
  		    ZIO_PRIORITY_ASYNC_READ, zio_flags, &flags, zb);
  		if (err) {
  			scn->scn_phys.scn_errors++;
 
 Modified: stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/metaslab.c
 ==============================================================================
 --- stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/metaslab.c	Wed Feb 27 19:03:31 2013	(r247405)
 +++ stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/metaslab.c	Wed Feb 27 19:20:50 2013	(r247406)
 @@ -21,6 +21,7 @@
  /*
   * Copyright (c) 2005, 2010, Oracle and/or its affiliates. All rights reserved.
   * Copyright (c) 2012 by Delphix. All rights reserved.
 + * Copyright (c) 2013 by Saso Kiselkov. All rights reserved.
   */
  
  #include <sys/zfs_context.h>
 @@ -97,6 +98,15 @@ int metaslab_prefetch_limit = SPA_DVAS_P
  int metaslab_smo_bonus_pct = 150;
  
  /*
 + * Should we be willing to write data to degraded vdevs?
 + */
 +boolean_t zfs_write_to_degraded = B_FALSE;
 +SYSCTL_INT(_vfs_zfs, OID_AUTO, write_to_degraded, CTLFLAG_RW,
 +    &zfs_write_to_degraded, 0,
 +    "Allow writing data to degraded vdevs");
 +TUNABLE_INT("vfs.zfs.write_to_degraded", &zfs_write_to_degraded);
 +
 +/*
   * ==========================================================================
   * Metaslab classes
   * ==========================================================================
 @@ -1383,10 +1393,13 @@ top:
  
  		/*
  		 * Avoid writing single-copy data to a failing vdev
 +		 * unless the user instructs us that it is okay.
  		 */
  		if ((vd->vdev_stat.vs_write_errors > 0 ||
  		    vd->vdev_state < VDEV_STATE_HEALTHY) &&
 -		    d == 0 && dshift == 3) {
 +		    d == 0 && dshift == 3 &&
 +		    !(zfs_write_to_degraded && vd->vdev_state ==
 +		    VDEV_STATE_DEGRADED)) {
  			all_zero = B_FALSE;
  			goto next;
  		}
 
 Modified: stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sa.c
 ==============================================================================
 --- stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sa.c	Wed Feb 27 19:03:31 2013	(r247405)
 +++ stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sa.c	Wed Feb 27 19:20:50 2013	(r247406)
 @@ -553,6 +553,7 @@ sa_find_sizes(sa_os_t *sa, sa_bulk_attr_
  {
  	int var_size = 0;
  	int i;
 +	int j = -1;
  	int full_space;
  	int hdrsize;
  	boolean_t done = B_FALSE;
 @@ -574,11 +575,13 @@ sa_find_sizes(sa_os_t *sa, sa_bulk_attr_
  	    sizeof (sa_hdr_phys_t);
  
  	full_space = (buftype == SA_BONUS) ? DN_MAX_BONUSLEN : db->db_size;
 +	ASSERT(IS_P2ALIGNED(full_space, 8));
  
  	for (i = 0; i != attr_count; i++) {
  		boolean_t is_var_sz;
  
 -		*total += P2ROUNDUP(attr_desc[i].sa_length, 8);
 +		*total = P2ROUNDUP(*total, 8);
 +		*total += attr_desc[i].sa_length;
  		if (done)
  			goto next;
  
 @@ -590,7 +593,14 @@ sa_find_sizes(sa_os_t *sa, sa_bulk_attr_
  		if (is_var_sz && var_size > 1) {
  			if (P2ROUNDUP(hdrsize + sizeof (uint16_t), 8) +
  			    *total < full_space) {
 +				/*
 +				 * Account for header space used by array of
 +				 * optional sizes of variable-length attributes.
 +				 * Record the index in case this increase needs
 +				 * to be reversed due to spill-over.
 +				 */
  				hdrsize += sizeof (uint16_t);
 +				j = i;
  			} else {
  				done = B_TRUE;
  				*index = i;
 @@ -619,6 +629,14 @@ next:
  			*will_spill = B_TRUE;
  	}
  
 +	/*
 +	 * j holds the index of the last variable-sized attribute for
 +	 * which hdrsize was increased.  Reverse the increase if that
 +	 * attribute will be relocated to the spill block.
 +	 */
 +	if (*will_spill && j == *index)
 +		hdrsize -= sizeof (uint16_t);
 +
  	hdrsize = P2ROUNDUP(hdrsize, 8);
  	return (hdrsize);
  }
 @@ -709,6 +727,8 @@ sa_build_layouts(sa_handle_t *hdl, sa_bu
  	for (i = 0, len_idx = 0, hash = -1ULL; i != attr_count; i++) {
  		uint16_t length;
  
 +		ASSERT(IS_P2ALIGNED(data_start, 8));
 +		ASSERT(IS_P2ALIGNED(buf_space, 8));
  		attrs[i] = attr_desc[i].sa_attr;
  		length = SA_REGISTERED_LEN(sa, attrs[i]);
  		if (length == 0)
 @@ -717,6 +737,7 @@ sa_build_layouts(sa_handle_t *hdl, sa_bu
  			VERIFY(length == attr_desc[i].sa_length);
  
  		if (buf_space < length) {  /* switch to spill buffer */
 +			VERIFY(spilling);
  			VERIFY(bonustype == DMU_OT_SA);
  			if (buftype == SA_BONUS && !sa->sa_force_spill) {
  				sa_find_layout(hdl->sa_os, hash, attrs_start,
 
 Modified: stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa.c
 ==============================================================================
 --- stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa.c	Wed Feb 27 19:03:31 2013	(r247405)
 +++ stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa.c	Wed Feb 27 19:20:50 2013	(r247406)
 @@ -1764,7 +1764,7 @@ spa_load_verify_done(zio_t *zio)
  /*ARGSUSED*/
  static int
  spa_load_verify_cb(spa_t *spa, zilog_t *zilog, const blkptr_t *bp,
 -    arc_buf_t *pbuf, const zbookmark_t *zb, const dnode_phys_t *dnp, void *arg)
 +    const zbookmark_t *zb, const dnode_phys_t *dnp, void *arg)
  {
  	if (bp != NULL) {
  		zio_t *rio = arg;
 
 Modified: stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/arc.h
 ==============================================================================
 --- stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/arc.h	Wed Feb 27 19:03:31 2013	(r247405)
 +++ stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/arc.h	Wed Feb 27 19:20:50 2013	(r247406)
 @@ -49,7 +49,6 @@ struct arc_buf {
  	arc_buf_hdr_t		*b_hdr;
  	arc_buf_t		*b_next;
  	kmutex_t		b_evict_lock;
 -	krwlock_t		b_data_lock;
  	void			*b_data;
  	arc_evict_func_t	*b_efunc;
  	void			*b_private;
 @@ -93,8 +92,6 @@ void arc_buf_add_ref(arc_buf_t *buf, voi
  int arc_buf_remove_ref(arc_buf_t *buf, void *tag);
  int arc_buf_size(arc_buf_t *buf);
  void arc_release(arc_buf_t *buf, void *tag);
 -int arc_release_bp(arc_buf_t *buf, void *tag, blkptr_t *bp, spa_t *spa,
 -    zbookmark_t *zb);
  int arc_released(arc_buf_t *buf);
  int arc_has_callback(arc_buf_t *buf);
  void arc_buf_freeze(arc_buf_t *buf);
 @@ -103,10 +100,7 @@ void arc_buf_thaw(arc_buf_t *buf);
  int arc_referenced(arc_buf_t *buf);
  #endif
  
 -int arc_read(zio_t *pio, spa_t *spa, const blkptr_t *bp, arc_buf_t *pbuf,
 -    arc_done_func_t *done, void *priv, int priority, int zio_flags,
 -    uint32_t *arc_flags, const zbookmark_t *zb);
 -int arc_read_nolock(zio_t *pio, spa_t *spa, const blkptr_t *bp,
 +int arc_read(zio_t *pio, spa_t *spa, const blkptr_t *bp,
      arc_done_func_t *done, void *priv, int priority, int flags,
      uint32_t *arc_flags, const zbookmark_t *zb);
  zio_t *arc_write(zio_t *pio, spa_t *spa, uint64_t txg,
 
 Modified: stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/dmu_traverse.h
 ==============================================================================
 --- stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/dmu_traverse.h	Wed Feb 27 19:03:31 2013	(r247405)
 +++ stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/dmu_traverse.h	Wed Feb 27 19:20:50 2013	(r247406)
 @@ -40,8 +40,7 @@ struct zilog;
  struct arc_buf;
  
  typedef int (blkptr_cb_t)(spa_t *spa, zilog_t *zilog, const blkptr_t *bp,
 -    struct arc_buf *pbuf, const zbookmark_t *zb, const struct dnode_phys *dnp,
 -    void *arg);
 +    const zbookmark_t *zb, const struct dnode_phys *dnp, void *arg);
  
  #define	TRAVERSE_PRE			(1<<0)
  #define	TRAVERSE_POST			(1<<1)
 
 Modified: stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/dsl_pool.h
 ==============================================================================
 --- stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/dsl_pool.h	Wed Feb 27 19:03:31 2013	(r247405)
 +++ stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/dsl_pool.h	Wed Feb 27 19:20:50 2013	(r247406)
 @@ -134,12 +134,6 @@ void dsl_pool_willuse_space(dsl_pool_t *
  void dsl_free(dsl_pool_t *dp, uint64_t txg, const blkptr_t *bpp);
  void dsl_free_sync(zio_t *pio, dsl_pool_t *dp, uint64_t txg,
      const blkptr_t *bpp);
 -int dsl_read(zio_t *pio, spa_t *spa, const blkptr_t *bpp, arc_buf_t *pbuf,
 -    arc_done_func_t *done, void *priv, int priority, int zio_flags,
 -    uint32_t *arc_flags, const zbookmark_t *zb);
 -int dsl_read_nolock(zio_t *pio, spa_t *spa, const blkptr_t *bpp,
 -    arc_done_func_t *done, void *priv, int priority, int zio_flags,
 -    uint32_t *arc_flags, const zbookmark_t *zb);
  void dsl_pool_create_origin(dsl_pool_t *dp, dmu_tx_t *tx);
  void dsl_pool_upgrade_clones(dsl_pool_t *dp, dmu_tx_t *tx);
  void dsl_pool_upgrade_dir_clones(dsl_pool_t *dp, dmu_tx_t *tx);
 
 Modified: stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/refcount.h
 ==============================================================================
 --- stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/refcount.h	Wed Feb 27 19:03:31 2013	(r247405)
 +++ stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/refcount.h	Wed Feb 27 19:20:50 2013	(r247406)
 @@ -20,6 +20,7 @@
   */
  /*
   * Copyright (c) 2005, 2010, Oracle and/or its affiliates. All rights reserved.
 + * Copyright (c) 2012 by Delphix. All rights reserved.
   */
  
  #ifndef	_SYS_REFCOUNT_H
 @@ -54,8 +55,8 @@ typedef struct refcount {
  	kmutex_t rc_mtx;
  	list_t rc_list;
  	list_t rc_removed;
 -	int64_t rc_count;
 -	int64_t rc_removed_count;
 +	uint64_t rc_count;
 +	uint64_t rc_removed_count;
  } refcount_t;
  
  /* Note: refcount_t must be initialized with refcount_create() */
 
 Modified: stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev.c
 ==============================================================================
 --- stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev.c	Wed Feb 27 19:03:31 2013	(r247405)
 +++ stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev.c	Wed Feb 27 19:20:50 2013	(r247406)
 @@ -1328,7 +1328,8 @@ vdev_validate(vdev_t *vd, boolean_t stri
  	if (vd->vdev_ops->vdev_op_leaf && vdev_readable(vd)) {
  		uint64_t aux_guid = 0;
  		nvlist_t *nvl;
 -		uint64_t txg = strict ? spa->spa_config_txg : -1ULL;
 +		uint64_t txg = spa_last_synced_txg(spa) != 0 ?
 +		    spa_last_synced_txg(spa) : -1ULL;
  
  		if ((label = vdev_label_read_config(vd, txg)) == NULL) {
  			vdev_set_state(vd, B_TRUE, VDEV_STATE_CANT_OPEN,
 @@ -1512,7 +1513,7 @@ vdev_reopen(vdev_t *vd)
  		    !l2arc_vdev_present(vd))
  			l2arc_add_vdev(spa, vd);
  	} else {
 -		(void) vdev_validate(vd, spa_last_synced_txg(spa));
 +		(void) vdev_validate(vd, B_TRUE);
  	}
  
  	/*
 
 Modified: stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_ioctl.c
 ==============================================================================
 --- stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_ioctl.c	Wed Feb 27 19:03:31 2013	(r247405)
 +++ stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_ioctl.c	Wed Feb 27 19:20:50 2013	(r247406)
 @@ -106,12 +106,18 @@ typedef enum {
  	DATASET_NAME
  } zfs_ioc_namecheck_t;
  
 +typedef enum {
 +	POOL_CHECK_NONE		= 1 << 0,
 +	POOL_CHECK_SUSPENDED	= 1 << 1,
 +	POOL_CHECK_READONLY	= 1 << 2
 +} zfs_ioc_poolcheck_t;
 +
  typedef struct zfs_ioc_vec {
  	zfs_ioc_func_t		*zvec_func;
  	zfs_secpolicy_func_t	*zvec_secpolicy;
  	zfs_ioc_namecheck_t	zvec_namecheck;
  	boolean_t		zvec_his_log;
 -	boolean_t		zvec_pool_check;
 +	zfs_ioc_poolcheck_t	zvec_pool_check;
  } zfs_ioc_vec_t;
  
  /* This array is indexed by zfs_userquota_prop_t */
 @@ -5033,138 +5039,155 @@ zfs_ioc_unjail(zfs_cmd_t *zc)
  
  static zfs_ioc_vec_t zfs_ioc_vec[] = {
  	{ zfs_ioc_pool_create, zfs_secpolicy_config, POOL_NAME, B_FALSE,
 -	    B_FALSE },
 +	    POOL_CHECK_NONE },
  	{ zfs_ioc_pool_destroy,	zfs_secpolicy_config, POOL_NAME, B_FALSE,
 -	    B_FALSE },
 +	    POOL_CHECK_NONE },
  	{ zfs_ioc_pool_import, zfs_secpolicy_config, POOL_NAME, B_TRUE,
 -	    B_FALSE },
 +	    POOL_CHECK_NONE },
  	{ zfs_ioc_pool_export, zfs_secpolicy_config, POOL_NAME, B_FALSE,
 -	    B_FALSE },
 
 *** DIFF OUTPUT TRUNCATED AT 1000 LINES ***
 _______________________________________________
 svn-src-all@freebsd.org mailing list
 http://lists.freebsd.org/mailman/listinfo/svn-src-all
 To unsubscribe, send any mail to "svn-src-all-unsubscribe@freebsd.org"
 

From owner-freebsd-fs@FreeBSD.ORG  Thu Feb 28 02:59:29 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115])
 by hub.freebsd.org (Postfix) with ESMTP id 58506410;
 Thu, 28 Feb 2013 02:59:29 +0000 (UTC)
 (envelope-from rmacklem@uoguelph.ca)
Received: from esa-jnhn.mail.uoguelph.ca (esa-jnhn.mail.uoguelph.ca
 [131.104.91.44]) by mx1.freebsd.org (Postfix) with ESMTP id E84E73A4;
 Thu, 28 Feb 2013 02:59:28 +0000 (UTC)
X-IronPort-Anti-Spam-Filtered: true
X-IronPort-Anti-Spam-Result: AqAEACTHLlGDaFvO/2dsb2JhbABFhk+4e4JlgRNzgiYjVkQZAgRVBogmrweSZ45gGRsHgi2BEwOIaoY8hxuJY4cHgyaCCQ
X-IronPort-AV: E=Sophos;i="4.84,752,1355115600"; d="scan'208";a="18641552"
Received: from erie.cs.uoguelph.ca (HELO zcs3.mail.uoguelph.ca)
 ([131.104.91.206])
 by esa-jnhn.mail.uoguelph.ca with ESMTP; 27 Feb 2013 21:59:22 -0500
Received: from zcs3.mail.uoguelph.ca (localhost.localdomain [127.0.0.1])
 by zcs3.mail.uoguelph.ca (Postfix) with ESMTP id 08FFCB3F0D;
 Wed, 27 Feb 2013 21:59:22 -0500 (EST)
Date: Wed, 27 Feb 2013 21:59:22 -0500 (EST)
From: Rick Macklem <rmacklem@uoguelph.ca>
To: FreeBSD Filesystems <freebsd-fs@freebsd.org>
Message-ID: <707174204.3391839.1362020362019.JavaMail.root@erie.cs.uoguelph.ca>
In-Reply-To: <860349954.3391816.1362020304865.JavaMail.root@erie.cs.uoguelph.ca>
Subject: should vn_fullpath1() ever return a path with "." in it?
MIME-Version: 1.0
Content-Type: multipart/mixed; 
 boundary="----=_Part_3391838_1284162422.1362020362017"
X-Originating-IP: [172.17.91.201]
X-Mailer: Zimbra 6.0.10_GA_2692 (ZimbraWebClient - FF3.0 (Win)/6.0.10_GA_2692)
Cc: Sergey Kandaurov <pluknet@freebsd.org>, Kostik Belousov <kib@freebsd.org>
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 28 Feb 2013 02:59:29 -0000

------=_Part_3391838_1284162422.1362020362017
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit

Hi,

Sergey Kandaurov reported a problem where getcwd() returns a
path with "/./" imbedded in it for an NFSv4 mount. This is
caused by a mount point crossing on the server when at the
server's root because vn_fullpath1() uses VV_ROOT to spot
mount point crossings.

The current workaround is to use the sysctls:
debug.disablegetcwd=1
debug.disablefullpath=1

However, it would be nice to fix this when vn_fullpath1()
is being used.

A simple fix is to have vn_fullpath1() fail when it finds
"." as a directory match in the path. When vn_fullpath1()
fails, the syscalls fail and that allows the libc algorithm
to be used (which works for this case because it doesn't
depend on VV_ROOT being set, etc).

So, I am wondering if a patch (I have attached one) that
makes vn_fullpath1() fail when it matches "." will break
anything else? (I don't think so, since the code checks
for VV_ROOT in the loop above the check for a match of
".", but I am not sure?)

Thanks for any input w.r.t. this, rick

------=_Part_3391838_1284162422.1362020362017
Content-Type: text/x-patch; name=getcwd.patch
Content-Transfer-Encoding: base64
Content-Disposition: attachment; filename=getcwd.patch

LS0tIGtlcm4vdmZzX2NhY2hlLmMuc2F2CTIwMTMtMDItMjcgMjA6NDQ6NDIuMDAwMDAwMDAwIC0w
NTAwCisrKyBrZXJuL3Zmc19jYWNoZS5jCTIwMTMtMDItMjcgMjE6MTA6MzkuMDAwMDAwMDAwIC0w
NTAwCkBAIC0xMzMzLDYgKzEzMzMsMjAgQEAgdm5fZnVsbHBhdGgxKHN0cnVjdCB0aHJlYWQgKnRk
LCBzdHJ1Y3QgdgogCQkJICAgIHN0YXJ0dnAsIE5VTEwsIDAsIDApOwogCQkJYnJlYWs7CiAJCX0K
KwkJaWYgKGJ1ZltidWZsZW5dID09ICcuJyAmJiAoYnVmW2J1ZmxlbiArIDFdID09ICdcMCcgfHwK
KwkJICAgIGJ1ZltidWZsZW4gKyAxXSA9PSAnLycpKSB7CisJCQkvKgorCQkJICogRmFpbCBpZiBp
dCBtYXRjaGVkICIuIi4gVGhpcyBzaG91bGQgb25seSBoYXBwZW4KKwkJCSAqIGZvciBORlN2NCBt
b3VudHMgdGhhdCBjcm9zcyBzZXJ2ZXIgbW91bnQgcG9pbnRzLgorCQkJICovCisJCQlDQUNIRV9S
VU5MT0NLKCk7CisJCQl2cmVsZSh2cCk7CisJCQludW1mdWxscGF0aGZhaWwxKys7CisJCQllcnJv
ciA9IEVOT0VOVDsKKwkJCVNEVF9QUk9CRSh2ZnMsIG5hbWVjYWNoZSwgZnVsbHBhdGgsIHJldHVy
biwKKwkJCSAgICBlcnJvciwgdnAsIE5VTEwsIDAsIDApOworCQkJYnJlYWs7CisJCX0KIAkJYnVm
Wy0tYnVmbGVuXSA9ICcvJzsKIAkJc2xhc2hfcHJlZml4ZWQgPSAxOwogCX0K
------=_Part_3391838_1284162422.1362020362017--

From owner-freebsd-fs@FreeBSD.ORG  Thu Feb 28 06:51:19 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by hub.freebsd.org (Postfix) with ESMTP id 4E32F731
 for <freebsd-fs@freebsd.org>; Thu, 28 Feb 2013 06:51:19 +0000 (UTC)
 (envelope-from it.helpdesk@mab.ae)
Received: from mail.mab.ae (mail2.mab.ae [94.56.15.51])
 by mx1.freebsd.org (Postfix) with ESMTP id 75AF8E50
 for <freebsd-fs@freebsd.org>; Thu, 28 Feb 2013 06:51:17 +0000 (UTC)
Received: from DXBHUB02.MAB.PRD (Not Verified[172.16.5.126]) by mail.mab.ae
 with MailMarshal (v6, 8, 4, 9558)
 id <B512efaa70000>; Thu, 28 Feb 2013 10:35:19 +0400
Received: from DXBMBX01.MAB.PRD ([fe80::e0ca:10ea:97a8:27d0]) by dxbhub02
 ([172.16.5.124]) with mapi id 14.01.0355.002; Thu, 28 Feb 2013 10:35:19 +0400
From: IT Helpdesk <it.helpdesk@mab.ae>
To: "freebsd-fs@freebsd.org" <freebsd-fs@freebsd.org>
Subject: Re: Policy Breaches, Malformed, Spam Type - Zero Day, Routing,
 Encryption,  Undetermined, Spam, Malformed Mime, Spam Type - Phish,  Spam
 Type - Pornographic, Suspect Summary Digest: 1 Messages
Thread-Topic: Policy Breaches, Malformed, Spam Type - Zero Day, Routing,
 Encryption,  Undetermined, Spam, Malformed Mime, Spam Type - Phish,  Spam
 Type - Pornographic, Suspect Summary Digest: 1 Messages
Thread-Index: Ac4VfcC0L3C5MuOiTf6fGI5idyf1Bw==
Date: Thu, 28 Feb 2013 06:35:18 +0000
Message-ID: <238EE51378AEA748BD52DD92AF2450F234CD665E@DXBMBX01.MAB.PRD>
Accept-Language: en-US
Content-Language: en-US
X-MS-Has-Attach: 
X-MS-TNEF-Correlator: 
x-originating-ip: [172.16.6.147]
MIME-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: quoted-printable
X-Content-Filtered-By: Mailman/MimeDel 2.1.14
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 28 Feb 2013 06:51:19 -0000



#########################################################################=
############
Disclaimer: This message is for the named person's use only.  It may cont=
ain confidential, proprietary or legally privileged information.  No conf=
identiality or privilege is waived or lost by any incorrect transmission.=
=20
If you receive this message in error, please immediately delete it and al=
l copies of it from your system, destroy any hard copies of it and notify=
=20the sender.You must not, directly or indirectly, use, disclose, distri=
bute, print, or copy any part of this message if you are not the intended=
=20recipient.=20
Email transmission cannot be guaranteed to be secure or error-free as inf=
ormation may be intercepted, corrupted, lost, destroyed, arrive late or i=
ncomplete, or contain viruses. Therefore, we do not accept any liability,=
=20for any error or omission in this email or for any resulting loss or d=
amage suffered as a result of email transmission.=20
Any views expressed in this message are those of the individual sender, e=
xcept where the message states otherwise and the sender is authorized to =
state them to be the views of any such entity. MAB Facilities Management =
L.L.C and any of its subsidiaries each reserve the right to monitor all e=
-mail communications through its networks.=20

#########################################################################=
############

From owner-freebsd-fs@FreeBSD.ORG  Thu Feb 28 07:05:23 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by hub.freebsd.org (Postfix) with ESMTP id AC218B0A;
 Thu, 28 Feb 2013 07:05:23 +0000 (UTC)
 (envelope-from kostikbel@gmail.com)
Received: from kib.kiev.ua (kib.kiev.ua [IPv6:2001:470:d5e7:1::1])
 by mx1.freebsd.org (Postfix) with ESMTP id 06DC9ED8;
 Thu, 28 Feb 2013 07:05:22 +0000 (UTC)
Received: from tom.home (kostik@localhost [127.0.0.1])
 by kib.kiev.ua (8.14.6/8.14.6) with ESMTP id r1S75FXk088416;
 Thu, 28 Feb 2013 09:05:15 +0200 (EET)
 (envelope-from kostikbel@gmail.com)
DKIM-Filter: OpenDKIM Filter v2.8.0 kib.kiev.ua r1S75FXk088416
Received: (from kostik@localhost)
 by tom.home (8.14.6/8.14.6/Submit) id r1S75FiS088414;
 Thu, 28 Feb 2013 09:05:15 +0200 (EET)
 (envelope-from kostikbel@gmail.com)
X-Authentication-Warning: tom.home: kostik set sender to kostikbel@gmail.com
 using -f
Date: Thu, 28 Feb 2013 09:05:15 +0200
From: Konstantin Belousov <kostikbel@gmail.com>
To: Rick Macklem <rmacklem@uoguelph.ca>
Subject: Re: should vn_fullpath1() ever return a path with "." in it?
Message-ID: <20130228070515.GK2454@kib.kiev.ua>
References: <860349954.3391816.1362020304865.JavaMail.root@erie.cs.uoguelph.ca>
 <707174204.3391839.1362020362019.JavaMail.root@erie.cs.uoguelph.ca>
MIME-Version: 1.0
Content-Type: multipart/signed; micalg=pgp-sha1;
 protocol="application/pgp-signature"; boundary="TOcCsfss/f1fJPnO"
Content-Disposition: inline
In-Reply-To: <707174204.3391839.1362020362019.JavaMail.root@erie.cs.uoguelph.ca>
User-Agent: Mutt/1.5.21 (2010-09-15)
X-Spam-Status: No, score=-2.0 required=5.0 tests=ALL_TRUSTED,BAYES_00,
 DKIM_ADSP_CUSTOM_MED,FREEMAIL_FROM,NML_ADSP_CUSTOM_MED autolearn=no
 version=3.3.2
X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on tom.home
Cc: FreeBSD Filesystems <freebsd-fs@freebsd.org>,
 Sergey Kandaurov <pluknet@freebsd.org>
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 28 Feb 2013 07:05:23 -0000


--TOcCsfss/f1fJPnO
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

On Wed, Feb 27, 2013 at 09:59:22PM -0500, Rick Macklem wrote:
> Hi,
>=20
> Sergey Kandaurov reported a problem where getcwd() returns a
> path with "/./" imbedded in it for an NFSv4 mount. This is
> caused by a mount point crossing on the server when at the
> server's root because vn_fullpath1() uses VV_ROOT to spot
> mount point crossings.
>=20
> The current workaround is to use the sysctls:
> debug.disablegetcwd=3D1
> debug.disablefullpath=3D1
>=20
> However, it would be nice to fix this when vn_fullpath1()
> is being used.
>=20
> A simple fix is to have vn_fullpath1() fail when it finds
> "." as a directory match in the path. When vn_fullpath1()
> fails, the syscalls fail and that allows the libc algorithm
> to be used (which works for this case because it doesn't
> depend on VV_ROOT being set, etc).
>=20
> So, I am wondering if a patch (I have attached one) that
> makes vn_fullpath1() fail when it matches "." will break
> anything else? (I don't think so, since the code checks
> for VV_ROOT in the loop above the check for a match of
> ".", but I am not sure?)
>=20
> Thanks for any input w.r.t. this, rick

> --- kern/vfs_cache.c.sav	2013-02-27 20:44:42.000000000 -0500
> +++ kern/vfs_cache.c	2013-02-27 21:10:39.000000000 -0500
> @@ -1333,6 +1333,20 @@ vn_fullpath1(struct thread *td, struct v
>  			    startvp, NULL, 0, 0);
>  			break;
>  		}
> +		if (buf[buflen] =3D=3D '.' && (buf[buflen + 1] =3D=3D '\0' ||
> +		    buf[buflen + 1] =3D=3D '/')) {
> +			/*
> +			 * Fail if it matched ".". This should only happen
> +			 * for NFSv4 mounts that cross server mount points.
> +			 */
> +			CACHE_RUNLOCK();
> +			vrele(vp);
> +			numfullpathfail1++;
> +			error =3D ENOENT;
> +			SDT_PROBE(vfs, namecache, fullpath, return,
> +			    error, vp, NULL, 0, 0);
> +			break;
> +		}
>  		buf[--buflen] =3D '/';
>  		slash_prefixed =3D 1;
>  	}

I do not quite understand this. Did the dvp (parent) vnode returned by
VOP_VPTOCNP() equal to vp (child) vnode in the case of the "." name ?
It must be, for the correct operation, but also it should cause the almost
infinite loop in the vn_fullpath1(). The loop is not really infinite due
to a limited size of the buffer where the infinite amount of "./" is placed.

Anyway, I think we should do better than this patch, even if it is
legitimate. I think that the better place to check the condition is the
default implementation of VOP_VPTOCNP(). Am I right that this is where
it broke for you ?
diff --git a/sys/kern/vfs_default.c b/sys/kern/vfs_default.c
index 00d064e..1dd0185 100644
--- a/sys/kern/vfs_default.c
+++ b/sys/kern/vfs_default.c
@@ -856,8 +856,12 @@ vop_stdvptocnp(struct vop_vptocnp_args *ap)
 				error =3D ENOMEM;
 				goto out;
 			}
-			bcopy(dp->d_name, buf + i, dp->d_namlen);
-			error =3D 0;
+			if (dp->d_namlen =3D=3D 1 && dp->d_name[0] =3D=3D '.') {
+				error =3D ENOENT;
+			} else {
+				bcopy(dp->d_name, buf + i, dp->d_namlen);
+				error =3D 0;
+			}
 			goto out;
 		}
 	} while (len > 0 || !eofflag);

--TOcCsfss/f1fJPnO
Content-Type: application/pgp-signature

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.19 (FreeBSD)

iQIcBAEBAgAGBQJRLwGqAAoJEJDCuSvBvK1BDWAP/2btss4VP6rNctXFRP+Sg89v
HyhDquJdAUqhSCbQgRMTUrUQWf/K/O3RJAZ+/6S062JQH7vYfvGkB7YFnMVk0oml
2Lho0Qie4lMM2zwH/otWpJ3L0FxRed5dG3vB0jmBYXFTizzGiFPx0jgr2X40vuVE
n6cdICidbApt4hbuSSBE3V2c1XqpufbOWYp3uKrqdQ/twMdR6nsEOGnMeGCNaqm3
tDv2LNLJIz+6MYwerCeELkNuxQpPZRMCHL54t72WeIbhGcC5aK225txpyw7sJPhG
UDqLDGEuwSj5xbwqt9ISEkd2HqumzhRuUhmTX/popF+TDaJP6uQAEVwwS4UxdMCt
y+qzn+zO4xlFljHwGGaxf+8abrfZ/31+w3riSOd3HI3MPVbEkHH9Z05LX7bjabeO
Xs0L3Yeh1LDL6/adwkpZYdUHcWwNhywzXt0oduVXO9NAhPvlxDRs2o9yjkvJCzob
axjcwY6H8QEnHLwlTsGO/lWYQLtSzXr4EmQta2EjjmMMby7J7dg6NdS+d1DfViZc
eZCXx8sl3LyPRYXUwZ72522csR974TVirHFWh0dqsUkcVLxLQdQE8Pdkx2UWlyJA
KaWmSsf8b6YN3bVmQbQwSsrM97r7o/W06OgscZXwxqZUPNy5WlRl+VfxrjPGspnd
JrIyI8FEx0oKz63uan5Z
=d3e2
-----END PGP SIGNATURE-----

--TOcCsfss/f1fJPnO--

From owner-freebsd-fs@FreeBSD.ORG  Thu Feb 28 08:06:46 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115])
 by hub.freebsd.org (Postfix) with ESMTP id E9B976C2;
 Thu, 28 Feb 2013 08:06:46 +0000 (UTC) (envelope-from lev@FreeBSD.org)
Received: from onlyone.friendlyhosting.spb.ru (onlyone.friendlyhosting.spb.ru
 [46.4.40.135]) by mx1.freebsd.org (Postfix) with ESMTP id B22961111;
 Thu, 28 Feb 2013 08:06:46 +0000 (UTC)
Received: from lion.home.serebryakov.spb.ru (unknown
 [IPv6:2001:470:923f:1:9421:367:9d7d:512b])
 (Authenticated sender: lev@serebryakov.spb.ru)
 by onlyone.friendlyhosting.spb.ru (Postfix) with ESMTPA id BCF1B4AC57;
 Thu, 28 Feb 2013 12:06:35 +0400 (MSK)
Date: Thu, 28 Feb 2013 12:06:30 +0400
From: Lev Serebryakov <lev@FreeBSD.org>
Organization: FreeBSD
X-Priority: 3 (Normal)
Message-ID: <1796551389.20130228120630@serebryakov.spb.ru>
To: freebsd-fs@freebsd.org, freebsd-current@freebsd.org
Subject: Unexpected SU+J inconsistency AGAIN -- please,
 don't shift topic to ZFS!
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
Reply-To: lev@FreeBSD.org
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 28 Feb 2013 08:06:47 -0000

Hello, Freebsd-fs.

 My server runs 9.1-STABLE and have 8Tb UFS2 SU+J FS.
 It crashed a several minutes ago (I don't know reason yet) and fsck
says "Unexpected SU+J inconsistency" (Inode mode/directory tyme
mismatch) and requested full check (which will take more than hour on
such FS).

 All drives are perfectly healthy according to SMART, it is SATA
WD20EARS/EARX mix.

 In my experience, SU/SU+J fsck never completes successful on this
FS :(

 Does SU+J work at all? Here was topic in closed mailing list about
it, started as topic about using CURRENT on FreeBSD's cluster, but it
was shifted to ZFS discussion without changing "Subject" line after
several iterations without any conclusion.

 Could I do something to help debug this problem?

 Please, don't give advices like "Convert to ZFS". ZFS is great, but,
I think, we should have robust "native" and simple FS too.

-- 
// Black Lion AKA Lev Serebryakov <lev@FreeBSD.org>


From owner-freebsd-fs@FreeBSD.ORG  Thu Feb 28 08:33:32 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115])
 by hub.freebsd.org (Postfix) with ESMTP id 45C925F6;
 Thu, 28 Feb 2013 08:33:32 +0000 (UTC) (envelope-from lev@FreeBSD.org)
Received: from onlyone.friendlyhosting.spb.ru (onlyone.friendlyhosting.spb.ru
 [46.4.40.135]) by mx1.freebsd.org (Postfix) with ESMTP id 06A9912C2;
 Thu, 28 Feb 2013 08:33:31 +0000 (UTC)
Received: from lion.home.serebryakov.spb.ru (unknown
 [IPv6:2001:470:923f:1:9421:367:9d7d:512b])
 (Authenticated sender: lev@serebryakov.spb.ru)
 by onlyone.friendlyhosting.spb.ru (Postfix) with ESMTPA id C90034AC57;
 Thu, 28 Feb 2013 12:33:30 +0400 (MSK)
Date: Thu, 28 Feb 2013 12:33:25 +0400
From: Lev Serebryakov <lev@FreeBSD.org>
Organization: FreeBSD
X-Priority: 3 (Normal)
Message-ID: <1238720635.20130228123325@serebryakov.spb.ru>
To: freebsd-fs@freebsd.org, freebsd-current@freebsd.org
Subject: Re: Unexpected SU+J inconsistency AGAIN -- please,
 don't shift topic to ZFS!
In-Reply-To: <1796551389.20130228120630@serebryakov.spb.ru>
References: <1796551389.20130228120630@serebryakov.spb.ru>
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: quoted-printable
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
Reply-To: lev@FreeBSD.org
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 28 Feb 2013 08:33:32 -0000

Hello, Lev.
You wrote 28 =D1=84=D0=B5=D0=B2=D1=80=D0=B0=D0=BB=D1=8F 2013 =D0=B3., 12:06=
:30:

LS> Hello, Freebsd-fs.

LS>  My server runs 9.1-STABLE and have 8Tb UFS2 SU+J FS.
LS>  It crashed a several minutes ago (I don't know reason yet) and fsck
LS> says "Unexpected SU+J inconsistency" (Inode mode/directory tyme
LS> mismatch) and requested full check (which will take more than hour on
LS> such FS).
 Full fsck found "INTERNAL ERROR: DUPS WITH SOFTUPDATES" and keeps running.=
..

--=20
// Black Lion AKA Lev Serebryakov <lev@FreeBSD.org>


From owner-freebsd-fs@FreeBSD.ORG  Thu Feb 28 09:07:42 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by hub.freebsd.org (Postfix) with ESMTP id 29F7FF98;
 Thu, 28 Feb 2013 09:07:42 +0000 (UTC)
 (envelope-from yerenkow@gmail.com)
Received: from mail-da0-f46.google.com (mail-da0-f46.google.com
 [209.85.210.46])
 by mx1.freebsd.org (Postfix) with ESMTP id C90031611;
 Thu, 28 Feb 2013 09:07:41 +0000 (UTC)
Received: by mail-da0-f46.google.com with SMTP id z8so763294dad.5
 for <multiple recipients>; Thu, 28 Feb 2013 01:07:35 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
 h=mime-version:x-received:in-reply-to:references:date:message-id
 :subject:from:to:cc:content-type;
 bh=1aba5cCFpfKeW8SjF9cANZJnEbpLg/20UUidbpyvFEk=;
 b=GVlFmIuD4i9UcfhuRKz5vYOPfeBo/T/EtW5mddH1YZAlGgWdl8rfmxd88J6YJqooaz
 F5SZkGffUTE6zRdEV5+c4D0kB4H63QIBLHqnyW1ECsOcAF9xa4DDuGKZCiy2NVfSe7nv
 1sCAL12O2O/NV1KPI3JwUYzlCeCqINLW8wNLl1FWjsmv8raKaChAWpN3rF6HooT0HJw7
 PoMMCl1aSK0MNSTQXszA0ORvM4OqEizwj3p/iBpAtFvTgE0gFax4Spv6nTgLk4U5hfV1
 uDSMKyOtW2MEdxukbjLBMjthqByDlT9T558yCyaFizNyrrpVaruLlbWxDZESwL8Fms0k
 bDsg==
MIME-Version: 1.0
X-Received: by 10.68.134.3 with SMTP id pg3mr8171454pbb.51.1362042455545; Thu,
 28 Feb 2013 01:07:35 -0800 (PST)
Received: by 10.68.36.69 with HTTP; Thu, 28 Feb 2013 01:07:35 -0800 (PST)
In-Reply-To: <1238720635.20130228123325@serebryakov.spb.ru>
References: <1796551389.20130228120630@serebryakov.spb.ru>
 <1238720635.20130228123325@serebryakov.spb.ru>
Date: Thu, 28 Feb 2013 11:07:35 +0200
Message-ID: <CAPJF9wm+m75mAu=mZ2kQ67xMnX+YLhnpgGi-aUAV1AHkKnCXAg@mail.gmail.com>
Subject: Re: Unexpected SU+J inconsistency AGAIN -- please, don't shift topic
 to ZFS!
From: Alexander Yerenkow <yerenkow@gmail.com>
To: lev@freebsd.org
Content-Type: text/plain; charset=ISO-8859-1
X-Content-Filtered-By: Mailman/MimeDel 2.1.14
Cc: freebsd-fs@freebsd.org, freebsd-current@freebsd.org
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 28 Feb 2013 09:07:42 -0000

How about tell us 9.1-STABLE  from which date you run?
Do you use any dumps/snapshots in this FS?
In past, that could broke things.

-- 
Regards,
Alexander Yerenkow

From owner-freebsd-fs@FreeBSD.ORG  Thu Feb 28 09:11:08 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115])
 by hub.freebsd.org (Postfix) with ESMTP id 29071466;
 Thu, 28 Feb 2013 09:11:08 +0000 (UTC) (envelope-from lev@FreeBSD.org)
Received: from onlyone.friendlyhosting.spb.ru (onlyone.friendlyhosting.spb.ru
 [IPv6:2a01:4f8:131:60a2::2])
 by mx1.freebsd.org (Postfix) with ESMTP id DEE65164F;
 Thu, 28 Feb 2013 09:11:07 +0000 (UTC)
Received: from lion.home.serebryakov.spb.ru (unknown
 [IPv6:2001:470:923f:1:9421:367:9d7d:512b])
 (Authenticated sender: lev@serebryakov.spb.ru)
 by onlyone.friendlyhosting.spb.ru (Postfix) with ESMTPA id 5B5614AC57;
 Thu, 28 Feb 2013 13:11:00 +0400 (MSK)
Date: Thu, 28 Feb 2013 13:10:55 +0400
From: Lev Serebryakov <lev@FreeBSD.org>
Organization: FreeBSD Project
X-Priority: 3 (Normal)
Message-ID: <376897409.20130228131055@serebryakov.spb.ru>
To: Alexander Yerenkow <yerenkow@gmail.com>
Subject: Re: Unexpected SU+J inconsistency AGAIN -- please,
 don't shift topic to ZFS!
In-Reply-To: <CAPJF9wm+m75mAu=mZ2kQ67xMnX+YLhnpgGi-aUAV1AHkKnCXAg@mail.gmail.com>
References: <1796551389.20130228120630@serebryakov.spb.ru>
 <1238720635.20130228123325@serebryakov.spb.ru>
 <CAPJF9wm+m75mAu=mZ2kQ67xMnX+YLhnpgGi-aUAV1AHkKnCXAg@mail.gmail.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=windows-1251
Content-Transfer-Encoding: quoted-printable
Cc: freebsd-fs@freebsd.org, freebsd-current@freebsd.org
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
Reply-To: lev@FreeBSD.org
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 28 Feb 2013 09:11:08 -0000

Hello, Alexander.
You wrote 28 =F4=E5=E2=F0=E0=EB=FF 2013 =E3., 13:07:35:

AY> How about tell us 9.1-STABLE  from which date you run?
  r244957

AY> Do you use any dumps/snapshots in this FS?
  Nope.=20

--=20
// Black Lion AKA Lev Serebryakov <lev@FreeBSD.org>


From owner-freebsd-fs@FreeBSD.ORG  Thu Feb 28 10:13:31 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115])
 by hub.freebsd.org (Postfix) with ESMTP id 953552D8;
 Thu, 28 Feb 2013 10:13:31 +0000 (UTC) (envelope-from lev@FreeBSD.org)
Received: from onlyone.friendlyhosting.spb.ru (onlyone.friendlyhosting.spb.ru
 [46.4.40.135]) by mx1.freebsd.org (Postfix) with ESMTP id 3E6C4192F;
 Thu, 28 Feb 2013 10:13:30 +0000 (UTC)
Received: from lion.home.serebryakov.spb.ru (unknown
 [IPv6:2001:470:923f:1:9421:367:9d7d:512b])
 (Authenticated sender: lev@serebryakov.spb.ru)
 by onlyone.friendlyhosting.spb.ru (Postfix) with ESMTPA id 254904AC57;
 Thu, 28 Feb 2013 14:13:29 +0400 (MSK)
Date: Thu, 28 Feb 2013 14:13:23 +0400
From: Lev Serebryakov <lev@FreeBSD.org>
Organization: FreeBSD
X-Priority: 3 (Normal)
Message-ID: <1158712592.20130228141323@serebryakov.spb.ru>
To: freebsd-fs@freebsd.org, freebsd-current@freebsd.org
Subject: Re: Unexpected SU+J inconsistency AGAIN -- please,
 don't shift topic to ZFS!
In-Reply-To: <1238720635.20130228123325@serebryakov.spb.ru>
References: <1796551389.20130228120630@serebryakov.spb.ru>
 <1238720635.20130228123325@serebryakov.spb.ru>
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: quoted-printable
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
Reply-To: lev@FreeBSD.org
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 28 Feb 2013 10:13:31 -0000

Hello, Lev.
You wrote 28 =D1=84=D0=B5=D0=B2=D1=80=D0=B0=D0=BB=D1=8F 2013 =D0=B3., 12:33=
:25:

LS>>  My server runs 9.1-STABLE and have 8Tb UFS2 SU+J FS.
LS>>  It crashed a several minutes ago (I don't know reason yet) and fsck
LS>> says "Unexpected SU+J inconsistency" (Inode mode/directory tyme
LS>> mismatch) and requested full check (which will take more than hour on
LS>> such FS).
LS>  Full fsck found "INTERNAL ERROR: DUPS WITH SOFTUPDATES" and keeps runn=
ing...
  full fsck reconnected about 1000 files, which was written in time of
 crash.
  Really, sever crashed when SVN mirror seed was been unpacking on
 this FS, so there was massive file creation at this time.

--=20
// Black Lion AKA Lev Serebryakov <lev@FreeBSD.org>


From owner-freebsd-fs@FreeBSD.ORG  Thu Feb 28 10:23:07 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115])
 by hub.freebsd.org (Postfix) with ESMTP id 39F2B655;
 Thu, 28 Feb 2013 10:23:07 +0000 (UTC)
 (envelope-from yerenkow@gmail.com)
Received: from mail-ve0-f172.google.com (mail-ve0-f172.google.com
 [209.85.128.172])
 by mx1.freebsd.org (Postfix) with ESMTP id CD1001999;
 Thu, 28 Feb 2013 10:23:06 +0000 (UTC)
Received: by mail-ve0-f172.google.com with SMTP id cz11so1625961veb.17
 for <multiple recipients>; Thu, 28 Feb 2013 02:23:00 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
 h=mime-version:x-received:in-reply-to:references:date:message-id
 :subject:from:to:cc:content-type;
 bh=U1qrMep/B2zkWGFqSiR44X5W1nJtQfi9GHLLAWZEmeM=;
 b=rgD5KWyyblHVQlxiBqYY39KXGc7UwVdQwNvA8oXrP8miDJEstNaoy366mErGEpGnk8
 kTF6KRZ/ixkwAKjWCNFvEZ9kFsf4YMTJFcxE0g9Kl//85PA/QuEeEYrAVmtahpU2mM72
 3Lewu1McNDTGyb9L870rTbhbXoBCwLYz9sQAAZ6tMxGNyDwbc1CcNtIkNJ8/NNDWtond
 fvAxvKvdQaxZNoafZBBfIkcahPqoLS22CekkZpryrPw3xa3J188v2e08K9DxsMi4xXUN
 Tgi3tPav9Lu5b6KisdullGbF4lO0cSSYxLgilxtaYvj92QmY7ebW36fIIHvurU2dlf4h
 7a1g==
MIME-Version: 1.0
X-Received: by 10.52.96.163 with SMTP id dt3mr2042152vdb.11.1362046980062;
 Thu, 28 Feb 2013 02:23:00 -0800 (PST)
Received: by 10.52.228.163 with HTTP; Thu, 28 Feb 2013 02:22:59 -0800 (PST)
In-Reply-To: <1158712592.20130228141323@serebryakov.spb.ru>
References: <1796551389.20130228120630@serebryakov.spb.ru>
 <1238720635.20130228123325@serebryakov.spb.ru>
 <1158712592.20130228141323@serebryakov.spb.ru>
Date: Thu, 28 Feb 2013 12:22:59 +0200
Message-ID: <CAPJF9w=CZg_+K7NHTGUhRLaMJWWNOG7zMipGMJL6w6NoNZpSXA@mail.gmail.com>
Subject: Re: Unexpected SU+J inconsistency AGAIN -- please, don't shift topic
 to ZFS!
From: Alexander Yerenkow <yerenkow@gmail.com>
To: lev@freebsd.org
Content-Type: text/plain; charset=KOI8-R
Content-Transfer-Encoding: quoted-printable
X-Content-Filtered-By: Mailman/MimeDel 2.1.14
Cc: freebsd-fs@freebsd.org, freebsd-current@freebsd.org
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 28 Feb 2013 10:23:07 -0000

2013/2/28 Lev Serebryakov <lev@freebsd.org>

> Hello, Lev.
> You wrote 28 =C6=C5=D7=D2=C1=CC=D1 2013 =C7., 12:33:25:
>
> LS>>  My server runs 9.1-STABLE and have 8Tb UFS2 SU+J FS.
> LS>>  It crashed a several minutes ago (I don't know reason yet) and fsck
> LS>> says "Unexpected SU+J inconsistency" (Inode mode/directory tyme
> LS>> mismatch) and requested full check (which will take more than hour o=
n
> LS>> such FS).
> LS>  Full fsck found "INTERNAL ERROR: DUPS WITH SOFTUPDATES" and keeps
> running...
>   full fsck reconnected about 1000 files, which was written in time of
>  crash.
>   Really, sever crashed when SVN mirror seed was been unpacking on
>  this FS, so there was massive file creation at this time.
>
>
Could you afford reproducing this? :)
Also, would be nice to know how look your setup (CPUs, how much disks, how
they connected, is it hw raid, etc).


> --
> // Black Lion AKA Lev Serebryakov <lev@FreeBSD.org>
>
> _______________________________________________
> freebsd-current@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-current
> To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org=
"
>



--=20
Regards,
Alexander Yerenkow

From owner-freebsd-fs@FreeBSD.ORG  Thu Feb 28 10:31:36 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by hub.freebsd.org (Postfix) with ESMTP id 925509C8;
 Thu, 28 Feb 2013 10:31:36 +0000 (UTC) (envelope-from lev@FreeBSD.org)
Received: from onlyone.friendlyhosting.spb.ru (onlyone.friendlyhosting.spb.ru
 [46.4.40.135]) by mx1.freebsd.org (Postfix) with ESMTP id 3782E1A07;
 Thu, 28 Feb 2013 10:31:36 +0000 (UTC)
Received: from lion.home.serebryakov.spb.ru (unknown
 [IPv6:2001:470:923f:1:9421:367:9d7d:512b])
 (Authenticated sender: lev@serebryakov.spb.ru)
 by onlyone.friendlyhosting.spb.ru (Postfix) with ESMTPA id AF6BF4AC57;
 Thu, 28 Feb 2013 14:31:34 +0400 (MSK)
Date: Thu, 28 Feb 2013 14:31:29 +0400
From: Lev Serebryakov <lev@FreeBSD.org>
Organization: FreeBSD
X-Priority: 3 (Normal)
Message-ID: <583012022.20130228143129@serebryakov.spb.ru>
To: Alexander Yerenkow <yerenkow@gmail.com>
Subject: Re: Unexpected SU+J inconsistency AGAIN -- please,
 don't shift topic to ZFS!
In-Reply-To: <CAPJF9w=CZg_+K7NHTGUhRLaMJWWNOG7zMipGMJL6w6NoNZpSXA@mail.gmail.com>
References: <1796551389.20130228120630@serebryakov.spb.ru>
 <1238720635.20130228123325@serebryakov.spb.ru>
 <1158712592.20130228141323@serebryakov.spb.ru>
 <CAPJF9w=CZg_+K7NHTGUhRLaMJWWNOG7zMipGMJL6w6NoNZpSXA@mail.gmail.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=koi8-r
Content-Transfer-Encoding: quoted-printable
Cc: freebsd-fs@freebsd.org, freebsd-current@freebsd.org
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
Reply-To: lev@FreeBSD.org
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 28 Feb 2013 10:31:36 -0000

Hello, Alexander.
You wrote 28 =C6=C5=D7=D2=C1=CC=D1 2013 =C7., 14:22:59:

AY> Could you afford reproducing this? :)
  After half a day of memtest86+ :)
  I want to be sure, that it is not memory problem first.

AY> Also, would be nice to know how look your setup (CPUs, how much disks, =
how
AY> they connected, is it hw raid, etc).
  Simple  E4500  CPU  on  Q35-based  desktop  (ASUS) MoBo, 6GiB memory
 (under  test  now!),  Samsung 500GiB SATA HDD for system, 5x2Tb  WD
 Green  (4xWD20EARS, 1xWD20EARX which replace failed WD20EARS), all
 disks are connected to 6 SATA ports of chipset (no RAID controller),
 WD disks are in software RAID5 with geom_raid5 (from ports, but I'm
 active maintainer of it).
   Disks are in "Default" configuration: WC and NCQ are enabled.

   I know, that FS guys could blame geom_raid5, as it could delay real
 write up to 15 seconds, but it never "lies" about writes (it doesn't
 mark BIOs complete till they are really sent to disk) and I could
 not reproduce any problems with it on many hours tests on VMs (and I
 don't want to experiment a lot on real hardware, as it contains my
 real data).

   Maybe, it is subtile interference between raid5 implementation and
  SU+J, but in such case I want to understand what does raid5 do
  wrong.

--=20
// Black Lion AKA Lev Serebryakov <lev@FreeBSD.org>


From owner-freebsd-fs@FreeBSD.ORG  Thu Feb 28 10:43:57 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115])
 by hub.freebsd.org (Postfix) with ESMTP id 9000BCD4
 for <freebsd-fs@freebsd.org>; Thu, 28 Feb 2013 10:43:57 +0000 (UTC)
 (envelope-from radiomlodychbandytow@o2.pl)
Received: from moh3-ve1.go2.pl (moh3-ve1.go2.pl [193.17.41.30])
 by mx1.freebsd.org (Postfix) with ESMTP id 1E6F71A79
 for <freebsd-fs@freebsd.org>; Thu, 28 Feb 2013 10:43:56 +0000 (UTC)
Received: from moh3-ve1.go2.pl (unknown [10.0.0.117])
 by moh3-ve1.go2.pl (Postfix) with ESMTP id 0E376A6A02B
 for <freebsd-fs@freebsd.org>; Thu, 28 Feb 2013 11:43:56 +0100 (CET)
Received: from unknown (unknown [10.0.0.42])
 by moh3-ve1.go2.pl (Postfix) with SMTP
 for <freebsd-fs@freebsd.org>; Thu, 28 Feb 2013 11:43:56 +0100 (CET)
Received: from unknown [93.175.66.185] by poczta.o2.pl with ESMTP id XhYMnM;
 Thu, 28 Feb 2013 11:43:56 +0100
Message-ID: <512F34E7.40602@o2.pl>
Date: Thu, 28 Feb 2013 11:43:51 +0100
From: =?UTF-8?B?UmFkaW8gbcWCb2R5Y2ggYmFuZHl0w7N3?= <radiomlodychbandytow@o2.pl>
User-Agent: Mozilla/5.0 (X11; FreeBSD amd64;
 rv:17.0) Gecko/20130201 Thunderbird/17.0.2
MIME-Version: 1.0
To: freebsd-fs@freebsd.org, Ivan Voras <ivoras@freebsd.org>
Subject: Re: freebsd-fs Digest, Vol 506, Issue 4
References: <mailman.41511.1362042468.2166.freebsd-fs@freebsd.org>
In-Reply-To: <mailman.41511.1362042468.2166.freebsd-fs@freebsd.org>
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
X-O2-Trust: 1, 33
X-O2-SPF: neutral
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 28 Feb 2013 10:43:57 -0000

On 28/02/2013 10:07, freebsd-fs-request@freebsd.org wrote:
> Message: 2
> Date: Wed, 27 Feb 2013 18:01:27 +0100
> From: Ivan Voras<ivoras@freebsd.org>
> To:freebsd-fs@freebsd.org
> Subject: Re: Some filesystem thoughts
> Message-ID:<kgle54$f79$1@ger.gmane.org>
> Content-Type: text/plain; charset="utf-8"
>
> On 20/02/2013 20:26, Radio m?odych bandyt?w wrote:
>
>> >The way I see it is not to treat files as streams of bytes. That's not
>> >what they are, files have meanings and there are tools that bring them
>> >out. A picture is a stored emotion. OK, there are no tools for that yet.
>> >But it is also an array of pixels. And a container with exif data. And
>> >may be a container with an encrypted archive. And, a stream of bytes too.
>> >They have multiple facets.
>> >I think that it would be useful to somehow expose them to applications.
>> >Wouldn't it be useful to be able to grep through pdfs in your email
>> >attachments?
> I think the problem is presentation - offering just the "grep" function
> is waste of effort since those using GUIs will generally not use grep.
> What you're talking about is something like google tried to do with
> android (and, probably, failed): a unified search interface across all
> applications and their data.
Not really. grep was just an example of a more general thing; tools 
having access to not directly visible properties of files. Another example:
I download a src.7z from some project because I want to see how do they 
do a particular thing. To browse it with my file manager, I need to 
mount it or extract it unless my file manager supports .7z files. It 
does, I can just step in. Phew, I saved my time...but only until I get 
to a file that I want to view with my text editor - then I have to do 
either of this things 'cause there's no path that FM can pass to the editor.
>
> Actually, modern smartphones & tablets are slowly moving into the
> direction that there are no "files" and no "filesystems" on your device,
> but rather jost your "data" and "apps" which both are managed by the
> system (and possibly reside in a "cloud"). It may be that the
> "hierarhical filesystem" idea has just not so useful or efficient any
> more (but OTOH, I don't see it going away any time soon).
I checked "hierarchical filesystem" search term. After a quick look I 
see that it's a thing to read into somewhat deeper.
>
>> >Mass-edit music tags with sed? Manually edit with your favourite text
>> >editor instead of the sucky one-liner provided by your favourite music
>> >player?
>> >How about video players being able to play videos by reading them in
>> >decoded form directly from the filesystem instead of having to integrate
>> >a significant number of complex libraries to provide sufficient format
>> >coverage?
> All those things already exist (or will exist soon) in modern GUI
> desktop environments, and especially on handheld-enabled OSes. The way
> they are achieved is to introduce a Grand Unified Interface (or several
> of them, as it happens), which severly abstract the low-level libraries,
> even to the point where the (GUI) application doesn't know it's dealing
> with actual files or something completely different.
That's not really the same.
I don't know if there's any Turing-complete mass-tagger in Unix and for 
sure that's not a norm.
Advanced text edition features like caps corrections are useful in 
tagging sometimes.
More than once I wanted to run jpegtran and similar tools on artwork 
embedded in music files...the list could go on.

Yet why would media managers be supposed to implement such things? There 
are text editors and scripting languages designed precisely for such 
jobs, they just don't have access to the file properties needed.

They could have, but why would all tools be supposed to implement dozens 
of interfaces to handle narrow special cases? We already have a Grand 
Unified Interface interface that almost all programs implement - a 
filesystem interface.
>
> If you're more concerned about the technical aspects, then learning to
> write filesystems in FUSE would be a good starting point for you.
Thanks, but for now I prefer concepts. I know I can learn 
technicalities, but I don't see myself implementing such thing any time 
soon and very possibly - ever.
-- 
Twoje radio

From owner-freebsd-fs@FreeBSD.ORG  Thu Feb 28 12:48:36 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by hub.freebsd.org (Postfix) with ESMTP id 80F7832A;
 Thu, 28 Feb 2013 12:48:36 +0000 (UTC) (envelope-from lev@FreeBSD.org)
Received: from onlyone.friendlyhosting.spb.ru (onlyone.friendlyhosting.spb.ru
 [IPv6:2a01:4f8:131:60a2::2])
 by mx1.freebsd.org (Postfix) with ESMTP id 2AD4E214;
 Thu, 28 Feb 2013 12:48:36 +0000 (UTC)
Received: from lion.home.serebryakov.spb.ru (unknown
 [IPv6:2001:470:923f:1:9421:367:9d7d:512b])
 (Authenticated sender: lev@serebryakov.spb.ru)
 by onlyone.friendlyhosting.spb.ru (Postfix) with ESMTPA id C3CFF4AC57;
 Thu, 28 Feb 2013 16:48:26 +0400 (MSK)
Date: Thu, 28 Feb 2013 16:48:21 +0400
From: Lev Serebryakov <lev@FreeBSD.org>
Organization: FreeBSD
X-Priority: 3 (Normal)
Message-ID: <1698593972.20130228164821@serebryakov.spb.ru>
To: freebsd-fs@freebsd.org, freebsd-current@freebsd.org
Subject: Panic in ffs_valloc (Was: Unexpected SU+J inconsistency AGAIN --
 please, don't shift topic to ZFS!)
In-Reply-To: <1158712592.20130228141323@serebryakov.spb.ru>
References: <1796551389.20130228120630@serebryakov.spb.ru>
 <1238720635.20130228123325@serebryakov.spb.ru>
 <1158712592.20130228141323@serebryakov.spb.ru>
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: quoted-printable
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
Reply-To: lev@FreeBSD.org
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 28 Feb 2013 12:48:36 -0000

Hello, Lev.
You wrote 28 =D1=84=D0=B5=D0=B2=D1=80=D0=B0=D0=BB=D1=8F 2013 =D0=B3., 14:13=
:23:

LS>>>  My server runs 9.1-STABLE and have 8Tb UFS2 SU+J FS.
LS>>>  It crashed a several minutes ago (I don't know reason yet) and fsck
LS>>> says "Unexpected SU+J inconsistency" (Inode mode/directory tyme
LS>>> mismatch) and requested full check (which will take more than hour on
LS>>> such FS).
LS>>  Full fsck found "INTERNAL ERROR: DUPS WITH SOFTUPDATES" and keeps run=
ning...
LS>   full fsck reconnected about 1000 files, which was written in time of
LS>  crash.
LS>   Really, sever crashed when SVN mirror seed was been unpacking on
LS>  this FS, so there was massive file creation at this time.
  Ok,  I've checked memory, and now I have booted system with crashlog
  (!)

Here it is (please note, that panic() was called by ffs_valloc):

#0  doadump (textdump=3D<value optimized out>) at pcpu.h:229
229     pcpu.h: No such file or directory.
        in pcpu.h
(kgdb) #0  doadump (textdump=3D<value optimized out>) at pcpu.h:229
#1  0xffffffff80431494 in kern_reboot (howto=3D260)
    at /usr/src/sys/kern/kern_shutdown.c:448
#2  0xffffffff80431997 in panic (fmt=3D0x1 <Address 0x1 out of bounds>)
    at /usr/src/sys/kern/kern_shutdown.c:636
#3  0xffffffff80573d8c in ffs_valloc (pvp=3D0xfffffe0024d68000, mode=3D3320=
4,
    cred=3D0xfffffe0023d52700, vpp=3D0xffffff81c35586b8)
    at /usr/src/sys/ufs/ffs/ffs_alloc.c:995
#4  0xffffffff805aa126 in ufs_makeinode (mode=3D33204, dvp=3D0xfffffe0024d6=
8000,
    vpp=3D0xffffff81c3558a10, cnp=3D0xffffff81c3558a38)
    at /usr/src/sys/ufs/ufs/ufs_vnops.c:2614
#5  0xffffffff80634391 in VOP_CREATE_APV (vop=3D<value optimized out>,
    a=3D0xffffff81c3558920) at vnode_if.c:252
#6  0xffffffff804d389a in vn_open_cred (ndp=3D0xffffff81c35589d0,
    flagp=3D0xffffff81c35589cc, cmode=3D<value optimized out>,
    vn_open_flags=3D<value optimized out>, cred=3D0xfffffe0023d52700,
    fp=3D0xfffffe00ae9cf370) at vnode_if.h:109
#7  0xffffffff804cc0d9 in kern_openat (td=3D0xfffffe012d095000, fd=3D-100,
    path=3D0x801c951e0 <Address 0x801c951e0 out of bounds>,
    pathseg=3DUIO_USERSPACE, flags=3D2562, mode=3D<value optimized out>)
    at /usr/src/sys/kern/vfs_syscalls.c:1132
#8  0xffffffff805f1400 in amd64_syscall (td=3D0xfffffe012d095000, traced=3D=
0)
    at subr_syscall.c:135
#9  0xffffffff805dbfc7 in Xfast_syscall ()
    at /usr/src/sys/amd64/amd64/exception.S:387
#10 0x000000080177ce5c in ?? ()
Previous frame inner to this frame (corrupt stack?)
(kgdb)

  Full textdump: http://lev.serebryakov.spb.ru/crashes/core-ffs-crash.txt.1

  Please note, that FS was loaded by torrent client (40Mbit/s outbound
  traffic) and unpacking of svnmirror-base-r238500.tar.xz from this FS
  to itself. So, it was really high multistream load.

  I'll try to reproduce this on SINGLE disk, without geom_radi5 :)

--=20
// Black Lion AKA Lev Serebryakov <lev@FreeBSD.org>


From owner-freebsd-fs@FreeBSD.ORG  Thu Feb 28 14:57:02 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115])
 by hub.freebsd.org (Postfix) with ESMTP id 74FFFB0E;
 Thu, 28 Feb 2013 14:57:02 +0000 (UTC) (envelope-from lev@FreeBSD.org)
Received: from onlyone.friendlyhosting.spb.ru (onlyone.friendlyhosting.spb.ru
 [IPv6:2a01:4f8:131:60a2::2])
 by mx1.freebsd.org (Postfix) with ESMTP id 1ABF19A1;
 Thu, 28 Feb 2013 14:57:02 +0000 (UTC)
Received: from lion.home.serebryakov.spb.ru (unknown
 [IPv6:2001:470:923f:1:9421:367:9d7d:512b])
 (Authenticated sender: lev@serebryakov.spb.ru)
 by onlyone.friendlyhosting.spb.ru (Postfix) with ESMTPA id 858794AC57;
 Thu, 28 Feb 2013 18:56:53 +0400 (MSK)
Date: Thu, 28 Feb 2013 18:56:47 +0400
From: Lev Serebryakov <lev@FreeBSD.org>
Organization: FreeBSD
X-Priority: 3 (Normal)
Message-ID: <1502041051.20130228185647@serebryakov.spb.ru>
To: Ivan Voras <ivoras@freebsd.org>
Subject: Re: Unexpected SU+J inconsistency AGAIN -- please,
 don't shift topic to ZFS!
In-Reply-To: <kgnp1n$9mc$1@ger.gmane.org>
References: <1796551389.20130228120630@serebryakov.spb.ru>
 <1238720635.20130228123325@serebryakov.spb.ru>
 <1158712592.20130228141323@serebryakov.spb.ru>
 <CAPJF9w=CZg_+K7NHTGUhRLaMJWWNOG7zMipGMJL6w6NoNZpSXA@mail.gmail.com>
 <583012022.20130228143129@serebryakov.spb.ru> <kgnp1n$9mc$1@ger.gmane.org>
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: quoted-printable
Cc: freebsd-fs@freebsd.org, freebsd-current@freebsd.org
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
Reply-To: lev@FreeBSD.org
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 28 Feb 2013 14:57:02 -0000

Hello, Ivan.
You wrote 28 =D1=84=D0=B5=D0=B2=D1=80=D0=B0=D0=BB=D1=8F 2013 =D0=B3., 18:19=
:38:

>>    Maybe, it is subtile interference between raid5 implementation and
>>   SU+J, but in such case I want to understand what does raid5 do
>>   wrong.
IV> You guessed correctly, I was going to blame geom_raid5 :)
  It  is  not first time :( But every time such discussion ends without
 any practical results.

  One time, Kirk say, that delayed writes are Ok for SU until bottom
 layer doesn't lie about operation completeness. geom_raid5 could
 delay writes (in hope that next writes will combine nicely and allow
 not to do read-calculate-write cycle for read alone), but it never
 mark BIO complete until it is really completed (layers down to
 geom_raid5 returns completion). So, every BIO in wait queue is "in
 flight" from GEOM/VFS point of view. Maybe, it is fatal for journal :(

  And want I really want to see is "SYNC" flag for BIO and that all
 journal-related writes will be marked with it. Also all commits
 originated with fsync() MUST be marked in same way, really. Alexander
 Motin (ahci driver author) assured me, that he'll add support for
 such flag in driver to flush drive cache too, if it will be
 introduced.

  IMHO, lack of this (or similar) flag is bad idea even without
 geom_raid5 with its optimistic behavior.

  There was commit r246876, but I don't understand exactly what it
 means, as no real FS or driver's code was touched.

  But I'm writing about this idea for 3rd or 4th time without any
 results :( And I don't mean, that it should be implemented ASAP by
 someone, I mean I didn't see any support from FS guys (Kirk and
 somebody else, I don't remember exactly participants of these old
 thread, but he was not you) like "go ahead and send your patch". All
 these threads was very defensive from FS guru side, like "we don't
 need it, fix hardware, disable caches".

IV> Is this a production setup you have? Can you afford to destroy it and
IV> re-create it for the purpose of testing, this time with geom_raid3
IV> (which should be synchronous with respect to writes)?
  Unfortunately, it is production setup and I don't have any spare
 hardware for second one :(

  I've posted panic stacktrace -- and it is FFS-related too -- and now
 preparing setup with only one HDD and same high load to try reproduce
 it without geom_raid5. But I don't have enough hardware (3 spare HDDs
 at least!) to reproduce it with geom_raid3 or other copy of
 geiom_radi5.


--=20
// Black Lion AKA Lev Serebryakov <lev@FreeBSD.org>


From owner-freebsd-fs@FreeBSD.ORG  Thu Feb 28 15:00:57 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by hub.freebsd.org (Postfix) with ESMTP id A5BAFCFA;
 Thu, 28 Feb 2013 15:00:57 +0000 (UTC) (envelope-from lev@FreeBSD.org)
Received: from onlyone.friendlyhosting.spb.ru (onlyone.friendlyhosting.spb.ru
 [46.4.40.135]) by mx1.freebsd.org (Postfix) with ESMTP id 67FD79EA;
 Thu, 28 Feb 2013 15:00:57 +0000 (UTC)
Received: from lion.home.serebryakov.spb.ru (unknown
 [IPv6:2001:470:923f:1:9421:367:9d7d:512b])
 (Authenticated sender: lev@serebryakov.spb.ru)
 by onlyone.friendlyhosting.spb.ru (Postfix) with ESMTPA id 06C084AC58;
 Thu, 28 Feb 2013 19:00:54 +0400 (MSK)
Date: Thu, 28 Feb 2013 19:00:49 +0400
From: Lev Serebryakov <lev@FreeBSD.org>
Organization: FreeBSD
X-Priority: 3 (Normal)
Message-ID: <1843530475.20130228190049@serebryakov.spb.ru>
To: Ivan Voras <ivoras@freebsd.org>
Subject: Re: Unexpected SU+J inconsistency AGAIN -- please,
 don't shift topic to ZFS!
In-Reply-To: <1502041051.20130228185647@serebryakov.spb.ru>
References: <1796551389.20130228120630@serebryakov.spb.ru>
 <1238720635.20130228123325@serebryakov.spb.ru>
 <1158712592.20130228141323@serebryakov.spb.ru>
 <CAPJF9w=CZg_+K7NHTGUhRLaMJWWNOG7zMipGMJL6w6NoNZpSXA@mail.gmail.com>
 <583012022.20130228143129@serebryakov.spb.ru> <kgnp1n$9mc$1@ger.gmane.org>
 <1502041051.20130228185647@serebryakov.spb.ru>
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: quoted-printable
Cc: freebsd-fs@freebsd.org, freebsd-current@freebsd.org
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
Reply-To: lev@FreeBSD.org
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 28 Feb 2013 15:00:57 -0000

Hello, Ivan.
You wrote 28 =D1=84=D0=B5=D0=B2=D1=80=D0=B0=D0=BB=D1=8F 2013 =D0=B3., 18:56=
:47:

LS>   There was commit r246876, but I don't understand exactly what it
LS>  means, as no real FS or driver's code was touched.
 And,  yes,  barriers  are  much  stronger than "sync writes", as they
 should flush all previous writes, even that is not related to journal
 or metadata and could wait more (simple file data could be fixed on
 plates out of order without destroying filesystem structure).

--=20
// Black Lion AKA Lev Serebryakov <lev@FreeBSD.org>


From owner-freebsd-fs@FreeBSD.ORG  Thu Feb 28 15:28:10 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115])
 by hub.freebsd.org (Postfix) with ESMTP id 49CD088A;
 Thu, 28 Feb 2013 15:28:10 +0000 (UTC)
 (envelope-from rmacklem@uoguelph.ca)
Received: from esa-annu.net.uoguelph.ca (esa-annu.mail.uoguelph.ca
 [131.104.91.36]) by mx1.freebsd.org (Postfix) with ESMTP id DF1EEB6A;
 Thu, 28 Feb 2013 15:28:09 +0000 (UTC)
X-IronPort-Anti-Spam-Filtered: true
X-IronPort-Anti-Spam-Result: AqEEADd1L1GDaFvO/2dsb2JhbABFhk+5AYJcgRBzgh8BAQQBIwRSBRYOCgICDRkCWQaIIAavWJIXgSOMKoETNAeCLYETA4hqjVeJY4cHgyaBSz4
X-IronPort-AV: E=Sophos;i="4.84,755,1355115600"; d="scan'208";a="16276035"
Received: from erie.cs.uoguelph.ca (HELO zcs3.mail.uoguelph.ca)
 ([131.104.91.206])
 by esa-annu.net.uoguelph.ca with ESMTP; 28 Feb 2013 10:28:03 -0500
Received: from zcs3.mail.uoguelph.ca (localhost.localdomain [127.0.0.1])
 by zcs3.mail.uoguelph.ca (Postfix) with ESMTP id 15CCAB3F18;
 Thu, 28 Feb 2013 10:28:03 -0500 (EST)
Date: Thu, 28 Feb 2013 10:28:03 -0500 (EST)
From: Rick Macklem <rmacklem@uoguelph.ca>
To: Konstantin Belousov <kostikbel@gmail.com>
Message-ID: <664298325.3403590.1362065283063.JavaMail.root@erie.cs.uoguelph.ca>
In-Reply-To: <20130228070515.GK2454@kib.kiev.ua>
Subject: Re: should vn_fullpath1() ever return a path with "." in it?
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit
X-Originating-IP: [172.17.91.202]
X-Mailer: Zimbra 6.0.10_GA_2692 (ZimbraWebClient - FF3.0 (Win)/6.0.10_GA_2692)
Cc: FreeBSD Filesystems <freebsd-fs@freebsd.org>,
 Sergey Kandaurov <pluknet@freebsd.org>
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 28 Feb 2013 15:28:10 -0000

Konstantin Belousov wrote:
> On Wed, Feb 27, 2013 at 09:59:22PM -0500, Rick Macklem wrote:
> > Hi,
> >
> > Sergey Kandaurov reported a problem where getcwd() returns a
> > path with "/./" imbedded in it for an NFSv4 mount. This is
> > caused by a mount point crossing on the server when at the
> > server's root because vn_fullpath1() uses VV_ROOT to spot
> > mount point crossings.
> >
> > The current workaround is to use the sysctls:
> > debug.disablegetcwd=1
> > debug.disablefullpath=1
> >
> > However, it would be nice to fix this when vn_fullpath1()
> > is being used.
> >
> > A simple fix is to have vn_fullpath1() fail when it finds
> > "." as a directory match in the path. When vn_fullpath1()
> > fails, the syscalls fail and that allows the libc algorithm
> > to be used (which works for this case because it doesn't
> > depend on VV_ROOT being set, etc).
> >
> > So, I am wondering if a patch (I have attached one) that
> > makes vn_fullpath1() fail when it matches "." will break
> > anything else? (I don't think so, since the code checks
> > for VV_ROOT in the loop above the check for a match of
> > ".", but I am not sure?)
> >
> > Thanks for any input w.r.t. this, rick
> 
> > --- kern/vfs_cache.c.sav 2013-02-27 20:44:42.000000000 -0500
> > +++ kern/vfs_cache.c 2013-02-27 21:10:39.000000000 -0500
> > @@ -1333,6 +1333,20 @@ vn_fullpath1(struct thread *td, struct v
> >  			    startvp, NULL, 0, 0);
> >  			break;
> >  		}
> > + if (buf[buflen] == '.' && (buf[buflen + 1] == '\0' ||
> > + buf[buflen + 1] == '/')) {
> > + /*
> > + * Fail if it matched ".". This should only happen
> > + * for NFSv4 mounts that cross server mount points.
> > + */
> > + CACHE_RUNLOCK();
> > + vrele(vp);
> > + numfullpathfail1++;
> > + error = ENOENT;
> > + SDT_PROBE(vfs, namecache, fullpath, return,
> > + error, vp, NULL, 0, 0);
> > + break;
> > + }
> >  		buf[--buflen] = '/';
> >  		slash_prefixed = 1;
> >  	}
> 
> I do not quite understand this. Did the dvp (parent) vnode returned by
> VOP_VPTOCNP() equal to vp (child) vnode in the case of the "." name ?
Well, the vnodes aren't the same, but the fileid (think NFS i-node#)
is the value for "." and ".." (2 for a UFS exported fs). The vnodes
are based on the file handles and dvp will be for the mount point in
the other file system on the server.

NFSv4 has 2 attributes for a server mount point directory:
fileid - which is the fileid# for the root (2 for UFS)
mounted_on_fileid - which is the fileid of the directory in the
  parent file system
The parent file system has a different fsid, which becomes the st_dev
and, as such, the userland algorithm in getcwd() works.

The case I test is where the server mount point is one directory
level below the local mount point in the client.
For example: /mnt is the local mount point and /mnt/sub1 is a server
   mount point (different file system than /mnt).
- when vn_fullpath1() gets up to /mnt/sub1 (which doesn't have VV_ROOT set on it),
  vn_vptocnp_locked() matches "." for the fileno. I think there is code
  in vn_vptocnp_locked() that avoids a match for ".." or that could match too.
- then it does /mnt, which does have VV_ROOT set and it works.
> It must be, for the correct operation, but also it should cause the
> almost
> infinite loop in the vn_fullpath1(). The loop is not really infinite
> due
> to a limited size of the buffer where the infinite amount of "./" is
> placed.
> 
As noted above, I think this loop is avoided by dvp != vp.

Within the NFSv4 mount, there can be multiple instances of a fileid (st_ino),
but the have different fsids (st_dev) and different vnodes.

> Anyway, I think we should do better than this patch, even if it is
> legitimate. I think that the better place to check the condition is
> the
> default implementation of VOP_VPTOCNP(). Am I right that this is where
> it broke for you ?
> 
Yep. I wasn't sure what the implications of putting the fix further down
were. (I was planning to ask if the patch should go in a lower level function,
but forgot to ask;-)

I'll test this patch and let you know if it works.

Thanks, rick

> diff --git a/sys/kern/vfs_default.c b/sys/kern/vfs_default.c
> index 00d064e..1dd0185 100644
> --- a/sys/kern/vfs_default.c
> +++ b/sys/kern/vfs_default.c
> @@ -856,8 +856,12 @@ vop_stdvptocnp(struct vop_vptocnp_args *ap)
> error = ENOMEM;
> goto out;
> }
> - bcopy(dp->d_name, buf + i, dp->d_namlen);
> - error = 0;
> + if (dp->d_namlen == 1 && dp->d_name[0] == '.') {
> + error = ENOENT;
> + } else {
> + bcopy(dp->d_name, buf + i, dp->d_namlen);
> + error = 0;
> + }
> goto out;
> }
> } while (len > 0 || !eofflag);

From owner-freebsd-fs@FreeBSD.ORG  Thu Feb 28 17:32:35 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115])
 by hub.freebsd.org (Postfix) with ESMTP id 89D31B93;
 Thu, 28 Feb 2013 17:32:35 +0000 (UTC) (envelope-from lev@FreeBSD.org)
Received: from onlyone.friendlyhosting.spb.ru (onlyone.friendlyhosting.spb.ru
 [46.4.40.135]) by mx1.freebsd.org (Postfix) with ESMTP id 44C183FC;
 Thu, 28 Feb 2013 17:32:35 +0000 (UTC)
Received: from lion.home.serebryakov.spb.ru (unknown
 [IPv6:2001:470:923f:1:9421:367:9d7d:512b])
 (Authenticated sender: lev@serebryakov.spb.ru)
 by onlyone.friendlyhosting.spb.ru (Postfix) with ESMTPA id 64B434AC57;
 Thu, 28 Feb 2013 21:32:23 +0400 (MSK)
Date: Thu, 28 Feb 2013 21:32:17 +0400
From: Lev Serebryakov <lev@FreeBSD.org>
Organization: FreeBSD
X-Priority: 3 (Normal)
Message-ID: <127827160.20130228213217@serebryakov.spb.ru>
To: Ivan Voras <ivoras@freebsd.org>
Subject: Re: Unexpected SU+J inconsistency AGAIN -- please,
 don't shift topic to ZFS!
In-Reply-To: <kgo2hn$f1s$1@ger.gmane.org>
References: <1796551389.20130228120630@serebryakov.spb.ru>
 <1238720635.20130228123325@serebryakov.spb.ru>
 <1158712592.20130228141323@serebryakov.spb.ru>
 <CAPJF9w=CZg_+K7NHTGUhRLaMJWWNOG7zMipGMJL6w6NoNZpSXA@mail.gmail.com>
 <583012022.20130228143129@serebryakov.spb.ru> <kgnp1n$9mc$1@ger.gmane.org>
 <1502041051.20130228185647@serebryakov.spb.ru> <kgo2hn$f1s$1@ger.gmane.org>
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: quoted-printable
Cc: freebsd-fs@freebsd.org, freebsd-geom@freebsd.org
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
Reply-To: lev@FreeBSD.org
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 28 Feb 2013 17:32:35 -0000

Hello, Ivan.
You wrote 28 =D1=84=D0=B5=D0=B2=D1=80=D0=B0=D0=BB=D1=8F 2013 =D0=B3., 21:01=
:46:

>>   One time, Kirk say, that delayed writes are Ok for SU until bottom
>>  layer doesn't lie about operation completeness. geom_raid5 could
>>  delay writes (in hope that next writes will combine nicely and allow
>>  not to do read-calculate-write cycle for read alone), but it never
>>  mark BIO complete until it is really completed (layers down to
>>  geom_raid5 returns completion). So, every BIO in wait queue is "in
>>  flight" from GEOM/VFS point of view. Maybe, it is fatal for journal :(
IV> It shouldn't be - it could be a bug.
  I'll try to reproduce it on VM, but it could be hard, as virtual
 storage have very different (really -- much simpler) characteristics
 and behavior.

>>   And want I really want to see is "SYNC" flag for BIO and that all
>>  journal-related writes will be marked with it. Also all commits
>>  originated with fsync() MUST be marked in same way, really. Alexander
>>  Motin (ahci driver author) assured me, that he'll add support for
>>  such flag in driver to flush drive cache too, if it will be
>>  introduced.

IV> Hmmm, once upon a time I actually tried to add it:

IV> http://people.freebsd.org/~ivoras/diffs/fsync_flush.patch
  I have almost the same patch here :)

IV> This is from 2011, and was never really reviewed. Kirk said it was a
IV> good idea (meaning the implementation could be wrong, YMMV) :)
  It will be great to see this idea committed, really! Could I help
 somehow?

IV> I don't know whether it's significant, but ffs_softdep.c contains 6
IV> bawrite() calls (meaning buf async write), in softdep_process_journal(),
IV> softdep_journal_freeblocks(), softdep_fsync_mountdev(), sync_cgs(), and
IV> flush_deplist().
  As  far  as  I  understand  (I've  examined  this  code  when try to
 understand how to add this BIO_SYNC flag), ASYNC/SYNC here means
 something different. It is only about does caller want to sleep till
 operation is completed, and doesn't mean sync or async write...
   So, I'm not sure, which of these calls should be marked for flushing
 (or should be marked with new ORDERED/BARRIER flag at least).
   SU code is complicated enough without journal, and with journal it
  is much more complicated to simply understand it :(

--=20
// Black Lion AKA Lev Serebryakov <lev@FreeBSD.org>


From owner-freebsd-fs@FreeBSD.ORG  Thu Feb 28 23:49:27 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by hub.freebsd.org (Postfix) with ESMTP id 2D90E50F
 for <freebsd-fs@freebsd.org>; Thu, 28 Feb 2013 23:49:27 +0000 (UTC)
 (envelope-from allan@physics.umn.edu)
Received: from mail.physics.umn.edu (smtp.spa.umn.edu [128.101.220.4])
 by mx1.freebsd.org (Postfix) with ESMTP id 082FB9D2
 for <freebsd-fs@freebsd.org>; Thu, 28 Feb 2013 23:49:26 +0000 (UTC)
Received: from spa-sysadm-01.spa.umn.edu ([134.84.199.8])
 by mail.physics.umn.edu with esmtpsa (TLSv1:CAMELLIA256-SHA:256)
 (Exim 4.77 (FreeBSD)) (envelope-from <allan@physics.umn.edu>)
 id 1UBCr5-000Bts-VB
 for freebsd-fs@freebsd.org; Thu, 28 Feb 2013 17:25:52 -0600
Message-ID: <512FE773.3060903@physics.umn.edu>
Date: Thu, 28 Feb 2013 17:25:39 -0600
From: Graham Allan <allan@physics.umn.edu>
User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64;
 rv:17.0) Gecko/17.0 Thunderbird/17.0
MIME-Version: 1.0
To: freebsd-fs@freebsd.org
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on
 mrmachenry.spa.umn.edu
X-Spam-Level: 
X-Spam-Status: No, score=-1.7 required=5.0 tests=ALL_TRUSTED,AWL,BAYES_50,
 TW_ZF,T_RP_MATCHES_RCVD autolearn=no version=3.3.2
Subject: benefit of GEOM labels for ZFS, was Hard drive device names... serial
 numbers
X-SA-Exim-Version: 4.2
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 28 Feb 2013 23:49:27 -0000

Sorry to come in late on this thread but I've been struggling with 
thinking about the same issue, from a different perspective.

Several months ago we created our first "large" ZFS storage system, 
using 42 drives plus a few SSDs in one of the oft-used Supermicro 
45-drive chassis. It has been working really nicely but has led to some 
puzzling over the best way to do some things when we build more.

We made our pool using geom drive labels. Ever since, I've been 
wondering if this really gives any advantage - at least for this type of 
system. If you need to replace a drive, you don't really know which 
enclosure slot any given da device is, and so our answer has been to dig 
around using sg3_utils commands wrapped in a bit of perl, to try and 
correlate the da device to the slot via the drive serial number.

At this point, having a geom label just seems like an extra bit of 
indirection to increase my confusion :-) Although setting the geom label 
to the drive serial number might be a serious improvement...

We're about to add a couple more of these shelves to the system, giving 
a total of 135 drives (although each shelf would be a separate pool), 
and given that they will be standard consumer grade drives, some 
frequency of replacement is a given.

Does anyone have any good tips on how to manage a large number of drives 
in a zfs pool like this?

Thanks,

Graham
-- 
-------------------------------------------------------------------------
Graham Allan
School of Physics and Astronomy - University of Minnesota
-------------------------------------------------------------------------

From owner-freebsd-fs@FreeBSD.ORG  Fri Mar  1 00:58:54 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by hub.freebsd.org (Postfix) with ESMTP id 13E9F878;
 Fri,  1 Mar 2013 00:58:54 +0000 (UTC)
 (envelope-from rmacklem@uoguelph.ca)
Received: from esa-annu.net.uoguelph.ca (esa-annu.mail.uoguelph.ca
 [131.104.91.36]) by mx1.freebsd.org (Postfix) with ESMTP id C0372CF7;
 Fri,  1 Mar 2013 00:58:53 +0000 (UTC)
X-IronPort-Anti-Spam-Filtered: true
X-IronPort-Anti-Spam-Result: AqAEAC/8L1GDaFvO/2dsb2JhbABFhk+5CYJcgRRzgh8BAQUjBFIbDgoCAg0ZAlkGiCavMZIhgSOMKoETNAeCLYETA4hqjVeJY4cHgyaBSz4
X-IronPort-AV: E=Sophos;i="4.84,758,1355115600"; d="scan'208";a="16387626"
Received: from erie.cs.uoguelph.ca (HELO zcs3.mail.uoguelph.ca)
 ([131.104.91.206])
 by esa-annu.net.uoguelph.ca with ESMTP; 28 Feb 2013 19:58:51 -0500
Received: from zcs3.mail.uoguelph.ca (localhost.localdomain [127.0.0.1])
 by zcs3.mail.uoguelph.ca (Postfix) with ESMTP id 76EBDB3F7D;
 Thu, 28 Feb 2013 19:58:51 -0500 (EST)
Date: Thu, 28 Feb 2013 19:58:51 -0500 (EST)
From: Rick Macklem <rmacklem@uoguelph.ca>
To: Konstantin Belousov <kostikbel@gmail.com>
Message-ID: <1208475167.3432384.1362099531469.JavaMail.root@erie.cs.uoguelph.ca>
In-Reply-To: <20130228070515.GK2454@kib.kiev.ua>
Subject: Re: should vn_fullpath1() ever return a path with "." in it?
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit
X-Originating-IP: [172.17.91.203]
X-Mailer: Zimbra 6.0.10_GA_2692 (ZimbraWebClient - FF3.0 (Win)/6.0.10_GA_2692)
Cc: FreeBSD Filesystems <freebsd-fs@freebsd.org>,
 Sergey Kandaurov <pluknet@freebsd.org>
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 01 Mar 2013 00:58:54 -0000

Kostik Belousov wrote:
> On Wed, Feb 27, 2013 at 09:59:22PM -0500, Rick Macklem wrote:
> > Hi,
> >
> > Sergey Kandaurov reported a problem where getcwd() returns a
> > path with "/./" imbedded in it for an NFSv4 mount. This is
> > caused by a mount point crossing on the server when at the
> > server's root because vn_fullpath1() uses VV_ROOT to spot
> > mount point crossings.
> >
> > The current workaround is to use the sysctls:
> > debug.disablegetcwd=1
> > debug.disablefullpath=1
> >
> > However, it would be nice to fix this when vn_fullpath1()
> > is being used.
> >
> > A simple fix is to have vn_fullpath1() fail when it finds
> > "." as a directory match in the path. When vn_fullpath1()
> > fails, the syscalls fail and that allows the libc algorithm
> > to be used (which works for this case because it doesn't
> > depend on VV_ROOT being set, etc).
> >
> > So, I am wondering if a patch (I have attached one) that
> > makes vn_fullpath1() fail when it matches "." will break
> > anything else? (I don't think so, since the code checks
> > for VV_ROOT in the loop above the check for a match of
> > ".", but I am not sure?)
> >
> > Thanks for any input w.r.t. this, rick
> 
> > --- kern/vfs_cache.c.sav 2013-02-27 20:44:42.000000000 -0500
> > +++ kern/vfs_cache.c 2013-02-27 21:10:39.000000000 -0500
> > @@ -1333,6 +1333,20 @@ vn_fullpath1(struct thread *td, struct v
> >  			    startvp, NULL, 0, 0);
> >  			break;
> >  		}
> > + if (buf[buflen] == '.' && (buf[buflen + 1] == '\0' ||
> > + buf[buflen + 1] == '/')) {
> > + /*
> > + * Fail if it matched ".". This should only happen
> > + * for NFSv4 mounts that cross server mount points.
> > + */
> > + CACHE_RUNLOCK();
> > + vrele(vp);
> > + numfullpathfail1++;
> > + error = ENOENT;
> > + SDT_PROBE(vfs, namecache, fullpath, return,
> > + error, vp, NULL, 0, 0);
> > + break;
> > + }
> >  		buf[--buflen] = '/';
> >  		slash_prefixed = 1;
> >  	}
> 
> I do not quite understand this. Did the dvp (parent) vnode returned by
> VOP_VPTOCNP() equal to vp (child) vnode in the case of the "." name ?
> It must be, for the correct operation, but also it should cause the
> almost
> infinite loop in the vn_fullpath1(). The loop is not really infinite
> due
> to a limited size of the buffer where the infinite amount of "./" is
> placed.
> 
> Anyway, I think we should do better than this patch, even if it is
> legitimate. I think that the better place to check the condition is
> the
> default implementation of VOP_VPTOCNP(). Am I right that this is where
> it broke for you ?
> 
> diff --git a/sys/kern/vfs_default.c b/sys/kern/vfs_default.c
> index 00d064e..1dd0185 100644
> --- a/sys/kern/vfs_default.c
> +++ b/sys/kern/vfs_default.c
> @@ -856,8 +856,12 @@ vop_stdvptocnp(struct vop_vptocnp_args *ap)
> error = ENOMEM;
> goto out;
> }
> - bcopy(dp->d_name, buf + i, dp->d_namlen);
> - error = 0;
> + if (dp->d_namlen == 1 && dp->d_name[0] == '.') {
> + error = ENOENT;
> + } else {
> + bcopy(dp->d_name, buf + i, dp->d_namlen);
> + error = 0;
> + }
> goto out;
> }
> } while (len > 0 || !eofflag);

Yes, this patch fixes the problem too. If you think it is safe to
do this, I can commit the patch in mid-April. Maybe Sergey can
test it?

Thanks yet again, rick


From owner-freebsd-fs@FreeBSD.ORG  Fri Mar  1 03:30:32 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115])
 by hub.freebsd.org (Postfix) with ESMTP id A954ECDD
 for <freebsd-fs@freebsd.org>; Fri,  1 Mar 2013 03:30:32 +0000 (UTC)
 (envelope-from fjwcash@gmail.com)
Received: from mail-qa0-f48.google.com (mail-qa0-f48.google.com
 [209.85.216.48]) by mx1.freebsd.org (Postfix) with ESMTP id 707212F3
 for <freebsd-fs@freebsd.org>; Fri,  1 Mar 2013 03:30:32 +0000 (UTC)
Received: by mail-qa0-f48.google.com with SMTP id j8so1904613qah.14
 for <freebsd-fs@freebsd.org>; Thu, 28 Feb 2013 19:30:25 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
 h=mime-version:x-received:in-reply-to:references:date:message-id
 :subject:from:to:cc:content-type;
 bh=bxTE2Siu5yBzT8clYam6Sc3Ucvct8ao/QL0OoX+9Mcc=;
 b=r7GyQ3GC0+uiSENmxWSiWepwsVkQIttMkBZsU7a9oySlHDkSL+DMttwyX1F2l2xNe1
 LTuO0uvbBAjAYvtPwyJakF6xZ4IusShSC+/nr9AKq+KcZiGAb/hmRVAFNf8rWfiUeIEl
 Az0o1Qdzps+b+lICnJsyW1XlktoY4zASbqmt/lMN2qFfmPVurElhuSiVe1DlX/rbi2Ro
 Qv5ayY0ZakOjGoi/fjubS2MUuDWofh97Ibn83tauFer2Az91Jkp3ZxAU7WZ/Y1vJyk+D
 ceNdsYbDJb9RBb38AfMWmAXupHyqjW/Biwf6BffIKhxG5sCFhF8Ka3NluKNPDSyK42T5
 4Y4w==
MIME-Version: 1.0
X-Received: by 10.49.128.170 with SMTP id np10mr2041434qeb.37.1362108625286;
 Thu, 28 Feb 2013 19:30:25 -0800 (PST)
Received: by 10.49.106.233 with HTTP; Thu, 28 Feb 2013 19:30:25 -0800 (PST)
Received: by 10.49.106.233 with HTTP; Thu, 28 Feb 2013 19:30:25 -0800 (PST)
In-Reply-To: <512FE773.3060903@physics.umn.edu>
References: <512FE773.3060903@physics.umn.edu>
Date: Thu, 28 Feb 2013 19:30:25 -0800
Message-ID: <CAOjFWZ5DH2wb8m2eTX-d1bxdfGLOCpDB06+RKqA7kL5Lyvawkg@mail.gmail.com>
Subject: Re: benefit of GEOM labels for ZFS, was Hard drive device names...
 serial numbers
From: Freddie Cash <fjwcash@gmail.com>
To: Graham Allan <allan@physics.umn.edu>
Content-Type: text/plain; charset=UTF-8
X-Content-Filtered-By: Mailman/MimeDel 2.1.14
Cc: FreeBSD Filesystems <freebsd-fs@freebsd.org>
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 01 Mar 2013 03:30:32 -0000

You label the drive with something that tells you:
  - enclosure
  - column
  - row

IOW, something that definitively tells you where the drive is located,
without having to pull the drive to find it.

To do so, you have to install 1 drive at a time, and label it at that point.

For example, we use the following pattern: encX-A-#

Where X tells you which enclosure it's in, A tells you which column it's in
(letters start at A increasing to the right), and # tells you the disk in
the column, numbered top-down.

Whether you label the entire drive using glabel or just a GPT partition is
up to you. We use GPT labels.
On 2013-02-28 3:49 PM, "Graham Allan" <allan@physics.umn.edu> wrote:

> Sorry to come in late on this thread but I've been struggling with
> thinking about the same issue, from a different perspective.
>
> Several months ago we created our first "large" ZFS storage system, using
> 42 drives plus a few SSDs in one of the oft-used Supermicro 45-drive
> chassis. It has been working really nicely but has led to some puzzling
> over the best way to do some things when we build more.
>
> We made our pool using geom drive labels. Ever since, I've been wondering
> if this really gives any advantage - at least for this type of system. If
> you need to replace a drive, you don't really know which enclosure slot any
> given da device is, and so our answer has been to dig around using
> sg3_utils commands wrapped in a bit of perl, to try and correlate the da
> device to the slot via the drive serial number.
>
> At this point, having a geom label just seems like an extra bit of
> indirection to increase my confusion :-) Although setting the geom label to
> the drive serial number might be a serious improvement...
>
> We're about to add a couple more of these shelves to the system, giving a
> total of 135 drives (although each shelf would be a separate pool), and
> given that they will be standard consumer grade drives, some frequency of
> replacement is a given.
>
> Does anyone have any good tips on how to manage a large number of drives
> in a zfs pool like this?
>
> Thanks,
>
> Graham
> --
> ------------------------------**------------------------------**
> -------------
> Graham Allan
> School of Physics and Astronomy - University of Minnesota
> ------------------------------**------------------------------**
> -------------
> ______________________________**_________________
> freebsd-fs@freebsd.org mailing list
> http://lists.freebsd.org/**mailman/listinfo/freebsd-fs<http://lists.freebsd.org/mailman/listinfo/freebsd-fs>
> To unsubscribe, send any mail to "freebsd-fs-unsubscribe@**freebsd.org<freebsd-fs-unsubscribe@freebsd.org>
> "
>

From owner-freebsd-fs@FreeBSD.ORG  Fri Mar  1 05:11:47 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@FreeBSD.org
Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115])
 by hub.freebsd.org (Postfix) with ESMTP id A524AB8A;
 Fri,  1 Mar 2013 05:11:47 +0000 (UTC)
 (envelope-from truckman@FreeBSD.org)
Received: from gw.catspoiler.org (gw.catspoiler.org [75.1.14.242])
 by mx1.freebsd.org (Postfix) with ESMTP id 6F80D7FC;
 Fri,  1 Mar 2013 05:11:47 +0000 (UTC)
Received: from FreeBSD.org (mousie.catspoiler.org [192.168.101.2])
 by gw.catspoiler.org (8.13.3/8.13.3) with ESMTP id r215BWoU092532;
 Thu, 28 Feb 2013 21:11:36 -0800 (PST)
 (envelope-from truckman@FreeBSD.org)
Message-Id: <201303010511.r215BWoU092532@gw.catspoiler.org>
Date: Thu, 28 Feb 2013 21:11:32 -0800 (PST)
From: Don Lewis <truckman@FreeBSD.org>
Subject: Re: Panic in ffs_valloc (Was: Unexpected SU+J inconsistency AGAIN
 -- please, don't shift topic to ZFS!)
To: lev@FreeBSD.org
In-Reply-To: <1698593972.20130228164821@serebryakov.spb.ru>
MIME-Version: 1.0
Content-Type: TEXT/plain; charset=iso-8859-5
Content-Transfer-Encoding: 8BIT
Cc: freebsd-fs@FreeBSD.org, freebsd-current@FreeBSD.org
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 01 Mar 2013 05:11:47 -0000

On 28 Feb, Lev Serebryakov wrote:
> Hello, Lev.
> You wrote 28 ������� 2013 �., 14:13:23:
> 
> LS>>>  My server runs 9.1-STABLE and have 8Tb UFS2 SU+J FS.
> LS>>>  It crashed a several minutes ago (I don't know reason yet) and fsck
> LS>>> says "Unexpected SU+J inconsistency" (Inode mode/directory tyme
> LS>>> mismatch) and requested full check (which will take more than hour on
> LS>>> such FS).
> LS>>  Full fsck found "INTERNAL ERROR: DUPS WITH SOFTUPDATES" and keeps running...
> LS>   full fsck reconnected about 1000 files, which was written in time of
> LS>  crash.
> LS>   Really, sever crashed when SVN mirror seed was been unpacking on
> LS>  this FS, so there was massive file creation at this time.
>   Ok,  I've checked memory, and now I have booted system with crashlog
>   (!)
> 
> Here it is (please note, that panic() was called by ffs_valloc):
> 
> #0  doadump (textdump=<value optimized out>) at pcpu.h:229
> 229     pcpu.h: No such file or directory.
>         in pcpu.h
> (kgdb) #0  doadump (textdump=<value optimized out>) at pcpu.h:229
> #1  0xffffffff80431494 in kern_reboot (howto=260)
>     at /usr/src/sys/kern/kern_shutdown.c:448
> #2  0xffffffff80431997 in panic (fmt=0x1 <Address 0x1 out of bounds>)
>     at /usr/src/sys/kern/kern_shutdown.c:636
> #3  0xffffffff80573d8c in ffs_valloc (pvp=0xfffffe0024d68000, mode=33204,
>     cred=0xfffffe0023d52700, vpp=0xffffff81c35586b8)
>     at /usr/src/sys/ufs/ffs/ffs_alloc.c:995
> #4  0xffffffff805aa126 in ufs_makeinode (mode=33204, dvp=0xfffffe0024d68000,
>     vpp=0xffffff81c3558a10, cnp=0xffffff81c3558a38)
>     at /usr/src/sys/ufs/ufs/ufs_vnops.c:2614
> #5  0xffffffff80634391 in VOP_CREATE_APV (vop=<value optimized out>,
>     a=0xffffff81c3558920) at vnode_if.c:252
> #6  0xffffffff804d389a in vn_open_cred (ndp=0xffffff81c35589d0,
>     flagp=0xffffff81c35589cc, cmode=<value optimized out>,
>     vn_open_flags=<value optimized out>, cred=0xfffffe0023d52700,
>     fp=0xfffffe00ae9cf370) at vnode_if.h:109
> #7  0xffffffff804cc0d9 in kern_openat (td=0xfffffe012d095000, fd=-100,
>     path=0x801c951e0 <Address 0x801c951e0 out of bounds>,
>     pathseg=UIO_USERSPACE, flags=2562, mode=<value optimized out>)
>     at /usr/src/sys/kern/vfs_syscalls.c:1132
> #8  0xffffffff805f1400 in amd64_syscall (td=0xfffffe012d095000, traced=0)
>     at subr_syscall.c:135
> #9  0xffffffff805dbfc7 in Xfast_syscall ()
>     at /usr/src/sys/amd64/amd64/exception.S:387
> #10 0x000000080177ce5c in ?? ()
> Previous frame inner to this frame (corrupt stack?)
> (kgdb)
> 
>   Full textdump: http://lev.serebryakov.spb.ru/crashes/core-ffs-crash.txt.1
> 
>   Please note, that FS was loaded by torrent client (40Mbit/s outbound
>   traffic) and unpacking of svnmirror-base-r238500.tar.xz from this FS
>   to itself. So, it was really high multistream load.
> 
>   I'll try to reproduce this on SINGLE disk, without geom_radi5 :)


The fact that the filesystem code called panic() indicates that the
filesystem was already corrupt by that point.  That's a likely reason
for fsck complaining about the unexpected SU+J inconsistency.

Incorrect write ordering that allowed the filesystem to become
inconsistent because some pending writes were lost because of the panic
might not be necessary, but this might have allowed an earlier crash
where a full fsck was skipped to leave the filesystem in this state.

This panic might also be a result of the bug fixed in 246877, but I have
my doubts about that.



From owner-freebsd-fs@FreeBSD.ORG  Fri Mar  1 06:22:51 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@FreeBSD.org
Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115])
 by hub.freebsd.org (Postfix) with ESMTP id 7AACA312;
 Fri,  1 Mar 2013 06:22:51 +0000 (UTC) (envelope-from lev@FreeBSD.org)
Received: from onlyone.friendlyhosting.spb.ru (onlyone.friendlyhosting.spb.ru
 [IPv6:2a01:4f8:131:60a2::2])
 by mx1.freebsd.org (Postfix) with ESMTP id 2501C9DE;
 Fri,  1 Mar 2013 06:22:51 +0000 (UTC)
Received: from lion.home.serebryakov.spb.ru (unknown
 [IPv6:2001:470:923f:1:9421:367:9d7d:512b])
 (Authenticated sender: lev@serebryakov.spb.ru)
 by onlyone.friendlyhosting.spb.ru (Postfix) with ESMTPA id E0E064AC57;
 Fri,  1 Mar 2013 10:22:43 +0400 (MSK)
Date: Fri, 1 Mar 2013 10:22:37 +0400
From: Lev Serebryakov <lev@FreeBSD.org>
Organization: FreeBSD Project
X-Priority: 3 (Normal)
Message-ID: <352538988.20130301102237@serebryakov.spb.ru>
To: Don Lewis <truckman@FreeBSD.org>
Subject: Re: Panic in ffs_valloc (Was: Unexpected SU+J inconsistency AGAIN --
 please, don't shift topic to ZFS!)
In-Reply-To: <201303010511.r215BWoU092532@gw.catspoiler.org>
References: <1698593972.20130228164821@serebryakov.spb.ru>
 <201303010511.r215BWoU092532@gw.catspoiler.org>
MIME-Version: 1.0
Content-Type: text/plain; charset=iso-8859-5
Content-Transfer-Encoding: quoted-printable
Cc: freebsd-fs@FreeBSD.org, freebsd-current@FreeBSD.org
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
Reply-To: lev@FreeBSD.org
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 01 Mar 2013 06:22:51 -0000

Hello, Don.
You wrote 1 =DC=D0=E0=E2=D0 2013 =D3., 9:11:32:

DL> The fact that the filesystem code called panic() indicates that the
DL> filesystem was already corrupt by that point.  That's a likely reason
DL> for fsck complaining about the unexpected SU+J inconsistency.

DL> Incorrect write ordering that allowed the filesystem to become
DL> inconsistent because some pending writes were lost because of the panic
DL> might not be necessary, but this might have allowed an earlier crash
DL> where a full fsck was skipped to leave the filesystem in this state.
  As far, as I understand, if this theory is right (file system
 corruption which left unnoticed by "standard" fsck), it is bug in FFS
 SU+J too, as it should not be corrupted by reordered writes (if
 writes is properly reported as completed even if they were
 reordered).

DL> This panic might also be a result of the bug fixed in 246877, but I have
DL> my doubts about that.
  It was not MFCed :(

--=20
// Black Lion AKA Lev Serebryakov <lev@FreeBSD.org>


From owner-freebsd-fs@FreeBSD.ORG  Fri Mar  1 08:03:01 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by hub.freebsd.org (Postfix) with ESMTP id 373B8503
 for <freebsd-fs@freebsd.org>; Fri,  1 Mar 2013 08:03:01 +0000 (UTC)
 (envelope-from pluknet@gmail.com)
Received: from mail-wg0-f45.google.com (mail-wg0-f45.google.com [74.125.82.45])
 by mx1.freebsd.org (Postfix) with ESMTP id B8954E67
 for <freebsd-fs@freebsd.org>; Fri,  1 Mar 2013 08:03:00 +0000 (UTC)
Received: by mail-wg0-f45.google.com with SMTP id dq12so2237675wgb.24
 for <freebsd-fs@freebsd.org>; Fri, 01 Mar 2013 00:02:54 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
 h=mime-version:x-received:sender:in-reply-to:references:date
 :x-google-sender-auth:message-id:subject:from:to:cc:content-type;
 bh=WWkYj5OlnVmugdkk5dqssw9/rcfRXLKBJ2ZY9cAlloU=;
 b=wDDpN4Dmnnqg5LfOP7yEDnAe3E8XGxj6MaII/JsiwZ3/huh+OVjee2xC92jaOG0eSb
 Nwsix297bM/IuQYEU6yppEFYd2S+7+8YpMIe4RU5O5lBUVQUiCLxCss/EUOnAl1LGY4E
 v8zUp19IU0bcmlOlMGVbld0WrkY6smEN2o6nVyBC0KbYmB35py6NZwXd9d/k1HN6a2lJ
 fl7OmXQA8ky+9pjkxAAvllcwGLFjeniPcb4qPUL+7lZfSP05xMiOjtzcxhaEDDqsGW7l
 suGxv5Sv3uMdxJcckm60D6+nWzOIl/e6lBQUpy71bj5ceDxCBNyr8zGDuawzmOoBT/cP
 gtSQ==
MIME-Version: 1.0
X-Received: by 10.180.79.227 with SMTP id m3mr2184825wix.12.1362124974582;
 Fri, 01 Mar 2013 00:02:54 -0800 (PST)
Sender: pluknet@gmail.com
Received: by 10.194.86.167 with HTTP; Fri, 1 Mar 2013 00:02:54 -0800 (PST)
In-Reply-To: <1208475167.3432384.1362099531469.JavaMail.root@erie.cs.uoguelph.ca>
References: <20130228070515.GK2454@kib.kiev.ua>
 <1208475167.3432384.1362099531469.JavaMail.root@erie.cs.uoguelph.ca>
Date: Fri, 1 Mar 2013 11:02:54 +0300
X-Google-Sender-Auth: NiIxkvWo5PgdXSyMFYs_GHzvosA
Message-ID: <CAE-mSOJYATENbP48_7QJQEm8Y2X+22cS6nD77OGM7jYvYMBHmQ@mail.gmail.com>
Subject: Re: should vn_fullpath1() ever return a path with "." in it?
From: Sergey Kandaurov <pluknet@freebsd.org>
To: Rick Macklem <rmacklem@uoguelph.ca>
Content-Type: text/plain; charset=ISO-8859-1
Cc: FreeBSD Filesystems <freebsd-fs@freebsd.org>
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 01 Mar 2013 08:03:01 -0000

On 1 March 2013 04:58, Rick Macklem <rmacklem@uoguelph.ca> wrote:
> Kostik Belousov wrote:
>> On Wed, Feb 27, 2013 at 09:59:22PM -0500, Rick Macklem wrote:
>> > Hi,
>> >
>> > Sergey Kandaurov reported a problem where getcwd() returns a
>> > path with "/./" imbedded in it for an NFSv4 mount. This is
>> > caused by a mount point crossing on the server when at the
>> > server's root because vn_fullpath1() uses VV_ROOT to spot
>> > mount point crossings.
>> >
>> > The current workaround is to use the sysctls:
>> > debug.disablegetcwd=1
>> > debug.disablefullpath=1
>> >
>> > However, it would be nice to fix this when vn_fullpath1()
>> > is being used.
>> >
>> > A simple fix is to have vn_fullpath1() fail when it finds
>> > "." as a directory match in the path. When vn_fullpath1()
>> > fails, the syscalls fail and that allows the libc algorithm
>> > to be used (which works for this case because it doesn't
>> > depend on VV_ROOT being set, etc).
>> >
>> > So, I am wondering if a patch (I have attached one) that
>> > makes vn_fullpath1() fail when it matches "." will break
>> > anything else? (I don't think so, since the code checks
>> > for VV_ROOT in the loop above the check for a match of
>> > ".", but I am not sure?)
>> >
>> > Thanks for any input w.r.t. this, rick
>>
>> > --- kern/vfs_cache.c.sav 2013-02-27 20:44:42.000000000 -0500
>> > +++ kern/vfs_cache.c 2013-02-27 21:10:39.000000000 -0500
>> > @@ -1333,6 +1333,20 @@ vn_fullpath1(struct thread *td, struct v
>> >                         startvp, NULL, 0, 0);
>> >                     break;
>> >             }
>> > + if (buf[buflen] == '.' && (buf[buflen + 1] == '\0' ||
>> > + buf[buflen + 1] == '/')) {
>> > + /*
>> > + * Fail if it matched ".". This should only happen
>> > + * for NFSv4 mounts that cross server mount points.
>> > + */
>> > + CACHE_RUNLOCK();
>> > + vrele(vp);
>> > + numfullpathfail1++;
>> > + error = ENOENT;
>> > + SDT_PROBE(vfs, namecache, fullpath, return,
>> > + error, vp, NULL, 0, 0);
>> > + break;
>> > + }
>> >             buf[--buflen] = '/';
>> >             slash_prefixed = 1;
>> >     }
>>
>> I do not quite understand this. Did the dvp (parent) vnode returned by
>> VOP_VPTOCNP() equal to vp (child) vnode in the case of the "." name ?
>> It must be, for the correct operation, but also it should cause the
>> almost
>> infinite loop in the vn_fullpath1(). The loop is not really infinite
>> due
>> to a limited size of the buffer where the infinite amount of "./" is
>> placed.
>>
>> Anyway, I think we should do better than this patch, even if it is
>> legitimate. I think that the better place to check the condition is
>> the
>> default implementation of VOP_VPTOCNP(). Am I right that this is where
>> it broke for you ?
>>
>> diff --git a/sys/kern/vfs_default.c b/sys/kern/vfs_default.c
>> index 00d064e..1dd0185 100644
>> --- a/sys/kern/vfs_default.c
>> +++ b/sys/kern/vfs_default.c
>> @@ -856,8 +856,12 @@ vop_stdvptocnp(struct vop_vptocnp_args *ap)
>> error = ENOMEM;
>> goto out;
>> }
>> - bcopy(dp->d_name, buf + i, dp->d_namlen);
>> - error = 0;
>> + if (dp->d_namlen == 1 && dp->d_name[0] == '.') {
>> + error = ENOENT;
>> + } else {
>> + bcopy(dp->d_name, buf + i, dp->d_namlen);
>> + error = 0;
>> + }
>> goto out;
>> }
>> } while (len > 0 || !eofflag);
>
> Yes, this patch fixes the problem too. If you think it is safe to
> do this, I can commit the patch in mid-April. Maybe Sergey can
> test it?
>
> Thanks yet again, rick
>

Hi Rick
Sorry but I am no longer able to test NFSv4.

-- 
wbr,
pluknet

From owner-freebsd-fs@FreeBSD.ORG  Fri Mar  1 11:26:43 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by hub.freebsd.org (Postfix) with ESMTP id A9F28330
 for <freebsd-fs@freebsd.org>; Fri,  1 Mar 2013 11:26:43 +0000 (UTC)
 (envelope-from mailinglists.tech@gmail.com)
Received: from mail-qa0-f47.google.com (mail-qa0-f47.google.com
 [209.85.216.47]) by mx1.freebsd.org (Postfix) with ESMTP id 74F4C838
 for <freebsd-fs@freebsd.org>; Fri,  1 Mar 2013 11:26:43 +0000 (UTC)
Received: by mail-qa0-f47.google.com with SMTP id j8so4940908qah.13
 for <freebsd-fs@freebsd.org>; Fri, 01 Mar 2013 03:26:37 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
 h=mime-version:x-received:date:message-id:subject:from:to
 :content-type; bh=1/1bIr7tY46lLqU0iBTmNKABSYkFx/QwcpeqTeRnUIU=;
 b=AHL839mAsZIj9LIrnrAAA4N5s/4fOeFuFU4ZXaDizbC18+EctLm6pMwm3b6a5S1Hiu
 4e30j25dh5Pho4VYtcp8TKEk4gsUbk3Vs2GXTN9COuX7BrM67qG2USrqGyKVk4LWOBLT
 rX0m38fxBSOE8Adxbhh54Eg1MCzS2Zoo4S/zAlSchiMh3BUF5MQYBuR+Gp4ocz9lDz52
 vevkb97Pcar47kkIRe85oX8RMJ+NpVQpVYYQkjaDR4YiShk+e3uOZqPLCrkjedaLUj58
 2ezFA+IztbCOOZIXamHVyfdMoZo6TagncAMv+0A9XSgya1EYy3FMB5DbWGbdFk60hXIW
 vNHg==
MIME-Version: 1.0
X-Received: by 10.229.203.78 with SMTP id fh14mr3515476qcb.143.1362137197659; 
 Fri, 01 Mar 2013 03:26:37 -0800 (PST)
Received: by 10.49.110.70 with HTTP; Fri, 1 Mar 2013 03:26:37 -0800 (PST)
Date: Fri, 1 Mar 2013 12:26:37 +0100
Message-ID: <CAMCOOJvs_SS1n2r3jA28x4et+dSFv9YJ4BR0d9Padtmrj8E1Hw@mail.gmail.com>
Subject: I am to silly to mount a zpool while boot
From: tech mailinglists <mailinglists.tech@gmail.com>
To: freebsd-fs@freebsd.org
Content-Type: text/plain; charset=ISO-8859-1
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 01 Mar 2013 11:26:43 -0000

Hello all,

I think that I only can be an idiot to get in such a problem but I am
not able to mount a zpool via fstab while boot.

I have a FreeBSD i386 PV Xen DomU running with 3 disks xbd0 (ext2 for
/boot), xbd1 (UFS for /) and xbd2 (ZFS/zpool with name home to mount
at /home).

I now tried everything I could find. So my fstab entry looks like this:

home            /home   zfs     rw,late 0       0

The real problem is that after a reboot the zpool is no longer
imported, I really don't know why I always have to reimport the pool
via zpool import -d /dev home. Because of this the filesystem never
can be mounted via fstab while boot and I get dropped into a shell
where I need to do this always manually.

So why the pool always isn't imported after boot and how can I solve this issue?

And is the fstab entry correct itself? So would it work when the pool
gets imported with it's name befor the fstab entry is parsed?

Hope that someone give me a few hints or a solution.

Best Regards

From owner-freebsd-fs@FreeBSD.ORG  Fri Mar  1 11:28:20 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by hub.freebsd.org (Postfix) with ESMTP id C279F497;
 Fri,  1 Mar 2013 11:28:20 +0000 (UTC) (envelope-from lev@FreeBSD.org)
Received: from onlyone.friendlyhosting.spb.ru (onlyone.friendlyhosting.spb.ru
 [46.4.40.135]) by mx1.freebsd.org (Postfix) with ESMTP id 82D62854;
 Fri,  1 Mar 2013 11:28:20 +0000 (UTC)
Received: from lion.home.serebryakov.spb.ru (unknown
 [IPv6:2001:470:923f:1:9421:367:9d7d:512b])
 (Authenticated sender: lev@serebryakov.spb.ru)
 by onlyone.friendlyhosting.spb.ru (Postfix) with ESMTPA id 9F5BE4AC59;
 Fri,  1 Mar 2013 15:28:02 +0400 (MSK)
Date: Fri, 1 Mar 2013 15:27:56 +0400
From: Lev Serebryakov <lev@FreeBSD.org>
Organization: FreeBSD
X-Priority: 3 (Normal)
Message-ID: <612776324.20130301152756@serebryakov.spb.ru>
To: Ivan Voras <ivoras@freebsd.org>, Don Lewis <truckman@FreeBSD.org>
Subject: Re: Unexpected SU+J inconsistency AGAIN -- please,
 don't shift topic to ZFS!
In-Reply-To: <kgo2hn$f1s$1@ger.gmane.org>
References: <1796551389.20130228120630@serebryakov.spb.ru>
 <1238720635.20130228123325@serebryakov.spb.ru>
 <1158712592.20130228141323@serebryakov.spb.ru>
 <CAPJF9w=CZg_+K7NHTGUhRLaMJWWNOG7zMipGMJL6w6NoNZpSXA@mail.gmail.com>
 <583012022.20130228143129@serebryakov.spb.ru> <kgnp1n$9mc$1@ger.gmane.org>
 <1502041051.20130228185647@serebryakov.spb.ru> <kgo2hn$f1s$1@ger.gmane.org>
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: quoted-printable
Cc: freebsd-fs@freebsd.org, freebsd-geom@freebsd.org
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
Reply-To: lev@FreeBSD.org
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 01 Mar 2013 11:28:20 -0000

Hello, Ivan.
You wrote 28 =D1=84=D0=B5=D0=B2=D1=80=D0=B0=D0=BB=D1=8F 2013 =D0=B3., 21:01=
:46:

>>   One time, Kirk say, that delayed writes are Ok for SU until bottom
>>  layer doesn't lie about operation completeness. geom_raid5 could
>>  delay writes (in hope that next writes will combine nicely and allow
>>  not to do read-calculate-write cycle for read alone), but it never
>>  mark BIO complete until it is really completed (layers down to
>>  geom_raid5 returns completion). So, every BIO in wait queue is "in
>>  flight" from GEOM/VFS point of view. Maybe, it is fatal for journal :(
IV> It shouldn't be - it could be a bug.
   I understand, that it proves nothing, but I've tried to repeat
 "previous crash corrupt FS in journal-undetectable way" theory by
 killing virtual system when there is massive writing to
 geom_radi5-based FS (on virtual drives, unfortunately). I've done 15
 tries (as it is manual testing, it takes about 1-1.5 hours total),
 but every time FS was Ok after double-fsck (first with journal and
 last without one). Of course, there was MASSIVE loss of data, as
 timeout and size of cache in geom_raid5 was set very high (sometimes
 FS becomes empty after unpacking 50% of SVN mirror seed, crash and
 check) but FS was consistent every time!


--=20
// Black Lion AKA Lev Serebryakov <lev@FreeBSD.org>


From owner-freebsd-fs@FreeBSD.ORG  Fri Mar  1 12:03:18 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115])
 by hub.freebsd.org (Postfix) with ESMTP id 8519FA36
 for <freebsd-fs@freebsd.org>; Fri,  1 Mar 2013 12:03:18 +0000 (UTC)
 (envelope-from peter.maloney@brockmann-consult.de)
Received: from moutng.kundenserver.de (moutng.kundenserver.de [212.227.17.10])
 by mx1.freebsd.org (Postfix) with ESMTP id 065039AC
 for <freebsd-fs@freebsd.org>; Fri,  1 Mar 2013 12:03:17 +0000 (UTC)
Received: from [10.3.0.26] ([141.4.215.32])
 by mrelayeu.kundenserver.de (node=mreu3) with ESMTP (Nemesis)
 id 0MQYci-1UNazE271Y-00Tot0; Fri, 01 Mar 2013 13:03:12 +0100
Message-ID: <513098FF.8030806@brockmann-consult.de>
Date: Fri, 01 Mar 2013 13:03:11 +0100
From: Peter Maloney <peter.maloney@brockmann-consult.de>
User-Agent: Mozilla/5.0 (X11; Linux x86_64;
 rv:17.0) Gecko/17.0 Thunderbird/17.0
MIME-Version: 1.0
To: freebsd-fs@freebsd.org
Subject: Re: I am to silly to mount a zpool while boot
References: <CAMCOOJvs_SS1n2r3jA28x4et+dSFv9YJ4BR0d9Padtmrj8E1Hw@mail.gmail.com>
In-Reply-To: <CAMCOOJvs_SS1n2r3jA28x4et+dSFv9YJ4BR0d9Padtmrj8E1Hw@mail.gmail.com>
X-Enigmail-Version: 1.5
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 7bit
X-Provags-ID: V02:K0:0xjhKHGwD7FWeX/IoJiN7QakZcMgAwKNFpFVsLn9faq
 nvvChqD6Ok0SXu+Cx+ylPlcVKLvUkXzuEJtT8z4gFfYIj09rOn
 +A99LXg8139dsqhaj72hY2FDuOK7naoDkMb3AYIpg0v2KcmLi+
 9qRUZbsC874NabiSGPihDQ8BC0oTScc8vgIqwtdULQBDoO8F4x
 AoA0I2sW1MhtsPoXahO5JGi0eXCCI3KXSyPQMuMmOZQvGwRr5U
 ABZ+RiOD7KDUmVFkSnwRWzWEQ7RhzzelvvxuMHG+bPveygrJIs
 OZGD6qlRsBmrOCnYvKLJG08tfwWxgay9C89Xh0Pd3/HtHtOgzd
 2nTiEKms0z72CUDtdmBN3v5mrg+Q1rvKq6imq3Eu0
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 01 Mar 2013 12:03:18 -0000

For the mount, don't use fstab. use:

zfs set mountpoint=/home poolname/path/to/dataset

And for the import, add

zfs_enable="YES"

to rc.conf.


And I think that's it. (all my FreeBSD systems are pure zfs, so not sure
what troubles you would get if you had UFS on root)


On 2013-03-01 12:26, tech mailinglists wrote:
> Hello all,
>
> I think that I only can be an idiot to get in such a problem but I am
> not able to mount a zpool via fstab while boot.
>
> I have a FreeBSD i386 PV Xen DomU running with 3 disks xbd0 (ext2 for
> /boot), xbd1 (UFS for /) and xbd2 (ZFS/zpool with name home to mount
> at /home).
>
> I now tried everything I could find. So my fstab entry looks like this:
>
> home            /home   zfs     rw,late 0       0
>
> The real problem is that after a reboot the zpool is no longer
> imported, I really don't know why I always have to reimport the pool
> via zpool import -d /dev home. Because of this the filesystem never
> can be mounted via fstab while boot and I get dropped into a shell
> where I need to do this always manually.
>
> So why the pool always isn't imported after boot and how can I solve this issue?
>
> And is the fstab entry correct itself? So would it work when the pool
> gets imported with it's name befor the fstab entry is parsed?
>
> Hope that someone give me a few hints or a solution.
>
> Best Regards
> _______________________________________________
> freebsd-fs@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-fs
> To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org"


-- 

--------------------------------------------
Peter Maloney
Brockmann Consult
Max-Planck-Str. 2
21502 Geesthacht
Germany
Tel: +49 4152 889 300
Fax: +49 4152 889 333
E-mail: peter.maloney@brockmann-consult.de
Internet: http://www.brockmann-consult.de
--------------------------------------------


From owner-freebsd-fs@FreeBSD.ORG  Fri Mar  1 12:12:10 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by hub.freebsd.org (Postfix) with ESMTP id 114DFB00
 for <freebsd-fs@freebsd.org>; Fri,  1 Mar 2013 12:12:10 +0000 (UTC)
 (envelope-from tevans.uk@googlemail.com)
Received: from mail-vc0-f173.google.com (mail-vc0-f173.google.com
 [209.85.220.173]) by mx1.freebsd.org (Postfix) with ESMTP id AA8C09E9
 for <freebsd-fs@freebsd.org>; Fri,  1 Mar 2013 12:12:09 +0000 (UTC)
Received: by mail-vc0-f173.google.com with SMTP id fy27so1911125vcb.18
 for <freebsd-fs@freebsd.org>; Fri, 01 Mar 2013 04:12:03 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=googlemail.com; s=20120113;
 h=mime-version:x-received:in-reply-to:references:date:message-id
 :subject:from:to:cc:content-type;
 bh=ZKZJFHMdL08yFQGdOAjRuuCu7YP/ZcMNWY0MhF5phS0=;
 b=c95mfvf4ci+IyQ1OHScSxunzP3Zg+SGquJJ9o8CBfPC0Ka1xE9mg88dnoP+hCs79k9
 Y6JbAvsFoWIxaVbqm6E/25XgWEQKnA56tp0Q667Sel3nsApI1jYwwryZNjKsl1ZWRS5E
 uq6/ElP56cj7ztSuT/s1bB6qxy13dI4J6ajDt6kaT8k2uPYjiW1ly3oT3SguAP6E64vV
 ndvdop9NqCvOCDE1gZ90/vOithiZ5+D9a7nQjmrGIwrn41M7stMoajEQiEcSESZwQAN3
 1gOVIXWIc/i4/nBml2BdwznCynO+Q0kQXfXbdSKt0oMwVUSaDZ27yAaRqo+Zof94pKuP
 2ctA==
MIME-Version: 1.0
X-Received: by 10.58.205.179 with SMTP id lh19mr4025462vec.7.1362139923599;
 Fri, 01 Mar 2013 04:12:03 -0800 (PST)
Received: by 10.58.223.170 with HTTP; Fri, 1 Mar 2013 04:12:03 -0800 (PST)
In-Reply-To: <513098FF.8030806@brockmann-consult.de>
References: <CAMCOOJvs_SS1n2r3jA28x4et+dSFv9YJ4BR0d9Padtmrj8E1Hw@mail.gmail.com>
 <513098FF.8030806@brockmann-consult.de>
Date: Fri, 1 Mar 2013 12:12:03 +0000
Message-ID: <CAFHbX1L0ks0N9QxTH5Qr_7fUfA52i1AJ_BFQ2B8oK=risYFoAg@mail.gmail.com>
Subject: Re: I am to silly to mount a zpool while boot
From: Tom Evans <tevans.uk@googlemail.com>
To: Peter Maloney <peter.maloney@brockmann-consult.de>
Content-Type: text/plain; charset=UTF-8
Cc: freebsd-fs@freebsd.org
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 01 Mar 2013 12:12:10 -0000

On Fri, Mar 1, 2013 at 12:03 PM, Peter Maloney
<peter.maloney@brockmann-consult.de> wrote:
> For the mount, don't use fstab. use:
>
> zfs set mountpoint=/home poolname/path/to/dataset
>
> And for the import, add
>
> zfs_enable="YES"
>
> to rc.conf.
>
>
> And I think that's it. (all my FreeBSD systems are pure zfs, so not sure
> what troubles you would get if you had UFS on root)
>

I have UFS root, ZFS for /usr, /var etc, due to BIOS/loader issues
when initially trying to get ZFS boot working on this box.
This is the total contents of fstab:

/dev/gpt/root        /                  ufs     rw              1       1
/dev/gpt/swap1       none               swap    sw              0       0
/dev/gpt/swap2       none               swap    sw              0       0

The ZFS fs is mounted by the mountpoint property:

> $ zfs get mountpoint tank
NAME  PROPERTY    VALUE       SOURCE
tank  mountpoint  /tank       default

ZFS is loaded as usual, by adding zfs_load="YES" to /boot/loader.conf
and zfs_enable="YES" to /etc/rc.conf

Hope that helps

Tom

From owner-freebsd-fs@FreeBSD.ORG  Fri Mar  1 14:57:34 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115])
 by hub.freebsd.org (Postfix) with ESMTP id 34C0627C;
 Fri,  1 Mar 2013 14:57:34 +0000 (UTC)
 (envelope-from rmacklem@uoguelph.ca)
Received: from esa-jnhn.mail.uoguelph.ca (esa-jnhn.mail.uoguelph.ca
 [131.104.91.44]) by mx1.freebsd.org (Postfix) with ESMTP id BF5051C5;
 Fri,  1 Mar 2013 14:57:33 +0000 (UTC)
X-IronPort-Anti-Spam-Filtered: true
X-IronPort-Anti-Spam-Result: AqAEAILBMFGDaFvO/2dsb2JhbABEhk+7ZIESc4IfAQEFIwRSGw4KAgINGQJZBhOIE65oki6BI4wqgRM0B4ItgRMDiGuNWIljhwiDJoFLPg
X-IronPort-AV: E=Sophos;i="4.84,761,1355115600"; d="scan'208";a="18880314"
Received: from erie.cs.uoguelph.ca (HELO zcs3.mail.uoguelph.ca)
 ([131.104.91.206])
 by esa-jnhn.mail.uoguelph.ca with ESMTP; 01 Mar 2013 09:57:26 -0500
Received: from zcs3.mail.uoguelph.ca (localhost.localdomain [127.0.0.1])
 by zcs3.mail.uoguelph.ca (Postfix) with ESMTP id BCFB9B3F13;
 Fri,  1 Mar 2013 09:57:26 -0500 (EST)
Date: Fri, 1 Mar 2013 09:57:26 -0500 (EST)
From: Rick Macklem <rmacklem@uoguelph.ca>
To: Sergey Kandaurov <pluknet@freebsd.org>
Message-ID: <298688524.3444408.1362149846756.JavaMail.root@erie.cs.uoguelph.ca>
In-Reply-To: <CAE-mSOJYATENbP48_7QJQEm8Y2X+22cS6nD77OGM7jYvYMBHmQ@mail.gmail.com>
Subject: Re: should vn_fullpath1() ever return a path with "." in it?
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit
X-Originating-IP: [172.17.91.201]
X-Mailer: Zimbra 6.0.10_GA_2692 (ZimbraWebClient - FF3.0 (Win)/6.0.10_GA_2692)
Cc: FreeBSD Filesystems <freebsd-fs@freebsd.org>
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 01 Mar 2013 14:57:34 -0000

Sergey Kandaurov wrote:
> On 1 March 2013 04:58, Rick Macklem <rmacklem@uoguelph.ca> wrote:
> > Kostik Belousov wrote:
> >> On Wed, Feb 27, 2013 at 09:59:22PM -0500, Rick Macklem wrote:
> >> > Hi,
> >> >
> >> > Sergey Kandaurov reported a problem where getcwd() returns a
> >> > path with "/./" imbedded in it for an NFSv4 mount. This is
> >> > caused by a mount point crossing on the server when at the
> >> > server's root because vn_fullpath1() uses VV_ROOT to spot
> >> > mount point crossings.
> >> >
> >> > The current workaround is to use the sysctls:
> >> > debug.disablegetcwd=1
> >> > debug.disablefullpath=1
> >> >
> >> > However, it would be nice to fix this when vn_fullpath1()
> >> > is being used.
> >> >
> >> > A simple fix is to have vn_fullpath1() fail when it finds
> >> > "." as a directory match in the path. When vn_fullpath1()
> >> > fails, the syscalls fail and that allows the libc algorithm
> >> > to be used (which works for this case because it doesn't
> >> > depend on VV_ROOT being set, etc).
> >> >
> >> > So, I am wondering if a patch (I have attached one) that
> >> > makes vn_fullpath1() fail when it matches "." will break
> >> > anything else? (I don't think so, since the code checks
> >> > for VV_ROOT in the loop above the check for a match of
> >> > ".", but I am not sure?)
> >> >
> >> > Thanks for any input w.r.t. this, rick
> >>
> >> > --- kern/vfs_cache.c.sav 2013-02-27 20:44:42.000000000 -0500
> >> > +++ kern/vfs_cache.c 2013-02-27 21:10:39.000000000 -0500
> >> > @@ -1333,6 +1333,20 @@ vn_fullpath1(struct thread *td, struct v
> >> >                         startvp, NULL, 0, 0);
> >> >                     break;
> >> >             }
> >> > + if (buf[buflen] == '.' && (buf[buflen + 1] == '\0' ||
> >> > + buf[buflen + 1] == '/')) {
> >> > + /*
> >> > + * Fail if it matched ".". This should only happen
> >> > + * for NFSv4 mounts that cross server mount points.
> >> > + */
> >> > + CACHE_RUNLOCK();
> >> > + vrele(vp);
> >> > + numfullpathfail1++;
> >> > + error = ENOENT;
> >> > + SDT_PROBE(vfs, namecache, fullpath, return,
> >> > + error, vp, NULL, 0, 0);
> >> > + break;
> >> > + }
> >> >             buf[--buflen] = '/';
> >> >             slash_prefixed = 1;
> >> >     }
> >>
> >> I do not quite understand this. Did the dvp (parent) vnode returned
> >> by
> >> VOP_VPTOCNP() equal to vp (child) vnode in the case of the "." name
> >> ?
> >> It must be, for the correct operation, but also it should cause the
> >> almost
> >> infinite loop in the vn_fullpath1(). The loop is not really
> >> infinite
> >> due
> >> to a limited size of the buffer where the infinite amount of "./"
> >> is
> >> placed.
> >>
> >> Anyway, I think we should do better than this patch, even if it is
> >> legitimate. I think that the better place to check the condition is
> >> the
> >> default implementation of VOP_VPTOCNP(). Am I right that this is
> >> where
> >> it broke for you ?
> >>
> >> diff --git a/sys/kern/vfs_default.c b/sys/kern/vfs_default.c
> >> index 00d064e..1dd0185 100644
> >> --- a/sys/kern/vfs_default.c
> >> +++ b/sys/kern/vfs_default.c
> >> @@ -856,8 +856,12 @@ vop_stdvptocnp(struct vop_vptocnp_args *ap)
> >> error = ENOMEM;
> >> goto out;
> >> }
> >> - bcopy(dp->d_name, buf + i, dp->d_namlen);
> >> - error = 0;
> >> + if (dp->d_namlen == 1 && dp->d_name[0] == '.') {
> >> + error = ENOENT;
> >> + } else {
> >> + bcopy(dp->d_name, buf + i, dp->d_namlen);
> >> + error = 0;
> >> + }
> >> goto out;
> >> }
> >> } while (len > 0 || !eofflag);
> >
> > Yes, this patch fixes the problem too. If you think it is safe to
> > do this, I can commit the patch in mid-April. Maybe Sergey can
> > test it?
> >
> > Thanks yet again, rick
> >
> 
> Hi Rick
> Sorry but I am no longer able to test NFSv4.
> 
No problem. I can reproduce the problem, so I think it's fine w.r.t.
testing to see if it fixes the bug.

Thanks, rick

> --
> wbr,
> pluknet

From owner-freebsd-fs@FreeBSD.ORG  Fri Mar  1 16:54:25 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115])
 by hub.freebsd.org (Postfix) with ESMTP id B200DCEE
 for <freebsd-fs@freebsd.org>; Fri,  1 Mar 2013 16:54:25 +0000 (UTC)
 (envelope-from dean.jones@oregonstate.edu)
Received: from smtp1.oregonstate.edu (smtp1.oregonstate.edu [128.193.15.35])
 by mx1.freebsd.org (Postfix) with ESMTP id 828CF990
 for <freebsd-fs@freebsd.org>; Fri,  1 Mar 2013 16:54:25 +0000 (UTC)
Received: from localhost (localhost [127.0.0.1])
 by smtp1.oregonstate.edu (Postfix) with ESMTP id 3DA573E3F4
 for <freebsd-fs@freebsd.org>; Fri,  1 Mar 2013 08:53:18 -0800 (PST)
X-Virus-Scanned: amavisd-new at oregonstate.edu
Received: from smtp1.oregonstate.edu ([127.0.0.1])
 by localhost (smtp.oregonstate.edu [127.0.0.1]) (amavisd-new, port 10024)
 with ESMTP id wdsGWVbxZvSx for <freebsd-fs@freebsd.org>;
 Fri,  1 Mar 2013 08:53:18 -0800 (PST)
Received: from mail-ia0-f181.google.com (mail-ia0-f181.google.com
 [209.85.210.181]) (using TLSv1 with cipher RC4-SHA (128/128 bits))
 (No client certificate requested)
 by smtp1.oregonstate.edu (Postfix) with ESMTPSA id EA1973E4E8
 for <freebsd-fs@freebsd.org>; Fri,  1 Mar 2013 08:53:17 -0800 (PST)
Received: by mail-ia0-f181.google.com with SMTP id w33so2770263iag.40
 for <freebsd-fs@freebsd.org>; Fri, 01 Mar 2013 08:53:17 -0800 (PST)
X-Received: by 10.50.193.200 with SMTP id hq8mr13075152igc.101.1362156797089; 
 Fri, 01 Mar 2013 08:53:17 -0800 (PST)
MIME-Version: 1.0
Received: by 10.64.33.161 with HTTP; Fri, 1 Mar 2013 08:52:56 -0800 (PST)
In-Reply-To: <CAOjFWZ5DH2wb8m2eTX-d1bxdfGLOCpDB06+RKqA7kL5Lyvawkg@mail.gmail.com>
References: <512FE773.3060903@physics.umn.edu>
 <CAOjFWZ5DH2wb8m2eTX-d1bxdfGLOCpDB06+RKqA7kL5Lyvawkg@mail.gmail.com>
From: Dean Jones <dean.jones@oregonstate.edu>
Date: Fri, 1 Mar 2013 08:52:56 -0800
Message-ID: <CAMXYB4JdxwNS-1iL46Odednj-7MP4eZP0xon51WCumsuEdM2gg@mail.gmail.com>
Subject: Re: benefit of GEOM labels for ZFS, was Hard drive device names...
 serial numbers
To: Freddie Cash <fjwcash@gmail.com>
Content-Type: text/plain; charset=ISO-8859-1
X-Content-Filtered-By: Mailman/MimeDel 2.1.14
Cc: FreeBSD Filesystems <freebsd-fs@freebsd.org>
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 01 Mar 2013 16:54:25 -0000

On Thu, Feb 28, 2013 at 7:30 PM, Freddie Cash <fjwcash@gmail.com> wrote:

> You label the drive with something that tells you:
>   - enclosure
>   - column
>   - row
>
> For example, we use the following pattern: encX-A-#
>
> Where X tells you which enclosure it's in, A tells you which column it's in
> (letters start at A increasing to the right), and # tells you the disk in
> the column, numbered top-down.
>
> Whether you label the entire drive using glabel or just a GPT partition is
> up to you. We use GPT labels.
>

I like your labeling convention.

I'll add that glabel is FreeBSD specific, so if a pool might ever be
imported under another OS that GPT labels are universal.

From owner-freebsd-fs@FreeBSD.ORG  Fri Mar  1 18:00:54 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115])
 by hub.freebsd.org (Postfix) with ESMTP id C33ED3C1;
 Fri,  1 Mar 2013 18:00:54 +0000 (UTC)
 (envelope-from mckusick@mckusick.com)
Received: from chez.mckusick.com (chez.mckusick.com
 [IPv6:2001:5a8:4:7e72:4a5b:39ff:fe12:452])
 by mx1.freebsd.org (Postfix) with ESMTP id 9DEE5D75;
 Fri,  1 Mar 2013 18:00:54 +0000 (UTC)
Received: from chez.mckusick.com (localhost [127.0.0.1])
 by chez.mckusick.com (8.14.3/8.14.3) with ESMTP id r21I0pBD034998;
 Fri, 1 Mar 2013 10:00:51 -0800 (PST)
 (envelope-from mckusick@chez.mckusick.com)
Message-Id: <201303011800.r21I0pBD034998@chez.mckusick.com>
To: lev@freebsd.org
Subject: Re: Panic in ffs_valloc (Was: Unexpected SU+J inconsistency AGAIN --
 please, don't shift topic to ZFS!) 
In-reply-to: <352538988.20130301102237@serebryakov.spb.ru> 
Date: Fri, 01 Mar 2013 10:00:51 -0800
From: Kirk McKusick <mckusick@mckusick.com>
X-Spam-Status: No, score=0.0 required=5.0 tests=MISSING_MID, UNPARSEABLE_RELAY
 autolearn=failed version=3.2.5
X-Spam-Checker-Version: SpamAssassin 3.2.5 (2008-06-10) on chez.mckusick.com
Cc: freebsd-fs@freebsd.org, Don Lewis <truckman@freebsd.org>
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 01 Mar 2013 18:00:54 -0000

> Date: Fri, 1 Mar 2013 10:22:37 +0400
> From: Lev Serebryakov <lev@freebsd.org>
> To: Don Lewis <truckman@freebsd.org>
> Subject: Re: Panic in ffs_valloc (Was: Unexpected SU+J inconsistency AGAIN --
> Cc: freebsd-fs@freebsd.org, freebsd-current@freebsd.org
> 
> DL> The fact that the filesystem code called panic() indicates that the
> DL> filesystem was already corrupt by that point.  That's a likely reason
> DL> for fsck complaining about the unexpected SU+J inconsistency.
> 
> DL> Incorrect write ordering that allowed the filesystem to become
> DL> inconsistent because some pending writes were lost because of the panic
> DL> might not be necessary, but this might have allowed an earlier crash
> DL> where a full fsck was skipped to leave the filesystem in this state.
>   As far, as I understand, if this theory is right (file system
>  corruption which left unnoticed by "standard" fsck), it is bug in FFS
>  SU+J too, as it should not be corrupted by reordered writes (if
>  writes is properly reported as completed even if they were
>  reordered).

If the bitmaps are left corrupted (in particular if blocks are marked
free that are actually in use), then that panic can occur. Such a state
should never be possible when running with SU even if you have crashed
multiple times and restarted without running fsck.

To reduce the number of possible points of failure, I suggest that
you try running with just SU (i.e., turn off the SU+J jornalling).
you can do this with `tunefs -j disable /dev/fsdisk'. This will
turn off journalling, but not soft updates. You can verify this
by then running `tunefs -p /dev/fsdisk' to ensure that soft updates
are still enabled.

As you have already stated, the filesystem is fine with reordered
writes provided that they are not completed (iodone) until they are
well and truely on the disk.

> DL> This panic might also be a result of the bug fixed in 246877, but I have
> DL> my doubts about that.
>   It was not MFCed :(
> 
> --
> // Black Lion AKA Lev Serebryakov <lev@FreeBSD.org>

I will MFC 246876 and 246877 once they have been in head long enough
to have confidence that they will not cause trouble. That means at
least a month (well more than the two weeks they have presently been
there).

Note these changes only pass the barrier request down to the GEOM
layer. I don't know whether it actually makes it to the drive layer
and if it does whether the drive layer actually implements it. My
goal was to get the ball rolling.

	Kirk McKusick

From owner-freebsd-fs@FreeBSD.ORG  Fri Mar  1 20:16:53 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by hub.freebsd.org (Postfix) with ESMTP id D810FDBB
 for <freebsd-fs@freebsd.org>; Fri,  1 Mar 2013 20:16:53 +0000 (UTC)
 (envelope-from gperez@entel.upc.edu)
Received: from violet.upc.es (violet.upc.es [147.83.2.51])
 by mx1.freebsd.org (Postfix) with ESMTP id 694901598
 for <freebsd-fs@freebsd.org>; Fri,  1 Mar 2013 20:16:52 +0000 (UTC)
Received: from ackerman2.upc.es (ackerman2.upc.es [147.83.2.244])
 by violet.upc.es (8.14.1/8.13.1) with ESMTP id r21KGjGb004662
 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=FAIL);
 Fri, 1 Mar 2013 21:16:45 +0100
Received: from [192.168.1.110] (247.Red-81-39-132.dynamicIP.rima-tde.net
 [81.39.132.247]) (authenticated bits=0)
 by ackerman2.upc.es (8.14.4/8.14.4) with ESMTP id r21KGhh4008484
 (version=TLSv1/SSLv3 cipher=DHE-RSA-CAMELLIA256-SHA bits=256 verify=NO);
 Fri, 1 Mar 2013 21:16:45 +0100
Message-ID: <51310CAA.1020701@entel.upc.edu>
Date: Fri, 01 Mar 2013 21:16:42 +0100
From: =?ISO-8859-1?Q?Gustau_P=E9rez_i_Querol?= <gperez@entel.upc.edu>
User-Agent: Mozilla/5.0 (X11; FreeBSD amd64;
 rv:17.0) Gecko/20130220 Thunderbird/17.0.3
MIME-Version: 1.0
To: Graham Allan <allan@physics.umn.edu>
Subject: Re: benefit of GEOM labels for ZFS, was Hard drive device names...
 serial numbers
References: <512FE773.3060903@physics.umn.edu>
In-Reply-To: <512FE773.3060903@physics.umn.edu>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 8bit
X-Scanned-By: MIMEDefang 2.70 on 147.83.2.244
X-Mail-Scanned: Criba 2.0 + Clamd
X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-3.0
 (violet.upc.es [147.83.2.51]); Fri, 01 Mar 2013 21:16:46 +0100 (CET)
Cc: freebsd-fs@freebsd.org
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 01 Mar 2013 20:16:53 -0000

Al 01/03/2013 00:25, En/na Graham Allan ha escrit:
> Sorry to come in late on this thread but I've been struggling with 
> thinking about the same issue, from a different perspective.
>
> Several months ago we created our first "large" ZFS storage system, 
> using 42 drives plus a few SSDs in one of the oft-used Supermicro 
> 45-drive chassis. It has been working really nicely but has led to 
> some puzzling over the best way to do some things when we build more.
>
> We made our pool using geom drive labels. Ever since, I've been 
> wondering if this really gives any advantage - at least for this type 
> of system. If you need to replace a drive, you don't really know which 
> enclosure slot any given da device is, and so our answer has been to 
> dig around using sg3_utils commands wrapped in a bit of perl, to try 
> and correlate the da device to the slot via the drive serial number.
>
> At this point, having a geom label just seems like an extra bit of 
> indirection to increase my confusion :-) Although setting the geom 
> label to the drive serial number might be a serious improvement...
>
> We're about to add a couple more of these shelves to the system, 
> giving a total of 135 drives (although each shelf would be a separate 
> pool), and given that they will be standard consumer grade drives, 
> some frequency of replacement is a given.
>
> Does anyone have any good tips on how to manage a large number of 
> drives in a zfs pool like this?
>

    I don't have such a large array, I have  about 8 or 10 drives at 
most but I'd go with Freddie's convention. I'd also go with GPT labels 
instead of geom labels because the former are universal.

    I'd also ensure that you can easily identify driver with leds. 
Either by issuing commands to the disk controller (I use mfiutil to 
visually identify them) or by using ses, but you probably have though.

    Greets,

    Gustau

-- 
    Salut i for�a,

    Gustau

---------------------------------------------------------------------------
Prou top-posting :	http://ca.wikipedia.org/wiki/Top-posting
Stop top-posting :	http://en.wikipedia.org/wiki/Posting_style	

O O O Gustau P�rez i Querol
O O O Unitat de Gesti� dels departaments
O O O Matem�tica Aplicada IV i Enginyeria Telem�tica

       Universitat Polit�cnica de Catalunya
       Edifici C3 - Despatx S101-B
  UPC  Campus Nord UPC
       C/ Jordi Girona, 1-3
       08034 - Barcelona


From owner-freebsd-fs@FreeBSD.ORG  Fri Mar  1 20:23:01 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115])
 by hub.freebsd.org (Postfix) with ESMTP id EDABBEBE;
 Fri,  1 Mar 2013 20:23:01 +0000 (UTC) (envelope-from lev@FreeBSD.org)
Received: from onlyone.friendlyhosting.spb.ru (onlyone.friendlyhosting.spb.ru
 [IPv6:2a01:4f8:131:60a2::2])
 by mx1.freebsd.org (Postfix) with ESMTP id 7B1F115E9;
 Fri,  1 Mar 2013 20:23:01 +0000 (UTC)
Received: from lion.home.serebryakov.spb.ru (unknown
 [IPv6:2001:470:923f:1:9421:367:9d7d:512b])
 (Authenticated sender: lev@serebryakov.spb.ru)
 by onlyone.friendlyhosting.spb.ru (Postfix) with ESMTPA id 65EFB4AC58;
 Sat,  2 Mar 2013 00:22:51 +0400 (MSK)
Date: Sat, 2 Mar 2013 00:22:44 +0400
From: Lev Serebryakov <lev@FreeBSD.org>
Organization: FreeBSD Project
X-Priority: 3 (Normal)
Message-ID: <1352492388.20130302002244@serebryakov.spb.ru>
To: Kirk McKusick <mckusick@mckusick.com>
Subject: Re: Panic in ffs_valloc (Was: Unexpected SU+J inconsistency AGAIN --
 please, don't shift topic to ZFS!)
In-Reply-To: <201303011800.r21I0pBD034998@chez.mckusick.com>
References: <352538988.20130301102237@serebryakov.spb.ru>
 <201303011800.r21I0pBD034998@chez.mckusick.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: quoted-printable
Cc: freebsd-fs@freebsd.org, Don Lewis <truckman@freebsd.org>
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
Reply-To: lev@FreeBSD.org
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 01 Mar 2013 20:23:02 -0000

Hello, Kirk.
You wrote 1 =D0=BC=D0=B0=D1=80=D1=82=D0=B0 2013 =D0=B3., 22:00:51:

>>   As far, as I understand, if this theory is right (file system
>>  corruption which left unnoticed by "standard" fsck), it is bug in FFS
>>  SU+J too, as it should not be corrupted by reordered writes (if
>>  writes is properly reported as completed even if they were
>>  reordered).
KM> If the bitmaps are left corrupted (in particular if blocks are marked
KM> free that are actually in use), then that panic can occur. Such a state
KM> should never be possible when running with SU even if you have crashed
KM> multiple times and restarted without running fsck.
  I run fsck every time (ok, every half-year) when server crashes due
 to my awkward experiments on live system, but I run it as it runs:
 with journal after upgrade to 9-STABLE, not full old-fashioned run.

KM> To reduce the number of possible points of failure, I suggest that
KM> you try running with just SU (i.e., turn off the SU+J jornalling).
KM> you can do this with `tunefs -j disable /dev/fsdisk'. This will
KM> turn off journalling, but not soft updates. You can verify this
KM> by then running `tunefs -p /dev/fsdisk' to ensure that soft updates
KM> are still enabled.
  And wait another half a year :)

  I'm trying to reproduce this situation on VM (VirtualBox with
 virtual HDDs), but no luck (yet?).

KM> I will MFC 246876 and 246877 once they have been in head long enough
KM> to have confidence that they will not cause trouble. That means at
KM> least a month (well more than the two weeks they have presently been
KM> there).

KM> Note these changes only pass the barrier request down to the GEOM
KM> layer. I don't know whether it actually makes it to the drive layer
KM> and if it does whether the drive layer actually implements it. My
KM> goal was to get the ball rolling.
  I'm have controversial feelings about this barriers. IMHO, all
 writes to UFS (FFS) could and should be divided into two classes:
 data writes and metadata (including journal, as FFS doesn't have data
 journaling) writes. IMHO (it is last time I type these 4 letters,
 but, please, add it when you read this before and after each my
 sentence, as I'm not FS expert at any grade), data writes could be
 done as best effort till fsync() is called (or file is opened with
 appropriate flag, which is equivalent to automatic fsync() after each
 write). They could be delayed, reordered, etc. But metadata should
 have some strong guarantees (and fsync()'ed data too, of course).
 Such division could allow best possible performance & consistent FS
 metadata (maybe not consistent user data -- but every application
 which needs strong guarantees, like RDBMS, use fsync() anyway).

   Now you add "BARRIER" write. It looks too strong to use it often.
 It will force writing of ALL data from caches, even if your intention
 is to write only 2 or 3 blocks of metadata. It could solve problems
 with FS metadata, but it will degrade performance, especially in
 multithreaded load. Update of inode map for creating 0 bytes file
 flag by one process (protected with barrier) will flush whole data
 cache (maybe, hundred of meagbytes) of other one.

  It is better than noting, but, it is not best solution. Every write
 should be marked as "critical" or "loose" and critical-marked buffers
 (BIOs) must be written ASAP and before all other _crtitcal_ BIOs (not
 all BIOs after it with or without flag). So, barrier should affect
 only other barriers (ordered writes). Default, "loose" semantic (for
 data) will exactly what we have now.

   It is very hard to implement contract "It only ensure that buffers
written before that buffer will get to the media before any buffers
written after that buffer" in any other way but full flush, which, as I
stated above, will hurt performace in such cases as effective
RAID5-like implementations which gain a lot from combining wrties
together by spatial (not time) property.

   And for full flush (which is needed sometimes, of course) we
already have BIO_FLUSH command.

   Anyway, I'll support new semantic in geom_raid5 ASAP. But,
 unfortunately, now it could be supported as it is simple write
 followed by BIO_FLUSH -- not very effective :(

--=20
// Black Lion AKA Lev Serebryakov <lev@FreeBSD.org>


From owner-freebsd-fs@FreeBSD.ORG  Fri Mar  1 20:43:53 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115])
 by hub.freebsd.org (Postfix) with ESMTP id 2DC6CCCC;
 Fri,  1 Mar 2013 20:43:53 +0000 (UTC)
 (envelope-from utisoft@gmail.com)
Received: from mail-ia0-x22d.google.com (mail-ia0-x22d.google.com
 [IPv6:2607:f8b0:4001:c02::22d])
 by mx1.freebsd.org (Postfix) with ESMTP id E3B8B172A;
 Fri,  1 Mar 2013 20:43:52 +0000 (UTC)
Received: by mail-ia0-f173.google.com with SMTP id h37so3043435iak.18
 for <multiple recipients>; Fri, 01 Mar 2013 12:43:52 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
 h=x-received:mime-version:in-reply-to:references:from:date:message-id
 :subject:to:cc:content-type;
 bh=2g9kPW/UNrvBQ2y31CE+9hcRlAGhq335WEw+3Nrux7U=;
 b=mtWTcfOt5u88QYbcR1TfnYITgplpKRngh+tX1j3IyJ9Ddgnj8QrjOAQKyuH3VQhR5N
 X7SBcot2NCug+7abb5BdSiXRo/Zmf8hOYe8BegYoL9IsbGqIqmRphUJbejMsGPjaA5vb
 zcPjSXESytJLwGdTy33JSWfc7X/Yq2dvHbEfWGbgc2O2cn1giKZA3J2QuWRR9PYfQSw7
 pI553v+fvGwVDluF0JuDjbJNJTdrchXlySfDB6xV3rqp0YCsxVjzCU9PTyvkbKC4pbXE
 cC3vRYbwjHxa4GzLBiLMKG5Fq7WQFYg74Vox5QpyXdwnhgbSkG37aE3AsA2qRuYFN0uU
 bMSA==
X-Received: by 10.42.126.133 with SMTP id e5mr7497058ics.17.1362170632573;
 Fri, 01 Mar 2013 12:43:52 -0800 (PST)
MIME-Version: 1.0
Received: by 10.64.63.12 with HTTP; Fri, 1 Mar 2013 12:43:22 -0800 (PST)
In-Reply-To: <CADLo839OaZ-HfXW9HjKd0pTR66Sd4zzf8Y7ervZrb0-Oem4+cQ@mail.gmail.com>
References: <20130121221617.GA23909@icarus.home.lan>
 <50FED818.7070704@FreeBSD.org>
 <20130125083619.GA51096@icarus.home.lan>
 <20130125211232.GA3037@icarus.home.lan>
 <20130125212559.GA1772@icarus.home.lan>
 <20130125213209.GA1858@icarus.home.lan>
 <20130126011754.GA1806@icarus.home.lan> <51267055.3040500@FreeBSD.org>
 <CADLo839OaZ-HfXW9HjKd0pTR66Sd4zzf8Y7ervZrb0-Oem4+cQ@mail.gmail.com>
From: Chris Rees <utisoft@gmail.com>
Date: Fri, 1 Mar 2013 20:43:22 +0000
Message-ID: <CADLo839VcoXkmwqWDP=-nRhU5g_LH8bBmA+oZC5rTwOH-dYYCQ@mail.gmail.com>
Subject: Re: disk "flipped" - a known problem?
To: mav@freebsd.org
Content-Type: text/plain; charset=ISO-8859-1
Cc: Jeremy Chadwick <jdc@koitsu.org>,
 "freebsd-fs@freebsd.org" <freebsd-fs@freebsd.org>,
 Andriy Gapon <avg@freebsd.org>
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 01 Mar 2013 20:43:53 -0000

On 23 February 2013 09:39, Chris Rees <utisoft@gmail.com> wrote:
>
> On 21 Feb 2013 19:07, "Alexander Motin" <mav@freebsd.org> wrote:
>>
>> On 26.01.2013 03:17, Jeremy Chadwick wrote:
>> > Okay, I've figured out the exact, 100% reproducible condition that
>> > causes the situation.  It took me a lot of tries and a digital pocket
>> > recorder to take verbal notes (there are just too many things to look at
>> > simultaneously), but I've figured it out.
>> >
>> > I'm sorry for the verbosity, but it's necessary.
>> >
>> > Assume the disk we're talking about is /dev/ada5.
>> >
>> > 1. Prior to any issues, we have this:
>> >
>> > root@icarus:~ # ls -l /dev/ada5* /dev/xpt* /dev/pass5*
>> > crw-r-----  1 root  operator  0x8c Jan 25 16:41 /dev/ada5
>> > crw-------  1 root  operator  0x75 Jan 25 16:35 /dev/pass5
>> > crw-------  1 root  operator  0x51 Jan 25 16:35 /dev/xpt0
>> >
>> > 2. ada5 begins experiencing issues -- ATA commands (CDBs) submit do not
>> > get a response (not going to discuss how/why that can happen).
>> >
>> > 3. These types of messages are seen on console (naturally the CDB and
>> > request type will vary -- in this case it was because I was doing the dd
>> > zero'ing, thus tickling the bad sector/naughty firmware on the drive):
>> >
>> > Jan 25 16:29:28 icarus kernel: ahcich5: Timeout on slot 0 port 0
>> > Jan 25 16:29:28 icarus kernel: ahcich5: is 00000000 cs 00000000 ss
>> > 00000001 rs 00000001 tfd 40 serr 00000000 cmd 0004c017
>> > Jan 25 16:29:28 icarus kernel: ahcich5: AHCI reset...
>> > Jan 25 16:29:28 icarus kernel: ahcich5: SATA connect time=1000us
>> > status=00000113
>> > Jan 25 16:29:28 icarus kernel: ahcich5: AHCI reset: device found
>> > Jan 25 16:29:28 icarus kernel: (ada5:ahcich5:0:0:0): WRITE_FPDMA_QUEUED.
>> > ACB: 61 80 80 77 01 40 00 00 00 00 00 00
>> > Jan 25 16:29:28 icarus kernel: (ada5:ahcich5:0:0:0): CAM status: Command
>> > timeout
>> > Jan 25 16:29:28 icarus kernel: (ada5:ahcich5:0:0:0): Retrying command
>> >
>> > 4. Any I/O submit to ada5 during this time blocks (this is normal).
>> >
>> > 5. **While this situation is happening**, something using xpt(4)
>> > attempts to submit a CDB to the disk (ex. smartctl -a /dev/ada5).
>> > This request also blocks (again, normal).
>> >
>> > 6. Physical device falls off bus, or CAM kicks the disk off the bus.
>> > Doesn't matter which.  We see messages resembling this (boy am I tired
>> > of this interspersed output problem):
>> >
>> > Jan 25 16:29:32 icarus kernel: (ada5:ahcich5:0:0:0): lost device
>> > Jan 25 16:29:32 icarus kernel: (pass5:ahcich5:0:0:0): lost device
>> > Jan 25 16:29:32 icarus kernel: (ada5:ahcich5:0:0:0): removing device
>> > entry
>> > Jan 25 16:29:32 icarus kernel: (pass5:ahcich5:0:0:0): passdevgonecb:
>> > devfs entry is gone
>> >
>> > 7. Standard I/O requests fail with errno=6 "Device not configured".
>> > xpt(4) requests also fail with the same errno.
>> >
>> > 8. Device-wise, at this stage all we have is:
>> >
>> > root@icarus:~ # ls -l /dev/ada5* /dev/xpt* /dev/pass5*
>> > crw-------  1 root  operator  0x51 Jan 25 16:35 /dev/xpt0
>> >
>> > 9. Device comes back online for whatever reason.  FreeBSD sees the disk,
>> > blah blah blah:
>> >
>> > Jan 25 16:30:16 icarus kernel: GEOM: new disk ada5
>> > Jan 25 16:30:16 icarus kernel: ada5: <WDC WD1500ADFD-00NLR4 21.07QR4>
>> > ATA-7 SATA 1.x device
>> > Jan 25 16:30:16 icarus kernel: ada5: Serial Number WD-WMAP41573589
>> > Jan 25 16:30:16 icarus kernel: ada5: 150.000MB/s transfers (SATA 1.x,
>> > UDMA6, PIO 8192bytes)
>> > Jan 25 16:30:16 icarus kernel: ada5: Command Queueing enabled
>> > Jan 25 16:30:16 icarus kernel: ada5: 143089MB (293046768 512 byte
>> > sectors: 16H 63S/T 16383C)
>> > Jan 25 16:30:16 icarus kernel: ada5: Previously was known as ad14
>> >
>> > ...um, where's pass5?
>> >
>> > 10. /dev/pass5 is now completely (permanently) missing:
>> >
>> > root@icarus:~ # ls -l /dev/ada5* /dev/xpt* /dev/pass5*
>> > crw-r-----  1 root  operator  0x99 Jan 25 16:42 /dev/ada5
>> > crw-------  1 root  operator  0x51 Jan 25 16:35 /dev/xpt0
>> >
>> > 11. Any further attempts to communicate via xpt(4) with ada5 fail.
>> > Detaching and reattaching the disk does not fix the issue; the only fix
>> > is to reboot the system.
>> >
>> > 12. "camcontrol debug -IPXp scbus5" results in tons and tons of output
>> > all pertaining to xpt(4).  It looks like xpt(4) is in some kind of
>> > loop.
>> >
>> > Below is my verbose boot (with non-kernel things removed), which
>> > also includes "camcontrol debug" output once things are in a bad state:
>> >
>> > http://jdc.koitsu.org/freebsd/xpt_oddity.log
>> >
>> > In this log you'll see that after 1 CAM timeout I yanked the drive, then
>> > roughly 30 seconds later reinserted it.
>> >
>> > If you need me to turn on CAM debugging *prior* to the above, I can do
>> > that, just let me know.
>> >
>> > The important step is #5.  Without that, the problem shown in #9/10/11
>> > does not happen.
>> >
>> > It's a good thing I don't run smartd(8) -- most users I see using that
>> > software set the interval to something like 180s or 60s.  Imagine this
>> > frustration: "okay so the disk fell off the bus, but what, now I can't
>> > talk to it with SMART?  Uhhh... <reboots>  Err, works now?  Whatever".
>>
>> I think, the problem may already be fixed in HEAD by r244014 by ken@.
>> I've just merged it to 9-STABLE at r247115. So if it is still possible
>> to reproduce the situation, it would be good to try.
>
> I think I've been having the same troubles since upgrading from 9.0, so I'm
> going to try applying that to 9.1-R and I'll also give feedback.

Yup, I no longer get weird disconnects after this patch (5 days later now).

Thank you very much!

Chris

From owner-freebsd-fs@FreeBSD.ORG  Fri Mar  1 22:31:23 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115])
 by hub.freebsd.org (Postfix) with ESMTP id 9DF74944
 for <freebsd-fs@freebsd.org>; Fri,  1 Mar 2013 22:31:23 +0000 (UTC)
 (envelope-from mailinglists.tech@gmail.com)
Received: from mail-ea0-x235.google.com (mail-ea0-x235.google.com
 [IPv6:2a00:1450:4013:c01::235])
 by mx1.freebsd.org (Postfix) with ESMTP id 208CB1AAB
 for <freebsd-fs@freebsd.org>; Fri,  1 Mar 2013 22:31:22 +0000 (UTC)
Received: by mail-ea0-f181.google.com with SMTP id i13so417348eaa.40
 for <freebsd-fs@freebsd.org>; Fri, 01 Mar 2013 14:31:22 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
 h=x-received:message-id:date:from:user-agent:mime-version:to:cc
 :subject:references:in-reply-to:content-type
 :content-transfer-encoding;
 bh=pZBvZY3F7igvyYzyt7hHx5fcHOhoKTgqbulafVkaIGI=;
 b=bf2nIe/kZBgyVx1PjUBPRoDlydin0poLWz8q7tG1BeO33XZS7B3gq2IgUgUZW/QqnE
 XQ0kunX34OJuuUSTnd8C4VtAip1avhphC2pd+8DfgPTudrsrkgJZt/GgaqWHiLCTJU9E
 rHFbN1ZQeyQpgnCXFcJXKsiKFPtetes0XGnkzkTkoddrYrLinvIfxzs+nmjli4ckBk4o
 IRUaar/txFFV+T80P0jFnOHhBy3Kv4z6jAnAuipvcv9qT4ObMhD59vAngbwExAILbuRc
 sQjGK2XMVwOG6Xe/yfkavO7u9Gf8Py4ac+Wkr6PtSJGbT6yWWM4bZwpnpEkEILfI91pR
 JYUQ==
X-Received: by 10.14.3.70 with SMTP id 46mr32385815eeg.2.1362177082289;
 Fri, 01 Mar 2013 14:31:22 -0800 (PST)
Received: from [127.0.0.1] (ashlynn.lippux.de. [5.9.218.242])
 by mx.google.com with ESMTPS id 3sm19345585eej.6.2013.03.01.14.31.20
 (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128);
 Fri, 01 Mar 2013 14:31:21 -0800 (PST)
Message-ID: <51312C32.6000207@gmail.com>
Date: Fri, 01 Mar 2013 23:31:14 +0100
From: tech mailinglists <mailinglists.tech@gmail.com>
User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64;
 rv:17.0) Gecko/20130215 Thunderbird/17.0.3
MIME-Version: 1.0
To: Peter Maloney <peter.maloney@brockmann-consult.de>
Subject: Re: I am to silly to mount a zpool while boot
References: <CAMCOOJvs_SS1n2r3jA28x4et+dSFv9YJ4BR0d9Padtmrj8E1Hw@mail.gmail.com>
 <513098FF.8030806@brockmann-consult.de>
In-Reply-To: <513098FF.8030806@brockmann-consult.de>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Cc: freebsd-fs@freebsd.org
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 01 Mar 2013 22:31:23 -0000

Am 01.03.2013 13:03, schrieb Peter Maloney:
> For the mount, don't use fstab. use:
>
> zfs set mountpoint=/home poolname/path/to/dataset
>
> And for the import, add
>
> zfs_enable="YES"
>
> to rc.conf.
>
>
> And I think that's it. (all my FreeBSD systems are pure zfs, so not sure
> what troubles you would get if you had UFS on root)
>
>
> On 2013-03-01 12:26, tech mailinglists wrote:
>> Hello all,
>>
>> I think that I only can be an idiot to get in such a problem but I am
>> not able to mount a zpool via fstab while boot.
>>
>> I have a FreeBSD i386 PV Xen DomU running with 3 disks xbd0 (ext2 for
>> /boot), xbd1 (UFS for /) and xbd2 (ZFS/zpool with name home to mount
>> at /home).
>>
>> I now tried everything I could find. So my fstab entry looks like this:
>>
>> home            /home   zfs     rw,late 0       0
>>
>> The real problem is that after a reboot the zpool is no longer
>> imported, I really don't know why I always have to reimport the pool
>> via zpool import -d /dev home. Because of this the filesystem never
>> can be mounted via fstab while boot and I get dropped into a shell
>> where I need to do this always manually.
>>
>> So why the pool always isn't imported after boot and how can I solve this issue?
>>
>> And is the fstab entry correct itself? So would it work when the pool
>> gets imported with it's name befor the fstab entry is parsed?
>>
>> Hope that someone give me a few hints or a solution.
>>
>> Best Regards
>> _______________________________________________
>> freebsd-fs@freebsd.org mailing list
>> http://lists.freebsd.org/mailman/listinfo/freebsd-fs
>> To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org"
>

Hello all,

a few of the things I already had done. But the real problem is I think 
that the pool doesn't get imported automatically. I read that ZFS 
searches in special directories when it tries to import. So is there a 
way to set an option which says that it should search in /dev? I always 
have to do this after reboot:

zpool import -d /dev tank

Than tank (pool) gets mounted at /tank and the zvol tank/home gets 
mounted on /home.

So I think that the import of the zpool fails. I have set 
zfs_enable="YES" in /etc/rc.conf also zfs_load=YES as boot parameter 
which gets shown in kenv and commented out the fstab entry. So I read 
that the import normally should work automatically when the module is 
loaded and zfs is enabled but I think the fact that my pool is located 
on /dev/xbd2 is the problem.

Best Regards

From owner-freebsd-fs@FreeBSD.ORG  Fri Mar  1 22:52:27 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115])
 by hub.freebsd.org (Postfix) with ESMTP id DB0B5151
 for <freebsd-fs@freebsd.org>; Fri,  1 Mar 2013 22:52:27 +0000 (UTC)
 (envelope-from lkchen@k-state.edu)
Received: from ksu-out.merit.edu (ksu-out.merit.edu [207.75.117.132])
 by mx1.freebsd.org (Postfix) with ESMTP id A90101B6C
 for <freebsd-fs@freebsd.org>; Fri,  1 Mar 2013 22:52:27 +0000 (UTC)
X-Merit-ExtLoop1: 1
X-IronPort-Anti-Spam-Filtered: true
X-IronPort-Anti-Spam-Result: AgAFAPgvMVHPS3TT/2dsb2JhbAA6CoZPuRCDYBZzgh8BAQUjYg8aAg0ZAlmILKA5jlWJMohogSOMLxCBWIIXgRMDiGueQ4FSgVSBTD0
X-IronPort-AV: E=Sophos;i="4.84,765,1355115600"; d="scan'208";a="905809341"
X-MERIT-SOURCE: KSU
Received: from ksu-sfpop-mailstore02.merit.edu ([207.75.116.211])
 by sfpop-ironport05.merit.edu with ESMTP; 01 Mar 2013 17:52:20 -0500
Date: Fri, 1 Mar 2013 17:52:20 -0500 (EST)
From: "Lawrence K. Chen, P.Eng." <lkchen@ksu.edu>
To: freebsd-fs@freebsd.org
Message-ID: <1602333081.21816316.1362178340105.JavaMail.root@k-state.edu>
In-Reply-To: <51310CAA.1020701@entel.upc.edu>
Subject: Re: benefit of GEOM labels for ZFS, was Hard drive device names...
 serial numbers
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit
X-Originating-IP: [129.130.0.181]
X-Mailer: Zimbra 7.2.2_GA_2852 (ZimbraWebClient - GC25
 ([unknown])/7.2.2_GA_2852)
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 01 Mar 2013 22:52:27 -0000



----- Original Message -----
> Al 01/03/2013 00:25, En/na Graham Allan ha escrit:
> > Sorry to come in late on this thread but I've been struggling with
> > thinking about the same issue, from a different perspective.
> >
> > Several months ago we created our first "large" ZFS storage system,
> > using 42 drives plus a few SSDs in one of the oft-used Supermicro
> > 45-drive chassis. It has been working really nicely but has led to
> > some puzzling over the best way to do some things when we build
> > more.
> >
> > We made our pool using geom drive labels. Ever since, I've been
> > wondering if this really gives any advantage - at least for this
> > type
> > of system. If you need to replace a drive, you don't really know
> > which
> > enclosure slot any given da device is, and so our answer has been
> > to
> > dig around using sg3_utils commands wrapped in a bit of perl, to
> > try
> > and correlate the da device to the slot via the drive serial
> > number.
> >
> > At this point, having a geom label just seems like an extra bit of
> > indirection to increase my confusion :-) Although setting the geom
> > label to the drive serial number might be a serious improvement...
> >
> > We're about to add a couple more of these shelves to the system,
> > giving a total of 135 drives (although each shelf would be a
> > separate
> > pool), and given that they will be standard consumer grade drives,
> > some frequency of replacement is a given.
> >
> > Does anyone have any good tips on how to manage a large number of
> > drives in a zfs pool like this?
> >
> 
>     I don't have such a large array, I have  about 8 or 10 drives at
> most but I'd go with Freddie's convention. I'd also go with GPT
> labels
> instead of geom labels because the former are universal.
> 
>     I'd also ensure that you can easily identify driver with leds.
> Either by issuing commands to the disk controller (I use mfiutil to
> visually identify them) or by using ses, but you probably have
> though.
> 

I only have 15 drives...(12 HDDs and 3 SSDs) but the ordering of drives seemed to randomize on every boot (wonder now if the controller was doing some kind of staggering in spin ups.  And, their other drivers cope with it.  They provide a v1.1 driver for FreeBSD 7.2 or source to the v1.0 driver.)  And, then everything moved around when I changed controllers a few times.

I had resorted at one point to putting device.hints to force all the drives to keep their mapping.  Which caused problems elsewhere, and a mess when I added another controller.  But, then I changed to more meaningful GPT labels and exported and re-imported my zpools with '-d /dev/gpt', and now things are ok.

L

From owner-freebsd-fs@FreeBSD.ORG  Fri Mar  1 22:59:58 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by hub.freebsd.org (Postfix) with ESMTP id 606F1262
 for <freebsd-fs@freebsd.org>; Fri,  1 Mar 2013 22:59:58 +0000 (UTC)
 (envelope-from fjwcash@gmail.com)
Received: from mail-qa0-f48.google.com (mail-qa0-f48.google.com
 [209.85.216.48]) by mx1.freebsd.org (Postfix) with ESMTP id 269E71BBD
 for <freebsd-fs@freebsd.org>; Fri,  1 Mar 2013 22:59:57 +0000 (UTC)
Received: by mail-qa0-f48.google.com with SMTP id j8so62186qah.7
 for <freebsd-fs@freebsd.org>; Fri, 01 Mar 2013 14:59:57 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
 h=mime-version:x-received:in-reply-to:references:date:message-id
 :subject:from:to:cc:content-type;
 bh=GhBTFB57zu0Vttuf+Ri9l7x168Ijm2PqKMToODLvdPU=;
 b=ckpEc/1xClSQ7C06GxFafSH9eCs7OnffZPp+Lz9l1mWTDO0d4E9NBJDRk3sk9Z0B6X
 epQJtRcWn32a1jccLr1IhcEUWkqEneWTQSUDNzRaPCmIJ7JjtpH8Q/emifGbDW9EtQro
 ES1BmCZEh4r7HWtto5csh/YBDQx5RR2YPwKpydPgnDau0CzhdNKMbP1dlK/IHGq6Ly84
 7w9bAf9JALZW6HK+c8EiXCel0L9a30iDlUqQCOhlNxivASQCYlG1WvH3r4UXtBn03lL3
 ORg/OajsyA3BjzblKx24E/O6kzM1+wEf1i59soNLapTCWl/pet0T3V6XNAwbkjadgpk1
 Ezdg==
MIME-Version: 1.0
X-Received: by 10.224.203.131 with SMTP id fi3mr22428502qab.77.1362178797242; 
 Fri, 01 Mar 2013 14:59:57 -0800 (PST)
Received: by 10.49.106.233 with HTTP; Fri, 1 Mar 2013 14:59:57 -0800 (PST)
In-Reply-To: <51312C32.6000207@gmail.com>
References: <CAMCOOJvs_SS1n2r3jA28x4et+dSFv9YJ4BR0d9Padtmrj8E1Hw@mail.gmail.com>
 <513098FF.8030806@brockmann-consult.de>
 <51312C32.6000207@gmail.com>
Date: Fri, 1 Mar 2013 14:59:57 -0800
Message-ID: <CAOjFWZ6Zcy_7Ceh=+Dxczrn8ZQj+SO0fKbmYy-F9L9V_XgoASg@mail.gmail.com>
Subject: Re: I am to silly to mount a zpool while boot
From: Freddie Cash <fjwcash@gmail.com>
To: tech mailinglists <mailinglists.tech@gmail.com>
Content-Type: text/plain; charset=UTF-8
X-Content-Filtered-By: Mailman/MimeDel 2.1.14
Cc: FreeBSD Filesystems <freebsd-fs@freebsd.org>
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 01 Mar 2013 22:59:58 -0000

What's the output of:

zfs get mountpoint tank/home


On Fri, Mar 1, 2013 at 2:31 PM, tech mailinglists <
mailinglists.tech@gmail.com> wrote:

> Am 01.03.2013 13:03, schrieb Peter Maloney:
>
>> For the mount, don't use fstab. use:
>>
>> zfs set mountpoint=/home poolname/path/to/dataset
>>
>> And for the import, add
>>
>> zfs_enable="YES"
>>
>> to rc.conf.
>>
>>
>> And I think that's it. (all my FreeBSD systems are pure zfs, so not sure
>> what troubles you would get if you had UFS on root)
>>
>>
>> On 2013-03-01 12:26, tech mailinglists wrote:
>>
>>> Hello all,
>>>
>>> I think that I only can be an idiot to get in such a problem but I am
>>> not able to mount a zpool via fstab while boot.
>>>
>>> I have a FreeBSD i386 PV Xen DomU running with 3 disks xbd0 (ext2 for
>>> /boot), xbd1 (UFS for /) and xbd2 (ZFS/zpool with name home to mount
>>> at /home).
>>>
>>> I now tried everything I could find. So my fstab entry looks like this:
>>>
>>> home            /home   zfs     rw,late 0       0
>>>
>>> The real problem is that after a reboot the zpool is no longer
>>> imported, I really don't know why I always have to reimport the pool
>>> via zpool import -d /dev home. Because of this the filesystem never
>>> can be mounted via fstab while boot and I get dropped into a shell
>>> where I need to do this always manually.
>>>
>>> So why the pool always isn't imported after boot and how can I solve
>>> this issue?
>>>
>>> And is the fstab entry correct itself? So would it work when the pool
>>> gets imported with it's name befor the fstab entry is parsed?
>>>
>>> Hope that someone give me a few hints or a solution.
>>>
>>> Best Regards
>>> ______________________________**_________________
>>> freebsd-fs@freebsd.org mailing list
>>> http://lists.freebsd.org/**mailman/listinfo/freebsd-fs<http://lists.freebsd.org/mailman/listinfo/freebsd-fs>
>>> To unsubscribe, send any mail to "freebsd-fs-unsubscribe@**freebsd.org<freebsd-fs-unsubscribe@freebsd.org>
>>> "
>>>
>>
>>
> Hello all,
>
> a few of the things I already had done. But the real problem is I think
> that the pool doesn't get imported automatically. I read that ZFS searches
> in special directories when it tries to import. So is there a way to set an
> option which says that it should search in /dev? I always have to do this
> after reboot:
>
> zpool import -d /dev tank
>
> Than tank (pool) gets mounted at /tank and the zvol tank/home gets mounted
> on /home.
>
> So I think that the import of the zpool fails. I have set zfs_enable="YES"
> in /etc/rc.conf also zfs_load=YES as boot parameter which gets shown in
> kenv and commented out the fstab entry. So I read that the import normally
> should work automatically when the module is loaded and zfs is enabled but
> I think the fact that my pool is located on /dev/xbd2 is the problem.
>
> Best Regards
> ______________________________**_________________
> freebsd-fs@freebsd.org mailing list
> http://lists.freebsd.org/**mailman/listinfo/freebsd-fs<http://lists.freebsd.org/mailman/listinfo/freebsd-fs>
> To unsubscribe, send any mail to "freebsd-fs-unsubscribe@**freebsd.org<freebsd-fs-unsubscribe@freebsd.org>
> "
>



-- 
Freddie Cash
fjwcash@gmail.com

From owner-freebsd-fs@FreeBSD.ORG  Sat Mar  2 00:53:06 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by hub.freebsd.org (Postfix) with ESMTP id 04737FA6
 for <freebsd-fs@freebsd.org>; Sat,  2 Mar 2013 00:53:06 +0000 (UTC)
 (envelope-from mailinglists.tech@gmail.com)
Received: from mail-ee0-f54.google.com (mail-ee0-f54.google.com [74.125.83.54])
 by mx1.freebsd.org (Postfix) with ESMTP id 945881F39
 for <freebsd-fs@freebsd.org>; Sat,  2 Mar 2013 00:53:05 +0000 (UTC)
Received: by mail-ee0-f54.google.com with SMTP id c41so2845077eek.13
 for <freebsd-fs@freebsd.org>; Fri, 01 Mar 2013 16:53:04 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
 h=x-received:message-id:date:from:user-agent:mime-version:to:cc
 :subject:references:in-reply-to:content-type;
 bh=LZfbTPfAfRNl3GvNhvGjv2sStgBu3yKwitfl7qU21MU=;
 b=PXseLQIXhYIFVIFYgN6N5wm7RC/fmDxeXa3wggxetX5D577iMN5BDg7KtFheFWIUwo
 UVkLjflPteG4jbsBQURHxsYTL/el74z6koXGktiFhVWWDDmSzTrOD6Yw3jXqqihhwEmh
 qYI/+yf4rdrxJXjSMYn4QazYMneuFWmNk5NufK7tBQJO9o1J3cIxnAkr3ZZ3TpEoKcp5
 NAUNji4j/YUYm1E9aqEi2VEsDxObqeSnbQHgFyG07qgxmj0ktOe2gvlZjUud2KR1mRP6
 vbmRCUFtAy27ecPSpkfqn2hF7gjwhGpnrJJJ66Ro5AxqD0jpCkQ9m/yi0UBacrV47J7Y
 CbSQ==
X-Received: by 10.14.3.133 with SMTP id 5mr32752698eeh.43.1362185584336;
 Fri, 01 Mar 2013 16:53:04 -0800 (PST)
Received: from [127.0.0.1] (ashlynn.lippux.de. [5.9.218.242])
 by mx.google.com with ESMTPS id d47sm19817075eem.9.2013.03.01.16.53.01
 (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128);
 Fri, 01 Mar 2013 16:53:03 -0800 (PST)
Message-ID: <51314D67.7040704@gmail.com>
Date: Sat, 02 Mar 2013 01:52:55 +0100
From: tech mailinglists <mailinglists.tech@gmail.com>
User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64;
 rv:17.0) Gecko/20130215 Thunderbird/17.0.3
MIME-Version: 1.0
To: Freddie Cash <fjwcash@gmail.com>
Subject: Re: I am to silly to mount a zpool while boot
References: <CAMCOOJvs_SS1n2r3jA28x4et+dSFv9YJ4BR0d9Padtmrj8E1Hw@mail.gmail.com>
 <513098FF.8030806@brockmann-consult.de> <51312C32.6000207@gmail.com>
 <CAOjFWZ6Zcy_7Ceh=+Dxczrn8ZQj+SO0fKbmYy-F9L9V_XgoASg@mail.gmail.com>
In-Reply-To: <CAOjFWZ6Zcy_7Ceh=+Dxczrn8ZQj+SO0fKbmYy-F9L9V_XgoASg@mail.gmail.com>
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
X-Content-Filtered-By: Mailman/MimeDel 2.1.14
Cc: FreeBSD Filesystems <freebsd-fs@freebsd.org>
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sat, 02 Mar 2013 00:53:06 -0000

Am 01.03.2013 23:59, schrieb Freddie Cash:
> What's the output of:
>
> zfs get mountpoint tank/home
>
>
> On Fri, Mar 1, 2013 at 2:31 PM, tech mailinglists 
> <mailinglists.tech@gmail.com <mailto:mailinglists.tech@gmail.com>> wrote:
>
>     Am 01.03.2013 13:03, schrieb Peter Maloney:
>
>         For the mount, don't use fstab. use:
>
>         zfs set mountpoint=/home poolname/path/to/dataset
>
>         And for the import, add
>
>         zfs_enable="YES"
>
>         to rc.conf.
>
>
>         And I think that's it. (all my FreeBSD systems are pure zfs,
>         so not sure
>         what troubles you would get if you had UFS on root)
>
>
>         On 2013-03-01 12:26, tech mailinglists wrote:
>
>             Hello all,
>
>             I think that I only can be an idiot to get in such a
>             problem but I am
>             not able to mount a zpool via fstab while boot.
>
>             I have a FreeBSD i386 PV Xen DomU running with 3 disks
>             xbd0 (ext2 for
>             /boot), xbd1 (UFS for /) and xbd2 (ZFS/zpool with name
>             home to mount
>             at /home).
>
>             I now tried everything I could find. So my fstab entry
>             looks like this:
>
>             home            /home   zfs     rw,late 0       0
>
>             The real problem is that after a reboot the zpool is no longer
>             imported, I really don't know why I always have to
>             reimport the pool
>             via zpool import -d /dev home. Because of this the
>             filesystem never
>             can be mounted via fstab while boot and I get dropped into
>             a shell
>             where I need to do this always manually.
>
>             So why the pool always isn't imported after boot and how
>             can I solve this issue?
>
>             And is the fstab entry correct itself? So would it work
>             when the pool
>             gets imported with it's name befor the fstab entry is parsed?
>
>             Hope that someone give me a few hints or a solution.
>
>             Best Regards
>             _______________________________________________
>             freebsd-fs@freebsd.org <mailto:freebsd-fs@freebsd.org>
>             mailing list
>             http://lists.freebsd.org/mailman/listinfo/freebsd-fs
>             To unsubscribe, send any mail to
>             "freebsd-fs-unsubscribe@freebsd.org
>             <mailto:freebsd-fs-unsubscribe@freebsd.org>"
>
>
>
>     Hello all,
>
>     a few of the things I already had done. But the real problem is I
>     think that the pool doesn't get imported automatically. I read
>     that ZFS searches in special directories when it tries to import.
>     So is there a way to set an option which says that it should
>     search in /dev? I always have to do this after reboot:
>
>     zpool import -d /dev tank
>
>     Than tank (pool) gets mounted at /tank and the zvol tank/home gets
>     mounted on /home.
>
>     So I think that the import of the zpool fails. I have set
>     zfs_enable="YES" in /etc/rc.conf also zfs_load=YES as boot
>     parameter which gets shown in kenv and commented out the fstab
>     entry. So I read that the import normally should work
>     automatically when the module is loaded and zfs is enabled but I
>     think the fact that my pool is located on /dev/xbd2 is the problem.
>
>     Best Regards
>     _______________________________________________
>     freebsd-fs@freebsd.org <mailto:freebsd-fs@freebsd.org> mailing list
>     http://lists.freebsd.org/mailman/listinfo/freebsd-fs
>     To unsubscribe, send any mail to
>     "freebsd-fs-unsubscribe@freebsd.org
>     <mailto:freebsd-fs-unsubscribe@freebsd.org>"
>
>
>
>
> -- 
> Freddie Cash
> fjwcash@gmail.com <mailto:fjwcash@gmail.com>

The mountpoint of tank/home is set to /home. The output looks like this:

NAME       PROPERTY    VALUE       SOURCE
tank/home  mountpoint  /home       local

Best Regards

From owner-freebsd-fs@FreeBSD.ORG  Sat Mar  2 01:50:02 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@smarthost.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by hub.freebsd.org (Postfix) with ESMTP id EBF69628
 for <freebsd-fs@smarthost.ysv.freebsd.org>;
 Sat,  2 Mar 2013 01:50:02 +0000 (UTC)
 (envelope-from gnats@FreeBSD.org)
Received: from freefall.freebsd.org (freefall.freebsd.org
 [IPv6:2001:1900:2254:206c::16:87])
 by mx1.freebsd.org (Postfix) with ESMTP id DB4EB1A9
 for <freebsd-fs@smarthost.ysv.freebsd.org>;
 Sat,  2 Mar 2013 01:50:02 +0000 (UTC)
Received: from freefall.freebsd.org (localhost [127.0.0.1])
 by freefall.freebsd.org (8.14.6/8.14.6) with ESMTP id r221o1Tr096279
 for <freebsd-fs@freefall.freebsd.org>; Sat, 2 Mar 2013 01:50:01 GMT
 (envelope-from gnats@freefall.freebsd.org)
Received: (from gnats@localhost)
 by freefall.freebsd.org (8.14.6/8.14.6/Submit) id r221o1Po096271;
 Sat, 2 Mar 2013 01:50:01 GMT (envelope-from gnats)
Date: Sat, 2 Mar 2013 01:50:01 GMT
Message-Id: <201303020150.r221o1Po096271@freefall.freebsd.org>
To: freebsd-fs@FreeBSD.org
Cc: 
From: "Steven Hartland" <killing@multiplay.co.uk>
Subject: Re: kern/153695: [patch] [zfs] Booting from zpool created on
 4k-sector drive doesn&#39; t work
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
Reply-To: Steven Hartland <killing@multiplay.co.uk>
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sat, 02 Mar 2013 01:50:03 -0000

The following reply was made to PR kern/153695; it has been noted by GNATS.

From: "Steven Hartland" <killing@multiplay.co.uk>
To: <bug-followup@freebsd.org>,
	<am@raisa.eu.org>
Cc:  
Subject: Re: kern/153695: [patch] [zfs] Booting from zpool created on 4k-sector drive doesn&#39;t work
Date: Sat, 2 Mar 2013 01:43:06 -0000

 So this is no longer a problem and can be closed?
 
 ================================================
 This e.mail is private and confidential between Multiplay (UK) Ltd. and the person or entity to whom it is addressed. In the event of misdirection, the recipient is prohibited from using, copying, printing or otherwise disseminating it or any information contained in it. 
 
 In the event of misdirection, illegible or incomplete transmission please telephone +44 845 868 1337
 or return the E.mail to postmaster@multiplay.co.uk.
 

From owner-freebsd-fs@FreeBSD.ORG  Sat Mar  2 06:44:51 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115])
 by hub.freebsd.org (Postfix) with ESMTP id 799142FB
 for <freebsd-fs@freebsd.org>; Sat,  2 Mar 2013 06:44:51 +0000 (UTC)
 (envelope-from daniel@digsys.bg)
Received: from smtp-sofia.digsys.bg (smtp-sofia.digsys.bg [193.68.21.123])
 by mx1.freebsd.org (Postfix) with ESMTP id D65F7D9F
 for <freebsd-fs@freebsd.org>; Sat,  2 Mar 2013 06:44:50 +0000 (UTC)
Received: from [193.68.136.207] (digsys207-136.pip.digsys.bg [193.68.136.207])
 (authenticated bits=0)
 by smtp-sofia.digsys.bg (8.14.6/8.14.6) with ESMTP id r226It6G085564
 (version=TLSv1/SSLv3 cipher=AES128-SHA bits=128 verify=NO);
 Sat, 2 Mar 2013 08:18:56 +0200 (EET) (envelope-from daniel@digsys.bg)
References: <512FE773.3060903@physics.umn.edu>
 <CAOjFWZ5DH2wb8m2eTX-d1bxdfGLOCpDB06+RKqA7kL5Lyvawkg@mail.gmail.com>
Mime-Version: 1.0 (1.0)
In-Reply-To: <CAOjFWZ5DH2wb8m2eTX-d1bxdfGLOCpDB06+RKqA7kL5Lyvawkg@mail.gmail.com>
Content-Type: text/plain;
	charset=us-ascii
Content-Transfer-Encoding: quoted-printable
Message-Id: <F89F2896-B2CB-46B2-AACE-E04D53AB3B0C@digsys.bg>
X-Mailer: iPad Mail (10B146)
From: Daniel Kalchev <daniel@digsys.bg>
Subject: Re: benefit of GEOM labels for ZFS,
 was Hard drive device names... serial numbers
Date: Sat, 2 Mar 2013 08:18:56 +0200
To: Freddie Cash <fjwcash@gmail.com>
Cc: FreeBSD Filesystems <freebsd-fs@freebsd.org>
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sat, 02 Mar 2013 06:44:51 -0000

On 01.03.2013, at 05:30, Freddie Cash <fjwcash@gmail.com> wrote:

> For example, we use the following pattern: encX-A-#
>=20
> Where X tells you which enclosure it's in, A tells you which column it's i=
n
> (letters start at A increasing to the right), and # tells you the disk in
> the column, numbered top-down.

We use similar labeling, but usually rely on the vendor's drive cage labels a=
nd do not use column numbers. But if your enclosures have column labels it m=
akes sense. Anything that makes it obvious for the technician to locate the d=
rive without consulting too much documentation makes sense. Just stick to on=
e coordinate system for all enclosures in one location :)

Using labels greatly simplifies ZFS management in cases of disaster - you ma=
y have to boot another recovery system and no scripts or hard wired drive in=
formation may be available to assist you.

Daniel=

From owner-freebsd-fs@FreeBSD.ORG  Sat Mar  2 15:12:13 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115])
 by hub.freebsd.org (Postfix) with ESMTP id 339F7BD5
 for <freebsd-fs@freebsd.org>; Sat,  2 Mar 2013 15:12:12 +0000 (UTC)
 (envelope-from gperez@entel.upc.edu)
Received: from violet.upc.es (violet.upc.es [147.83.2.51])
 by mx1.freebsd.org (Postfix) with ESMTP id A1C2AE6
 for <freebsd-fs@freebsd.org>; Sat,  2 Mar 2013 15:12:11 +0000 (UTC)
Received: from ackerman2.upc.es (ackerman2.upc.es [147.83.2.244])
 by violet.upc.es (8.14.1/8.13.1) with ESMTP id r22FC8At011156
 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=FAIL);
 Sat, 2 Mar 2013 16:12:09 +0100
Received: from [192.168.1.110] (247.Red-81-39-132.dynamicIP.rima-tde.net
 [81.39.132.247]) (authenticated bits=0)
 by ackerman2.upc.es (8.14.4/8.14.4) with ESMTP id r22FC7L0018977
 (version=TLSv1/SSLv3 cipher=DHE-RSA-CAMELLIA256-SHA bits=256 verify=NO);
 Sat, 2 Mar 2013 16:12:08 +0100
Message-ID: <513216C6.6030108@entel.upc.edu>
Date: Sat, 02 Mar 2013 16:12:06 +0100
From: =?ISO-8859-1?Q?Gustau_P=E9rez_i_Querol?= <gperez@entel.upc.edu>
User-Agent: Mozilla/5.0 (X11; FreeBSD amd64;
 rv:17.0) Gecko/20130220 Thunderbird/17.0.3
MIME-Version: 1.0
To: "Lawrence K. Chen, P.Eng." <lkchen@ksu.edu>
Subject: Re: benefit of GEOM labels for ZFS, was Hard drive device names...
 serial numbers
References: <1602333081.21816316.1362178340105.JavaMail.root@k-state.edu>
In-Reply-To: <1602333081.21816316.1362178340105.JavaMail.root@k-state.edu>
X-Scanned-By: MIMEDefang 2.70 on 147.83.2.244
X-Mail-Scanned: Criba 2.0 + Clamd
X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-3.0
 (violet.upc.es [147.83.2.51]); Sat, 02 Mar 2013 16:12:09 +0100 (CET)
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 8bit
X-Content-Filtered-By: Mailman/MimeDel 2.1.14
Cc: FreeBSD FS <freebsd-fs@freebsd.org>
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sat, 02 Mar 2013 15:12:13 -0000

Al 01/03/2013 23:52, En/na Lawrence K. Chen, P.Eng. ha escrit:
>
>
> I only have 15 drives...(12 HDDs and 3 SSDs) but the ordering of drives seemed to randomize on every boot (wonder now if the controller was doing some kind of staggering in spin ups.  And, their other drivers cope with it.  They provide a v1.1 driver for FreeBSD 7.2 or source to the v1.0 driver.)  And, then everything moved around when I changed controllers a few times.
>
> I had resorted at one point to putting device.hints to force all the drives to keep their mapping.  Which caused problems elsewhere, and a mess when I added another controller.  But, then I changed to more meaningful GPT labels and exported and re-imported my zpools with '-d /dev/gpt', and now things are ok.

    That reordering issue is what made me switch to geom labels first 
(IIRC I did by 5.x era) and next switched to GPT labels. The GPT allows 
me first to easily used those drives with ZFS and specially because 
those labels belong to the partition scheme and thus are filesystem 
independent.

    What I found specially useful is to be able to identify those 
drives. When I drive fails I know where it is because of the simple name 
convention I use; but having a blinking led helps a lot, specially when 
writing the recovery plan for the rest of the team.

    Gus

> L
> _______________________________________________
> freebsd-fs@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-fs
> To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org"


-- 
    Salut i for�a,

    Gustau

---------------------------------------------------------------------------
Prou top-posting :	http://ca.wikipedia.org/wiki/Top-posting
Stop top-posting :	http://en.wikipedia.org/wiki/Posting_style	

O O O Gustau P�rez i Querol
O O O Unitat de Gesti� dels departaments
O O O Matem�tica Aplicada IV i Enginyeria Telem�tica

       Universitat Polit�cnica de Catalunya
       Edifici C3 - Despatx S101-B
  UPC  Campus Nord UPC
       C/ Jordi Girona, 1-3
       08034 - Barcelona