From owner-freebsd-fs@FreeBSD.ORG  Sun Jan 13 15:30:47 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by hub.freebsd.org (Postfix) with ESMTP id E9035B9B
 for <freebsd-fs@freebsd.org>; Sun, 13 Jan 2013 15:30:47 +0000 (UTC)
 (envelope-from wblock@wonkity.com)
Received: from wonkity.com (wonkity.com [67.158.26.137])
 by mx1.freebsd.org (Postfix) with ESMTP id 9F40FA0C
 for <freebsd-fs@freebsd.org>; Sun, 13 Jan 2013 15:30:47 +0000 (UTC)
Received: from wonkity.com (localhost [127.0.0.1])
 by wonkity.com (8.14.6/8.14.6) with ESMTP id r0DFUkhW015940;
 Sun, 13 Jan 2013 08:30:46 -0700 (MST)
 (envelope-from wblock@wonkity.com)
Received: from localhost (wblock@localhost)
 by wonkity.com (8.14.6/8.14.6/Submit) with ESMTP id r0DFUi0D015937;
 Sun, 13 Jan 2013 08:30:46 -0700 (MST)
 (envelope-from wblock@wonkity.com)
Date: Sun, 13 Jan 2013 08:30:43 -0700 (MST)
From: Warren Block <wblock@wonkity.com>
To: kpneal@pobox.com
Subject: Re:  Using glabel
In-Reply-To: <20130113062702.GA63271@neutralgood.org>
Message-ID: <alpine.BSF.2.00.1301130806280.15715@wonkity.com>
References: <CAG27QgTQFA2CvBSrH0Zid9oQg5RbmU55OvfP0njMoLpqHb_GTg@mail.gmail.com>
 <20130112200041.GA77338@psconsult.nl>
 <20130113062702.GA63271@neutralgood.org>
User-Agent: Alpine 2.00 (BSF 1167 2008-08-23)
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed
X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.2.7
 (wonkity.com [127.0.0.1]); Sun, 13 Jan 2013 08:30:46 -0700 (MST)
Cc: freebsd-fs@freebsd.org
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sun, 13 Jan 2013 15:30:48 -0000

On Sun, 13 Jan 2013, kpneal@pobox.com wrote:

>> You can use glabel to label your disks or partition the disks with gpart
>> (using the GPT scheme) and let gpt put a label on each (-l flag).
>
> Don't use glabel pretty much ever. It stores the label inside the
> partition (or disk). If the end of the partition is ever touched then
> the label goes *poof*. Stick to gpt labels.

If you label a partition, the label device will be one block smaller in 
size.  The metadata is hidden and safe, as long as it is accessed 
through the label device.

   # diskinfo -v /dev/da0p1
   /dev/da0p1
 	512         	# sectorsize
 	512000      	# mediasize in bytes (500k)
 	1000        	# mediasize in sectors

   # glabel label teeny /dev/da0p1
   # diskinfo -v /dev/label/teeny
   /dev/label/teeny
 	512         	# sectorsize
 	511488      	# mediasize in bytes (499k)
 	999         	# mediasize in sectors

Note the size in sectors.  The problem is that sometimes people don't 
realize that the label device (/dev/label/teeny) is offering those extra 
features and will continue to use the raw partition in newfs commands 
and such.

Anyway, GPT labels are still preferable to glabel because they can be 
created at the same time as partitions and don't use any extra metadata.

ZFS has its own metadata, and newer versions are supposed to leave the 
last megabyte or so unused to allow for actual versus nominal disk 
sizes.  I'm not clear whether there's a good reason to use additional 
labels instead of just giving ZFS the whole disk.  Unless you aren't 
planning on using the whole disk for ZFS, of course.

From owner-freebsd-fs@FreeBSD.ORG  Sun Jan 13 18:22:00 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by hub.freebsd.org (Postfix) with ESMTP id 32AB7776
 for <freebsd-fs@freebsd.org>; Sun, 13 Jan 2013 18:22:00 +0000 (UTC)
 (envelope-from ronald-freebsd8@klop.yi.org)
Received: from smarthost1.greenhost.nl (smarthost1.greenhost.nl
 [195.190.28.78]) by mx1.freebsd.org (Postfix) with ESMTP id BF8BC2B1
 for <freebsd-fs@freebsd.org>; Sun, 13 Jan 2013 18:21:59 +0000 (UTC)
Received: from smtp.greenhost.nl ([213.108.104.138])
 by smarthost1.greenhost.nl with esmtps (TLS1.0:RSA_AES_256_CBC_SHA1:32)
 (Exim 4.69) (envelope-from <ronald-freebsd8@klop.yi.org>)
 id 1TuSBq-0000tX-Ku
 for freebsd-fs@freebsd.org; Sun, 13 Jan 2013 19:21:52 +0100
Received: from h253044.upc-h.chello.nl ([62.194.253.44] helo=ronaldradial.home)
 by smtp.greenhost.nl with esmtpsa (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32)
 (Exim 4.72) (envelope-from <ronald-freebsd8@klop.yi.org>)
 id 1TuSBq-0005Gi-PY
 for freebsd-fs@freebsd.org; Sun, 13 Jan 2013 19:21:50 +0100
Content-Type: text/plain; charset=us-ascii; format=flowed; delsp=yes
To: freebsd-fs@freebsd.org
Subject: Re:  Using glabel
References: <CAG27QgTQFA2CvBSrH0Zid9oQg5RbmU55OvfP0njMoLpqHb_GTg@mail.gmail.com>
 <20130112200041.GA77338@psconsult.nl>
 <20130113062702.GA63271@neutralgood.org>
 <alpine.BSF.2.00.1301130806280.15715@wonkity.com>
Date: Sun, 13 Jan 2013 19:21:52 +0100
MIME-Version: 1.0
Content-Transfer-Encoding: 8bit
From: "Ronald Klop" <ronald-freebsd8@klop.yi.org>
Message-ID: <op.wqu3yqwn8527sy@ronaldradial.home>
In-Reply-To: <alpine.BSF.2.00.1301130806280.15715@wonkity.com>
User-Agent: Opera Mail/12.12 (Win32)
X-Virus-Scanned: by clamav at smarthost1.samage.net
X-Spam-Level: /
X-Spam-Score: 0.8
X-Spam-Status: No, score=0.8 required=5.0 tests=BAYES_50 autolearn=disabled
 version=3.3.1
X-Scan-Signature: c09395f469c52153b963e4ff2d10f427
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sun, 13 Jan 2013 18:22:00 -0000

On Sun, 13 Jan 2013 16:30:43 +0100, Warren Block <wblock@wonkity.com>  
wrote:

> On Sun, 13 Jan 2013, kpneal@pobox.com wrote:
>
>>> You can use glabel to label your disks or partition the disks with  
>>> gpart
>>> (using the GPT scheme) and let gpt put a label on each (-l flag).
>>
>> Don't use glabel pretty much ever. It stores the label inside the
>> partition (or disk). If the end of the partition is ever touched then
>> the label goes *poof*. Stick to gpt labels.
>
> If you label a partition, the label device will be one block smaller in  
> size.  The metadata is hidden and safe, as long as it is accessed  
> through the label device.
>
>    # diskinfo -v /dev/da0p1
>    /dev/da0p1
>  	512         	# sectorsize
>  	512000      	# mediasize in bytes (500k)
>  	1000        	# mediasize in sectors
>
>    # glabel label teeny /dev/da0p1
>    # diskinfo -v /dev/label/teeny
>    /dev/label/teeny
>  	512         	# sectorsize
>  	511488      	# mediasize in bytes (499k)
>  	999         	# mediasize in sectors
>
> Note the size in sectors.  The problem is that sometimes people don't  
> realize that the label device (/dev/label/teeny) is offering those extra  
> features and will continue to use the raw partition in newfs commands  
> and such.
>
> Anyway, GPT labels are still preferable to glabel because they can be  
> created at the same time as partitions and don't use any extra metadata.
>
> ZFS has its own metadata, and newer versions are supposed to leave the  
> last megabyte or so unused to allow for actual versus nominal disk  
> sizes.  I'm not clear whether there's a good reason to use additional  
> labels instead of just giving ZFS the whole disk.  Unless you aren't  
> planning on using the whole disk for ZFS, of course.

Gpt labels are also portable between different OS'es. If you ever want to  
import your pool with OpenSolaris or the more recent forks of OpenSolaris  
they will understand gpt. Glabel is FreeBSD only.

Ronald.

From owner-freebsd-fs@FreeBSD.ORG  Sun Jan 13 21:28:46 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by hub.freebsd.org (Postfix) with ESMTP id D45116C7;
 Sun, 13 Jan 2013 21:28:46 +0000 (UTC)
 (envelope-from cross+freebsd@distal.com)
Received: from mail.distal.com (mail.distal.com [IPv6:2001:470:e24c:200::ae25])
 by mx1.freebsd.org (Postfix) with ESMTP id A8EBFDB2;
 Sun, 13 Jan 2013 21:28:46 +0000 (UTC)
Received: from [192.168.1.151] (static-71-163-17-12.washdc.fios.verizon.net
 [71.163.17.12]) (authenticated bits=0)
 by mail.distal.com (8.14.3/8.14.3) with ESMTP id r0DLSicA026417
 (version=TLSv1/SSLv3 cipher=AES128-SHA bits=128 verify=NO);
 Sun, 13 Jan 2013 16:28:45 -0500 (EST)
From: Chris Ross <cross+freebsd@distal.com>
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: quoted-printable
Subject: ZFS loader crash on sparc64 (since Oct 2012)
Date: Sun, 13 Jan 2013 16:28:43 -0500
Message-Id: <4031F492-C30C-4F5D-BF8E-B2D61FFD0EAD@distal.com>
To: freebsd-fs@freebsd.org
Mime-Version: 1.0 (Mac OS X Mail 6.2 \(1499\))
X-Mailer: Apple Mail (2.1499)
X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.2.2
 (mail.distal.com [206.138.151.250]); Sun, 13 Jan 2013 16:28:45 -0500 (EST)
Cc: "freebsd-sparc64@freebsd.org" <freebsd-sparc64@freebsd.org>
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sun, 13 Jan 2013 21:28:46 -0000


  Since this is a loader crash related to ZFS, someone wisely pointed =
out that posting to freebsd-fs might be a good idea.  There's a long =
thread on the freebsd-sparc64 list about this issue, you can find at:

=
http://list-archives.org/2012/12/23/freebsd-sparc64-freebsd-org/changes-to=
-kern-geom-debugflags/f/4758203564

  But, the relevant details are that I determined that stable/9 changed =
for sparc64 on October 28, with revision 242230.  This was noted as a =
merge by avg for revision 241289, and appears to be part of a bunch of =
changes he made on October 6 (revs 241282-241294, plus some others =
nearby).

  I'm interested in getting this fixed, so I can build a bootloader that =
doesn't cause my sparc64 to hit a divide by zero trap.

  Please feel free to contact me with any questions about what I found.  =
(most of it is in the freebsd-sparc64 thread linked to above, but I'm =
happy to describe anything unclear or recount from memory.)

  Thanks!

                           - Chris


From owner-freebsd-fs@FreeBSD.ORG  Sun Jan 13 22:22:47 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115])
 by hub.freebsd.org (Postfix) with ESMTP id A0C47776
 for <freebsd-fs@freebsd.org>; Sun, 13 Jan 2013 22:22:47 +0000 (UTC)
 (envelope-from lists@eitanadler.com)
Received: from mail-la0-f44.google.com (mail-la0-f44.google.com
 [209.85.215.44]) by mx1.freebsd.org (Postfix) with ESMTP id 056161F3
 for <freebsd-fs@freebsd.org>; Sun, 13 Jan 2013 22:22:46 +0000 (UTC)
Received: by mail-la0-f44.google.com with SMTP id fr10so3293807lab.31
 for <freebsd-fs@freebsd.org>; Sun, 13 Jan 2013 14:22:40 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=eitanadler.com; s=0xdeadbeef;
 h=mime-version:in-reply-to:references:from:date:message-id:subject:to
 :content-type; bh=rqxrAmv+IZanzO0YEz3YyR9ZlP14vKr0uBuFbIc+7Y4=;
 b=EcVxDgFeuTn2ZTT9Hr/DaG2L5VcXyzPwEnn0kSNVo0d2+Kvfqfxdx68R2wJ937Codx
 6Z5zvL790PbC/k687YzlNdMrJi7JEFBoKA6lP+I8xW6KN8xLx3gQUrwe/uHZy/qr5w/a
 zfTEKD1zYmqVFiHcaoIpu7AJvMM2PYlfOvPKo=
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=google.com; s=20120113;
 h=mime-version:in-reply-to:references:from:date:message-id:subject:to
 :content-type:x-gm-message-state;
 bh=rqxrAmv+IZanzO0YEz3YyR9ZlP14vKr0uBuFbIc+7Y4=;
 b=PqXRNpWaM26QiflVi5+Fzw83sUgXrLiQDAoxeKU+OjailUni2ii2lCkMU/QqpuUsEK
 NQp6SAKZaT/wht/Dls3s0e93XBC60yCjQNkN3sBphlkyMG5bOjxzOyYnHaWLFLQ2AcN+
 pvH/430m5EE6ZdMcv/f/mncqT8flCRpyEGdBb7z27KOhMYy5Vn/Ijkk6I/mA2NJHEh0W
 hg6hBHnIxmPbBtH+UYGQ/8cGF0Bo14JgajVx9yjO4snoFgKoLPhDDmXpBp8X6NPoL1Yc
 e5zNZJUSYl+r5sWua6s5gpvnGeYUex4YZyVO4AVcyjXdCAGXVsRG2hIvReZkbx5b/jSr
 byuw==
Received: by 10.112.23.2 with SMTP id i2mr34917868lbf.24.1358115760248; Sun,
 13 Jan 2013 14:22:40 -0800 (PST)
MIME-Version: 1.0
Received: by 10.112.151.36 with HTTP; Sun, 13 Jan 2013 14:22:10 -0800 (PST)
In-Reply-To: <CAF6rxgnv19VU6_v8qCV2saMBTBaykPss6A=_x=ZPv3fg0+f7nw@mail.gmail.com>
References: <CAF6rxgnv19VU6_v8qCV2saMBTBaykPss6A=_x=ZPv3fg0+f7nw@mail.gmail.com>
From: Eitan Adler <lists@eitanadler.com>
Date: Sun, 13 Jan 2013 17:22:10 -0500
Message-ID: <CAF6rxgkqO_zLQh3Syg=td4EkfPVn4+GtB015+SYvW1Kz-1kC1A@mail.gmail.com>
Subject: Re: What are the limits for FFS file systems and assorted questions
To: freebsd-fs@freebsd.org
Content-Type: text/plain; charset=UTF-8
X-Gm-Message-State: ALoCoQn+rh8h/q9i4Cd34+aYC9zoOCwu6p+X7Otau6M/bfucUsNqY7J1XtDDVlCYfq9fhUft8+IB
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sun, 13 Jan 2013 22:22:47 -0000

Can anyone provide an up to date answer for the following:

If these are all already perfect and correct can you please tell me so?

On 18 December 2012 23:13, Eitan Adler <lists@eitanadler.com> wrote:

> http://www.freebsd.org/doc/en_US.ISO8859-1/books/faq/book.html#ffs-limits
> Are the bugs listed still bugs?
>
> http://www.freebsd.org/doc/en_US.ISO8859-1/books/faq/book.html#mount-foreign-fs
> Is this completely true?  Should it be updated?
>
> http://www.freebsd.org/doc/en_US.ISO8859-1/books/faq/book.html#alternate-directory-layout
> Does this still deserve to be listed as a FAQ?
>
> --
> Eitan Adler


-- 
Eitan Adler

From owner-freebsd-fs@FreeBSD.ORG  Sun Jan 13 23:52:47 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115])
 by hub.freebsd.org (Postfix) with ESMTP id 24D7070F
 for <freebsd-fs@freebsd.org>; Sun, 13 Jan 2013 23:52:47 +0000 (UTC)
 (envelope-from nicolas@i.0x5.de)
Received: from n.0x5.de (n.0x5.de [217.197.85.144])
 by mx1.freebsd.org (Postfix) with ESMTP id B417D700
 for <freebsd-fs@freebsd.org>; Sun, 13 Jan 2013 23:52:46 +0000 (UTC)
Received: by pc5.i.0x5.de (Postfix, from userid 1003)
 id 3Ykvkb3jr4z7ySF; Mon, 14 Jan 2013 00:52:39 +0100 (CET)
Date: Mon, 14 Jan 2013 00:52:39 +0100
From: Nicolas Rachinsky <fbsd-mas-0@ml.turing-complete.org>
To: Steven Hartland <killing@multiplay.co.uk>
Subject: Re: slowdown of zfs (tx->tx)
Message-ID: <20130113235239.GA16318@mid.pc5.i.0x5.de>
References: <20130108174225.GA17260@mid.pc5.i.0x5.de>
 <CAFqOu6jgA8RWV5d+rOBk8D=3Vu3yWSnDkAi1cFJ0esj4OpBy2Q@mail.gmail.com>
 <20130109162613.GA34276@mid.pc5.i.0x5.de>
 <CAFqOu6jrng=v8eVyhqV-PBqJM_dYy+U7X4+=ahBeoxvK4mxcSA@mail.gmail.com>
 <20130110193949.GA10023@mid.pc5.i.0x5.de>
 <20130111111147.GA34160@mid.pc5.i.0x5.de>
 <C3BE7751389046ED8A03C8EA9699A76B@multiplay.co.uk>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <C3BE7751389046ED8A03C8EA9699A76B@multiplay.co.uk>
X-Powered-by: FreeBSD
X-Homepage: http://www.rachinsky.de
X-PGP-Keyid: 887BAE72
X-PGP-Fingerprint: 039E 9433 115F BC5F F88D  4524 5092 45C4 887B AE72
X-PGP-Keys: http://www.rachinsky.de/nicolas/gpg/nicolas_rachinsky.asc
User-Agent: Mutt/1.5.21 (2010-09-15)
Cc: freebsd-fs <freebsd-fs@freebsd.org>
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sun, 13 Jan 2013 23:52:47 -0000

* Steven Hartland <killing@multiplay.co.uk> [2013-01-11 13:58 -0000]:
> TBH looks like your just saturating your disks with the number of IOP's
> your doing.

But now a backup takes forever (16hours and more) that took less than
30 minutes two weeks ago.

I duplicated the complete setup to another (slower) server. There the
backups are slower than they were on this machine, but they are much
faster than on this machine.

Nicolas

-- 
http://www.rachinsky.de/nicolas

From owner-freebsd-fs@FreeBSD.ORG  Mon Jan 14 00:05:36 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by hub.freebsd.org (Postfix) with ESMTP id 865C4A48
 for <freebsd-fs@freebsd.org>; Mon, 14 Jan 2013 00:05:36 +0000 (UTC)
 (envelope-from prvs=1726b28670=killing@multiplay.co.uk)
Received: from mail1.multiplay.co.uk (mail1.multiplay.co.uk [85.236.96.23])
 by mx1.freebsd.org (Postfix) with ESMTP id 215A4791
 for <freebsd-fs@freebsd.org>; Mon, 14 Jan 2013 00:05:35 +0000 (UTC)
Received: from r2d2 ([188.220.16.49])
 by mail1.multiplay.co.uk (mail1.multiplay.co.uk [85.236.96.23])
 (MDaemon PRO v10.0.4) with ESMTP id md50001701803.msg
 for <freebsd-fs@freebsd.org>; Mon, 14 Jan 2013 00:05:33 +0000
X-Spam-Processed: mail1.multiplay.co.uk, Mon, 14 Jan 2013 00:05:33 +0000
 (not processed: message from valid local sender)
X-MDRemoteIP: 188.220.16.49
X-Return-Path: prvs=1726b28670=killing@multiplay.co.uk
X-Envelope-From: killing@multiplay.co.uk
X-MDaemon-Deliver-To: freebsd-fs@freebsd.org
Message-ID: <778BCD159C6546A4ADC7D9FBBFC3DA8A@multiplay.co.uk>
From: "Steven Hartland" <killing@multiplay.co.uk>
To: "Nicolas Rachinsky" <fbsd-mas-0@ml.turing-complete.org>
References: <20130108174225.GA17260@mid.pc5.i.0x5.de>
 <CAFqOu6jgA8RWV5d+rOBk8D=3Vu3yWSnDkAi1cFJ0esj4OpBy2Q@mail.gmail.com>
 <20130109162613.GA34276@mid.pc5.i.0x5.de>
 <CAFqOu6jrng=v8eVyhqV-PBqJM_dYy+U7X4+=ahBeoxvK4mxcSA@mail.gmail.com>
 <20130110193949.GA10023@mid.pc5.i.0x5.de>
 <20130111111147.GA34160@mid.pc5.i.0x5.de>
 <C3BE7751389046ED8A03C8EA9699A76B@multiplay.co.uk>
 <20130113235239.GA16318@mid.pc5.i.0x5.de>
Subject: Re: slowdown of zfs (tx->tx)
Date: Mon, 14 Jan 2013 00:05:54 -0000
MIME-Version: 1.0
Content-Type: text/plain; format=flowed; charset="iso-8859-1";
 reply-type=original
Content-Transfer-Encoding: 7bit
X-Priority: 3
X-MSMail-Priority: Normal
X-Mailer: Microsoft Outlook Express 6.00.2900.5931
X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2900.6157
Cc: freebsd-fs <freebsd-fs@freebsd.org>
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 14 Jan 2013 00:05:36 -0000


----- Original Message ----- 
From: "Nicolas Rachinsky" <fbsd-mas-0@ml.turing-complete.org>
To: "Steven Hartland" <killing@multiplay.co.uk>
Cc: "freebsd-fs" <freebsd-fs@freebsd.org>
Sent: Sunday, January 13, 2013 11:52 PM
Subject: Re: slowdown of zfs (tx->tx)


>* Steven Hartland <killing@multiplay.co.uk> [2013-01-11 13:58 -0000]:
>> TBH looks like your just saturating your disks with the number of IOP's
>> your doing.
> 
> But now a backup takes forever (16hours and more) that took less than
> 30 minutes two weeks ago.
> 
> I duplicated the complete setup to another (slower) server. There the
> backups are slower than they were on this machine, but they are much
> faster than on this machine.

Its not something silly like you have 4k disks which aren't 4k aligned
is it?

IIRC you where using rsync, is there a reason why you aren't using
zfs send - recv?


    Regards
    Steve

================================================
This e.mail is private and confidential between Multiplay (UK) Ltd. and the person or entity to whom it is addressed. In the event of misdirection, the recipient is prohibited from using, copying, printing or otherwise disseminating it or any information contained in it. 

In the event of misdirection, illegible or incomplete transmission please telephone +44 845 868 1337
or return the E.mail to postmaster@multiplay.co.uk.


From owner-freebsd-fs@FreeBSD.ORG  Mon Jan 14 02:38:01 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@FreeBSD.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by hub.freebsd.org (Postfix) with ESMTP id D2EB294D
 for <freebsd-fs@FreeBSD.org>; Mon, 14 Jan 2013 02:38:01 +0000 (UTC)
 (envelope-from brde@optusnet.com.au)
Received: from fallbackmx08.syd.optusnet.com.au
 (fallbackmx08.syd.optusnet.com.au [211.29.132.10])
 by mx1.freebsd.org (Postfix) with ESMTP id 35618D41
 for <freebsd-fs@FreeBSD.org>; Mon, 14 Jan 2013 02:38:00 +0000 (UTC)
Received: from mail09.syd.optusnet.com.au (mail09.syd.optusnet.com.au
 [211.29.132.190])
 by fallbackmx08.syd.optusnet.com.au (8.13.1/8.13.1) with ESMTP id
 r0E2bwBo026710
 for <freebsd-fs@FreeBSD.org>; Mon, 14 Jan 2013 13:37:58 +1100
Received: from c211-30-173-106.carlnfd1.nsw.optusnet.com.au
 (c211-30-173-106.carlnfd1.nsw.optusnet.com.au [211.30.173.106])
 by mail09.syd.optusnet.com.au (8.13.1/8.13.1) with ESMTP id r0E2bjxc027910
 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO);
 Mon, 14 Jan 2013 13:37:48 +1100
Date: Mon, 14 Jan 2013 13:37:45 +1100 (EST)
From: Bruce Evans <brde@optusnet.com.au>
X-X-Sender: bde@besplex.bde.org
To: Eitan Adler <lists@eitanadler.com>
Subject: Re: What are the limits for FFS file systems and assorted questions
In-Reply-To: <CAF6rxgkqO_zLQh3Syg=td4EkfPVn4+GtB015+SYvW1Kz-1kC1A@mail.gmail.com>
Message-ID: <20130114120607.T1405@besplex.bde.org>
References: <CAF6rxgnv19VU6_v8qCV2saMBTBaykPss6A=_x=ZPv3fg0+f7nw@mail.gmail.com>
 <CAF6rxgkqO_zLQh3Syg=td4EkfPVn4+GtB015+SYvW1Kz-1kC1A@mail.gmail.com>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed
X-Optus-CM-Score: 0
X-Optus-CM-Analysis: v=2.0 cv=Zty1sKHG c=1 sm=1 a=kfiv9bC3p58A:10
 a=kj9zAlcOel0A:10 a=PO7r1zJSAAAA:8 a=JzwRw_2MAAAA:8 a=VR0RDpC6nxIA:10
 a=uyavkMrdAAAA:8 a=6I5d2MoRAAAA:8 a=F9O_ex-_ffWYsS3ENVsA:9
 a=CjuIK1q_8ugA:10 a=CA2XKIK3M3QA:10 a=JGX6LFFZUg8A:10 a=JXfXW3mLlncKNhj-:21
 a=24pwTRVnVc0i8OkX:21 a=TEtd8y5WR3g2ypngnwZWYw==:117
Cc: freebsd-fs@FreeBSD.org
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 14 Jan 2013 02:38:01 -0000

On Sun, 13 Jan 2013, Eitan Adler wrote:

> Can anyone provide an up to date answer for the following:
>
> If these are all already perfect and correct can you please tell me so?
>
> On 18 December 2012 23:13, Eitan Adler <lists@eitanadler.com> wrote:
>
>> http://www.freebsd.org/doc/en_US.ISO8859-1/books/faq/book.html#ffs-limits
>> Are the bugs listed still bugs?

This almost useless, since it pre-dates ffs2.

It seems to be derived from something I wrote in a mailing list.  The
"Should Work" column in the table is now implemented, but it is only
for ffs1 and is buggy for a block sizes of 8K (the limit should be
16TB, not 32TB).  The wording of the descriptions could be improved.

All related known bugs for ffs1, including the ones described there,
were fixed 5-15 years ago.  But recent work on ext2fs showed a new
one.  A very minor one that only recently became reachable, but it has
been fixed in Linux-ext2fs: there is a block count (di_nblocks in
ffs[1-2]) that is only 32 bits in ffs1 and in ext2fs (actually it only
has 31 bits in ffs1 and in FreeBSD-ext2fs, since it is signed).  Fs
block numbers in these fs's are also 32 (or 31) bits, but this block
counter doesn't suffice for counting them because it has units of
512-blocks while fs block numbers have larger units.  When this block
counter overflows, the only (?) thing broken is st_nblocks in stat(2).
One way of fixing this is to limit the file size to 1TB - 1.  This
would also simplify describing the limit.  This is only a serious
restriction for sparse files.  With the default block size of 32K,
ffs1 can only handle file systems of size 64TB.  It can only handle 1
non-sparse file of size nearly 64TB, or 63 non-sparse files of size
1TB-1.  It is now barely reasonable to have non-sparse files
of these sizes, but systems with such files probably wouldn't be using
ffs1.  Sparse files are more interesting.  You can fit a large number
of sparse files of size 64TB-1 in on a file system of size just a few
GB, and also write them in less than a day or two.  Also, the
potentially-overflowing block counter is for physical blocks, so it
can't overflow for fairly sparse files.  Thus restricting the file
size to 1TB-1 would break some cases unnecessarily.

ffs2 generally gives much larger limits for file system sizes but
halves the limits for file sizes (since block numbers are twice as
large, the block size must be twice as large to fit the same number
of block numbers in an indirect block).

>> http://www.freebsd.org/doc/en_US.ISO8859-1/books/faq/book.html#mount-foreign-fs
>> Is this completely true?  Should it be updated?

This doesn't give much detail, so there is less to go wrong in it.

>> http://www.freebsd.org/doc/en_US.ISO8859-1/books/faq/book.html#alternate-directory-layout
>> Does this still deserve to be listed as a FAQ?

I think it never did, since it is about a technical problem that can't
really be solved outside of the file system, especially with today's
disk sizes allowing 10's if not thousands as many files as when it was
written in 1998, or thousands if not millions as many files as when ffs
was written in ~1983.  With millions of files, you just can't make
much difference with a few changes to the directory layout.  It was
written by mckusick in 1998, so it is also out-of-date with respect to
the better layout policies that he implemented in ffs in 2001.

BTW, cp(1) still has bogus sorting related to this.  It sorts files
so that non-directory files are copied before directory files, because
it knows too much about ffs's internals and about ffs being the only
file system.  Perhaps this is still good if the file system is ffs,
but I think it is better to preserve any existing order that you get
from the command line or from a directory traversal (use fts and specify
pre- or post-order).  But the sorting function is of low quality and
tends to destroy any existing order:
- it uses qsort(), which gives an unstable sort for items that compare
   equal
- everything except directories vs non-directories compares equal.
The result is that if you have a perfectly sorted list on the command
line, say consisting of all regular files in alphabetical order, then
the order is very unstable.  Except the instability is very stable --
it is usually close to a perfect inversion of the order.  Anyway, this
instability makes it impossible to either preserve existing orders in
file hierarchies or to specify optimal orders on the command line.

Bruce

From owner-freebsd-fs@FreeBSD.ORG  Mon Jan 14 09:40:13 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by hub.freebsd.org (Postfix) with ESMTP id 64E59E48;
 Mon, 14 Jan 2013 09:40:13 +0000 (UTC)
 (envelope-from nicolas@i.0x5.de)
Received: from n.0x5.de (n.0x5.de [217.197.85.144])
 by mx1.freebsd.org (Postfix) with ESMTP id C4AD6F66;
 Mon, 14 Jan 2013 09:40:11 +0000 (UTC)
Received: by pc5.i.0x5.de (Postfix, from userid 1003)
 id 3Yl8mV0wr0z7yTR; Mon, 14 Jan 2013 10:40:10 +0100 (CET)
Date: Mon, 14 Jan 2013 10:40:10 +0100
From: Nicolas Rachinsky <fbsd-mas-0@ml.turing-complete.org>
To: Artem Belevich <art@freebsd.org>
Subject: Re: slowdown of zfs (tx->tx)
Message-ID: <20130114094010.GA75529@mid.pc5.i.0x5.de>
References: <20130108174225.GA17260@mid.pc5.i.0x5.de>
 <CAFqOu6jgA8RWV5d+rOBk8D=3Vu3yWSnDkAi1cFJ0esj4OpBy2Q@mail.gmail.com>
 <20130109162613.GA34276@mid.pc5.i.0x5.de>
 <CAFqOu6jrng=v8eVyhqV-PBqJM_dYy+U7X4+=ahBeoxvK4mxcSA@mail.gmail.com>
 <20130110193949.GA10023@mid.pc5.i.0x5.de>
 <20130111073417.GA95100@mid.pc5.i.0x5.de>
 <CAFqOu6gWpMsWN0pTBiv10WfwyGWMfO9GzMLWTtcVxHixr-_i3Q@mail.gmail.com>
MIME-Version: 1.0
Content-Type: multipart/mixed; boundary="6TrnltStXW4iwmi0"
Content-Disposition: inline
In-Reply-To: <CAFqOu6gWpMsWN0pTBiv10WfwyGWMfO9GzMLWTtcVxHixr-_i3Q@mail.gmail.com>
X-Powered-by: FreeBSD
X-Homepage: http://www.rachinsky.de
X-PGP-Keyid: 887BAE72
X-PGP-Fingerprint: 039E 9433 115F BC5F F88D  4524 5092 45C4 887B AE72
X-PGP-Keys: http://www.rachinsky.de/nicolas/gpg/nicolas_rachinsky.asc
User-Agent: Mutt/1.5.21 (2010-09-15)
Cc: freebsd-fs <freebsd-fs@freebsd.org>
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 14 Jan 2013 09:40:13 -0000


--6TrnltStXW4iwmi0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline

* Artem Belevich <art@freebsd.org> [2013-01-11 12:39 -0800]:
> On Thu, Jan 10, 2013 at 11:34 PM, Nicolas Rachinsky
> <fbsd-mas-0@ml.turing-complete.org> wrote:
> > * Nicolas Rachinsky <fbsd-mas-0@ml.turing-complete.org> [2013-01-10 20:39 +0100]:
> >> after replacing one of the controllers, all problems seem to have
> >> disappeared. Thank you very much for your advice!
> >
> > Now the problem is back.
> >
> > After changing the controller, there were no more timeouts logged.
> >
> > No UDMA_CRC_Error_Count changed.
> >
> 
> Is there anything special about ada8? It does seem to have noticeably
> higher service time compared to other disks.

Nothing I know of. The disks are Samsung HD103UJ and HD103SI, multiple
of each type.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x000f   100   100   051    Pre-fail  Always       -       0
  3 Spin_Up_Time            0x0007   073   073   011    Pre-fail  Always       -       8890
  4 Start_Stop_Count        0x0032   100   100   000    Old_age   Always       -       32
  5 Reallocated_Sector_Ct   0x0033   094   094   010    Pre-fail  Always       -       166
  7 Seek_Error_Rate         0x000f   100   100   051    Pre-fail  Always       -       0
  8 Seek_Time_Performance   0x0025   100   100   015    Pre-fail  Offline      -       10872
  9 Power_On_Hours          0x0032   099   099   000    Old_age   Always       -       5688
 10 Spin_Retry_Count        0x0033   100   100   051    Pre-fail  Always       -       0
 11 Calibration_Retry_Count 0x0012   100   100   000    Old_age   Always       -       0
 12 Power_Cycle_Count       0x0032   100   100   000    Old_age   Always       -       31
 13 Read_Soft_Error_Rate    0x000e   100   100   000    Old_age   Always       -       0
183 Runtime_Bad_Block       0x0032   100   100   000    Old_age   Always       -       0
184 End-to-End_Error        0x0033   100   100   000    Pre-fail  Always       -       0
187 Reported_Uncorrect      0x0032   100   100   000    Old_age   Always       -       0
188 Command_Timeout         0x0032   100   100   000    Old_age   Always       -       0
190 Airflow_Temperature_Cel 0x0022   078   069   000    Old_age   Always       -       22 (Min/Max 21/25)
194 Temperature_Celsius     0x0022   077   067   000    Old_age   Always       -       23 (Min/Max 21/26)
195 Hardware_ECC_Recovered  0x001a   100   100   000    Old_age   Always       -       1259614646
196 Reallocated_Event_Count 0x0032   096   096   000    Old_age   Always       -       166
197 Current_Pending_Sector  0x0012   100   100   000    Old_age   Always       -       0
198 Offline_Uncorrectable   0x0030   100   100   000    Old_age   Offline      -       0
199 UDMA_CRC_Error_Count    0x003e   100   100   000    Old_age   Always       -       0
200 Multi_Zone_Error_Rate   0x000a   100   099   000    Old_age   Always       -       5
201 Soft_Read_Error_Rate    0x000a   100   100   000    Old_age   Always       -       0

Reallocated_Sector_Ct did not increase during the last days.


> Cound you do gstat with 1-second interval. Some of the 5-second
> samples show that ada8 is the bottleneck -- it has its request queue
> full (L(q)=10) when all other drives were done with their jobs. And
> that's a 5-sec average. Its write service time also seems to be a lot
> higher than for other drives.

Attached.  I have replace ada8 by ada9, which is a Western Digital
Caviar Black.

Now ada0 and ada4 seem to be the bottleneck.

But I don't understand the intervalls without any disk activity.

> Does the drive have its write cache disabled by any chance? That could
> explain why it takes so much longer to service writes.

No, camcontrol identify says it's enabled.

> Can you remove ada8 and see if your performance go back to normal?

The problem still persists.

Thank you for your help!

Nicolas
-- 
http://www.rachinsky.de/nicolas

--6TrnltStXW4iwmi0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: attachment; filename="gstat.txt"

dT: 1.001s  w: 1.000s
 L(q)  ops/s    r/s   kBps   ms/r    w/s   kBps   ms/w   %busy Name
    0      0      0      0    0.0      0      0    0.0    0.0  ad4
    0      0      0      0    0.0      0      0    0.0    0.0  ad4s1
    0      0      0      0    0.0      0      0    0.0    0.0  ad4s1a
    0      0      0      0    0.0      0      0    0.0    0.0  ad4s1b
    0      0      0      0    0.0      0      0    0.0    0.0  ad6
    0      0      0      0    0.0      0      0    0.0    0.0  ad6s1
    0      0      0      0    0.0      0      0    0.0    0.0  ada0
    0      0      0      0    0.0      0      0    0.0    0.0  ada1
    0      0      0      0    0.0      0      0    0.0    0.0  ada2
    0      0      0      0    0.0      0      0    0.0    0.0  ada3
    0      0      0      0    0.0      0      0    0.0    0.0  ada4
    0      0      0      0    0.0      0      0    0.0    0.0  ada5
    0      0      0      0    0.0      0      0    0.0    0.0  ada6
    0      0      0      0    0.0      0      0    0.0    0.0  ada7
    0      0      0      0    0.0      0      0    0.0    0.0  ada8
    0      0      0      0    0.0      0      0    0.0    0.0  ada9
    0      0      0      0    0.0      0      0    0.0    0.0  ad6s1a
    0      0      0      0    0.0      0      0    0.0    0.0  ad6s1b
    0      0      0      0    0.0      0      0    0.0    0.0  mirror/ROOT121027
    0      0      0      0    0.0      0      0    0.0    0.0  mirror/SWAP121027
    0      0      0      0    0.0      0      0    0.0    0.0  mirror/ROOT121027.journal
dT: 1.001s  w: 1.000s
 L(q)  ops/s    r/s   kBps   ms/r    w/s   kBps   ms/w   %busy Name
    0      0      0      0    0.0      0      0    0.0    0.0  ad4
    0      0      0      0    0.0      0      0    0.0    0.0  ad4s1
    0      0      0      0    0.0      0      0    0.0    0.0  ad4s1a
    0      0      0      0    0.0      0      0    0.0    0.0  ad4s1b
    0      0      0      0    0.0      0      0    0.0    0.0  ad6
    0      0      0      0    0.0      0      0    0.0    0.0  ad6s1
    0     46      0      0    0.0     46     33    0.9    0.9  ada0
    0     47      0      0    0.0     47     32    3.3    1.9  ada1
    0     47      0      0    0.0     47     32    3.3    1.9  ada2
    0     47      0      0    0.0     47     32    3.3    1.9  ada3
    0     49      0      0    0.0     49     33    0.9    0.9  ada4
    0     49      0      0    0.0     49     33    3.6    2.1  ada5
    0     46      0      0    0.0     46     33    1.0    0.9  ada6
    0     46      0      0    0.0     46     33    0.8    0.8  ada7
    0      0      0      0    0.0      0      0    0.0    0.0  ada8
    0     49      0      0    0.0     49     33    0.8    0.8  ada9
    0      0      0      0    0.0      0      0    0.0    0.0  ad6s1a
    0      0      0      0    0.0      0      0    0.0    0.0  ad6s1b
    0      0      0      0    0.0      0      0    0.0    0.0  mirror/ROOT121027
    0      0      0      0    0.0      0      0    0.0    0.0  mirror/SWAP121027
    0      0      0      0    0.0      0      0    0.0    0.0  mirror/ROOT121027.journal
dT: 1.001s  w: 1.000s
 L(q)  ops/s    r/s   kBps   ms/r    w/s   kBps   ms/w   %busy Name
    0      0      0      0    0.0      0      0    0.0    0.0  ad4
    0      0      0      0    0.0      0      0    0.0    0.0  ad4s1
    0      0      0      0    0.0      0      0    0.0    0.0  ad4s1a
    0      0      0      0    0.0      0      0    0.0    0.0  ad4s1b
    0      0      0      0    0.0      0      0    0.0    0.0  ad6
    0      0      0      0    0.0      0      0    0.0    0.0  ad6s1
   10    429      0      0    0.0    429    997    2.5   18.1  ada0
    0    458     21     27    1.2    437    994    3.1   17.7  ada1
    0    406      0      0    0.0    406    988    2.6   14.6  ada2
    0    427      0      0    0.0    427    989    2.0   12.5  ada3
   10    335      0      0    0.0    335    938    4.1   22.9  ada4
    0    419      0      0    0.0    419    990    2.1   11.9  ada5
    0    434      0      0    0.0    434   1005    2.1   13.1  ada6
    0    486     25    133    5.2    461   1006    1.4   12.1  ada7
    0      0      0      0    0.0      0      0    0.0    0.0  ada8
    0    441     20     35    6.2    421    994    1.4   13.1  ada9
    0      0      0      0    0.0      0      0    0.0    0.0  ad6s1a
    0      0      0      0    0.0      0      0    0.0    0.0  ad6s1b
    0      0      0      0    0.0      0      0    0.0    0.0  mirror/ROOT121027
    0      0      0      0    0.0      0      0    0.0    0.0  mirror/SWAP121027
    0      0      0      0    0.0      0      0    0.0    0.0  mirror/ROOT121027.journal
dT: 1.002s  w: 1.000s
 L(q)  ops/s    r/s   kBps   ms/r    w/s   kBps   ms/w   %busy Name
    0      0      0      0    0.0      0      0    0.0    0.0  ad4
    0      0      0      0    0.0      0      0    0.0    0.0  ad4s1
    0      0      0      0    0.0      0      0    0.0    0.0  ad4s1a
    0      0      0      0    0.0      0      0    0.0    0.0  ad4s1b
    0      0      0      0    0.0      0      0    0.0    0.0  ad6
    0      0      0      0    0.0      0      0    0.0    0.0  ad6s1
    0    275      0      0    0.0    274    334   12.7   56.6  ada0
    0    278     11     14    3.0    266    308    2.0   28.7  ada1
    0    305      0      0    0.0    303    315    1.5   28.7  ada2
    0    303      0      0    0.0    301    311    1.6   14.7  ada3
    0    311      0      0    0.0    309    375   15.4   69.2  ada4
    0    285      0      0    0.0    283    310    2.1   15.8  ada5
    0    282      0      0    0.0    280    306    1.7   18.6  ada6
    0    307     11     17    2.3    294    318    1.0   19.1  ada7
    0      0      0      0    0.0      0      0    0.0    0.0  ada8
    0    329      9      6    0.5    318    312    0.7   12.4  ada9
    0      0      0      0    0.0      0      0    0.0    0.0  ad6s1a
    0      0      0      0    0.0      0      0    0.0    0.0  ad6s1b
    0      0      0      0    0.0      0      0    0.0    0.0  mirror/ROOT121027
    0      0      0      0    0.0      0      0    0.0    0.0  mirror/SWAP121027
    0      0      0      0    0.0      0      0    0.0    0.0  mirror/ROOT121027.journal
dT: 1.000s  w: 1.000s
 L(q)  ops/s    r/s   kBps   ms/r    w/s   kBps   ms/w   %busy Name
    0      0      0      0    0.0      0      0    0.0    0.0  ad4
    0      0      0      0    0.0      0      0    0.0    0.0  ad4s1
    0      0      0      0    0.0      0      0    0.0    0.0  ad4s1a
    0      0      0      0    0.0      0      0    0.0    0.0  ad4s1b
    0      0      0      0    0.0      0      0    0.0    0.0  ad6
    0      0      0      0    0.0      0      0    0.0    0.0  ad6s1
    0      0      0      0    0.0      0      0    0.0    0.0  ada0
    0      0      0      0    0.0      0      0    0.0    0.0  ada1
    0      0      0      0    0.0      0      0    0.0    0.0  ada2
    0      0      0      0    0.0      0      0    0.0    0.0  ada3
    0      0      0      0    0.0      0      0    0.0    0.0  ada4
    0      0      0      0    0.0      0      0    0.0    0.0  ada5
    0      0      0      0    0.0      0      0    0.0    0.0  ada6
    0      0      0      0    0.0      0      0    0.0    0.0  ada7
    0      0      0      0    0.0      0      0    0.0    0.0  ada8
    0      0      0      0    0.0      0      0    0.0    0.0  ada9
    0      0      0      0    0.0      0      0    0.0    0.0  ad6s1a
    0      0      0      0    0.0      0      0    0.0    0.0  ad6s1b
    0      0      0      0    0.0      0      0    0.0    0.0  mirror/ROOT121027
    0      0      0      0    0.0      0      0    0.0    0.0  mirror/SWAP121027
    0      0      0      0    0.0      0      0    0.0    0.0  mirror/ROOT121027.journal
dT: 1.000s  w: 1.000s
 L(q)  ops/s    r/s   kBps   ms/r    w/s   kBps   ms/w   %busy Name
    0      0      0      0    0.0      0      0    0.0    0.0  ad4
    0      0      0      0    0.0      0      0    0.0    0.0  ad4s1
    0      0      0      0    0.0      0      0    0.0    0.0  ad4s1a
    0      0      0      0    0.0      0      0    0.0    0.0  ad4s1b
    0      0      0      0    0.0      0      0    0.0    0.0  ad6
    0      0      0      0    0.0      0      0    0.0    0.0  ad6s1
    0     23      7      7    0.2     16     21    1.3    0.6  ada0
    0     29     15     39    7.8     14     20    1.6    3.6  ada1
    0     17      2      2    0.2     15     20    1.4    0.6  ada2
    0     16      2      2    0.2     14     19    1.4    0.6  ada3
    0     19      5      5    0.2     14     19    1.2    0.5  ada4
    0     19      5      5    0.2     14     19    1.2    0.5  ada5
    0     23      7      7    0.2     16     22    1.2    0.6  ada6
    0     29     13      9    0.2     16     20    1.1    0.6  ada7
    0      0      0      0    0.0      0      0    0.0    0.0  ada8
    0     28     14     42   11.8     14     21    1.0   11.5  ada9
    0      0      0      0    0.0      0      0    0.0    0.0  ad6s1a
    0      0      0      0    0.0      0      0    0.0    0.0  ad6s1b
    0      0      0      0    0.0      0      0    0.0    0.0  mirror/ROOT121027
    0      0      0      0    0.0      0      0    0.0    0.0  mirror/SWAP121027
    0      0      0      0    0.0      0      0    0.0    0.0  mirror/ROOT121027.journal
dT: 1.001s  w: 1.000s
 L(q)  ops/s    r/s   kBps   ms/r    w/s   kBps   ms/w   %busy Name
    0      0      0      0    0.0      0      0    0.0    0.0  ad4
    0      0      0      0    0.0      0      0    0.0    0.0  ad4s1
    0      0      0      0    0.0      0      0    0.0    0.0  ad4s1a
    0      0      0      0    0.0      0      0    0.0    0.0  ad4s1b
    0      0      0      0    0.0      0      0    0.0    0.0  ad6
    0      0      0      0    0.0      0      0    0.0    0.0  ad6s1
    0      0      0      0    0.0      0      0    0.0    0.0  ada0
    0      0      0      0    0.0      0      0    0.0    0.0  ada1
    0      0      0      0    0.0      0      0    0.0    0.0  ada2
    0      0      0      0    0.0      0      0    0.0    0.0  ada3
    0      0      0      0    0.0      0      0    0.0    0.0  ada4
    0      0      0      0    0.0      0      0    0.0    0.0  ada5
    0      0      0      0    0.0      0      0    0.0    0.0  ada6
    0      0      0      0    0.0      0      0    0.0    0.0  ada7
    0      0      0      0    0.0      0      0    0.0    0.0  ada8
    0      0      0      0    0.0      0      0    0.0    0.0  ada9
    0      0      0      0    0.0      0      0    0.0    0.0  ad6s1a
    0      0      0      0    0.0      0      0    0.0    0.0  ad6s1b
    0      0      0      0    0.0      0      0    0.0    0.0  mirror/ROOT121027
    0      0      0      0    0.0      0      0    0.0    0.0  mirror/SWAP121027
    0      0      0      0    0.0      0      0    0.0    0.0  mirror/ROOT121027.journal
dT: 1.001s  w: 1.000s
 L(q)  ops/s    r/s   kBps   ms/r    w/s   kBps   ms/w   %busy Name
    0      0      0      0    0.0      0      0    0.0    0.0  ad4
    0      0      0      0    0.0      0      0    0.0    0.0  ad4s1
    0      0      0      0    0.0      0      0    0.0    0.0  ad4s1a
    0      0      0      0    0.0      0      0    0.0    0.0  ad4s1b
    0      0      0      0    0.0      0      0    0.0    0.0  ad6
    0      0      0      0    0.0      0      0    0.0    0.0  ad6s1
    0      0      0      0    0.0      0      0    0.0    0.0  ada0
    0      0      0      0    0.0      0      0    0.0    0.0  ada1
    0      0      0      0    0.0      0      0    0.0    0.0  ada2
    0      0      0      0    0.0      0      0    0.0    0.0  ada3
    0      0      0      0    0.0      0      0    0.0    0.0  ada4
    0      0      0      0    0.0      0      0    0.0    0.0  ada5
    0      0      0      0    0.0      0      0    0.0    0.0  ada6
    0      0      0      0    0.0      0      0    0.0    0.0  ada7
    0      0      0      0    0.0      0      0    0.0    0.0  ada8
    0      0      0      0    0.0      0      0    0.0    0.0  ada9
    0      0      0      0    0.0      0      0    0.0    0.0  ad6s1a
    0      0      0      0    0.0      0      0    0.0    0.0  ad6s1b
    0      0      0      0    0.0      0      0    0.0    0.0  mirror/ROOT121027
    0      0      0      0    0.0      0      0    0.0    0.0  mirror/SWAP121027
    0      0      0      0    0.0      0      0    0.0    0.0  mirror/ROOT121027.journal
dT: 1.002s  w: 1.000s
 L(q)  ops/s    r/s   kBps   ms/r    w/s   kBps   ms/w   %busy Name
    0      0      0      0    0.0      0      0    0.0    0.0  ad4
    0      0      0      0    0.0      0      0    0.0    0.0  ad4s1
    0      0      0      0    0.0      0      0    0.0    0.0  ad4s1a
    0      0      0      0    0.0      0      0    0.0    0.0  ad4s1b
    0      0      0      0    0.0      0      0    0.0    0.0  ad6
    0      0      0      0    0.0      0      0    0.0    0.0  ad6s1
    0    142      0      0    0.0    142    624    2.6    7.8  ada0
    0    148      0      0    0.0    148    628    3.4    9.2  ada1
    0    147      0      0    0.0    147    634    2.1    7.2  ada2
    0    148      0      0    0.0    148    629    2.3    7.6  ada3
    0    146      0      0    0.0    146    633    1.7    6.6  ada4
    5    140      0      0    0.0    140    623    3.2    8.6  ada5
    0    149      0      0    0.0    149    634    1.8    6.9  ada6
    0    142      0      0    0.0    142    624    1.3    6.1  ada7
    0      0      0      0    0.0      0      0    0.0    0.0  ada8
    0    146      0      0    0.0    146    627    1.4    6.2  ada9
    0      0      0      0    0.0      0      0    0.0    0.0  ad6s1a
    0      0      0      0    0.0      0      0    0.0    0.0  ad6s1b
    0      0      0      0    0.0      0      0    0.0    0.0  mirror/ROOT121027
    0      0      0      0    0.0      0      0    0.0    0.0  mirror/SWAP121027
    0      0      0      0    0.0      0      0    0.0    0.0  mirror/ROOT121027.journal
dT: 1.000s  w: 1.000s
 L(q)  ops/s    r/s   kBps   ms/r    w/s   kBps   ms/w   %busy Name
    0      0      0      0    0.0      0      0    0.0    0.0  ad4
    0      0      0      0    0.0      0      0    0.0    0.0  ad4s1
    0      0      0      0    0.0      0      0    0.0    0.0  ad4s1a
    0      0      0      0    0.0      0      0    0.0    0.0  ad4s1b
    0      0      0      0    0.0      0      0    0.0    0.0  ad6
    0      0      0      0    0.0      0      0    0.0    0.0  ad6s1
   10    842      0      0    0.0    842    714    8.1   80.0  ada0
    0    872     42     85    8.3    830    715    2.6   42.4  ada1
    0    943      0      0    0.0    943    764    1.3   18.3  ada2
    0    954      0      0    0.0    954    773    1.5   20.0  ada3
   10    815      0      0    0.0    815    700    7.8   73.2  ada4
    0    935      0      0    0.0    935    750    1.6   21.4  ada5
    0    880      0      0    0.0    880    753    3.4   40.4  ada6
    0    910     46    133    8.6    864    704    1.4   25.7  ada7
    0      0      0      0    0.0      0      0    0.0    0.0  ada8
    0    925     44     71    7.7    881    710    1.8   35.9  ada9
    0      0      0      0    0.0      0      0    0.0    0.0  ad6s1a
    0      0      0      0    0.0      0      0    0.0    0.0  ad6s1b
    0      0      0      0    0.0      0      0    0.0    0.0  mirror/ROOT121027
    0      0      0      0    0.0      0      0    0.0    0.0  mirror/SWAP121027
    0      0      0      0    0.0      0      0    0.0    0.0  mirror/ROOT121027.journal
dT: 1.001s  w: 1.000s
 L(q)  ops/s    r/s   kBps   ms/r    w/s   kBps   ms/w   %busy Name
    0      0      0      0    0.0      0      0    0.0    0.0  ad4
    0      0      0      0    0.0      0      0    0.0    0.0  ad4s1
    0      0      0      0    0.0      0      0    0.0    0.0  ad4s1a
    0      0      0      0    0.0      0      0    0.0    0.0  ad4s1b
    0      0      0      0    0.0      0      0    0.0    0.0  ad6
    0      0      0      0    0.0      0      0    0.0    0.0  ad6s1
    0     53      0      0    0.0     51     49   19.0   33.0  ada0
    0      6      0      0    0.0      4      4    0.4   30.9  ada1
    0      6      0      0    0.0      4      4    0.3   29.2  ada2
    0      6      0      0    0.0      4      4    0.3    5.1  ada3
    0     41      0      0    0.0     39     38   24.8   32.6  ada4
    0      6      0      0    0.0      4      4    0.4    7.6  ada5
    0      6      0      0    0.0      4      4    0.3   14.9  ada6
    0      6      0      0    0.0      4      4    0.2   10.2  ada7
    0      0      0      0    0.0      0      0    0.0    0.0  ada8
    0      6      0      0    0.0      4      4    0.5    9.7  ada9
    0      0      0      0    0.0      0      0    0.0    0.0  ad6s1a
    0      0      0      0    0.0      0      0    0.0    0.0  ad6s1b
    0      0      0      0    0.0      0      0    0.0    0.0  mirror/ROOT121027
    0      0      0      0    0.0      0      0    0.0    0.0  mirror/SWAP121027
    0      0      0      0    0.0      0      0    0.0    0.0  mirror/ROOT121027.journal
dT: 1.000s  w: 1.000s
 L(q)  ops/s    r/s   kBps   ms/r    w/s   kBps   ms/w   %busy Name
    0      0      0      0    0.0      0      0    0.0    0.0  ad4
    0      0      0      0    0.0      0      0    0.0    0.0  ad4s1
    0      0      0      0    0.0      0      0    0.0    0.0  ad4s1a
    0      0      0      0    0.0      0      0    0.0    0.0  ad4s1b
    0      0      0      0    0.0      0      0    0.0    0.0  ad6
    0      0      0      0    0.0      0      0    0.0    0.0  ad6s1
    0      0      0      0    0.0      0      0    0.0    0.0  ada0
    0      0      0      0    0.0      0      0    0.0    0.0  ada1
    0      0      0      0    0.0      0      0    0.0    0.0  ada2
    0      0      0      0    0.0      0      0    0.0    0.0  ada3
    0      0      0      0    0.0      0      0    0.0    0.0  ada4
    0      0      0      0    0.0      0      0    0.0    0.0  ada5
    0      0      0      0    0.0      0      0    0.0    0.0  ada6
    0      0      0      0    0.0      0      0    0.0    0.0  ada7
    0      0      0      0    0.0      0      0    0.0    0.0  ada8
    0      0      0      0    0.0      0      0    0.0    0.0  ada9
    0      0      0      0    0.0      0      0    0.0    0.0  ad6s1a
    0      0      0      0    0.0      0      0    0.0    0.0  ad6s1b
    0      0      0      0    0.0      0      0    0.0    0.0  mirror/ROOT121027
    0      0      0      0    0.0      0      0    0.0    0.0  mirror/SWAP121027
    0      0      0      0    0.0      0      0    0.0    0.0  mirror/ROOT121027.journal
dT: 1.000s  w: 1.000s
 L(q)  ops/s    r/s   kBps   ms/r    w/s   kBps   ms/w   %busy Name
    0      0      0      0    0.0      0      0    0.0    0.0  ad4
    0      0      0      0    0.0      0      0    0.0    0.0  ad4s1
    0      0      0      0    0.0      0      0    0.0    0.0  ad4s1a
    0      0      0      0    0.0      0      0    0.0    0.0  ad4s1b
    0      0      0      0    0.0      0      0    0.0    0.0  ad6
    0      0      0      0    0.0      0      0    0.0    0.0  ad6s1
    0     19      6      6    3.4     13     19    1.5    2.6  ada0
    0     28     14     41   12.1     14     21    1.7   10.8  ada1
    0     15      1      1    8.7     14     21    1.4    1.4  ada2
    0     15      1      1    0.2     14     21    1.6    0.6  ada3
    0     19      5      5    4.4     14     21    1.5    2.8  ada4
    0     19      5      5    3.7     14     21    1.7    2.4  ada5
    0     20      6      6    2.7     14     21    1.5    2.2  ada6
    0     26     13      9    7.9     13     18    1.4    4.4  ada7
    0      0      0      0    0.0      0      0    0.0    0.0  ada8
    0     27     14      9   17.8     13     20    1.4   10.3  ada9
    0      0      0      0    0.0      0      0    0.0    0.0  ad6s1a
    0      0      0      0    0.0      0      0    0.0    0.0  ad6s1b
    0      0      0      0    0.0      0      0    0.0    0.0  mirror/ROOT121027
    0      0      0      0    0.0      0      0    0.0    0.0  mirror/SWAP121027
    0      0      0      0    0.0      0      0    0.0    0.0  mirror/ROOT121027.journal
dT: 1.002s  w: 1.000s
 L(q)  ops/s    r/s   kBps   ms/r    w/s   kBps   ms/w   %busy Name
    0      0      0      0    0.0      0      0    0.0    0.0  ad4
    0      0      0      0    0.0      0      0    0.0    0.0  ad4s1
    0      0      0      0    0.0      0      0    0.0    0.0  ad4s1a
    0      0      0      0    0.0      0      0    0.0    0.0  ad4s1b
    0      0      0      0    0.0      0      0    0.0    0.0  ad6
    0      0      0      0    0.0      0      0    0.0    0.0  ad6s1
    0      0      0      0    0.0      0      0    0.0    0.0  ada0
    0      0      0      0    0.0      0      0    0.0    0.0  ada1
    0      0      0      0    0.0      0      0    0.0    0.0  ada2
    0      0      0      0    0.0      0      0    0.0    0.0  ada3
    0      0      0      0    0.0      0      0    0.0    0.0  ada4
    0      0      0      0    0.0      0      0    0.0    0.0  ada5
    0      0      0      0    0.0      0      0    0.0    0.0  ada6
    0      0      0      0    0.0      0      0    0.0    0.0  ada7
    0      0      0      0    0.0      0      0    0.0    0.0  ada8
    0      0      0      0    0.0      0      0    0.0    0.0  ada9
    0      0      0      0    0.0      0      0    0.0    0.0  ad6s1a
    0      0      0      0    0.0      0      0    0.0    0.0  ad6s1b
    0      0      0      0    0.0      0      0    0.0    0.0  mirror/ROOT121027
    0      0      0      0    0.0      0      0    0.0    0.0  mirror/SWAP121027
    0      0      0      0    0.0      0      0    0.0    0.0  mirror/ROOT121027.journal
dT: 1.001s  w: 1.000s
 L(q)  ops/s    r/s   kBps   ms/r    w/s   kBps   ms/w   %busy Name
    0      0      0      0    0.0      0      0    0.0    0.0  ad4
    0      0      0      0    0.0      0      0    0.0    0.0  ad4s1
    0      0      0      0    0.0      0      0    0.0    0.0  ad4s1a
    0      0      0      0    0.0      0      0    0.0    0.0  ad4s1b
    0      0      0      0    0.0      0      0    0.0    0.0  ad6
    0      0      0      0    0.0      0      0    0.0    0.0  ad6s1
    0      0      0      0    0.0      0      0    0.0    0.0  ada0
    0      0      0      0    0.0      0      0    0.0    0.0  ada1
    0      0      0      0    0.0      0      0    0.0    0.0  ada2
    0      0      0      0    0.0      0      0    0.0    0.0  ada3
    0      0      0      0    0.0      0      0    0.0    0.0  ada4
    0      0      0      0    0.0      0      0    0.0    0.0  ada5
    0      0      0      0    0.0      0      0    0.0    0.0  ada6
    0      0      0      0    0.0      0      0    0.0    0.0  ada7
    0      0      0      0    0.0      0      0    0.0    0.0  ada8
    0      0      0      0    0.0      0      0    0.0    0.0  ada9
    0      0      0      0    0.0      0      0    0.0    0.0  ad6s1a
    0      0      0      0    0.0      0      0    0.0    0.0  ad6s1b
    0      0      0      0    0.0      0      0    0.0    0.0  mirror/ROOT121027
    0      0      0      0    0.0      0      0    0.0    0.0  mirror/SWAP121027
    0      0      0      0    0.0      0      0    0.0    0.0  mirror/ROOT121027.journal
dT: 1.001s  w: 1.000s
 L(q)  ops/s    r/s   kBps   ms/r    w/s   kBps   ms/w   %busy Name
    0      0      0      0    0.0      0      0    0.0    0.0  ad4
    0      0      0      0    0.0      0      0    0.0    0.0  ad4s1
    0      0      0      0    0.0      0      0    0.0    0.0  ad4s1a
    0      0      0      0    0.0      0      0    0.0    0.0  ad4s1b
    0      0      0      0    0.0      0      0    0.0    0.0  ad6
    0      0      0      0    0.0      0      0    0.0    0.0  ad6s1
    0      0      0      0    0.0      0      0    0.0    0.0  ada0
    0      0      0      0    0.0      0      0    0.0    0.0  ada1
    0      0      0      0    0.0      0      0    0.0    0.0  ada2
    0      0      0      0    0.0      0      0    0.0    0.0  ada3
    0      0      0      0    0.0      0      0    0.0    0.0  ada4
    0      0      0      0    0.0      0      0    0.0    0.0  ada5
    0      0      0      0    0.0      0      0    0.0    0.0  ada6
    0      0      0      0    0.0      0      0    0.0    0.0  ada7
    0      0      0      0    0.0      0      0    0.0    0.0  ada8
    0      0      0      0    0.0      0      0    0.0    0.0  ada9
    0      0      0      0    0.0      0      0    0.0    0.0  ad6s1a
    0      0      0      0    0.0      0      0    0.0    0.0  ad6s1b
    0      0      0      0    0.0      0      0    0.0    0.0  mirror/ROOT121027
    0      0      0      0    0.0      0      0    0.0    0.0  mirror/SWAP121027
    0      0      0      0    0.0      0      0    0.0    0.0  mirror/ROOT121027.journal
dT: 1.001s  w: 1.000s
 L(q)  ops/s    r/s   kBps   ms/r    w/s   kBps   ms/w   %busy Name
    0      0      0      0    0.0      0      0    0.0    0.0  ad4
    0      0      0      0    0.0      0      0    0.0    0.0  ad4s1
    0      0      0      0    0.0      0      0    0.0    0.0  ad4s1a
    0      0      0      0    0.0      0      0    0.0    0.0  ad4s1b
    0      0      0      0    0.0      0      0    0.0    0.0  ad6
    0      0      0      0    0.0      0      0    0.0    0.0  ad6s1
    0      0      0      0    0.0      0      0    0.0    0.0  ada0
    0      0      0      0    0.0      0      0    0.0    0.0  ada1
    0      0      0      0    0.0      0      0    0.0    0.0  ada2
    0      0      0      0    0.0      0      0    0.0    0.0  ada3
    0      0      0      0    0.0      0      0    0.0    0.0  ada4
    0      0      0      0    0.0      0      0    0.0    0.0  ada5
    0      0      0      0    0.0      0      0    0.0    0.0  ada6
    0      0      0      0    0.0      0      0    0.0    0.0  ada7
    0      0      0      0    0.0      0      0    0.0    0.0  ada8
    0      0      0      0    0.0      0      0    0.0    0.0  ada9
    0      0      0      0    0.0      0      0    0.0    0.0  ad6s1a
    0      0      0      0    0.0      0      0    0.0    0.0  ad6s1b
    0      0      0      0    0.0      0      0    0.0    0.0  mirror/ROOT121027
    0      0      0      0    0.0      0      0    0.0    0.0  mirror/SWAP121027
    0      0      0      0    0.0      0      0    0.0    0.0  mirror/ROOT121027.journal
dT: 1.002s  w: 1.000s
 L(q)  ops/s    r/s   kBps   ms/r    w/s   kBps   ms/w   %busy Name
    0      0      0      0    0.0      0      0    0.0    0.0  ad4
    0      0      0      0    0.0      0      0    0.0    0.0  ad4s1
    0      0      0      0    0.0      0      0    0.0    0.0  ad4s1a
    0      0      0      0    0.0      0      0    0.0    0.0  ad4s1b
    0      0      0      0    0.0      0      0    0.0    0.0  ad6
    0      0      0      0    0.0      0      0    0.0    0.0  ad6s1
    0     87      0      0    0.0     87    481    3.0    4.2  ada0
    0     88      0      0    0.0     88    477    3.8    4.9  ada1
    0     92      0      0    0.0     92    483    2.8    4.1  ada2
    0     89      0      0    0.0     89    477    2.3    3.5  ada3
    0    100      0      0    0.0    100    480    1.8    3.4  ada4
    0    100      0      0    0.0    100    475    3.6    5.4  ada5
    0     87      0      0    0.0     87    483    1.8    3.2  ada6
    0     89      0      0    0.0     89    480    1.6    3.0  ada7
    0      0      0      0    0.0      0      0    0.0    0.0  ada8
    0    100      0      0    0.0    100    474    1.5    3.1  ada9
    0      0      0      0    0.0      0      0    0.0    0.0  ad6s1a
    0      0      0      0    0.0      0      0    0.0    0.0  ad6s1b
    0      0      0      0    0.0      0      0    0.0    0.0  mirror/ROOT121027
    0      0      0      0    0.0      0      0    0.0    0.0  mirror/SWAP121027
    0      6      0      0    0.0      6     68    0.2    0.0  mirror/ROOT121027.journal
dT: 1.001s  w: 1.000s
 L(q)  ops/s    r/s   kBps   ms/r    w/s   kBps   ms/w   %busy Name
    0      0      0      0    0.0      0      0    0.0    0.0  ad4
    0      0      0      0    0.0      0      0    0.0    0.0  ad4s1
    0      0      0      0    0.0      0      0    0.0    0.0  ad4s1a
    0      0      0      0    0.0      0      0    0.0    0.0  ad4s1b
    0      0      0      0    0.0      0      0    0.0    0.0  ad6
    0      0      0      0    0.0      0      0    0.0    0.0  ad6s1
   10    449      0      0    0.0    449    400    9.0   49.6  ada0
    0    477     26     68    4.3    451    418    2.6   21.9  ada1
    0    469      0      0    0.0    469    417    1.7   11.2  ada2
    0    466      0      0    0.0    466    412    1.7   10.9  ada3
   10    378      0      0    0.0    378    305    9.3   43.0  ada4
    0    475      0      0    0.0    475    414    1.8   12.1  ada5
    0    430      0      0    0.0    430    407    8.7   41.3  ada6
    0    511     27     35    8.0    484    423    1.2   14.2  ada7
    0      0      0      0    0.0      0      0    0.0    0.0  ada8
    0    505     26     49   10.4    479    416    0.7   17.8  ada9
    0      0      0      0    0.0      0      0    0.0    0.0  ad6s1a
    0      0      0      0    0.0      0      0    0.0    0.0  ad6s1b
    0      0      0      0    0.0      0      0    0.0    0.0  mirror/ROOT121027
    0      0      0      0    0.0      0      0    0.0    0.0  mirror/SWAP121027
    0      0      0      0    0.0      0      0    0.0    0.0  mirror/ROOT121027.journal
dT: 1.002s  w: 1.000s
 L(q)  ops/s    r/s   kBps   ms/r    w/s   kBps   ms/w   %busy Name
    0      0      0      0    0.0      0      0    0.0    0.0  ad4
    0      0      0      0    0.0      0      0    0.0    0.0  ad4s1
    0      0      0      0    0.0      0      0    0.0    0.0  ad4s1a
    0      0      0      0    0.0      0      0    0.0    0.0  ad4s1b
    0      0      0      0    0.0      0      0    0.0    0.0  ad6
    0      0      0      0    0.0      0      0    0.0    0.0  ad6s1
    0    139      0      0    0.0    137    155    6.4   24.4  ada0
    0    138      9      4    1.1    127    131    1.8   27.3  ada1
    0    153      0      0    0.0    151    137    1.7   22.6  ada2
    0    150      0      0    0.0    148    135    1.3   11.2  ada3
    0    140      0      0    0.0    138    237   15.8   41.0  ada4
    0    141      0      0    0.0    139    134    1.3   14.1  ada5
    0    129      0      0    0.0    127    128    1.4   16.2  ada6
    0    158      5      2    0.7    151    128    0.5   15.0  ada7
    0      0      0      0    0.0      0      0    0.0    0.0  ada8
    0    153      5      2   11.9    146    138    0.6   19.9  ada9
    0      0      0      0    0.0      0      0    0.0    0.0  ad6s1a
    0      0      0      0    0.0      0      0    0.0    0.0  ad6s1b
    0      0      0      0    0.0      0      0    0.0    0.0  mirror/ROOT121027
    0      0      0      0    0.0      0      0    0.0    0.0  mirror/SWAP121027
    0      0      0      0    0.0      0      0    0.0    0.0  mirror/ROOT121027.journal

--6TrnltStXW4iwmi0--

From owner-freebsd-fs@FreeBSD.ORG  Mon Jan 14 09:43:46 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by hub.freebsd.org (Postfix) with ESMTP id BA7B2F3A
 for <freebsd-fs@freebsd.org>; Mon, 14 Jan 2013 09:43:46 +0000 (UTC)
 (envelope-from nicolas@i.0x5.de)
Received: from n.0x5.de (n.0x5.de [217.197.85.144])
 by mx1.freebsd.org (Postfix) with ESMTP id 6FB07F94
 for <freebsd-fs@freebsd.org>; Mon, 14 Jan 2013 09:43:46 +0000 (UTC)
Received: by pc5.i.0x5.de (Postfix, from userid 1003)
 id 3Yl8rd512Vz7ySF; Mon, 14 Jan 2013 10:43:45 +0100 (CET)
Date: Mon, 14 Jan 2013 10:43:45 +0100
From: Nicolas Rachinsky <fbsd-mas-0@ml.turing-complete.org>
To: Steven Hartland <killing@multiplay.co.uk>
Subject: Re: slowdown of zfs (tx->tx)
Message-ID: <20130114094345.GB75529@mid.pc5.i.0x5.de>
References: <20130108174225.GA17260@mid.pc5.i.0x5.de>
 <CAFqOu6jgA8RWV5d+rOBk8D=3Vu3yWSnDkAi1cFJ0esj4OpBy2Q@mail.gmail.com>
 <20130109162613.GA34276@mid.pc5.i.0x5.de>
 <CAFqOu6jrng=v8eVyhqV-PBqJM_dYy+U7X4+=ahBeoxvK4mxcSA@mail.gmail.com>
 <20130110193949.GA10023@mid.pc5.i.0x5.de>
 <20130111111147.GA34160@mid.pc5.i.0x5.de>
 <C3BE7751389046ED8A03C8EA9699A76B@multiplay.co.uk>
 <20130113235239.GA16318@mid.pc5.i.0x5.de>
 <778BCD159C6546A4ADC7D9FBBFC3DA8A@multiplay.co.uk>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <778BCD159C6546A4ADC7D9FBBFC3DA8A@multiplay.co.uk>
X-Powered-by: FreeBSD
X-Homepage: http://www.rachinsky.de
X-PGP-Keyid: 887BAE72
X-PGP-Fingerprint: 039E 9433 115F BC5F F88D  4524 5092 45C4 887B AE72
X-PGP-Keys: http://www.rachinsky.de/nicolas/gpg/nicolas_rachinsky.asc
User-Agent: Mutt/1.5.21 (2010-09-15)
Cc: freebsd-fs <freebsd-fs@freebsd.org>
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 14 Jan 2013 09:43:46 -0000

* Steven Hartland <killing@multiplay.co.uk> [2013-01-14 00:05 -0000]:
> 
> ----- Original Message ----- From: "Nicolas Rachinsky"
> <fbsd-mas-0@ml.turing-complete.org>
> To: "Steven Hartland" <killing@multiplay.co.uk>
> Cc: "freebsd-fs" <freebsd-fs@freebsd.org>
> Sent: Sunday, January 13, 2013 11:52 PM
> Subject: Re: slowdown of zfs (tx->tx)
> 
> 
> >* Steven Hartland <killing@multiplay.co.uk> [2013-01-11 13:58 -0000]:
> >>TBH looks like your just saturating your disks with the number of IOP's
> >>your doing.
> >
> >But now a backup takes forever (16hours and more) that took less than
> >30 minutes two weeks ago.
> >
> >I duplicated the complete setup to another (slower) server. There the
> >backups are slower than they were on this machine, but they are much
> >faster than on this machine.
              ^they are       ^now

> 
> Its not something silly like you have 4k disks which aren't 4k aligned
> is it?

No, these are 512 bytes/sector disks. And it started to become so slow
without any change (hardware or software). 

> IIRC you where using rsync, is there a reason why you aren't using
> zfs send - recv?

Most of the machines whose backups are put to this machine run with
linux and don't have zfs.

Nicolas
-- 
http://www.rachinsky.de/nicolas

From owner-freebsd-fs@FreeBSD.ORG  Mon Jan 14 10:58:41 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115])
 by hub.freebsd.org (Postfix) with ESMTP id EC964B73
 for <freebsd-fs@freebsd.org>; Mon, 14 Jan 2013 10:58:41 +0000 (UTC)
 (envelope-from joh.hendriks@gmail.com)
Received: from mail-la0-f51.google.com (mail-la0-f51.google.com
 [209.85.215.51]) by mx1.freebsd.org (Postfix) with ESMTP id 76FA2307
 for <freebsd-fs@freebsd.org>; Mon, 14 Jan 2013 10:58:41 +0000 (UTC)
Received: by mail-la0-f51.google.com with SMTP id fj20so3698736lab.10
 for <freebsd-fs@freebsd.org>; Mon, 14 Jan 2013 02:58:40 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
 h=x-received:message-id:date:from:user-agent:mime-version:to:cc
 :subject:references:in-reply-to:content-type;
 bh=3yQl+K7AqRR3ky63oHeeLRDoNxK7bqfmNqdZMlfWg7A=;
 b=aPwmF7lQ5mB91NH3WXS65qcErYVyy3+SLnc2ULfHf1rw737Cd1IuIpkAOgYSprsTue
 D8N65oOp3pwUok8Pqb3tB5Q0WKhiA/ficfk/I19mR1rpnFiIAPXqOvBXz39T/OscblCl
 I0HyXQHsZGKttymMkvmCbsyUZw4O6T4IDSzqulCYUqzZTg9XhWc96/zWaCBPVOvf86pN
 vBIc0SbkZz+pLNVKoFbt9+KFLFme3e702y5DW3LjNtzkQV9gvrw0B6Oy7Vcr3pXShRGW
 /HYDFw/tXxO3JOiK4Y0cJk2kOFnhaHskYaJCO4gpv7NpJsQNCnoOpMxRmbQLxelt7Hzt
 KhfQ==
X-Received: by 10.152.145.8 with SMTP id sq8mr80768163lab.21.1358161120058;
 Mon, 14 Jan 2013 02:58:40 -0800 (PST)
Received: from [192.168.50.105] (double-l.xs4all.nl. [80.126.205.144])
 by mx.google.com with ESMTPS id ox6sm5029619lab.16.2013.01.14.02.58.38
 (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128);
 Mon, 14 Jan 2013 02:58:39 -0800 (PST)
Message-ID: <50F3E4DC.8030704@gmail.com>
Date: Mon, 14 Jan 2013 11:58:36 +0100
From: Johan Hendriks <joh.hendriks@gmail.com>
User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64;
 rv:17.0) Gecko/20130107 Thunderbird/17.0.2
MIME-Version: 1.0
To: Nicolas Rachinsky <fbsd-mas-0@ml.turing-complete.org>
Subject: Re: slowdown of zfs (tx->tx)
References: <20130108174225.GA17260@mid.pc5.i.0x5.de>
 <CAFqOu6jgA8RWV5d+rOBk8D=3Vu3yWSnDkAi1cFJ0esj4OpBy2Q@mail.gmail.com>
 <20130109162613.GA34276@mid.pc5.i.0x5.de>
 <CAFqOu6jrng=v8eVyhqV-PBqJM_dYy+U7X4+=ahBeoxvK4mxcSA@mail.gmail.com>
 <20130110193949.GA10023@mid.pc5.i.0x5.de>
 <20130111073417.GA95100@mid.pc5.i.0x5.de>
 <CAFqOu6gWpMsWN0pTBiv10WfwyGWMfO9GzMLWTtcVxHixr-_i3Q@mail.gmail.com>
 <20130114094010.GA75529@mid.pc5.i.0x5.de>
In-Reply-To: <20130114094010.GA75529@mid.pc5.i.0x5.de>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
X-Content-Filtered-By: Mailman/MimeDel 2.1.14
Cc: freebsd-fs@freebsd.org
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 14 Jan 2013 10:58:42 -0000

Nicolas Rachinsky schreef:
> * Artem Belevich <art@freebsd.org> [2013-01-11 12:39 -0800]:
>> On Thu, Jan 10, 2013 at 11:34 PM, Nicolas Rachinsky
>> <fbsd-mas-0@ml.turing-complete.org> wrote:
>>> * Nicolas Rachinsky <fbsd-mas-0@ml.turing-complete.org> [2013-01-10 20:39 +0100]:
>>>> after replacing one of the controllers, all problems seem to have
>>>> disappeared. Thank you very much for your advice!
>>> Now the problem is back.
>>>
>>> After changing the controller, there were no more timeouts logged.
>>>
>>> No UDMA_CRC_Error_Count changed.
>>>
>> Is there anything special about ada8? It does seem to have noticeably
>> higher service time compared to other disks.
> Nothing I know of. The disks are Samsung HD103UJ and HD103SI, multiple
> of each type.
>
> SMART Attributes Data Structure revision number: 16
> Vendor Specific SMART Attributes with Thresholds:
> ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
>    1 Raw_Read_Error_Rate     0x000f   100   100   051    Pre-fail  Always       -       0
>    3 Spin_Up_Time            0x0007   073   073   011    Pre-fail  Always       -       8890
>    4 Start_Stop_Count        0x0032   100   100   000    Old_age   Always       -       32
>    5 Reallocated_Sector_Ct   0x0033   094   094   010    Pre-fail  Always       -       166
>    7 Seek_Error_Rate         0x000f   100   100   051    Pre-fail  Always       -       0
>    8 Seek_Time_Performance   0x0025   100   100   015    Pre-fail  Offline      -       10872
>    9 Power_On_Hours          0x0032   099   099   000    Old_age   Always       -       5688
>   10 Spin_Retry_Count        0x0033   100   100   051    Pre-fail  Always       -       0
>   11 Calibration_Retry_Count 0x0012   100   100   000    Old_age   Always       -       0
>   12 Power_Cycle_Count       0x0032   100   100   000    Old_age   Always       -       31
>   13 Read_Soft_Error_Rate    0x000e   100   100   000    Old_age   Always       -       0
> 183 Runtime_Bad_Block       0x0032   100   100   000    Old_age   Always       -       0
> 184 End-to-End_Error        0x0033   100   100   000    Pre-fail  Always       -       0
> 187 Reported_Uncorrect      0x0032   100   100   000    Old_age   Always       -       0
> 188 Command_Timeout         0x0032   100   100   000    Old_age   Always       -       0
> 190 Airflow_Temperature_Cel 0x0022   078   069   000    Old_age   Always       -       22 (Min/Max 21/25)
> 194 Temperature_Celsius     0x0022   077   067   000    Old_age   Always       -       23 (Min/Max 21/26)
> 195 Hardware_ECC_Recovered  0x001a   100   100   000    Old_age   Always       -       1259614646
> 196 Reallocated_Event_Count 0x0032   096   096   000    Old_age   Always       -       166
> 197 Current_Pending_Sector  0x0012   100   100   000    Old_age   Always       -       0
> 198 Offline_Uncorrectable   0x0030   100   100   000    Old_age   Offline      -       0
> 199 UDMA_CRC_Error_Count    0x003e   100   100   000    Old_age   Always       -       0
> 200 Multi_Zone_Error_Rate   0x000a   100   099   000    Old_age   Always       -       5
> 201 Soft_Read_Error_Rate    0x000a   100   100   000    Old_age   Always       -       0
>
> Reallocated_Sector_Ct did not increase during the last days.
>
>
>> Cound you do gstat with 1-second interval. Some of the 5-second
>> samples show that ada8 is the bottleneck -- it has its request queue
>> full (L(q)=10) when all other drives were done with their jobs. And
>> that's a 5-sec average. Its write service time also seems to be a lot
>> higher than for other drives.
> Attached.  I have replace ada8 by ada9, which is a Western Digital
> Caviar Black.
>
> Now ada0 and ada4 seem to be the bottleneck.
>
> But I don't understand the intervalls without any disk activity.
>
>> Does the drive have its write cache disabled by any chance? That could
>> explain why it takes so much longer to service writes.
> No, camcontrol identify says it's enabled.
>
>> Can you remove ada8 and see if your performance go back to normal?
> The problem still persists.
>
> Thank you for your help!
>
> Nicolas
>
>
> _______________________________________________
> freebsd-fs@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-fs
> To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org"
Could it be that something else is occupying the pool.
I had to disable a security check from periodic.
daily_status_security_neggrpperm_enable="NO"

After i disabled that check, my pool was performing normal again.
If you do not have many snapshots, it is no problem, but with a lot of 
snashots, this check stalls the pool.

gr
Johan


From owner-freebsd-fs@FreeBSD.ORG  Mon Jan 14 11:06:46 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@FreeBSD.org
Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115])
 by hub.freebsd.org (Postfix) with ESMTP id 56FFA48E
 for <freebsd-fs@FreeBSD.org>; Mon, 14 Jan 2013 11:06:46 +0000 (UTC)
 (envelope-from owner-bugmaster@FreeBSD.org)
Received: from freefall.freebsd.org (freefall.freebsd.org
 [IPv6:2001:1900:2254:206c::16:87])
 by mx1.freebsd.org (Postfix) with ESMTP id 3ACD463A
 for <freebsd-fs@FreeBSD.org>; Mon, 14 Jan 2013 11:06:46 +0000 (UTC)
Received: from freefall.freebsd.org (localhost [127.0.0.1])
 by freefall.freebsd.org (8.14.6/8.14.6) with ESMTP id r0EB6k3w086371
 for <freebsd-fs@FreeBSD.org>; Mon, 14 Jan 2013 11:06:46 GMT
 (envelope-from owner-bugmaster@FreeBSD.org)
Received: (from gnats@localhost)
 by freefall.freebsd.org (8.14.6/8.14.6/Submit) id r0EB6jBM086369
 for freebsd-fs@FreeBSD.org; Mon, 14 Jan 2013 11:06:45 GMT
 (envelope-from owner-bugmaster@FreeBSD.org)
Date: Mon, 14 Jan 2013 11:06:45 GMT
Message-Id: <201301141106.r0EB6jBM086369@freefall.freebsd.org>
X-Authentication-Warning: freefall.freebsd.org: gnats set sender to
 owner-bugmaster@FreeBSD.org using -f
From: FreeBSD bugmaster <bugmaster@freebsd.org>
To: freebsd-fs@FreeBSD.org
Subject: Current problem reports assigned to freebsd-fs@FreeBSD.org
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 14 Jan 2013 11:06:46 -0000

Note: to view an individual PR, use:
  http://www.freebsd.org/cgi/query-pr.cgi?pr=(number).

The following is a listing of current problems submitted by FreeBSD users.
These represent problem reports covering all versions including
experimental development code and obsolete releases.


S Tracker      Resp.      Description
--------------------------------------------------------------------------------
o kern/175179  fs         [zfs] ZFS may attach wrong device on move
o kern/175101  fs         [zfs] [nfs] ZFS NFSv4 ACL's allows user without perm t
o kern/175071  fs         [ufs] [panic] softdep_deallocate_dependencies: unrecov
o kern/174950  fs         [zfs] delete ZFS ACL have no effect
o kern/174949  fs         [zfs] ZFS ACL: rwxp required to mkdir. p should not be
o kern/174948  fs         [zfs] owner@ always have ZFS ACL full permissions. Sho
o kern/174372  fs         [zfs] Pagefault appears to be related to ZFS
o kern/174315  fs         [zfs] chflags uchg not supported
o kern/174310  fs         [zfs] root point mounting broken on CURRENT with multi
o kern/174279  fs         [ufs] UFS2-SU+J journal and filesystem corruption
o kern/174060  fs         [ext2fs] Ext2FS system crashes (buffer overflow?)
o kern/173830  fs         [zfs] Brain-dead simple change to ZFS error descriptio
o kern/173718  fs         [zfs] phantom directory in zraid2 pool
f kern/173657  fs         [nfs] strange UID map with nfsuserd
o kern/173363  fs         [zfs] [panic] Panic on 'zpool replace' on readonly poo
o kern/173136  fs         [unionfs] mounting above the NFS read-only share panic
o kern/172348  fs         [unionfs] umount -f of filesystem in use with readonly
o kern/172334  fs         [unionfs] unionfs permits recursive union mounts; caus
o kern/171626  fs         [tmpfs] tmpfs should be noisier when the requested siz
o kern/171415  fs         [zfs] zfs recv fails with "cannot receive incremental 
o kern/170945  fs         [gpt] disk layout not portable between direct connect 
o bin/170778   fs         [zfs] [panic] FreeBSD panics randomly
o kern/170680  fs         [nfs] Multiple NFS Client bug in the FreeBSD 7.4-RELEA
o kern/170497  fs         [xfs][panic] kernel will panic whenever I ls a mounted
o kern/169945  fs         [zfs] [panic] Kernel panic while importing zpool (afte
o kern/169480  fs         [zfs] ZFS stalls on heavy I/O
o kern/169398  fs         [zfs] Can't remove file with permanent error
o kern/169339  fs         panic while " : > /etc/123"
o kern/169319  fs         [zfs] zfs resilver can't complete
o kern/168947  fs         [nfs] [zfs] .zfs/snapshot directory is messed up when 
o kern/168942  fs         [nfs] [hang] nfsd hangs after being restarted (not -HU
o kern/168158  fs         [zfs] incorrect parsing of sharenfs options in zfs (fs
o kern/167979  fs         [ufs] DIOCGDINFO ioctl does not work on 8.2 file syste
o kern/167977  fs         [smbfs] mount_smbfs results are differ when utf-8 or U
o kern/167688  fs         [fusefs] Incorrect signal handling with direct_io
o kern/167685  fs         [zfs] ZFS on USB drive prevents shutdown / reboot
o kern/167612  fs         [portalfs] The portal file system gets stuck inside po
o kern/167272  fs         [zfs] ZFS Disks reordering causes ZFS to pick the wron
o kern/167260  fs         [msdosfs] msdosfs disk was mounted the second time whe
o kern/167109  fs         [zfs] [panic] zfs diff kernel panic Fatal trap 9: gene
o kern/167105  fs         [nfs] mount_nfs can not handle source exports wiht mor
o kern/167067  fs         [zfs] [panic] ZFS panics the server
o kern/167065  fs         [zfs] boot fails when a spare is the boot disk
o kern/167048  fs         [nfs] [patch] RELEASE-9 crash when using ZFS+NULLFS+NF
o kern/166912  fs         [ufs] [panic] Panic after converting Softupdates to jo
o kern/166851  fs         [zfs] [hang] Copying directory from the mounted UFS di
o kern/166477  fs         [nfs] NFS data corruption.
o kern/165950  fs         [ffs] SU+J and fsck problem
o kern/165923  fs         [nfs] Writing to NFS-backed mmapped files fails if flu
o kern/165521  fs         [zfs] [hang] livelock on 1 Gig of RAM with zfs when 31
o kern/165392  fs         Multiple mkdir/rmdir fails with errno 31
o kern/165087  fs         [unionfs] lock violation in unionfs
o kern/164472  fs         [ufs] fsck -B panics on particular data inconsistency
o kern/164370  fs         [zfs] zfs destroy for snapshot fails on i386 and sparc
o kern/164261  fs         [nullfs] [patch] fix panic with NFS served from NULLFS
o kern/164256  fs         [zfs] device entry for volume is not created after zfs
o kern/164184  fs         [ufs] [panic] Kernel panic with ufs_makeinode
o kern/163801  fs         [md] [request] allow mfsBSD legacy installed in 'swap'
o kern/163770  fs         [zfs] [hang] LOR between zfs&syncer + vnlru leading to
o kern/163501  fs         [nfs] NFS exporting a dir and a subdir in that dir to 
o kern/162944  fs         [coda] Coda file system module looks broken in 9.0
o kern/162860  fs         [zfs] Cannot share ZFS filesystem to hosts with a hyph
o kern/162751  fs         [zfs] [panic] kernel panics during file operations
o kern/162591  fs         [nullfs] cross-filesystem nullfs does not work as expe
o kern/162519  fs         [zfs] "zpool import" relies on buggy realpath() behavi
o kern/162362  fs         [snapshots] [panic] ufs with snapshot(s) panics when g
o kern/161968  fs         [zfs] [hang] renaming snapshot with -r including a zvo
o kern/161864  fs         [ufs] removing journaling from UFS partition fails on 
o bin/161807   fs         [patch] add option for explicitly specifying metadata 
o kern/161579  fs         [smbfs] FreeBSD sometimes panics when an smb share is 
o kern/161533  fs         [zfs] [panic] zfs receive panic: system ioctl returnin
o kern/161438  fs         [zfs] [panic] recursed on non-recursive spa_namespace_
o kern/161424  fs         [nullfs] __getcwd() calls fail when used on nullfs mou
o kern/161280  fs         [zfs] Stack overflow in gptzfsboot
o kern/161205  fs         [nfs] [pfsync] [regression] [build] Bug report freebsd
o kern/161169  fs         [zfs] [panic] ZFS causes kernel panic in dbuf_dirty
o kern/161112  fs         [ufs] [lor] filesystem LOR in FreeBSD 9.0-BETA3
o kern/160893  fs         [zfs] [panic] 9.0-BETA2 kernel panic
o kern/160860  fs         [ufs] Random UFS root filesystem corruption with SU+J 
o kern/160801  fs         [zfs] zfsboot on 8.2-RELEASE fails to boot from root-o
o kern/160790  fs         [fusefs] [panic] VPUTX: negative ref count with FUSE
o kern/160777  fs         [zfs] [hang] RAID-Z3 causes fatal hang upon scrub/impo
o kern/160706  fs         [zfs] zfs bootloader fails when a non-root vdev exists
o kern/160591  fs         [zfs] Fail to boot on zfs root with degraded raidz2 [r
o kern/160410  fs         [smbfs] [hang] smbfs hangs when transferring large fil
o kern/160283  fs         [zfs] [patch] 'zfs list' does abort in make_dataset_ha
o kern/159930  fs         [ufs] [panic] kernel core
o kern/159402  fs         [zfs][loader] symlinks cause I/O errors
o kern/159357  fs         [zfs] ZFS MAXNAMELEN macro has confusing name (off-by-
o kern/159356  fs         [zfs] [patch] ZFS NAME_ERR_DISKLIKE check is Solaris-s
o kern/159351  fs         [nfs] [patch] - divide by zero in mountnfs()
o kern/159251  fs         [zfs] [request]: add FLETCHER4 as DEDUP hash option
o kern/159077  fs         [zfs] Can't cd .. with latest zfs version
o kern/159048  fs         [smbfs] smb mount corrupts large files
o kern/159045  fs         [zfs] [hang] ZFS scrub freezes system
o kern/158839  fs         [zfs] ZFS Bootloader Fails if there is a Dead Disk
o kern/158802  fs         amd(8) ICMP storm and unkillable process.
o kern/158231  fs         [nullfs] panic on unmounting nullfs mounted over ufs o
f kern/157929  fs         [nfs] NFS slow read
o kern/157399  fs         [zfs] trouble with: mdconfig force delete && zfs strip
o kern/157179  fs         [zfs] zfs/dbuf.c: panic: solaris assert: arc_buf_remov
o kern/156797  fs         [zfs] [panic] Double panic with FreeBSD 9-CURRENT and 
o kern/156781  fs         [zfs] zfs is losing the snapshot directory,
p kern/156545  fs         [ufs] mv could break UFS on SMP systems
o kern/156193  fs         [ufs] [hang] UFS snapshot hangs && deadlocks processes
o kern/156039  fs         [nullfs] [unionfs] nullfs + unionfs do not compose, re
o kern/155615  fs         [zfs] zfs v28 broken on sparc64 -current
o kern/155587  fs         [zfs] [panic] kernel panic with zfs
p kern/155411  fs         [regression] [8.2-release] [tmpfs]: mount: tmpfs : No 
o kern/155199  fs         [ext2fs] ext3fs mounted as ext2fs gives I/O errors
o bin/155104   fs         [zfs][patch] use /dev prefix by default when importing
o kern/154930  fs         [zfs] cannot delete/unlink file from full volume -> EN
o kern/154828  fs         [msdosfs] Unable to create directories on external USB
o kern/154491  fs         [smbfs] smb_co_lock: recursive lock for object 1
p kern/154228  fs         [md] md getting stuck in wdrain state
o kern/153996  fs         [zfs] zfs root mount error while kernel is not located
o kern/153753  fs         [zfs] ZFS v15 - grammatical error when attempting to u
o kern/153716  fs         [zfs] zpool scrub time remaining is incorrect
o kern/153695  fs         [patch] [zfs] Booting from zpool created on 4k-sector 
o kern/153680  fs         [xfs] 8.1 failing to mount XFS partitions
o kern/153418  fs         [zfs] [panic] Kernel Panic occurred writing to zfs vol
o kern/153351  fs         [zfs] locking directories/files in ZFS
o bin/153258   fs         [patch][zfs] creating ZVOLs requires `refreservation' 
s kern/153173  fs         [zfs] booting from a gzip-compressed dataset doesn't w
o bin/153142   fs         [zfs] ls -l outputs `ls: ./.zfs: Operation not support
o kern/153126  fs         [zfs] vdev failure, zpool=peegel type=vdev.too_small
o kern/152022  fs         [nfs] nfs service hangs with linux client [regression]
o kern/151942  fs         [zfs] panic during ls(1) zfs snapshot directory
o kern/151905  fs         [zfs] page fault under load in /sbin/zfs
o bin/151713   fs         [patch] Bug in growfs(8) with respect to 32-bit overfl
o kern/151648  fs         [zfs] disk wait bug
o kern/151629  fs         [fs] [patch] Skip empty directory entries during name 
o kern/151330  fs         [zfs] will unshare all zfs filesystem after execute a 
o kern/151326  fs         [nfs] nfs exports fail if netgroups contain duplicate 
o kern/151251  fs         [ufs] Can not create files on filesystem with heavy us
o kern/151226  fs         [zfs] can't delete zfs snapshot
o kern/150503  fs         [zfs] ZFS disks are UNAVAIL and corrupted after reboot
o kern/150501  fs         [zfs] ZFS vdev failure vdev.bad_label on amd64
o kern/150390  fs         [zfs] zfs deadlock when arcmsr reports drive faulted
o kern/150336  fs         [nfs] mountd/nfsd became confused; refused to reload n
o kern/149208  fs         mksnap_ffs(8) hang/deadlock
o kern/149173  fs         [patch] [zfs] make OpenSolaris <sys/nvpair.h> installa
o kern/149015  fs         [zfs] [patch] misc fixes for ZFS code to build on Glib
o kern/149014  fs         [zfs] [patch] declarations in ZFS libraries/utilities 
o kern/149013  fs         [zfs] [patch] make ZFS makefiles use the libraries fro
o kern/148504  fs         [zfs] ZFS' zpool does not allow replacing drives to be
o kern/148490  fs         [zfs]: zpool attach - resilver bidirectionally, and re
o kern/148368  fs         [zfs] ZFS hanging forever on 8.1-PRERELEASE
o kern/148138  fs         [zfs] zfs raidz pool commands freeze
o kern/147903  fs         [zfs] [panic] Kernel panics on faulty zfs device
o kern/147881  fs         [zfs] [patch] ZFS "sharenfs" doesn't allow different "
o kern/147420  fs         [ufs] [panic] ufs_dirbad, nullfs, jail panic (corrupt 
o kern/146941  fs         [zfs] [panic] Kernel Double Fault - Happens constantly
o kern/146786  fs         [zfs] zpool import hangs with checksum errors
o kern/146708  fs         [ufs] [panic] Kernel panic in softdep_disk_write_compl
o kern/146528  fs         [zfs] Severe memory leak in ZFS on i386
o kern/146502  fs         [nfs] FreeBSD 8 NFS Client Connection to Server
s kern/145712  fs         [zfs] cannot offline two drives in a raidz2 configurat
o kern/145411  fs         [xfs] [panic] Kernel panics shortly after mounting an 
f bin/145309   fs         bsdlabel: Editing disk label invalidates the whole dev
o kern/145272  fs         [zfs] [panic] Panic during boot when accessing zfs on 
o kern/145246  fs         [ufs] dirhash in 7.3 gratuitously frees hashes when it
o kern/145238  fs         [zfs] [panic] kernel panic on zpool clear tank
o kern/145229  fs         [zfs] Vast differences in ZFS ARC behavior between 8.0
o kern/145189  fs         [nfs] nfsd performs abysmally under load
o kern/144929  fs         [ufs] [lor] vfs_bio.c + ufs_dirhash.c
p kern/144447  fs         [zfs] sharenfs fsunshare() & fsshare_main() non functi
o kern/144416  fs         [panic] Kernel panic on online filesystem optimization
s kern/144415  fs         [zfs] [panic] kernel panics on boot after zfs crash
o kern/144234  fs         [zfs] Cannot boot machine with recent gptzfsboot code 
o kern/143825  fs         [nfs] [panic] Kernel panic on NFS client
o bin/143572   fs         [zfs] zpool(1): [patch] The verbose output from iostat
o kern/143212  fs         [nfs] NFSv4 client strange work ...
o kern/143184  fs         [zfs] [lor] zfs/bufwait LOR
o kern/142878  fs         [zfs] [vfs] lock order reversal
o kern/142597  fs         [ext2fs] ext2fs does not work on filesystems with real
o kern/142489  fs         [zfs] [lor] allproc/zfs LOR
o kern/142466  fs         Update 7.2 -> 8.0 on Raid 1 ends with screwed raid [re
o kern/142306  fs         [zfs] [panic] ZFS drive (from OSX Leopard) causes two 
o kern/142068  fs         [ufs] BSD labels are got deleted spontaneously
o kern/141897  fs         [msdosfs] [panic] Kernel panic. msdofs: file name leng
o kern/141463  fs         [nfs] [panic] Frequent kernel panics after upgrade fro
o kern/141305  fs         [zfs] FreeBSD ZFS+sendfile severe performance issues (
o kern/141091  fs         [patch] [nullfs] fix panics with DIAGNOSTIC enabled
o kern/141086  fs         [nfs] [panic] panic("nfs: bioread, not dir") on FreeBS
o kern/141010  fs         [zfs] "zfs scrub" fails when backed by files in UFS2
o kern/140888  fs         [zfs] boot fail from zfs root while the pool resilveri
o kern/140661  fs         [zfs] [patch] /boot/loader fails to work on a GPT/ZFS-
o kern/140640  fs         [zfs] snapshot crash
o kern/140068  fs         [smbfs] [patch] smbfs does not allow semicolon in file
o kern/139725  fs         [zfs] zdb(1) dumps core on i386 when examining zpool c
o kern/139715  fs         [zfs] vfs.numvnodes leak on busy zfs
p bin/139651   fs         [nfs] mount(8): read-only remount of NFS volume does n
o kern/139407  fs         [smbfs] [panic] smb mount causes system crash if remot
o kern/138662  fs         [panic] ffs_blkfree: freeing free block
o kern/138421  fs         [ufs] [patch] remove UFS label limitations
o kern/138202  fs         mount_msdosfs(1) see only 2Gb
o kern/136968  fs         [ufs] [lor] ufs/bufwait/ufs (open)
o kern/136945  fs         [ufs] [lor] filedesc structure/ufs (poll)
o kern/136944  fs         [ffs] [lor] bufwait/snaplk (fsync)
o kern/136873  fs         [ntfs] Missing directories/files on NTFS volume
o kern/136865  fs         [nfs] [patch] NFS exports atomic and on-the-fly atomic
p kern/136470  fs         [nfs] Cannot mount / in read-only, over NFS
o kern/135546  fs         [zfs] zfs.ko module doesn't ignore zpool.cache filenam
o kern/135469  fs         [ufs] [panic] kernel crash on md operation in ufs_dirb
o kern/135050  fs         [zfs] ZFS clears/hides disk errors on reboot
o kern/134491  fs         [zfs] Hot spares are rather cold...
o kern/133676  fs         [smbfs] [panic] umount -f'ing a vnode-based memory dis
p kern/133174  fs         [msdosfs] [patch] msdosfs must support multibyte inter
o kern/132960  fs         [ufs] [panic] panic:ffs_blkfree: freeing free frag
o kern/132397  fs         reboot causes filesystem corruption (failure to sync b
o kern/132331  fs         [ufs] [lor] LOR ufs and syncer
o kern/132237  fs         [msdosfs] msdosfs has problems to read MSDOS Floppy
o kern/132145  fs         [panic] File System Hard Crashes
o kern/131441  fs         [unionfs] [nullfs] unionfs and/or nullfs not combineab
o kern/131360  fs         [nfs] poor scaling behavior of the NFS server under lo
o kern/131342  fs         [nfs] mounting/unmounting of disks causes NFS to fail
o bin/131341   fs         makefs: error "Bad file descriptor"  on the mount poin
o kern/130920  fs         [msdosfs] cp(1) takes 100% CPU time while copying file
o kern/130210  fs         [nullfs] Error by check nullfs
o kern/129760  fs         [nfs] after 'umount -f' of a stale NFS share FreeBSD l
o kern/129488  fs         [smbfs] Kernel "bug" when using smbfs in smbfs_smb.c: 
o kern/129231  fs         [ufs] [patch] New UFS mount (norandom) option - mostly
o kern/129152  fs         [panic] non-userfriendly panic when trying to mount(8)
o kern/127787  fs         [lor] [ufs] Three LORs: vfslock/devfs/vfslock, ufs/vfs
o bin/127270   fs         fsck_msdosfs(8) may crash if BytesPerSec is zero
o kern/127029  fs         [panic] mount(8): trying to mount a write protected zi
o kern/126287  fs         [ufs] [panic] Kernel panics while mounting an UFS file
o kern/125895  fs         [ffs] [panic] kernel: panic: ffs_blkfree: freeing free
s kern/125738  fs         [zfs] [request] SHA256 acceleration in ZFS
o kern/123939  fs         [msdosfs] corrupts new files
o kern/122380  fs         [ffs] ffs_valloc:dup alloc (Soekris 4801/7.0/USB Flash
o bin/122172   fs         [fs]: amd(8) automount daemon dies on 6.3-STABLE i386,
o bin/121898   fs         [nullfs] pwd(1)/getcwd(2) fails with Permission denied
o bin/121072   fs         [smbfs] mount_smbfs(8) cannot normally convert the cha
o kern/120483  fs         [ntfs] [patch] NTFS filesystem locking changes
o kern/120482  fs         [ntfs] [patch] Sync style changes between NetBSD and F
o kern/118912  fs         [2tb] disk sizing/geometry problem with large array
o kern/118713  fs         [minidump] [patch] Display media size required for a k
o kern/118318  fs         [nfs] NFS server hangs under special circumstances
o bin/118249   fs         [ufs] mv(1): moving a directory changes its mtime
o kern/118126  fs         [nfs] [patch] Poor NFS server write performance
o kern/118107  fs         [ntfs] [panic] Kernel panic when accessing a file at N
o kern/117954  fs         [ufs] dirhash on very large directories blocks the mac
o bin/117315   fs         [smbfs] mount_smbfs(8) and related options can't mount
o kern/117158  fs         [zfs] zpool scrub causes panic if geli vdevs detach on
o bin/116980   fs         [msdosfs] [patch] mount_msdosfs(8) resets some flags f
o conf/116931  fs         lack of fsck_cd9660 prevents mounting iso images with 
o kern/116583  fs         [ffs] [hang] System freezes for short time when using 
o bin/115361   fs         [zfs] mount(8) gets into a state where it won't set/un
o kern/114955  fs         [cd9660] [patch] [request] support for mask,dirmask,ui
o kern/114847  fs         [ntfs] [patch] [request] dirmask support for NTFS ala 
o kern/114676  fs         [ufs] snapshot creation panics: snapacct_ufs2: bad blo
o bin/114468   fs         [patch] [request] add -d option to umount(8) to detach
o kern/113852  fs         [smbfs] smbfs does not properly implement DFS referral
o bin/113838   fs         [patch] [request] mount(8): add support for relative p
o bin/113049   fs         [patch] [request] make quot(8) use getopt(3) and show 
o kern/112658  fs         [smbfs] [patch] smbfs and caching problems (resolves b
o kern/111843  fs         [msdosfs] Long Names of files are incorrectly created 
o kern/111782  fs         [ufs] dump(8) fails horribly for large filesystems
s bin/111146   fs         [2tb] fsck(8) fails on 6T filesystem
o bin/107829   fs         [2TB] fdisk(8): invalid boundary checking in fdisk / w
o kern/106107  fs         [ufs] left-over fsck_snapshot after unfinished backgro
o kern/104406  fs         [ufs] Processes get stuck in "ufs" state under persist
o kern/104133  fs         [ext2fs] EXT2FS module corrupts EXT2/3 filesystems
o kern/103035  fs         [ntfs] Directories in NTFS mounted disc images appear 
o kern/101324  fs         [smbfs] smbfs sometimes not case sensitive when it's s
o kern/99290   fs         [ntfs] mount_ntfs ignorant of cluster sizes
s bin/97498    fs         [request] newfs(8) has no option to clear the first 12
o kern/97377   fs         [ntfs] [patch] syntax cleanup for ntfs_ihash.c
o kern/95222   fs         [cd9660] File sections on ISO9660 level 3 CDs ignored
o kern/94849   fs         [ufs] rename on UFS filesystem is not atomic
o bin/94810    fs         fsck(8) incorrectly reports 'file system marked clean'
o kern/94769   fs         [ufs] Multiple file deletions on multi-snapshotted fil
o kern/94733   fs         [smbfs] smbfs may cause double unlock
o kern/93942   fs         [vfs] [patch] panic: ufs_dirbad: bad dir (patch from D
o kern/92272   fs         [ffs] [hang] Filling a filesystem while creating a sna
o kern/91134   fs         [smbfs] [patch] Preserve access and modification time 
a kern/90815   fs         [smbfs] [patch] SMBFS with character conversions somet
o kern/88657   fs         [smbfs] windows client hang when browsing a samba shar
o kern/88555   fs         [panic] ffs_blkfree: freeing free frag on AMD 64
o bin/87966    fs         [patch] newfs(8): introduce -A flag for newfs to enabl
o kern/87859   fs         [smbfs] System reboot while umount smbfs.
o kern/86587   fs         [msdosfs] rm -r /PATH fails with lots of small files
o bin/85494    fs         fsck_ffs: unchecked use of cg_inosused macro etc.
o kern/80088   fs         [smbfs] Incorrect file time setting on NTFS mounted vi
o bin/74779    fs         Background-fsck checks one filesystem twice and omits 
o kern/73484   fs         [ntfs] Kernel panic when doing `ls` from the client si
o bin/73019    fs         [ufs] fsck_ufs(8) cannot alloc 607016868 bytes for ino
o kern/71774   fs         [ntfs] NTFS cannot "see" files on a WinXP filesystem
o bin/70600    fs         fsck(8) throws files away when it can't grow lost+foun
o kern/68978   fs         [panic] [ufs] crashes with failing hard disk, loose po
o kern/65920   fs         [nwfs] Mounted Netware filesystem behaves strange
o kern/65901   fs         [smbfs] [patch] smbfs fails fsx write/truncate-down/tr
o kern/61503   fs         [smbfs] mount_smbfs does not work as non-root
o kern/55617   fs         [smbfs] Accessing an nsmb-mounted drive via a smb expo
o kern/51685   fs         [hang] Unbounded inode allocation causes kernel to loc
o kern/36566   fs         [smbfs] System reboot with dead smb mount and umount
o bin/27687    fs         fsck(8) wrapper is not properly passing options to fsc
o kern/18874   fs         [2TB] 32bit NFS servers export wrong negative values t

300 problems total.


From owner-freebsd-fs@FreeBSD.ORG  Mon Jan 14 19:13:42 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115])
 by hub.freebsd.org (Postfix) with ESMTP id 48135DA5
 for <freebsd-fs@freebsd.org>; Mon, 14 Jan 2013 19:13:42 +0000 (UTC)
 (envelope-from artemb@gmail.com)
Received: from mail-vb0-f47.google.com (mail-vb0-f47.google.com
 [209.85.212.47]) by mx1.freebsd.org (Postfix) with ESMTP id F04751C9
 for <freebsd-fs@freebsd.org>; Mon, 14 Jan 2013 19:13:41 +0000 (UTC)
Received: by mail-vb0-f47.google.com with SMTP id e21so3912475vbm.34
 for <freebsd-fs@freebsd.org>; Mon, 14 Jan 2013 11:13:41 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
 h=mime-version:sender:in-reply-to:references:date
 :x-google-sender-auth:message-id:subject:from:to:cc:content-type;
 bh=pZo+nyGb8Coy3KWiu9wDGHcznmwlSyv8uWLJMzxP8mo=;
 b=mq5NNJ+pWNc2HMqdufIE0bqP3uMKcLUCoBhDsyVVTRtmGDAGGj2ru4eE7C1unorfRk
 KX4G1HuZPMqgtpPW79SfU0VEqxUpZgN9Dc1nTkLGh2cb8KcfhsKf90v5wA6Z7WvtOYSY
 VKJ1E1j8dKVHLuUIkEZyV0WHiS1LqqHlsXJ8jEBa9L1ivNVWIfvH4wMpLKeG08ZOeg8D
 Wi+qiRoqpKTuJk/DtXQSD0fCDiInQc4w2Vqy8GR51TmBltQVOPu3onf/1KZxRnSnVXUb
 A8lmkUUTSp2gzCi1pba/KKWHav6IQAzZ7LIWfVMZwK0zwOeQaC9VAiErATYYgmO+blNV
 BKxw==
MIME-Version: 1.0
Received: by 10.52.180.200 with SMTP id dq8mr89384491vdc.71.1358190820894;
 Mon, 14 Jan 2013 11:13:40 -0800 (PST)
Sender: artemb@gmail.com
Received: by 10.220.122.196 with HTTP; Mon, 14 Jan 2013 11:13:40 -0800 (PST)
In-Reply-To: <20130114094010.GA75529@mid.pc5.i.0x5.de>
References: <20130108174225.GA17260@mid.pc5.i.0x5.de>
 <CAFqOu6jgA8RWV5d+rOBk8D=3Vu3yWSnDkAi1cFJ0esj4OpBy2Q@mail.gmail.com>
 <20130109162613.GA34276@mid.pc5.i.0x5.de>
 <CAFqOu6jrng=v8eVyhqV-PBqJM_dYy+U7X4+=ahBeoxvK4mxcSA@mail.gmail.com>
 <20130110193949.GA10023@mid.pc5.i.0x5.de>
 <20130111073417.GA95100@mid.pc5.i.0x5.de>
 <CAFqOu6gWpMsWN0pTBiv10WfwyGWMfO9GzMLWTtcVxHixr-_i3Q@mail.gmail.com>
 <20130114094010.GA75529@mid.pc5.i.0x5.de>
Date: Mon, 14 Jan 2013 11:13:40 -0800
X-Google-Sender-Auth: wj3keMDjo9kBGkdBzj1W7RwB6V0
Message-ID: <CAFqOu6hxfGt_M6Jo9qWeifDz9YnNc_Bd9H-GEe4RYtutaPvH5w@mail.gmail.com>
Subject: Re: slowdown of zfs (tx->tx)
From: Artem Belevich <art@freebsd.org>
To: Nicolas Rachinsky <fbsd-mas-0@ml.turing-complete.org>
Content-Type: text/plain; charset=ISO-8859-1
Cc: freebsd-fs <freebsd-fs@freebsd.org>
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 14 Jan 2013 19:13:42 -0000

On Mon, Jan 14, 2013 at 1:40 AM, Nicolas Rachinsky
<fbsd-mas-0@ml.turing-complete.org> wrote:
>   5 Reallocated_Sector_Ct   0x0033   094   094   010    Pre-fail  Always       -       166
> 195 Hardware_ECC_Recovered  0x001a   100   100   000    Old_age   Always       -       1259614646
> 196 Reallocated_Event_Count 0x0032   096   096   000    Old_age   Always       -       166

> Reallocated_Sector_Ct did not increase during the last days.

It does not matter IMHO. That hard drive already got quite a few bad
sectors that ECC could not deal with. There are apparently more
marginally bad sectors, but ECC deals with it for now. Once enough
bits rot, you'll get more bad sectors. I personally would replace the
drive.

>> Cound you do gstat with 1-second interval. Some of the 5-second
>> samples show that ada8 is the bottleneck -- it has its request queue
>> full (L(q)=10) when all other drives were done with their jobs. And
>> that's a 5-sec average. Its write service time also seems to be a lot
>> higher than for other drives.
>
> Attached.  I have replace ada8 by ada9, which is a Western Digital
> Caviar Black.
>
> Now ada0 and ada4 seem to be the bottleneck.
>
> But I don't understand the intervalls without any disk activity.

It is puzzling. Is rsync still sleeping in tx->tx state? Try running
"procstat -kk <rsync-PID>" periodically. It will print in-kernel stack
trace and may help giving a clue where/why rsync is stuck.

--Artem

From owner-freebsd-fs@FreeBSD.ORG  Mon Jan 14 19:37:16 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: fs@freebsd.org
Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115])
 by hub.freebsd.org (Postfix) with ESMTP id C2DCD731;
 Mon, 14 Jan 2013 19:37:16 +0000 (UTC) (envelope-from jhb@freebsd.org)
Received: from bigwig.baldwin.cx (bigknife-pt.tunnel.tserv9.chi1.ipv6.he.net
 [IPv6:2001:470:1f10:75::2])
 by mx1.freebsd.org (Postfix) with ESMTP id 0A59E34F;
 Mon, 14 Jan 2013 19:37:16 +0000 (UTC)
Received: from pakbsde14.localnet (unknown [38.105.238.108])
 by bigwig.baldwin.cx (Postfix) with ESMTPSA id AFD43B95B;
 Mon, 14 Jan 2013 14:37:14 -0500 (EST)
From: John Baldwin <jhb@freebsd.org>
To: fs@freebsd.org
Subject: [PATCH] Properly handle signals on interruptible NFS mounts
Date: Mon, 14 Jan 2013 14:37:04 -0500
User-Agent: KMail/1.13.5 (FreeBSD/8.2-CBSD-20110714-p22; KDE/4.5.5; amd64; ; )
MIME-Version: 1.0
Content-Type: Text/Plain;
  charset="us-ascii"
Content-Transfer-Encoding: 7bit
Message-Id: <201301141437.05040.jhb@freebsd.org>
X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.2.7
 (bigwig.baldwin.cx); Mon, 14 Jan 2013 14:37:14 -0500 (EST)
Cc: Rick Macklem <rmacklem@freebsd.org>, Doug Rabson <dfr@freebsd.org>
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 14 Jan 2013 19:37:16 -0000

When the new RPC layer was brought in, the RPC_INTR return value (to indicate 
an RPC request was interrupted by a signal) was not handled in the NFS client.  
As a result, if an NFS request is interrupted by a signal (on a mount with the 
"intr" option), then the nfs_request() functions would fall through to the 
default case and return EACCES rather than EINTR.  While here, I noticed that 
the new RPC layer also lost all of the RPC statistics the old client used to 
keep (but that are still reported in 'nfsstat -c').  I've added back as many 
of the statistics as I could, but retries are not easy to do as only the RPC 
layer knows about them and not the NFS client.

Index: fs/nfs/nfs_commonkrpc.c
===================================================================
--- fs/nfs/nfs_commonkrpc.c	(revision 245225)
+++ fs/nfs/nfs_commonkrpc.c	(working copy)
@@ -767,12 +767,18 @@
 	if (stat == RPC_SUCCESS) {
 		error = 0;
 	} else if (stat == RPC_TIMEDOUT) {
+		NFSINCRGLOBAL(newnfsstats.rpctimeouts);
 		error = ETIMEDOUT;
 	} else if (stat == RPC_VERSMISMATCH) {
+		NFSINCRGLOBAL(newnfsstats.rpcinvalid);
 		error = EOPNOTSUPP;
 	} else if (stat == RPC_PROGVERSMISMATCH) {
+		NFSINCRGLOBAL(newnfsstats.rpcinvalid);
 		error = EPROTONOSUPPORT;
+	} else if (stat == RPC_INTR) {
+		error = EINTR;
 	} else {
+		NFSINCRGLOBAL(newnfsstats.rpcinvalid);
 		error = EACCES;
 	}
 	if (error) {
Index: nfsclient/nfs_krpc.c
===================================================================
--- nfsclient/nfs_krpc.c	(revision 245225)
+++ nfsclient/nfs_krpc.c	(working copy)
@@ -549,14 +549,21 @@
 	 */
 	if (stat == RPC_SUCCESS)
 		error = 0;
-	else if (stat == RPC_TIMEDOUT)
+	else if (stat == RPC_TIMEDOUT) {
+		nfsstats.rpctimeouts++;
 		error = ETIMEDOUT;
-	else if (stat == RPC_VERSMISMATCH)
+	} else if (stat == RPC_VERSMISMATCH) {
+		nfsstats.rpcinvalid++;
 		error = EOPNOTSUPP;
-	else if (stat == RPC_PROGVERSMISMATCH)
+	} else if (stat == RPC_PROGVERSMISMATCH) {
+		nfsstats.rpcinvalid++;
 		error = EPROTONOSUPPORT;
-	else
+	} else if (stat == RPC_INTR) {
+		error = EINTR;
+	} else {
+		nfsstats.rpcinvalid++;
 		error = EACCES;
+	}
 	if (error)
 		goto nfsmout;
 
@@ -572,6 +579,7 @@
 	if (error == ENOMEM) {
 		m_freem(mrep);
 		AUTH_DESTROY(auth);
+		nfsstats.rpcinvalid++;
 		return (error);
 	}
 

-- 
John Baldwin

From owner-freebsd-fs@FreeBSD.ORG  Mon Jan 14 19:45:30 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by hub.freebsd.org (Postfix) with ESMTP id D0481C00;
 Mon, 14 Jan 2013 19:45:30 +0000 (UTC) (envelope-from jhb@freebsd.org)
Received: from bigwig.baldwin.cx (bigknife-pt.tunnel.tserv9.chi1.ipv6.he.net
 [IPv6:2001:470:1f10:75::2])
 by mx1.freebsd.org (Postfix) with ESMTP id A4AB765E;
 Mon, 14 Jan 2013 19:45:30 +0000 (UTC)
Received: from pakbsde14.localnet (unknown [38.105.238.108])
 by bigwig.baldwin.cx (Postfix) with ESMTPSA id 1E09DB95E;
 Mon, 14 Jan 2013 14:45:30 -0500 (EST)
From: John Baldwin <jhb@freebsd.org>
To: fs@freebsd.org
Subject: [PATCH] Better handle NULL utimes() in the NFS client
Date: Mon, 14 Jan 2013 14:45:29 -0500
User-Agent: KMail/1.13.5 (FreeBSD/8.2-CBSD-20110714-p22; KDE/4.5.5; amd64; ; )
MIME-Version: 1.0
Content-Type: Text/Plain;
  charset="us-ascii"
Content-Transfer-Encoding: 7bit
Message-Id: <201301141445.29260.jhb@freebsd.org>
X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.2.7
 (bigwig.baldwin.cx); Mon, 14 Jan 2013 14:45:30 -0500 (EST)
Cc: Rick Macklem <rmacklem@freebsd.org>
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 14 Jan 2013 19:45:30 -0000

The NFS client tries to infer when an application has passed NULL to utimes() 
so that it can let the server set the timestamp rather than using a client-
supplied timestamp.  It does this by checking to see if the desired 
timestamp's second matches the current second.  However, this breaks 
applications that are intentionally trying to set a specific timestamp within 
the current second.  In addition, utimes() sets a flag to indicate if NULL was 
passed to utimes().  The patch below changes the NFS client to check this flag 
and only use the server-supplied time in that case:

Index: fs/nfsclient/nfs_clport.c
===================================================================
--- fs/nfsclient/nfs_clport.c	(revision 225511)
+++ fs/nfsclient/nfs_clport.c	(working copy)
@@ -762,7 +762,7 @@
 			*tl = newnfs_false;
 		}
 		if (vap->va_atime.tv_sec != VNOVAL) {
-			if (vap->va_atime.tv_sec != curtime.tv_sec) {
+			if (!(vap->va_vaflags & VA_UTIMES_NULL)) {
 				NFSM_BUILD(tl, u_int32_t *, 3 * NFSX_UNSIGNED);
 				*tl++ = txdr_unsigned(NFSV3SATTRTIME_TOCLIENT);
 				txdr_nfsv3time(&vap->va_atime, tl);
@@ -775,7 +775,7 @@
 			*tl = txdr_unsigned(NFSV3SATTRTIME_DONTCHANGE);
 		}
 		if (vap->va_mtime.tv_sec != VNOVAL) {
-			if (vap->va_mtime.tv_sec != curtime.tv_sec) {
+			if (!(vap->va_vaflags & VA_UTIMES_NULL)) {
 				NFSM_BUILD(tl, u_int32_t *, 3 * NFSX_UNSIGNED);
 				*tl++ = txdr_unsigned(NFSV3SATTRTIME_TOCLIENT);
 				txdr_nfsv3time(&vap->va_mtime, tl);
Index: nfsclient/nfs_subs.c
===================================================================
--- nfsclient/nfs_subs.c	(revision 225511)
+++ nfsclient/nfs_subs.c	(working copy)
@@ -1119,7 +1119,7 @@
 		*tl = nfs_false;
 	}
 	if (va->va_atime.tv_sec != VNOVAL) {
-		if (va->va_atime.tv_sec != time_second) {
+		if (!(vattr.va_vaflags & VA_UTIMES_NULL)) {
 			tl = nfsm_build_xx(3 * NFSX_UNSIGNED, mb, bpos);
 			*tl++ = txdr_unsigned(NFSV3SATTRTIME_TOCLIENT);
 			txdr_nfsv3time(&va->va_atime, tl);
@@ -1132,7 +1132,7 @@
 		*tl = txdr_unsigned(NFSV3SATTRTIME_DONTCHANGE);
 	}
 	if (va->va_mtime.tv_sec != VNOVAL) {
-		if (va->va_mtime.tv_sec != time_second) {
+		if (!(vattr.va_vaflags & VA_UTIMES_NULL)) {
 			tl = nfsm_build_xx(3 * NFSX_UNSIGNED, mb, bpos);
 			*tl++ = txdr_unsigned(NFSV3SATTRTIME_TOCLIENT);
 			txdr_nfsv3time(&va->va_mtime, tl);

-- 
John Baldwin

From owner-freebsd-fs@FreeBSD.ORG  Mon Jan 14 19:51:50 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by hub.freebsd.org (Postfix) with ESMTP id B7CAE16B;
 Mon, 14 Jan 2013 19:51:50 +0000 (UTC)
 (envelope-from nicolas@i.0x5.de)
Received: from n.0x5.de (n.0x5.de [217.197.85.144])
 by mx1.freebsd.org (Postfix) with ESMTP id E5A186FB;
 Mon, 14 Jan 2013 19:51:49 +0000 (UTC)
Received: by pc5.i.0x5.de (Postfix, from userid 1003)
 id 3YlQLD53Bzz7ySc; Mon, 14 Jan 2013 20:51:48 +0100 (CET)
Date: Mon, 14 Jan 2013 20:51:48 +0100
From: Nicolas Rachinsky <fbsd-mas-0@ml.turing-complete.org>
To: Artem Belevich <art@freebsd.org>
Subject: Re: slowdown of zfs (tx->tx)
Message-ID: <20130114195148.GA20540@mid.pc5.i.0x5.de>
References: <20130108174225.GA17260@mid.pc5.i.0x5.de>
 <CAFqOu6jgA8RWV5d+rOBk8D=3Vu3yWSnDkAi1cFJ0esj4OpBy2Q@mail.gmail.com>
 <20130109162613.GA34276@mid.pc5.i.0x5.de>
 <CAFqOu6jrng=v8eVyhqV-PBqJM_dYy+U7X4+=ahBeoxvK4mxcSA@mail.gmail.com>
 <20130110193949.GA10023@mid.pc5.i.0x5.de>
 <20130111073417.GA95100@mid.pc5.i.0x5.de>
 <CAFqOu6gWpMsWN0pTBiv10WfwyGWMfO9GzMLWTtcVxHixr-_i3Q@mail.gmail.com>
 <20130114094010.GA75529@mid.pc5.i.0x5.de>
 <CAFqOu6hxfGt_M6Jo9qWeifDz9YnNc_Bd9H-GEe4RYtutaPvH5w@mail.gmail.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <CAFqOu6hxfGt_M6Jo9qWeifDz9YnNc_Bd9H-GEe4RYtutaPvH5w@mail.gmail.com>
X-Powered-by: FreeBSD
X-Homepage: http://www.rachinsky.de
X-PGP-Keyid: 887BAE72
X-PGP-Fingerprint: 039E 9433 115F BC5F F88D  4524 5092 45C4 887B AE72
X-PGP-Keys: http://www.rachinsky.de/nicolas/gpg/nicolas_rachinsky.asc
User-Agent: Mutt/1.5.21 (2010-09-15)
Cc: freebsd-fs <freebsd-fs@freebsd.org>
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 14 Jan 2013 19:51:50 -0000

* Artem Belevich <art@freebsd.org> [2013-01-14 11:13 -0800]:
> On Mon, Jan 14, 2013 at 1:40 AM, Nicolas Rachinsky
> <fbsd-mas-0@ml.turing-complete.org> wrote:
> >   5 Reallocated_Sector_Ct   0x0033   094   094   010    Pre-fail  Always       -       166
> > 195 Hardware_ECC_Recovered  0x001a   100   100   000    Old_age   Always       -       1259614646
> > 196 Reallocated_Event_Count 0x0032   096   096   000    Old_age   Always       -       166
> 
> > Reallocated_Sector_Ct did not increase during the last days.
> 
> It does not matter IMHO. That hard drive already got quite a few bad
> sectors that ECC could not deal with. There are apparently more
> marginally bad sectors, but ECC deals with it for now. Once enough
> bits rot, you'll get more bad sectors. I personally would replace the
> drive.

Yes, I'll do that. 

> >> Cound you do gstat with 1-second interval. Some of the 5-second
> >> samples show that ada8 is the bottleneck -- it has its request queue
> >> full (L(q)=10) when all other drives were done with their jobs. And
> >> that's a 5-sec average. Its write service time also seems to be a lot
> >> higher than for other drives.
> >
> > Attached.  I have replace ada8 by ada9, which is a Western Digital
> > Caviar Black.
> >
> > Now ada0 and ada4 seem to be the bottleneck.
> >
> > But I don't understand the intervalls without any disk activity.
> 
> It is puzzling. Is rsync still sleeping in tx->tx state? Try running
> "procstat -kk <rsync-PID>" periodically. It will print in-kernel stack
> trace and may help giving a clue where/why rsync is stuck.

# sh -c 'for i in `jot 100`; do procstat -kk 36639 ; sleep 1; done' | sort | uniq -c
 100   PID    TID COMM             TDNAME           KSTACK                       
   1 36639 100574 rsync            -                <running>                    
  99 36639 100574 rsync            -                mi_switch+0x176 sleepq_wait+0x42 _cv_wait+0x129 txg_wait_open+0x85 zfs_freebsd_write+0x3a6 VOP_WRITE_APV+0xb2 vn_write+0x373 dofilewrite+0x8b kern_writev+0x60 write+0x55 amd64_syscall+0x1f4 Xfast_syscall+0xfc 

# sh -c 'for i in `jot 100`; do procstat -kk 36639 ; sleep 0.36; done' | sort | uniq -c
 100   PID    TID COMM             TDNAME           KSTACK                       
   1 36639 100574 rsync            -                mi_switch+0x176 sleepq_timedwait+0x42 _cv_timedwait+0x134 txg_delay+0x137 dsl_pool_tempreserve_space+0xd5 dsl_dir_tempreserve_space+0x154 dmu_tx_assign+0x370 zfs_freebsd_write+0x38a VOP_WRITE_APV+0xb2 vn_write+0x373 dofilewrite+0x8b kern_writev+0x60 write+0x55 amd64_syscall+0x1f4 Xfast_syscall+0xfc 
  99 36639 100574 rsync            -                mi_switch+0x176 sleepq_wait+0x42 _cv_wait+0x129 txg_wait_open+0x85 zfs_freebsd_write+0x3a6 VOP_WRITE_APV+0xb2 vn_write+0x373 dofilewrite+0x8b kern_writev+0x60 write+0x55 amd64_syscall+0x1f4 Xfast_syscall+0xfc 

# sh -c 'for i in `jot 100`; do procstat -kk 36639 ; sleep 0.1; done' | sort | uniq -c
 100   PID    TID COMM             TDNAME           KSTACK                       
 100 36639 100574 rsync            -                mi_switch+0x176 sleepq_wait+0x42 _cv_wait+0x129 txg_wait_open+0x85 zfs_freebsd_write+0x3a6 VOP_WRITE_APV+0xb2 vn_write+0x373 dofilewrite+0x8b kern_writev+0x60 write+0x55 amd64_syscall+0x1f4 Xfast_syscall+0xfc 


Thanks in advance

Nicolas

-- 
http://www.rachinsky.de/nicolas

From owner-freebsd-fs@FreeBSD.ORG  Mon Jan 14 20:41:03 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115])
 by hub.freebsd.org (Postfix) with ESMTP id 73244162
 for <freebsd-fs@freebsd.org>; Mon, 14 Jan 2013 20:41:03 +0000 (UTC)
 (envelope-from artemb@gmail.com)
Received: from mail-vb0-f51.google.com (mail-vb0-f51.google.com
 [209.85.212.51]) by mx1.freebsd.org (Postfix) with ESMTP id 1772D92F
 for <freebsd-fs@freebsd.org>; Mon, 14 Jan 2013 20:41:02 +0000 (UTC)
Received: by mail-vb0-f51.google.com with SMTP id fq11so4047416vbb.10
 for <freebsd-fs@freebsd.org>; Mon, 14 Jan 2013 12:41:02 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
 h=mime-version:sender:in-reply-to:references:date
 :x-google-sender-auth:message-id:subject:from:to:cc:content-type
 :content-transfer-encoding;
 bh=XQIM1nDAEmRY7MDfEhauYCWiozqZHx65EbRdswFhdJE=;
 b=wK7sZLleTAtTH9cg7amqFyKXAFSxIbuYUvjARznHHQ81hypaT8UfmmToSCsNi4BqXr
 r14C9VrhOJLPqjpc34CnRo0e1ujXP1lxgHF2R9oQvLP4wBDFOFgdYV1+hyRedEmp4DV9
 5vPVIIesiLyhe3DySZnUP2uS5oMAF5M5YdfjM0hLiMxoBCufSATVDbSD2V3u+geBz4iT
 8FvEaohBpAyaAvQGQec2PDi+hsovSb3uRLsdhqno9xMTx7jxVIDIwzI9GL1Dg0PWM0jI
 2dWi8k/juPcZx9jBfhOSTzsH7U5sfznghfIHtrYzAqCSKXuMM40h0QfKiwWPf2fiMgYe
 r/ig==
MIME-Version: 1.0
Received: by 10.52.156.40 with SMTP id wb8mr90499872vdb.39.1358196062075; Mon,
 14 Jan 2013 12:41:02 -0800 (PST)
Sender: artemb@gmail.com
Received: by 10.220.122.196 with HTTP; Mon, 14 Jan 2013 12:41:01 -0800 (PST)
In-Reply-To: <20130114195148.GA20540@mid.pc5.i.0x5.de>
References: <20130108174225.GA17260@mid.pc5.i.0x5.de>
 <CAFqOu6jgA8RWV5d+rOBk8D=3Vu3yWSnDkAi1cFJ0esj4OpBy2Q@mail.gmail.com>
 <20130109162613.GA34276@mid.pc5.i.0x5.de>
 <CAFqOu6jrng=v8eVyhqV-PBqJM_dYy+U7X4+=ahBeoxvK4mxcSA@mail.gmail.com>
 <20130110193949.GA10023@mid.pc5.i.0x5.de>
 <20130111073417.GA95100@mid.pc5.i.0x5.de>
 <CAFqOu6gWpMsWN0pTBiv10WfwyGWMfO9GzMLWTtcVxHixr-_i3Q@mail.gmail.com>
 <20130114094010.GA75529@mid.pc5.i.0x5.de>
 <CAFqOu6hxfGt_M6Jo9qWeifDz9YnNc_Bd9H-GEe4RYtutaPvH5w@mail.gmail.com>
 <20130114195148.GA20540@mid.pc5.i.0x5.de>
Date: Mon, 14 Jan 2013 12:41:01 -0800
X-Google-Sender-Auth: g2rKpwHkwJdoYdnres2GBFw2hCM
Message-ID: <CAFqOu6jwJ4qhbOovN_NhzusdQJvrbvUC3g93sziR=Uw99SGenw@mail.gmail.com>
Subject: Re: slowdown of zfs (tx->tx)
From: Artem Belevich <art@freebsd.org>
To: Nicolas Rachinsky <fbsd-mas-0@ml.turing-complete.org>
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable
Cc: freebsd-fs <freebsd-fs@freebsd.org>
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 14 Jan 2013 20:41:03 -0000

txg_wait_open means that ZFS is waiting for ongoing transaction group
sync. There should've been some write activity in this case.

Check what zfs kernel threads are doing with procstat -kk on zfskern proces=
s.

--Artem

On Mon, Jan 14, 2013 at 11:51 AM, Nicolas Rachinsky
<fbsd-mas-0@ml.turing-complete.org> wrote:
> * Artem Belevich <art@freebsd.org> [2013-01-14 11:13 -0800]:
>> On Mon, Jan 14, 2013 at 1:40 AM, Nicolas Rachinsky
>> <fbsd-mas-0@ml.turing-complete.org> wrote:
>> >   5 Reallocated_Sector_Ct   0x0033   094   094   010    Pre-fail  Alwa=
ys       -       166
>> > 195 Hardware_ECC_Recovered  0x001a   100   100   000    Old_age   Alwa=
ys       -       1259614646
>> > 196 Reallocated_Event_Count 0x0032   096   096   000    Old_age   Alwa=
ys       -       166
>>
>> > Reallocated_Sector_Ct did not increase during the last days.
>>
>> It does not matter IMHO. That hard drive already got quite a few bad
>> sectors that ECC could not deal with. There are apparently more
>> marginally bad sectors, but ECC deals with it for now. Once enough
>> bits rot, you'll get more bad sectors. I personally would replace the
>> drive.
>
> Yes, I'll do that.
>
>> >> Cound you do gstat with 1-second interval. Some of the 5-second
>> >> samples show that ada8 is the bottleneck -- it has its request queue
>> >> full (L(q)=3D10) when all other drives were done with their jobs. And
>> >> that's a 5-sec average. Its write service time also seems to be a lot
>> >> higher than for other drives.
>> >
>> > Attached.  I have replace ada8 by ada9, which is a Western Digital
>> > Caviar Black.
>> >
>> > Now ada0 and ada4 seem to be the bottleneck.
>> >
>> > But I don't understand the intervalls without any disk activity.
>>
>> It is puzzling. Is rsync still sleeping in tx->tx state? Try running
>> "procstat -kk <rsync-PID>" periodically. It will print in-kernel stack
>> trace and may help giving a clue where/why rsync is stuck.
>
> # sh -c 'for i in `jot 100`; do procstat -kk 36639 ; sleep 1; done' | sor=
t | uniq -c
>  100   PID    TID COMM             TDNAME           KSTACK
>    1 36639 100574 rsync            -                <running>
>   99 36639 100574 rsync            -                mi_switch+0x176 sleep=
q_wait+0x42 _cv_wait+0x129 txg_wait_open+0x85 zfs_freebsd_write+0x3a6 VOP_W=
RITE_APV+0xb2 vn_write+0x373 dofilewrite+0x8b kern_writev+0x60 write+0x55 a=
md64_syscall+0x1f4 Xfast_syscall+0xfc
>
> # sh -c 'for i in `jot 100`; do procstat -kk 36639 ; sleep 0.36; done' | =
sort | uniq -c
>  100   PID    TID COMM             TDNAME           KSTACK
>    1 36639 100574 rsync            -                mi_switch+0x176 sleep=
q_timedwait+0x42 _cv_timedwait+0x134 txg_delay+0x137 dsl_pool_tempreserve_s=
pace+0xd5 dsl_dir_tempreserve_space+0x154 dmu_tx_assign+0x370 zfs_freebsd_w=
rite+0x38a VOP_WRITE_APV+0xb2 vn_write+0x373 dofilewrite+0x8b kern_writev+0=
x60 write+0x55 amd64_syscall+0x1f4 Xfast_syscall+0xfc
>   99 36639 100574 rsync            -                mi_switch+0x176 sleep=
q_wait+0x42 _cv_wait+0x129 txg_wait_open+0x85 zfs_freebsd_write+0x3a6 VOP_W=
RITE_APV+0xb2 vn_write+0x373 dofilewrite+0x8b kern_writev+0x60 write+0x55 a=
md64_syscall+0x1f4 Xfast_syscall+0xfc
>
> # sh -c 'for i in `jot 100`; do procstat -kk 36639 ; sleep 0.1; done' | s=
ort | uniq -c
>  100   PID    TID COMM             TDNAME           KSTACK
>  100 36639 100574 rsync            -                mi_switch+0x176 sleep=
q_wait+0x42 _cv_wait+0x129 txg_wait_open+0x85 zfs_freebsd_write+0x3a6 VOP_W=
RITE_APV+0xb2 vn_write+0x373 dofilewrite+0x8b kern_writev+0x60 write+0x55 a=
md64_syscall+0x1f4 Xfast_syscall+0xfc
>
>
> Thanks in advance
>
> Nicolas
>
> --
> http://www.rachinsky.de/nicolas

From owner-freebsd-fs@FreeBSD.ORG  Mon Jan 14 21:46:54 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115])
 by hub.freebsd.org (Postfix) with ESMTP id 9F9FEF8D;
 Mon, 14 Jan 2013 21:46:54 +0000 (UTC)
 (envelope-from nicolas@i.0x5.de)
Received: from n.0x5.de (n.0x5.de [217.197.85.144])
 by mx1.freebsd.org (Postfix) with ESMTP id 9F719D50;
 Mon, 14 Jan 2013 21:46:53 +0000 (UTC)
Received: by pc5.i.0x5.de (Postfix, from userid 1003)
 id 3YlSv031mZz7ySG; Mon, 14 Jan 2013 22:46:52 +0100 (CET)
Date: Mon, 14 Jan 2013 22:46:52 +0100
From: Nicolas Rachinsky <fbsd-mas-0@ml.turing-complete.org>
To: Artem Belevich <art@freebsd.org>
Subject: Re: slowdown of zfs (tx->tx)
Message-ID: <20130114214652.GA76779@mid.pc5.i.0x5.de>
References: <CAFqOu6jgA8RWV5d+rOBk8D=3Vu3yWSnDkAi1cFJ0esj4OpBy2Q@mail.gmail.com>
 <20130109162613.GA34276@mid.pc5.i.0x5.de>
 <CAFqOu6jrng=v8eVyhqV-PBqJM_dYy+U7X4+=ahBeoxvK4mxcSA@mail.gmail.com>
 <20130110193949.GA10023@mid.pc5.i.0x5.de>
 <20130111073417.GA95100@mid.pc5.i.0x5.de>
 <CAFqOu6gWpMsWN0pTBiv10WfwyGWMfO9GzMLWTtcVxHixr-_i3Q@mail.gmail.com>
 <20130114094010.GA75529@mid.pc5.i.0x5.de>
 <CAFqOu6hxfGt_M6Jo9qWeifDz9YnNc_Bd9H-GEe4RYtutaPvH5w@mail.gmail.com>
 <20130114195148.GA20540@mid.pc5.i.0x5.de>
 <CAFqOu6jwJ4qhbOovN_NhzusdQJvrbvUC3g93sziR=Uw99SGenw@mail.gmail.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <CAFqOu6jwJ4qhbOovN_NhzusdQJvrbvUC3g93sziR=Uw99SGenw@mail.gmail.com>
X-Powered-by: FreeBSD
X-Homepage: http://www.rachinsky.de
X-PGP-Keyid: 887BAE72
X-PGP-Fingerprint: 039E 9433 115F BC5F F88D  4524 5092 45C4 887B AE72
X-PGP-Keys: http://www.rachinsky.de/nicolas/gpg/nicolas_rachinsky.asc
User-Agent: Mutt/1.5.21 (2010-09-15)
Cc: freebsd-fs <freebsd-fs@freebsd.org>
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 14 Jan 2013 21:46:54 -0000

* Artem Belevich <art@freebsd.org> [2013-01-14 12:41 -0800]:
> txg_wait_open means that ZFS is waiting for ongoing transaction group
> sync. There should've been some write activity in this case.
> 
> Check what zfs kernel threads are doing with procstat -kk on zfskern process.

# sh -c 'for i in `jot 1000`; do procstat -kk 47 ; sleep 0.1; done' | sort | uniq -c
1000    47 100083 zfskern          arc_reclaim_thre mi_switch+0x176 sleepq_timedwait+0x42 _cv_timedwait+0x134 arc_reclaim_thread+0x29d fork_exit+0x11f fork_trampoline+0xe 
1000    47 100084 zfskern          l2arc_feed_threa mi_switch+0x176 sleepq_timedwait+0x42 _cv_timedwait+0x134 l2arc_feed_thread+0x1a8 fork_exit+0x11f fork_trampoline+0xe 
1000    47 100224 zfskern          txg_thread_enter mi_switch+0x176 sleepq_wait+0x42 _cv_wait+0x129 txg_thread_wait+0x79 txg_quiesce_thread+0xb5 fork_exit+0x11f fork_trampoline+0xe 
 165    47 100225 zfskern          txg_thread_enter <running>                    
   1    47 100225 zfskern          txg_thread_enter mi_switch+0x176 sleepq_wait+0x42 _cv_wait+0x129 zio_wait+0x61 dbuf_read+0x5e5 dbuf_findbp+0x107 dbuf_prefetch+0x8f dmu_zfetch_dofetch+0x10b dmu_zfetch+0xaf8 dbuf_read+0x675 dnode_hold_impl+0xf2 dmu_buf_hold_array+0x38 dmu_write+0x53 space_map_sync+0x1ff metaslab_sync+0x13e vdev_sync+0x6e spa_sync+0x3ab txg_sync_thread+0x139 
   1    47 100225 zfskern          txg_thread_enter mi_switch+0x176 sleepq_wait+0x42 _cv_wait+0x129 zio_wait+0x61 dbuf_read+0x5e5 dbuf_will_dirty+0x60 dmu_write+0x82 space_map_sync+0x1ff metaslab_sync+0x13e vdev_sync+0x6e spa_sync+0x3ab txg_sync_thread+0x139 fork_exit+0x11f fork_trampoline+0xe 
   1    47 100225 zfskern          txg_thread_enter mi_switch+0x176 sleepq_wait+0x42 _cv_wait+0x129 zio_wait+0x61 dsl_pool_sync+0x189 spa_sync+0x336 txg_sync_thread+0x139 fork_exit+0x11f fork_trampoline+0xe 
  81    47 100225 zfskern          txg_thread_enter mi_switch+0x176 sleepq_wait+0x42 _cv_wait+0x129 zio_wait+0x61 dsl_pool_sync+0x2c3 spa_sync+0x336 txg_sync_thread+0x139 fork_exit+0x11f fork_trampoline+0xe 
 719    47 100225 zfskern          txg_thread_enter mi_switch+0x176 sleepq_wait+0x42 _cv_wait+0x129 zio_wait+0x61 dsl_pool_sync+0xe0 spa_sync+0x336 txg_sync_thread+0x139 fork_exit+0x11f fork_trampoline+0xe 
   4    47 100225 zfskern          txg_thread_enter mi_switch+0x176 sleepq_wait+0x42 _cv_wait+0x129 zio_wait+0x61 spa_sync+0x286 txg_sync_thread+0x139 fork_exit+0x11f fork_trampoline+0xe 
   2    47 100225 zfskern          txg_thread_enter mi_switch+0x176 sleepq_wait+0x42 _cv_wait+0x129 zio_wait+0x61 spa_sync+0x370 txg_sync_thread+0x139 fork_exit+0x11f fork_trampoline+0xe 
  21    47 100225 zfskern          txg_thread_enter mi_switch+0x176 sleepq_wait+0x42 _cv_wait+0x129 zio_wait+0x61 vdev_config_sync+0xe3 spa_sync+0x49a txg_sync_thread+0x139 fork_exit+0x11f fork_trampoline+0xe 
   5    47 100225 zfskern          txg_thread_enter mi_switch+0x176 sleepq_wait+0x42 _cv_wait+0x129 zio_wait+0x61 vdev_uberblock_sync_list+0xd0 vdev_config_sync+0x10f spa_sync+0x49a txg_sync_thread+0x139 fork_exit+0x11f fork_trampoline+0xe 
1000   PID    TID COMM             TDNAME           KSTACK               

Thanks
Nicolas
-- 
http://www.rachinsky.de/nicolas

From owner-freebsd-fs@FreeBSD.ORG  Tue Jan 15 00:51:00 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by hub.freebsd.org (Postfix) with ESMTP id 233E5EAA;
 Tue, 15 Jan 2013 00:51:00 +0000 (UTC)
 (envelope-from rmacklem@uoguelph.ca)
Received: from esa-jnhn.mail.uoguelph.ca (esa-jnhn.mail.uoguelph.ca
 [131.104.91.44]) by mx1.freebsd.org (Postfix) with ESMTP id 9861E8BC;
 Tue, 15 Jan 2013 00:50:59 +0000 (UTC)
X-IronPort-Anti-Spam-Filtered: true
X-IronPort-Anti-Spam-Result: AqAEACSn9FCDaFvO/2dsb2JhbABEhjq3WHOCHgEBAQQBAQEgKyALGw4KAgINGQIpAQkmBggHBAEcBId4DKUikFqBI4tjgxWBEwOIYYp8gi6BHI8tgxOBUTU
X-IronPort-AV: E=Sophos;i="4.84,469,1355115600"; d="scan'208";a="11900158"
Received: from erie.cs.uoguelph.ca (HELO zcs3.mail.uoguelph.ca)
 ([131.104.91.206])
 by esa-jnhn.mail.uoguelph.ca with ESMTP; 14 Jan 2013 19:50:52 -0500
Received: from zcs3.mail.uoguelph.ca (localhost.localdomain [127.0.0.1])
 by zcs3.mail.uoguelph.ca (Postfix) with ESMTP id 9DBE8B3F16;
 Mon, 14 Jan 2013 19:50:52 -0500 (EST)
Date: Mon, 14 Jan 2013 19:50:52 -0500 (EST)
From: Rick Macklem <rmacklem@uoguelph.ca>
To: John Baldwin <jhb@freebsd.org>
Message-ID: <21875538.1984621.1358211052621.JavaMail.root@erie.cs.uoguelph.ca>
In-Reply-To: <201301141437.05040.jhb@freebsd.org>
Subject: Re: [PATCH] Properly handle signals on interruptible NFS mounts
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit
X-Originating-IP: [172.17.91.203]
X-Mailer: Zimbra 6.0.10_GA_2692 (ZimbraWebClient - FF3.0 (Win)/6.0.10_GA_2692)
Cc: Rick Macklem <rmacklem@freebsd.org>, Doug Rabson <dfr@freebsd.org>,
 fs@freebsd.org
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 15 Jan 2013 00:51:00 -0000

John Baldwin wrote:
> When the new RPC layer was brought in, the RPC_INTR return value (to
> indicate
> an RPC request was interrupted by a signal) was not handled in the NFS
> client.
> As a result, if an NFS request is interrupted by a signal (on a mount
> with the
> "intr" option), then the nfs_request() functions would fall through to
> the
> default case and return EACCES rather than EINTR. While here, I
> noticed that
> the new RPC layer also lost all of the RPC statistics the old client
> used to
> keep (but that are still reported in 'nfsstat -c'). I've added back as
> many
> of the statistics as I could, but retries are not easy to do as only
> the RPC
> layer knows about them and not the NFS client.
> 
> Index: fs/nfs/nfs_commonkrpc.c
> ===================================================================
> --- fs/nfs/nfs_commonkrpc.c (revision 245225)
> +++ fs/nfs/nfs_commonkrpc.c (working copy)
> @@ -767,12 +767,18 @@
> if (stat == RPC_SUCCESS) {
> error = 0;
> } else if (stat == RPC_TIMEDOUT) {
> + NFSINCRGLOBAL(newnfsstats.rpctimeouts);
> error = ETIMEDOUT;
> } else if (stat == RPC_VERSMISMATCH) {
> + NFSINCRGLOBAL(newnfsstats.rpcinvalid);
> error = EOPNOTSUPP;
> } else if (stat == RPC_PROGVERSMISMATCH) {
> + NFSINCRGLOBAL(newnfsstats.rpcinvalid);
> error = EPROTONOSUPPORT;
> + } else if (stat == RPC_INTR) {
> + error = EINTR;
> } else {
> + NFSINCRGLOBAL(newnfsstats.rpcinvalid);
> error = EACCES;
> }
> if (error) {
> Index: nfsclient/nfs_krpc.c
> ===================================================================
> --- nfsclient/nfs_krpc.c (revision 245225)
> +++ nfsclient/nfs_krpc.c (working copy)
> @@ -549,14 +549,21 @@
> */
> if (stat == RPC_SUCCESS)
> error = 0;
> - else if (stat == RPC_TIMEDOUT)
> + else if (stat == RPC_TIMEDOUT) {
> + nfsstats.rpctimeouts++;
> error = ETIMEDOUT;
> - else if (stat == RPC_VERSMISMATCH)
> + } else if (stat == RPC_VERSMISMATCH) {
> + nfsstats.rpcinvalid++;
> error = EOPNOTSUPP;
> - else if (stat == RPC_PROGVERSMISMATCH)
> + } else if (stat == RPC_PROGVERSMISMATCH) {
> + nfsstats.rpcinvalid++;
> error = EPROTONOSUPPORT;
> - else
> + } else if (stat == RPC_INTR) {
> + error = EINTR;
> + } else {
> + nfsstats.rpcinvalid++;
> error = EACCES;
> + }
> if (error)
> goto nfsmout;
> 
> @@ -572,6 +579,7 @@
> if (error == ENOMEM) {
> m_freem(mrep);
> AUTH_DESTROY(auth);
> + nfsstats.rpcinvalid++;
> return (error);
> }
> 
This patch looks fine to me, rick

> 
> --
> John Baldwin
> _______________________________________________
> freebsd-fs@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-fs
> To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org"

From owner-freebsd-fs@FreeBSD.ORG  Tue Jan 15 01:22:34 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: fs@freebsd.org
Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115])
 by hub.freebsd.org (Postfix) with ESMTP id D0F9F5C3;
 Tue, 15 Jan 2013 01:22:34 +0000 (UTC)
 (envelope-from rmacklem@uoguelph.ca)
Received: from esa-annu.net.uoguelph.ca (esa-annu.mail.uoguelph.ca
 [131.104.91.36]) by mx1.freebsd.org (Postfix) with ESMTP id 7672D9CB;
 Tue, 15 Jan 2013 01:22:33 +0000 (UTC)
X-IronPort-Anti-Spam-Filtered: true
X-IronPort-Anti-Spam-Result: AqAEAOCu9FCDaFvO/2dsb2JhbABEhjqzZYN0c4IeAQEFIwRSGw4KAgINGQJZBogspS+QW4EjjniBEwOIYY0qkEmDE4IG
X-IronPort-AV: E=Sophos;i="4.84,469,1355115600"; 
   d="scan'208";a="9060144"
Received: from erie.cs.uoguelph.ca (HELO zcs3.mail.uoguelph.ca)
 ([131.104.91.206])
 by esa-annu.net.uoguelph.ca with ESMTP; 14 Jan 2013 20:20:55 -0500
Received: from zcs3.mail.uoguelph.ca (localhost.localdomain [127.0.0.1])
 by zcs3.mail.uoguelph.ca (Postfix) with ESMTP id F246FB3F26;
 Mon, 14 Jan 2013 20:20:54 -0500 (EST)
Date: Mon, 14 Jan 2013 20:20:54 -0500 (EST)
From: Rick Macklem <rmacklem@uoguelph.ca>
To: John Baldwin <jhb@freebsd.org>
Message-ID: <162405990.1985479.1358212854967.JavaMail.root@erie.cs.uoguelph.ca>
In-Reply-To: <201301141445.29260.jhb@freebsd.org>
Subject: Re: [PATCH] Better handle NULL utimes() in the NFS client
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit
X-Originating-IP: [172.17.91.201]
X-Mailer: Zimbra 6.0.10_GA_2692 (ZimbraWebClient - FF3.0 (Win)/6.0.10_GA_2692)
Cc: Rick Macklem <rmacklem@freebsd.org>, fs@freebsd.org
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 15 Jan 2013 01:22:34 -0000

John Baldwin wrote:
> The NFS client tries to infer when an application has passed NULL to
> utimes()
> so that it can let the server set the timestamp rather than using a
> client-
> supplied timestamp. It does this by checking to see if the desired
> timestamp's second matches the current second. However, this breaks
> applications that are intentionally trying to set a specific timestamp
> within
> the current second. In addition, utimes() sets a flag to indicate if
> NULL was
> passed to utimes(). The patch below changes the NFS client to check
> this flag
> and only use the server-supplied time in that case:
> 
> Index: fs/nfsclient/nfs_clport.c
> ===================================================================
> --- fs/nfsclient/nfs_clport.c (revision 225511)
> +++ fs/nfsclient/nfs_clport.c (working copy)
> @@ -762,7 +762,7 @@
> *tl = newnfs_false;
> }
> if (vap->va_atime.tv_sec != VNOVAL) {
> - if (vap->va_atime.tv_sec != curtime.tv_sec) {
> + if (!(vap->va_vaflags & VA_UTIMES_NULL)) {
> NFSM_BUILD(tl, u_int32_t *, 3 * NFSX_UNSIGNED);
> *tl++ = txdr_unsigned(NFSV3SATTRTIME_TOCLIENT);
> txdr_nfsv3time(&vap->va_atime, tl);
> @@ -775,7 +775,7 @@
> *tl = txdr_unsigned(NFSV3SATTRTIME_DONTCHANGE);
> }
> if (vap->va_mtime.tv_sec != VNOVAL) {
> - if (vap->va_mtime.tv_sec != curtime.tv_sec) {
> + if (!(vap->va_vaflags & VA_UTIMES_NULL)) {
> NFSM_BUILD(tl, u_int32_t *, 3 * NFSX_UNSIGNED);
> *tl++ = txdr_unsigned(NFSV3SATTRTIME_TOCLIENT);
> txdr_nfsv3time(&vap->va_mtime, tl);
> Index: nfsclient/nfs_subs.c
> ===================================================================
> --- nfsclient/nfs_subs.c (revision 225511)
> +++ nfsclient/nfs_subs.c (working copy)
> @@ -1119,7 +1119,7 @@
> *tl = nfs_false;
> }
> if (va->va_atime.tv_sec != VNOVAL) {
> - if (va->va_atime.tv_sec != time_second) {
> + if (!(vattr.va_vaflags & VA_UTIMES_NULL)) {
> tl = nfsm_build_xx(3 * NFSX_UNSIGNED, mb, bpos);
> *tl++ = txdr_unsigned(NFSV3SATTRTIME_TOCLIENT);
> txdr_nfsv3time(&va->va_atime, tl);
> @@ -1132,7 +1132,7 @@
> *tl = txdr_unsigned(NFSV3SATTRTIME_DONTCHANGE);
> }
> if (va->va_mtime.tv_sec != VNOVAL) {
> - if (va->va_mtime.tv_sec != time_second) {
> + if (!(vattr.va_vaflags & VA_UTIMES_NULL)) {
> tl = nfsm_build_xx(3 * NFSX_UNSIGNED, mb, bpos);
> *tl++ = txdr_unsigned(NFSV3SATTRTIME_TOCLIENT);
> txdr_nfsv3time(&va->va_mtime, tl);
> 
> --
> John Baldwin
I think this patch is ok, too.

In the old days, a lot of NFS servers only stored times at a
resolution of 1sec, which I think is why the code had the habit
of comparing "seconds equal". If there is some app. out there
that sets "current time" via utimes(2) with a curent time argument
instead of a NULL argument would seem to be broken to me.
(It is conceivable that some app. did this to avoid clock
 skew between the client and server, but I doubt it.)

Have fun with it, rick
ps: If you were concerned that the change might break something
    that depended on the old behaviour, you could apply the patch
    to the new client only. Then switching to an "oldnfs" mount
    would provide the old "same sec->set time to current time on
    the server" behaviour.


From owner-freebsd-fs@FreeBSD.ORG  Tue Jan 15 01:37:31 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115])
 by hub.freebsd.org (Postfix) with ESMTP id E113CA3E
 for <freebsd-fs@freebsd.org>; Tue, 15 Jan 2013 01:37:31 +0000 (UTC)
 (envelope-from artemb@gmail.com)
Received: from mail-vc0-f172.google.com (mail-vc0-f172.google.com
 [209.85.220.172]) by mx1.freebsd.org (Postfix) with ESMTP id 902F7A54
 for <freebsd-fs@freebsd.org>; Tue, 15 Jan 2013 01:37:31 +0000 (UTC)
Received: by mail-vc0-f172.google.com with SMTP id fw7so4210442vcb.31
 for <freebsd-fs@freebsd.org>; Mon, 14 Jan 2013 17:37:25 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
 h=mime-version:sender:in-reply-to:references:date
 :x-google-sender-auth:message-id:subject:from:to:cc:content-type
 :content-transfer-encoding;
 bh=ERJqw0lCTNCaIJYj1D/i06oFPviy76eK8ykVyi7zL5g=;
 b=pCgjNYjD/Kek39O97JLEWbnDnHpxYik7sZXMyZYI7YqzZ0nZGOdFDvMZsTseoeYSKR
 mftYEdOeGZKTSTltO9nXsXhP5VlOVis02sFju7NttLm+Kmg3mdjWwhf233+GCiuMEjRk
 fgB5TGr9JBMcXj9Z++EJxpjKu+NjjuV/xdzOU9a7pxODIwqz/nIbfAi3SRPl1OfEQNuQ
 Dx6Nx6yUtmQCPxyCh8XnB3dRTnhWNiokp/MJoyIEAp5UD7OEbxFx6UPHvuauH41E+XZz
 BCG9e0941iRSaQ+7QTuu/kRn5+oAi0Qk6ovlUwvUnSRCdN5HIiOt9STnbeVf3M7ikrPG
 Otng==
MIME-Version: 1.0
Received: by 10.52.180.200 with SMTP id dq8mr90147517vdc.71.1358213845191;
 Mon, 14 Jan 2013 17:37:25 -0800 (PST)
Sender: artemb@gmail.com
Received: by 10.220.122.196 with HTTP; Mon, 14 Jan 2013 17:37:25 -0800 (PST)
In-Reply-To: <20130114214652.GA76779@mid.pc5.i.0x5.de>
References: <CAFqOu6jgA8RWV5d+rOBk8D=3Vu3yWSnDkAi1cFJ0esj4OpBy2Q@mail.gmail.com>
 <20130109162613.GA34276@mid.pc5.i.0x5.de>
 <CAFqOu6jrng=v8eVyhqV-PBqJM_dYy+U7X4+=ahBeoxvK4mxcSA@mail.gmail.com>
 <20130110193949.GA10023@mid.pc5.i.0x5.de>
 <20130111073417.GA95100@mid.pc5.i.0x5.de>
 <CAFqOu6gWpMsWN0pTBiv10WfwyGWMfO9GzMLWTtcVxHixr-_i3Q@mail.gmail.com>
 <20130114094010.GA75529@mid.pc5.i.0x5.de>
 <CAFqOu6hxfGt_M6Jo9qWeifDz9YnNc_Bd9H-GEe4RYtutaPvH5w@mail.gmail.com>
 <20130114195148.GA20540@mid.pc5.i.0x5.de>
 <CAFqOu6jwJ4qhbOovN_NhzusdQJvrbvUC3g93sziR=Uw99SGenw@mail.gmail.com>
 <20130114214652.GA76779@mid.pc5.i.0x5.de>
Date: Mon, 14 Jan 2013 17:37:25 -0800
X-Google-Sender-Auth: ooDSOCBbgBRv9mcQWqd_AE-gd9U
Message-ID: <CAFqOu6jKX-Ks6C1RK5GwZ51ZVUSnGSe7S99_EfK+fwLPjAFFYw@mail.gmail.com>
Subject: Re: slowdown of zfs (tx->tx)
From: Artem Belevich <art@freebsd.org>
To: Nicolas Rachinsky <fbsd-mas-0@ml.turing-complete.org>
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable
Cc: freebsd-fs <freebsd-fs@freebsd.org>
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 15 Jan 2013 01:37:31 -0000

On Mon, Jan 14, 2013 at 1:46 PM, Nicolas Rachinsky
<fbsd-mas-0@ml.turing-complete.org> wrote:
> * Artem Belevich <art@freebsd.org> [2013-01-14 12:41 -0800]:
>> txg_wait_open means that ZFS is waiting for ongoing transaction group
>> sync. There should've been some write activity in this case.
>>
>> Check what zfs kernel threads are doing with procstat -kk on zfskern pro=
cess.
>
> # sh -c 'for i in `jot 1000`; do procstat -kk 47 ; sleep 0.1; done' | sor=
t | uniq -c
> 1000    47 100083 zfskern          arc_reclaim_thre mi_switch+0x176 sleep=
q_timedwait+0x42 _cv_timedwait+0x134 arc_reclaim_thread+0x29d fork_exit+0x1=
1f fork_trampoline+0xe
> 1000    47 100084 zfskern          l2arc_feed_threa mi_switch+0x176 sleep=
q_timedwait+0x42 _cv_timedwait+0x134 l2arc_feed_thread+0x1a8 fork_exit+0x11=
f fork_trampoline+0xe
> 1000    47 100224 zfskern          txg_thread_enter mi_switch+0x176 sleep=
q_wait+0x42 _cv_wait+0x129 txg_thread_wait+0x79 txg_quiesce_thread+0xb5 for=
k_exit+0x11f fork_trampoline+0xe
>  165    47 100225 zfskern          txg_thread_enter <running>
>    1    47 100225 zfskern          txg_thread_enter mi_switch+0x176 sleep=
q_wait+0x42 _cv_wait+0x129 zio_wait+0x61 dbuf_read+0x5e5 dbuf_findbp+0x107 =
dbuf_prefetch+0x8f dmu_zfetch_dofetch+0x10b dmu_zfetch+0xaf8 dbuf_read+0x67=
5 dnode_hold_impl+0xf2 dmu_buf_hold_array+0x38 dmu_write+0x53 space_map_syn=
c+0x1ff metaslab_sync+0x13e vdev_sync+0x6e spa_sync+0x3ab txg_sync_thread+0=
x139
>    1    47 100225 zfskern          txg_thread_enter mi_switch+0x176 sleep=
q_wait+0x42 _cv_wait+0x129 zio_wait+0x61 dbuf_read+0x5e5 dbuf_will_dirty+0x=
60 dmu_write+0x82 space_map_sync+0x1ff metaslab_sync+0x13e vdev_sync+0x6e s=
pa_sync+0x3ab txg_sync_thread+0x139 fork_exit+0x11f fork_trampoline+0xe
>    1    47 100225 zfskern          txg_thread_enter mi_switch+0x176 sleep=
q_wait+0x42 _cv_wait+0x129 zio_wait+0x61 dsl_pool_sync+0x189 spa_sync+0x336=
 txg_sync_thread+0x139 fork_exit+0x11f fork_trampoline+0xe
>   81    47 100225 zfskern          txg_thread_enter mi_switch+0x176 sleep=
q_wait+0x42 _cv_wait+0x129 zio_wait+0x61 dsl_pool_sync+0x2c3 spa_sync+0x336=
 txg_sync_thread+0x139 fork_exit+0x11f fork_trampoline+0xe
>  719    47 100225 zfskern          txg_thread_enter mi_switch+0x176 sleep=
q_wait+0x42 _cv_wait+0x129 zio_wait+0x61 dsl_pool_sync+0xe0 spa_sync+0x336 =
txg_sync_thread+0x139 fork_exit+0x11f fork_trampoline+0xe
>    4    47 100225 zfskern          txg_thread_enter mi_switch+0x176 sleep=
q_wait+0x42 _cv_wait+0x129 zio_wait+0x61 spa_sync+0x286 txg_sync_thread+0x1=
39 fork_exit+0x11f fork_trampoline+0xe
>    2    47 100225 zfskern          txg_thread_enter mi_switch+0x176 sleep=
q_wait+0x42 _cv_wait+0x129 zio_wait+0x61 spa_sync+0x370 txg_sync_thread+0x1=
39 fork_exit+0x11f fork_trampoline+0xe
>   21    47 100225 zfskern          txg_thread_enter mi_switch+0x176 sleep=
q_wait+0x42 _cv_wait+0x129 zio_wait+0x61 vdev_config_sync+0xe3 spa_sync+0x4=
9a txg_sync_thread+0x139 fork_exit+0x11f fork_trampoline+0xe
>    5    47 100225 zfskern          txg_thread_enter mi_switch+0x176 sleep=
q_wait+0x42 _cv_wait+0x129 zio_wait+0x61 vdev_uberblock_sync_list+0xd0 vdev=
_config_sync+0x10f spa_sync+0x49a txg_sync_thread+0x139 fork_exit+0x11f for=
k_trampoline+0xe
> 1000   PID    TID COMM             TDNAME           KSTACK

OK. threads responsible for transaction sync seem to be stuck in zio_wait.
zio_wait is in turn waiting for some task thread to be done with its work.
Now you need to figure out what are those task threads doing.

'procstat -kk 0' will dump few hundreds of taskq threads. Most of them
would be zfs related. On an idle box (8.3/amd64 in my case) most of
them would have the same stack trace looking like this (modulo
offsets):

mi_switch+0x196 sleepq_wait+0x42 _sleep+0x3c0
taskqueue_thread_loop+0xbe fork_exit+0x11f fork_trampoline+0xe

Look for stack traces that don't match that pattern.

--Artem

From owner-freebsd-fs@FreeBSD.ORG  Tue Jan 15 02:50:10 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by hub.freebsd.org (Postfix) with ESMTP id 0FEAF8F8
 for <freebsd-fs@freebsd.org>; Tue, 15 Jan 2013 02:50:10 +0000 (UTC)
 (envelope-from edward@gogrid.com)
Received: from smtp1.servepath.com (smtp1.servepath.com [216.93.160.25])
 by mx1.freebsd.org (Postfix) with ESMTP id F1894DBA
 for <freebsd-fs@freebsd.org>; Tue, 15 Jan 2013 02:50:09 +0000 (UTC)
DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=january; d=gogrid.com;
 h=Received:Message-ID:Date:From:User-Agent:MIME-Version:To:Subject:References:Content-Type:Content-Transfer-Encoding;
 b=g3CuU3DzA2u5FyFQ5EKi3d3IjijQUObTp6D3GOMTKXTJmi8/Jn+cUMUGYjdezKtgt2ZW04+Ec0T6po3mhq4U6aHPZLYzLKD6HVUjV5CQlTyXrn2f4NCwLXrFSDzY3g6p;
Received: from [192.168.7.178]
 by smtp1.servepath.com with esmtp (Exim 4.68 (FreeBSD))
 (envelope-from <edward@gogrid.com>) id 1Tuw3a-000GHb-F3
 for freebsd-fs@freebsd.org; Mon, 14 Jan 2013 18:15:18 -0800
Message-ID: <50F4BBE7.7050207@gogrid.com>
Date: Mon, 14 Jan 2013 18:16:07 -0800
From: Edward Xiao <edward@gogrid.com>
User-Agent: Mozilla/5.0 (X11; Linux x86_64;
 rv:17.0) Gecko/17.0 Thunderbird/17.0
MIME-Version: 1.0
To: freebsd-fs@freebsd.org
Subject: HAST + ZFS self healing? Hot spares?
References: 4DD5A1CF.70807@itassistans.se
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 15 Jan 2013 02:50:10 -0000


From owner-freebsd-fs@FreeBSD.ORG  Tue Jan 15 04:51:35 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: fs@FreeBSD.org
Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115])
 by hub.freebsd.org (Postfix) with ESMTP id B0E841BD;
 Tue, 15 Jan 2013 04:51:35 +0000 (UTC)
 (envelope-from brde@optusnet.com.au)
Received: from mail13.syd.optusnet.com.au (mail13.syd.optusnet.com.au
 [211.29.132.194])
 by mx1.freebsd.org (Postfix) with ESMTP id 374117BE;
 Tue, 15 Jan 2013 04:51:34 +0000 (UTC)
Received: from c211-30-173-106.carlnfd1.nsw.optusnet.com.au
 (c211-30-173-106.carlnfd1.nsw.optusnet.com.au [211.30.173.106])
 by mail13.syd.optusnet.com.au (8.13.1/8.13.1) with ESMTP id r0F4pNaE013436
 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO);
 Tue, 15 Jan 2013 15:51:25 +1100
Date: Tue, 15 Jan 2013 15:51:23 +1100 (EST)
From: Bruce Evans <brde@optusnet.com.au>
X-X-Sender: bde@besplex.bde.org
To: Rick Macklem <rmacklem@uoguelph.ca>
Subject: Re: [PATCH] Better handle NULL utimes() in the NFS client
In-Reply-To: <162405990.1985479.1358212854967.JavaMail.root@erie.cs.uoguelph.ca>
Message-ID: <20130115141019.H1444@besplex.bde.org>
References: <162405990.1985479.1358212854967.JavaMail.root@erie.cs.uoguelph.ca>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed
X-Optus-CM-Score: 0
X-Optus-CM-Analysis: v=2.0 cv=P/xiHV8u c=1 sm=1 a=S8Qr1IbAvFsA:10
 a=kj9zAlcOel0A:10 a=PO7r1zJSAAAA:8 a=JzwRw_2MAAAA:8 a=U1Z5fgpPGSMA:10
 a=9QiI2z3JOZ09_-QNc5AA:9 a=CjuIK1q_8ugA:10 a=TEtd8y5WR3g2ypngnwZWYw==:117
Cc: Rick Macklem <rmacklem@FreeBSD.org>, fs@FreeBSD.org
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 15 Jan 2013 04:51:35 -0000

On Mon, 14 Jan 2013, Rick Macklem wrote:

> John Baldwin wrote:
>> The NFS client tries to infer when an application has passed NULL to
>> utimes()
>> so that it can let the server set the timestamp rather than using a
>> client-
>> supplied timestamp. It does this by checking to see if the desired
>> timestamp's second matches the current second. However, this breaks
>> applications that are intentionally trying to set a specific timestamp
>> within
>> the current second. In addition, utimes() sets a flag to indicate if
>> NULL was
>> passed to utimes(). The patch below changes the NFS client to check
>> this flag
>> and only use the server-supplied time in that case:

It is certainly an error to not check VA_UTIMES_NULL at all.  I think
the flag (or the NULL pointer) cannot be passed to the server, so the
best we can do for the VA_UTIMES_NULL case is read the current time on
the client and pass it to the server.  Upper layers have already read
the current time, but have passed us VA_UTIMES_NULL so that we can tell
that the pointer was originally null so that we can do the different
permissions checks for this case.

>> Index: fs/nfsclient/nfs_clport.c
>> ===================================================================
>> --- fs/nfsclient/nfs_clport.c (revision 225511)
>> +++ fs/nfsclient/nfs_clport.c (working copy)
>> @@ -762,7 +762,7 @@
>> *tl = newnfs_false;
>> }
>> if (vap->va_atime.tv_sec != VNOVAL) {
>> - if (vap->va_atime.tv_sec != curtime.tv_sec) {
>> + if (!(vap->va_vaflags & VA_UTIMES_NULL)) {
>> NFSM_BUILD(tl, u_int32_t *, 3 * NFSX_UNSIGNED);
>> *tl++ = txdr_unsigned(NFSV3SATTRTIME_TOCLIENT);
>> txdr_nfsv3time(&vap->va_atime, tl);
>> @@ -775,7 +775,7 @@
>> *tl = txdr_unsigned(NFSV3SATTRTIME_DONTCHANGE);
>> ...

Something mangled the patch so that it is hard to see what it does.  It
just uses the flag instead of guessing.

I can't see anything that does the different permissions check for
the VA_UTIMES_NULL case, and testing shows that this case is just broken,
at least for an old version of the old nfs client -- the same permissions
are required for all cases, but write permission is supposed to be
enough for the VA_UTIMES_NULL case (since write permission is sufficient
for setting the mtime to the current time (plus epsilon) using write(2)
and truncate(2).  Setting the atime to the current time should require
no more and no less than read permission, since it can be done using
read(2), but utimes(NULL) requires write permission for that too).

> In the old days, a lot of NFS servers only stored times at a
> resolution of 1sec, which I think is why the code had the habit
> of comparing "seconds equal".

I think this is not the reason for the check here.

> If there is some app. out there
> that sets "current time" via utimes(2) with a curent time argument
> instead of a NULL argument would seem to be broken to me.
> (It is conceivable that some app. did this to avoid clock
> skew between the client and server, but I doubt it.)

Apps have no alternative to using the NULL arg if they have write permission
to the file but don't own it.

Oops, on looking at the code I now think it _is_ possible to pass the
request to set the current time on the server, since in the
NFSV3SATTRTIME_TOSERVER case we just pass this case value and not
any time value to the server, so the server has no option but to use
its current time.  It is not surprising that the permissions checks
for this don't work right.  I thought that the client was responsible
for most permissions checks, but can't find many or the relevant one
here. The  NFSV3SATTRTIME_TOSERVER code on the server sets
VA_UTIMES_NULL, so I would have thought that the permissions check on
the server does the right thing.

There are some large timestamping bugs nearby:

- the old nfs server code for NFSV3SATTRTIME_TOSERVER uses getnanotime()
   to read the current time.  This violates the system's policy set by
   the vfs.timestamp precision in most cases, since using getnanotime()
   is the worst supported policy and is not the defaul.

   The old nfs client uses the correct function to read the current
   time, vfs_timestamp(), in nfs_create(), but this is the only use of
   vfs_timestamp() in old nfs code.  I think most cases use the server
   time and thus use the correct function iff the leaf server file
   system uses the correct function.

- the new nfs server code for NFSV3SATTRTIME_TOSERVER macro-izes all
   reads of the current time except 1 as NFSGETTIME().  This uses
   getmicrotime(), so it violates the system's policy in all cases,
   since using getmicrotime() is not a supported policy (using
   microtime() is supported).  The 1 exception is a hard-coded
   getmicrotime() in fs/nfsclient/nfs_clport.c whose use is visible
   in the above patch.  This one really didn't matter, because only the
   seconds part of curtime was used.  It was just a micro-pessimization
   and style bug.  The (not quite) correct way to get the seconds part
   is to use time_second, as is done in the old nfs client.
      (This way is not quite correct because there are some races and
      non-monotonicities reading the times.  In the above check,
      vap->va_atime.tv_sec might have been read by a more precise clock
      than curtime.tv_sec.  Then the check might give a false positive
      or negative.  But the check is only a heuristic, and is inherently
      racy, so this doesn't rally matter.
   With the above pathcm the check becomes a different pessimization and
   style bug.  The curtime variable becomes unused except for its
   incorrect initialization.

   New nfs code never uses the correct function vfs_timestamp().

Following the system pollcy for file timestamps causes some problems
for utimes(NULL) too.  Old versions hard-coded microtime().  Current
versions use vfs_timestamp().  The latter is better, but tends to
give different results than times(non_NULL), since few or no
applications know anything about the system's policy.  touch(1)
probably should know, but doesn't.  So the simple "touch foo" gives
various results, depending:
- touch(1) starts with gettimeofday().  This gives microseconds
   resolution and usually microseconds accuracy if its result is used.
- touch then tries utimes(non_NULL) with the current time that it
   just read.  This usually works, giving microseconds resolution,
   etc.  This is OK, but often different from the system policy.
- touch then tries utimes(NULL).  If this works, then it follow the
   system policy.

Another problem is that not all file systems support nanoseconds
resolutions, so not all system policies or utimes() requests can
be honored.

I would usually prefer the system's policy to be enforced as far as
possible.  Thus if the system's policy is microseconds resolution,
then times with nanoseconds resolution should be rounded down to the
nearest microsecond.  This case is most useful since utimes() cannot
preserve times with more than microseconds resolution.  Utilities like
cp(1) blindly round the times given in nanoseconds by stat(2) to ones
that can be written by utimes(2), so this often happens in an
uncontrollable way anyway (POSIX is finally getting around to specifying
permissible errors for unrepresentable resolutions).  But sometimes I
want utimes() to preserve times as well as possible.

Bruce

From owner-freebsd-fs@FreeBSD.ORG  Tue Jan 15 12:52:59 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by hub.freebsd.org (Postfix) with ESMTP id D94589A
 for <freebsd-fs@freebsd.org>; Tue, 15 Jan 2013 12:52:59 +0000 (UTC)
 (envelope-from pluknet@gmail.com)
Received: from mail-qa0-f48.google.com (mail-qa0-f48.google.com
 [209.85.216.48]) by mx1.freebsd.org (Postfix) with ESMTP id 8A304655
 for <freebsd-fs@freebsd.org>; Tue, 15 Jan 2013 12:52:59 +0000 (UTC)
Received: by mail-qa0-f48.google.com with SMTP id l8so121903qaq.7
 for <freebsd-fs@freebsd.org>; Tue, 15 Jan 2013 04:52:58 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
 h=mime-version:date:message-id:subject:from:to:content-type;
 bh=MUVcmjAyXuMFqtoC/Sb1vGEv1i/1T1T7E8U0HKFX6FE=;
 b=c85W1MlFJTEVEHyuqaZegPM0sVc6hJ30z89gQBMa8SB6vHLdVY7MAL++yYhTbSdFvC
 cVEt/IGpfBYG6Ocelzlwau6F/AXRdfVuv+x22jxtF99qwM/1uqWZ4212zWUxGvU0hvN8
 U5mCqVKSPuLm2gqu1F+1HsKe7TTO2JzVbGIBQkwS6h8DRH4FBuPS+us9qktaByh/AvCW
 ZVXZ/u2Mhih+zeuEBPvooztSw516UDZy9qj625xlyO05TTGYjtlLFPTv+MldYC776DIJ
 He0Hqzi6YoEky/iPy51c9WYEa8aPLHuMgnweVI752EIW2MhCs4OgPs/W9GGUiqN2rKn9
 6XIw==
MIME-Version: 1.0
Received: by 10.224.60.12 with SMTP id n12mr75306031qah.23.1358254378886; Tue,
 15 Jan 2013 04:52:58 -0800 (PST)
Received: by 10.229.78.96 with HTTP; Tue, 15 Jan 2013 04:52:58 -0800 (PST)
Date: Tue, 15 Jan 2013 15:52:58 +0300
Message-ID: <CAE-mSOJk1HbxvF=ZpoSP21b9j65qMov=AE-OM6wcUkbadQeZbw@mail.gmail.com>
Subject: getcwd lies on/under nfs4-mounted zfs dataset
From: Sergey Kandaurov <pluknet@gmail.com>
To: freebsd-fs@freebsd.org
Content-Type: text/plain; charset=ISO-8859-1
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 15 Jan 2013 12:52:59 -0000

Hi.

We stuck with the problem getting wrong current directory path
when sitting on/under zfs dataset filesystem mounted over NFSv4.
Both nfs server and client are 10.0-CURRENT from December or so.

The component path "user3" unexpectedly appears to be "." (dot).
nfs-client:/home/user3 # pwd
/home/.
nfs-client:/home/user3/var/run # pwd
/home/./var/run

nfs-client:~ # procstat -f 3225
  PID COMM               FD T V FLAGS     REF  OFFSET PRO NAME
 3225 a.out            text v r r--------   -       - -   /home/./var/a.out
 3225 a.out            ctty v c rw-------   -       - -   /dev/pts/2
 3225 a.out             cwd v d r--------   -       - -   /home/./var
 3225 a.out            root v d r--------   -       - -   /

The used setup follows.

1. NFS Server with local ZFS:
# cat /etc/exports
V4: / -sec=sys

# zfs list
pool1        10.4M   122G   580K  /pool1
pool1/user3 on /pool1/user3 (zfs, NFS exported, local, nfsv4acls)

Exports list on localhost:
/pool1/user3                       109.70.28.0
/pool1                             109.70.28.0

 # zfs get sharenfs pool1/user3
NAME         PROPERTY  VALUE                                           SOURCE
pool1/user3  sharenfs  -alldirs -maproot=root -network=109.70.28.0/24  local

2. pool1 is mounted on NFSv4 client:
nfs-server:/pool1 on /home (nfs, noatime, nfsv4acls)

So that on NFS client the "pool1/user3" dataset comes at /home/user3.
/ - ufs
/home - zpool-over-nfsv4
/home/user3 - zfs dataset "pool1/user3"

At the same time it works as expected when we're not on zfs dataset,
but directly on its parent zfs pool (also over NFSv4), e.g.
nfs-client:/home/non_dataset_dir # pwd
/home/non_dataset_dir

The ls command works as expected:
nfs-client:/# ls -dl /home/user3/var/
drwxrwxrwt+ 6 root  wheel  6 Jan 10 16:19 /home/user3/var/

-- 
wbr,
pluknet

From owner-freebsd-fs@FreeBSD.ORG  Tue Jan 15 16:56:44 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@smarthost.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by hub.freebsd.org (Postfix) with ESMTP id 63E107E2;
 Tue, 15 Jan 2013 16:56:44 +0000 (UTC)
 (envelope-from trasz@FreeBSD.org)
Received: from freefall.freebsd.org (freefall.freebsd.org
 [IPv6:2001:1900:2254:206c::16:87])
 by mx1.freebsd.org (Postfix) with ESMTP id 240F2779;
 Tue, 15 Jan 2013 16:56:44 +0000 (UTC)
Received: from freefall.freebsd.org (localhost [127.0.0.1])
 by freefall.freebsd.org (8.14.6/8.14.6) with ESMTP id r0FGuh3e048016;
 Tue, 15 Jan 2013 16:56:43 GMT
 (envelope-from trasz@freefall.freebsd.org)
Received: (from trasz@localhost)
 by freefall.freebsd.org (8.14.6/8.14.6/Submit) id r0FGuh0C048012;
 Tue, 15 Jan 2013 16:56:43 GMT (envelope-from trasz)
Date: Tue, 15 Jan 2013 16:56:43 GMT
Message-Id: <201301151656.r0FGuh0C048012@freefall.freebsd.org>
To: trasz@FreeBSD.org, freebsd-fs@FreeBSD.org, trasz@FreeBSD.org
From: trasz@FreeBSD.org
Subject: Re: kern/174948: [zfs] owner@ always have ZFS ACL full permissions.
 Should not be the case.
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 15 Jan 2013 16:56:44 -0000

Synopsis: [zfs] owner@ always have ZFS ACL full permissions. Should not be the case.

Responsible-Changed-From-To: freebsd-fs->trasz
Responsible-Changed-By: trasz
Responsible-Changed-When: Tue Jan 15 16:56:43 UTC 2013
Responsible-Changed-Why: 
I'll take it.

http://www.freebsd.org/cgi/query-pr.cgi?pr=174948

From owner-freebsd-fs@FreeBSD.ORG  Tue Jan 15 16:57:17 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@smarthost.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115])
 by hub.freebsd.org (Postfix) with ESMTP id 82113862;
 Tue, 15 Jan 2013 16:57:17 +0000 (UTC)
 (envelope-from trasz@FreeBSD.org)
Received: from freefall.freebsd.org (freefall.freebsd.org
 [IPv6:2001:1900:2254:206c::16:87])
 by mx1.freebsd.org (Postfix) with ESMTP id 5C61B786;
 Tue, 15 Jan 2013 16:57:17 +0000 (UTC)
Received: from freefall.freebsd.org (localhost [127.0.0.1])
 by freefall.freebsd.org (8.14.6/8.14.6) with ESMTP id r0FGvHdJ048113;
 Tue, 15 Jan 2013 16:57:17 GMT
 (envelope-from trasz@freefall.freebsd.org)
Received: (from trasz@localhost)
 by freefall.freebsd.org (8.14.6/8.14.6/Submit) id r0FGvHME048109;
 Tue, 15 Jan 2013 16:57:17 GMT (envelope-from trasz)
Date: Tue, 15 Jan 2013 16:57:17 GMT
Message-Id: <201301151657.r0FGvHME048109@freefall.freebsd.org>
To: trasz@FreeBSD.org, freebsd-fs@FreeBSD.org, trasz@FreeBSD.org
From: trasz@FreeBSD.org
Subject: Re: kern/174949: [zfs] ZFS ACL: rwxp required to mkdir. p should not
 be required.
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 15 Jan 2013 16:57:17 -0000

Synopsis: [zfs] ZFS ACL: rwxp required to mkdir. p should not be required.

Responsible-Changed-From-To: freebsd-fs->trasz
Responsible-Changed-By: trasz
Responsible-Changed-When: Tue Jan 15 16:57:16 UTC 2013
Responsible-Changed-Why: 
I'll take it.

http://www.freebsd.org/cgi/query-pr.cgi?pr=174949

From owner-freebsd-fs@FreeBSD.ORG  Tue Jan 15 16:57:28 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@smarthost.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115])
 by hub.freebsd.org (Postfix) with ESMTP id E510E8D0;
 Tue, 15 Jan 2013 16:57:28 +0000 (UTC)
 (envelope-from trasz@FreeBSD.org)
Received: from freefall.freebsd.org (freefall.freebsd.org
 [IPv6:2001:1900:2254:206c::16:87])
 by mx1.freebsd.org (Postfix) with ESMTP id C179A78A;
 Tue, 15 Jan 2013 16:57:28 +0000 (UTC)
Received: from freefall.freebsd.org (localhost [127.0.0.1])
 by freefall.freebsd.org (8.14.6/8.14.6) with ESMTP id r0FGvSjP048207;
 Tue, 15 Jan 2013 16:57:28 GMT
 (envelope-from trasz@freefall.freebsd.org)
Received: (from trasz@localhost)
 by freefall.freebsd.org (8.14.6/8.14.6/Submit) id r0FGvSh8048203;
 Tue, 15 Jan 2013 16:57:28 GMT (envelope-from trasz)
Date: Tue, 15 Jan 2013 16:57:28 GMT
Message-Id: <201301151657.r0FGvSh8048203@freefall.freebsd.org>
To: trasz@FreeBSD.org, freebsd-fs@FreeBSD.org, trasz@FreeBSD.org
From: trasz@FreeBSD.org
Subject: Re: kern/174950: [zfs] delete ZFS ACL have no effect
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 15 Jan 2013 16:57:29 -0000

Synopsis: [zfs] delete ZFS ACL have no effect

Responsible-Changed-From-To: freebsd-fs->trasz
Responsible-Changed-By: trasz
Responsible-Changed-When: Tue Jan 15 16:57:28 UTC 2013
Responsible-Changed-Why: 
I'll take it.

http://www.freebsd.org/cgi/query-pr.cgi?pr=174950

From owner-freebsd-fs@FreeBSD.ORG  Tue Jan 15 17:01:05 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@smarthost.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115])
 by hub.freebsd.org (Postfix) with ESMTP id 8F23FB2B;
 Tue, 15 Jan 2013 17:01:05 +0000 (UTC)
 (envelope-from trasz@FreeBSD.org)
Received: from freefall.freebsd.org (freefall.freebsd.org
 [IPv6:2001:1900:2254:206c::16:87])
 by mx1.freebsd.org (Postfix) with ESMTP id 6AF417B6;
 Tue, 15 Jan 2013 17:01:05 +0000 (UTC)
Received: from freefall.freebsd.org (localhost [127.0.0.1])
 by freefall.freebsd.org (8.14.6/8.14.6) with ESMTP id r0FH15VT049847;
 Tue, 15 Jan 2013 17:01:05 GMT
 (envelope-from trasz@freefall.freebsd.org)
Received: (from trasz@localhost)
 by freefall.freebsd.org (8.14.6/8.14.6/Submit) id r0FH15kx049842;
 Tue, 15 Jan 2013 17:01:05 GMT (envelope-from trasz)
Date: Tue, 15 Jan 2013 17:01:05 GMT
Message-Id: <201301151701.r0FH15kx049842@freefall.freebsd.org>
To: trasz@FreeBSD.org, freebsd-fs@FreeBSD.org, trasz@FreeBSD.org
From: trasz@FreeBSD.org
Subject: Re: kern/175101: [zfs] [nfs] ZFS NFSv4 ACL's allows user without perm
 to delete and update timestamp
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 15 Jan 2013 17:01:05 -0000

Synopsis: [zfs] [nfs] ZFS NFSv4 ACL's allows user without perm to delete and update timestamp

Responsible-Changed-From-To: freebsd-fs->trasz
Responsible-Changed-By: trasz
Responsible-Changed-When: Tue Jan 15 17:01:04 UTC 2013
Responsible-Changed-Why: 
I'll take it.

http://www.freebsd.org/cgi/query-pr.cgi?pr=175101

From owner-freebsd-fs@FreeBSD.ORG  Tue Jan 15 19:55:25 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by hub.freebsd.org (Postfix) with ESMTP id 4156A7F1;
 Tue, 15 Jan 2013 19:55:25 +0000 (UTC)
 (envelope-from olivier777a7@gmail.com)
Received: from mail-la0-f50.google.com (mail-la0-f50.google.com
 [209.85.215.50]) by mx1.freebsd.org (Postfix) with ESMTP id C1AB425A;
 Tue, 15 Jan 2013 19:55:23 +0000 (UTC)
Received: by mail-la0-f50.google.com with SMTP id fs13so564618lab.23
 for <multiple recipients>; Tue, 15 Jan 2013 11:55:22 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
 h=mime-version:in-reply-to:references:date:message-id:subject:from:to
 :cc:content-type;
 bh=gmPWyccEvg/1qMgT5YgggOnZQo4WIg59D1YBCCOPe+o=;
 b=vpkwX7gqVtnLLzPliWhTCcB3w+ZU3nmvi3WYbf1zir0TV3qhi4OJQyBm+kPDZtOQXe
 Xv8nFqOhSdLSzndKyqwSE2uGTFYxa6A5D99UQ/qdAWavYCrqyUpBwU4jvNNpcdhhyAur
 CiZ7WWPIjf3E62wDbEzUlswE5M6M8UuA4DErKIDFP1ESr75u/FWodd6TSjg/9iE3OUS6
 OWMvfiqraS4s8rTrzetrq6mt81DQjGiLlAxtJ8UMKlPqp963T5J5Qb9Pe74xgeOQSQcg
 Vson8t14ry3iZTzdrXRIBvhdPtv5yOu//aXGFevqNkv3lSwFuRRhuCreDRU0nA5vnNLt
 cpZg==
MIME-Version: 1.0
Received: by 10.152.145.37 with SMTP id sr5mr30198611lab.33.1358279722551;
 Tue, 15 Jan 2013 11:55:22 -0800 (PST)
Received: by 10.114.78.41 with HTTP; Tue, 15 Jan 2013 11:55:22 -0800 (PST)
In-Reply-To: <CALC5+1MbmG8xyKmr6LVYvrFOOWrv-v=BR6JkyG4jW_d3-Js7GA@mail.gmail.com>
References: <CALC5+1MbmG8xyKmr6LVYvrFOOWrv-v=BR6JkyG4jW_d3-Js7GA@mail.gmail.com>
Date: Tue, 15 Jan 2013 11:55:22 -0800
Message-ID: <CALC5+1MEQ3kqqGqk=dhSFLXSJA2fdR1smvKKjch0QeG_mhJX2g@mail.gmail.com>
Subject: Re: CAM hangs in 9-STABLE? [Was: NFS/ZFS hangs after upgrading from
 9.0-RELEASE to -STABLE]
From: olivier <olivier777a7@gmail.com>
To: freebsd-fs@freebsd.org, freebsd-stable@freebsd.org
Content-Type: text/plain; charset=ISO-8859-1
X-Content-Filtered-By: Mailman/MimeDel 2.1.14
Cc: ken@freebsd.org, Andriy Gapon <avg@freebsd.org>
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 15 Jan 2013 19:55:25 -0000

Dear All,
Still experiencing the same hangs I reported earlier with 9.1. I've been
running a kernel with WITNESS enabled to provide more information.

During an occurrence of the hang, running show alllocks gave

Process 25777 (sysctl) thread 0xfffffe014c5b2920 (102567)
exclusive sleep mutex Giant (Giant) r = 0 (0xffffffff811e34c0) locked @
/usr/src/sys/dev/usb/usb_transfer.c:3171
Process 25750 (sshd) thread 0xfffffe015a688000 (104313)
exclusive sx so_rcv_sx (so_rcv_sx) r = 0 (0xfffffe0204e0bb98) locked @
/usr/src/sys/kern/uipc_sockbuf.c:148
Process 24922 (cnid_dbd) thread 0xfffffe0187ac4920 (103597)
shared lockmgr zfs (zfs) r = 0 (0xfffffe0973062488) locked @
/usr/src/sys/kern/vfs_syscalls.c:3591
Process 24117 (sshd) thread 0xfffffe07bd914490 (104195)
exclusive sx so_rcv_sx (so_rcv_sx) r = 0 (0xfffffe0204e0a8f0) locked @
/usr/src/sys/kern/uipc_sockbuf.c:148
Process 1243 (java) thread 0xfffffe01ca85d000 (102704)
exclusive sleep mutex pmap (pmap) r = 0 (0xfffffe015aec1440) locked @
/usr/src/sys/amd64/amd64/pmap.c:4840
exclusive rw pmap pv global (pmap pv global) r = 0 (0xffffffff81409780)
locked @ /usr/src/sys/amd64/amd64/pmap.c:4802
exclusive sleep mutex vm page (vm page) r = 0 (0xffffffff813f0a80) locked @
/usr/src/sys/vm/vm_object.c:1128
exclusive sleep mutex vm object (standard object) r = 0
(0xfffffe01458e43a0) locked @ /usr/src/sys/vm/vm_object.c:1076
shared sx vm map (user) (vm map (user)) r = 0 (0xfffffe015aec1388) locked @
/usr/src/sys/vm/vm_map.c:2045
Process 994 (nfsd) thread 0xfffffe015a0df000 (102426)
shared lockmgr zfs (zfs) r = 0 (0xfffffe0c3b505878) locked @
/usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vfsops.c:1760
Process 994 (nfsd) thread 0xfffffe015a0f8490 (102422)
exclusive lockmgr zfs (zfs) r = 0 (0xfffffe02db3b3e60) locked @
/usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vfsops.c:1760
Process 931 (syslogd) thread 0xfffffe015af18920 (102365)
shared lockmgr zfs (zfs) r = 0 (0xfffffe0141dd6680) locked @
/usr/src/sys/kern/vfs_syscalls.c:3591
Process 22 (syncer) thread 0xfffffe0125077000 (100279)
exclusive lockmgr syncer (syncer) r = 0 (0xfffffe015a2ff680) locked @
/usr/src/sys/kern/vfs_subr.c:1809

I don't have full "show lockedvnods" output because the output does not get
captured by ddb after using "capture on", it doesn't fit on a single
screen, and doesn't get piped into a "more" equivalent. What I did manage
to get (copied by hand, typos possible) is:

0xfffffe0c3b5057e0: 0xfffffe0c3b5057e0: tag zfs, type VREG
tag zfs, type VREG
usecount 1, writecount 0, refcount 1 mountedhere 0
usecount 1, writecount 0, refcount 1 mountedhere 0
flags (VI_ACTIVE)
flags (VI_ACTIVE)
v_object 0xfffffe089bc1b828 ref 0 pages 0
v_object 0xfffffe089bc1b828 ref 0 pages 0
lock type zfs: SHARED (count 1)
lock type zfs: SHARED (count 1)

0xfffffe02db3b3dc8: 0xfffffe02db3b3dc8: tag zfs, type VREG
tag zfs, type VREG
usecount 6, writecount 0, refcount 6 mountedhere 0
usecount 6, writecount 0, refcount 6 mountedhere 0
flags (VI_ACTIVE)
flags (VI_ACTIVE)
v_object 0xfffffe0b79583ae0 ref 0 pages 0
v_object 0xfffffe0b79583ae0 ref 0 pages 0
lock type zfs: EXCL by thread 0xfffffe015a0f8490 (pid 994)
lock type zfs: EXCL by thread 0xfffffe015a0f8490 (pid 994)
with exclusive waiters pending
with exclusive waiters pending

The output of show witness is at http://pastebin.com/eSRb3FEu

The output of alltrace is at http://pastebin.com/X1LruNrf (a number of
threads are stuck in zio_wait, none I can find in zio_interrupt, and
according to gstat and disks eventually going to sleep all disk IO seems to
be stuck for good; I think Andriy explained earlier that these criteria
might indicate this is a ZFS hang).

The output of show geom is at http://pastebin.com/6nwQbKr4

The output of vmstat -i is at http://pastebin.com/9LcZ7Mi0 Interrupts are
occurring at a normal rate during the hang, as far as I can tell.

Any help would be greatly appreciated.
Thanks
Olivier
PS: my kernel was compiled from 9-STABLE from December, with CAM and ahci
from 9.0 (in the hope it would fix the hangs I was experiencing in plain
9-STABLE; obviously the hangs are still occurring). The rest of my
configuration is the same as posted earlier.

On Mon, Dec 24, 2012 at 9:42 PM, olivier <olivier777a7@gmail.com> wrote:

> Dear All
> It turns out that reverting to an older version of the mps driver did not
> fix the ZFS hangs I've been struggling with in 9.1 and 9-STABLE after all
> (they just took a bit longer to occur again, possibly just by chance). I
> followed steps along lines suggested by Andriy to collect more information
> when the problem occurs. Hopefully this will help figure out what's going
> on.
>
> As far as I can tell, what happens is that at some point IO operations to
> a bunch of drives that belong to different pools get stuck. For these
> drives, gstat shows no activity but 1 pending operation, as such:
>
>  L(q)  ops/s    r/s   kBps   ms/r    w/s   kBps   ms/w    d/s   kBps
> ms/d   %busy Name
>     1      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  da1
>
> I've been running gstat in a loop (every 100s) to monitor the machine.
> Just before the hang occurs, everything seems fine (see full gstat output
> below). Right after the hang occurs a number of drives seem stuck (see full
> gstat output below). Notably, some stuck drives are seen through the mps
> driver and others through the mpt driver. So the problem doesn't seem to be
> driver-specific. I have had the problem occur (at a lower frequency) on
> similar machines that don't use the mpt driver (and only have 1 disk
> provided through mps), so the problem doesn't seem to be caused by the mpt
> driver (and is likely not caused by defective hardware). Since based on the
> information I provided earlier Andriy thinks the problem might not
> originate in ZFS, perhaps that means that the problem is in the CAM layer?
>
> camcontrol tags -v (as suggested by Andriy) in the hung state shows for
> example
>
> (pass56:mpt1:0:8:20): dev_openings  254
> (pass56:mpt1:0:8:20): dev_active    1
> (pass56:mpt1:0:8:20): devq_openings 254
> (pass56:mpt1:0:8:20): devq_queued   0
> (pass56:mpt1:0:8:20): held          0
> (pass56:mpt1:0:8:20): mintags       2
> (pass56:mpt1:0:8:20): maxtags       255
> (I'm not providing full camcontrol tags output below because I couldn't
> get it to run during the specific hang I documented most thoroughly; the
> example above is from a different occurrence of the hang).
>
> The buses don't seem completely frozen: if I manually remove drives while
> the machine is hanging, that's picked up by the mpt driver, which prints
> out corresponding messages to the console. But camcontrol reset all or
> rescan all don't seem to do anything.
>
> I've tried reducing vfs.zfs.vdev.min_pending and vfs.zfs.vdev.max_pending
> to 1, to no avail.
>
> Any suggestions to resolve this problem, work around it, or further
> investigate it would be greatly appreciated!
> Thanks a lot
> Olivier
>
> Detailed information:
>
> Output of procstat -a -kk when the machine is hanging is available at
> http://pastebin.com/7D2KtT35 (not putting it here because it's pretty
> long)
>
> dmesg is available at http://pastebin.com/9zJQwWJG . Note that I'm using
> LUN masking, so the "illegal requests" reported aren't really errors. Maybe
> one day if I get my problems sorted out I'll use geom multipathing instead.
>
> My kernel config is
> include GENERIC
> ident MYKERNEL
>
> options IPSEC
> device crypto
>
> options OFED # Infiniband protocol
>
> device mlx4ib # ConnectX Infiniband support
> device mlxen # ConnectX Ethernet support
> device mthca # Infinihost cards
> device ipoib # IP over IB devices
>
> options         ATA_CAM         # Handle legacy controllers with CAM
> options         ATA_STATIC_ID   # Static device numbering
>
> options KDB
> options DDB
>
>
>
> Full output of gstat just before the hang (at most 100s before the hang):
>  L(q)  ops/s    r/s   kBps   ms/r    w/s   kBps   ms/w    d/s   kBps
> ms/d   %busy Name
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  da2
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  da0
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  DEV/da2/da2
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  DEV/da0/da0
>     1     85     48     79    4.7     35     84    0.5      0      0
>  0.0   24.3  da1
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  DEV/da1/da1
>     1     83     47     77    4.3     34     79    0.5      0      0
>  0.0   22.1  da4
>     1   1324   1303  21433    0.6     19     42    0.7      0      0
>  0.0   79.8  da3
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  da5
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  da6
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  da7
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  da8
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  da9
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  da10
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  da11
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  da12
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  da13
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  da14
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  da15
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  da16
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  da17
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  da18
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  da19
>     0     97     57     93    3.5     38     84    0.3      0      0
>  0.0   21.3  da20
>     0     85     47     69    3.3     36     86    0.4      0      0
>  0.0   16.8  da21
>     0   1666   1641  18992    0.3     23     43    0.4      0      0
>  0.0   57.9  da22
>     0     93     55     98    3.5     36     87    0.4      0      0
>  0.0   20.6  da23
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  da24
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  da25
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  da26
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  da27
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  da28
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  da29
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  da30
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  da31
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  da32
>     0   1200      0      0    0.0   1198  11751    0.6      0      0
>  0.0   67.3  da33
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  da34
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  da35
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  da36
>     0     81     44     67    2.0     35     84    0.3      0      0
>  0.0   10.1  da37
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  da38
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  da39
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  da40
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  da41
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  da42
>     1   1020    999  22028    0.8     19     42    0.7      0      0
>  0.0   84.8  da43
>     0   1050   1029  23479    0.8     19     47    0.7      0      0
>  0.0   83.3  da44
>     1   1006    984  22758    0.8     21     46    0.6      0      0
>  0.0   84.8  da45
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  da46
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  da47
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  da48
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  da49
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  da50
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  cd0
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  DEV/da4/da4
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  DEV/da3/da3
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  DEV/da5/da5
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  DEV/da6/da6
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  DEV/da7/da7
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  DEV/da8/da8
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  DEV/da9/da9
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  DEV/da10/da10
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  DEV/da11/da11
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  DEV/da12/da12
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  DEV/da13/da13
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  DEV/da14/da14
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  DEV/da15/da15
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  DEV/da16/da16
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  DEV/da17/da17
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  DEV/da18/da18
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  DEV/da19/da19
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  DEV/da20/da20
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  DEV/da21/da21
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  DEV/da22/da22
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  DEV/da23/da23
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  DEV/da24/da24
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  DEV/da25/da25
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  DEV/da26/da26
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  PART/da26/da26
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  da26p1
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  da26p2
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  da26p3
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  DEV/da27/da27
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  DEV/da28/da28
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  DEV/da29/da29
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  DEV/da30/da30
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  DEV/da31/da31
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  DEV/da32/da32
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  DEV/da33/da33
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  DEV/da34/da34
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  DEV/da35/da35
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  DEV/da36/da36
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  DEV/da37/da37
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  DEV/da38/da38
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  DEV/da39/da39
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  DEV/da40/da40
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  DEV/da41/da41
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  DEV/da42/da42
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  DEV/da43/da43
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  DEV/da44/da44
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  DEV/da45/da45
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  DEV/da46/da46
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  DEV/da47/da47
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  DEV/da48/da48
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  DEV/da49/da49
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  DEV/da50/da50
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  DEV/cd0/cd0
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  DEV/da26p1/da26p1
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  DEV/da26p2/da26p2
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  LABEL/da26p1/da26p1
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  gptid/84d4487b-34e3-11e2-b773-00259058949a
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  DEV/da26p3/da26p3
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  LABEL/da26p2/da26p2
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  gptid/b4255780-34e3-11e2-b773-00259058949a
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0
>  DEV/gptid/84d4487b-34e3-11e2-b773-00259058949a/gptid/84d4487b-34e3-11e2-b773-00259058949a
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  ZFS::VDEV/zfs::vdev/da25
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0
>  DEV/gptid/b4255780-34e3-11e2-b773-00259058949a/gptid/b4255780-34e3-11e2-b773-00259058949a
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  ZFS::VDEV/zfs::vdev/da40
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  ZFS::VDEV/zfs::vdev/da41
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  ZFS::VDEV/zfs::vdev/da26p3
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  ZFS::VDEV/zfs::vdev/da29
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  ZFS::VDEV/zfs::vdev/da30
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  ZFS::VDEV/zfs::vdev/da24
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  ZFS::VDEV/zfs::vdev/da6
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  ZFS::VDEV/zfs::vdev/da7
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  ZFS::VDEV/zfs::vdev/da16
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  ZFS::VDEV/zfs::vdev/da17
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  ZFS::VDEV/zfs::vdev/da20
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  ZFS::VDEV/zfs::vdev/da21
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  ZFS::VDEV/zfs::vdev/da37
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  ZFS::VDEV/zfs::vdev/da23
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  ZFS::VDEV/zfs::vdev/da1
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  ZFS::VDEV/zfs::vdev/da4
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  ZFS::VDEV/zfs::vdev/da43
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  ZFS::VDEV/zfs::vdev/da44
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  ZFS::VDEV/zfs::vdev/da22
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  ZFS::VDEV/zfs::vdev/da33
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  ZFS::VDEV/zfs::vdev/da45
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  ZFS::VDEV/zfs::vdev/da3
>
>
> Full output of gstat just after the hang (at most 100s after the hang):
>  L(q)  ops/s    r/s   kBps   ms/r    w/s   kBps   ms/w    d/s   kBps
> ms/d   %busy Name
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  da2
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  da0
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  DEV/da2/da2
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  DEV/da0/da0
>     1      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  da1
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  DEV/da1/da1
>     1      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  da4
>     1      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  da3
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  da5
>     1      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  da6
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  da7
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  da8
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  da9
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  da10
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  da11
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  da12
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  da13
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  da14
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  da15
>     1      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  da16
>     1      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  da17
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  da18
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  da19
>     1      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  da20
>     1      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  da21
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  da22
>     1      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  da23
>     1      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  da24
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  da25
>     1      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  da26
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  da27
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  da28
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  da29
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  da30
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  da31
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  da32
>     1      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  da33
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  da34
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  da35
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  da36
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  da37
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  da38
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  da39
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  da40
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  da41
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  da42
>     1      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  da43
>     1      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  da44
>     1      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  da45
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  da46
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  da47
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  da48
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  da49
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  da50
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  cd0
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  DEV/da4/da4
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  DEV/da3/da3
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  DEV/da5/da5
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  DEV/da6/da6
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  DEV/da7/da7
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  DEV/da8/da8
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  DEV/da9/da9
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  DEV/da10/da10
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  DEV/da11/da11
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  DEV/da12/da12
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  DEV/da13/da13
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  DEV/da14/da14
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  DEV/da15/da15
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  DEV/da16/da16
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  DEV/da17/da17
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  DEV/da18/da18
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  DEV/da19/da19
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  DEV/da20/da20
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  DEV/da21/da21
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  DEV/da22/da22
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  DEV/da23/da23
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  DEV/da24/da24
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  DEV/da25/da25
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  DEV/da26/da26
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  PART/da26/da26
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  da26p1
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  da26p2
>     1      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  da26p3
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  DEV/da27/da27
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  DEV/da28/da28
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  DEV/da29/da29
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  DEV/da30/da30
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  DEV/da31/da31
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  DEV/da32/da32
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  DEV/da33/da33
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  DEV/da34/da34
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  DEV/da35/da35
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  DEV/da36/da36
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  DEV/da37/da37
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  DEV/da38/da38
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  DEV/da39/da39
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  DEV/da40/da40
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  DEV/da41/da41
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  DEV/da42/da42
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  DEV/da43/da43
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  DEV/da44/da44
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  DEV/da45/da45
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  DEV/da46/da46
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  DEV/da47/da47
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  DEV/da48/da48
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  DEV/da49/da49
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  DEV/da50/da50
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  DEV/cd0/cd0
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  DEV/da26p1/da26p1
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  DEV/da26p2/da26p2
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  LABEL/da26p1/da26p1
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  gptid/84d4487b-34e3-11e2-b773-00259058949a
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  DEV/da26p3/da26p3
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  LABEL/da26p2/da26p2
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  gptid/b4255780-34e3-11e2-b773-00259058949a
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0
>  DEV/gptid/84d4487b-34e3-11e2-b773-00259058949a/gptid/84d4487b-34e3-11e2-b773-00259058949a
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  ZFS::VDEV/zfs::vdev/da25
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0
>  DEV/gptid/b4255780-34e3-11e2-b773-00259058949a/gptid/b4255780-34e3-11e2-b773-00259058949a
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  ZFS::VDEV/zfs::vdev/da40
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  ZFS::VDEV/zfs::vdev/da41
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  ZFS::VDEV/zfs::vdev/da26p3
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  ZFS::VDEV/zfs::vdev/da29
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  ZFS::VDEV/zfs::vdev/da30
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  ZFS::VDEV/zfs::vdev/da24
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  ZFS::VDEV/zfs::vdev/da6
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  ZFS::VDEV/zfs::vdev/da7
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  ZFS::VDEV/zfs::vdev/da16
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  ZFS::VDEV/zfs::vdev/da17
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  ZFS::VDEV/zfs::vdev/da20
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  ZFS::VDEV/zfs::vdev/da21
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  ZFS::VDEV/zfs::vdev/da37
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  ZFS::VDEV/zfs::vdev/da23
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  ZFS::VDEV/zfs::vdev/da1
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  ZFS::VDEV/zfs::vdev/da4
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  ZFS::VDEV/zfs::vdev/da43
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  ZFS::VDEV/zfs::vdev/da44
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  ZFS::VDEV/zfs::vdev/da22
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  ZFS::VDEV/zfs::vdev/da33
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  ZFS::VDEV/zfs::vdev/da45
>     0      0      0      0    0.0      0      0    0.0      0      0
>  0.0    0.0  ZFS::VDEV/zfs::vdev/da3
>
>
> On Thu, Dec 13, 2012 at 10:14 PM, olivier <olivier777a7@gmail.com> wrote:
>
>> For what it's worth, I think I might have solved my problem by reverting
>> to an older version of the mps driver. I checked out a recent version of
>> 9-STABLE and reversed the changes in
>> http://svnweb.freebsd.org/base?view=revision&revision=230592 (perhaps
>> there was a simpler way of reverting to the older mps driver). So far so
>> good, no hang even when hammering the file system.
>>
>> This does not conclusively prove that the new LSI mps driver is at fault,
>> but that seems to be a likely explanation.
>>
>> Thanks to everybody who pointed me in the right direction. Hope this
>> helps others who run into similar problems with 9.1
>>  Olivier
>>
>>
>> On Thu, Dec 13, 2012 at 10:14 AM, olivier <olivier777a7@gmail.com> wrote:
>>
>>>
>>>
>>> On Thu, Dec 13, 2012 at 9:54 AM, Andriy Gapon <avg@freebsd.org> wrote:
>>>
>>>> Google for "zfs deadman".  This is already committed upstream and I
>>>> think that it
>>>> is imported into FreeBSD, but I am not sure...  Maybe it's imported
>>>> just into the
>>>> vendor area and is not merged yet.
>>>>
>>>
>>> Yes, that's exactly what I had in mind. The logic for panicking makes
>>> sense.
>>> As far as I can tell you're correct that deadman is in the vendor area
>>> but not merged. Any idea when it might make it into 9-STABLE?
>>> Thanks
>>> Olivier
>>>
>>>
>>>
>>>
>>>> So, when enabled this logic would panic a system as a way of letting
>>>> know that
>>>> something is wrong.  You can read in the links why panic was selected
>>>> for this job.
>>>>
>>>> And speaking FreeBSD-centric - I think that our CAM layer would be a
>>>> perfect place
>>>> to detect such issues in non-ZFS-specific way.
>>>>
>>>> --
>>>> Andriy Gapon
>>>>
>>>
>>>
>>
>

From owner-freebsd-fs@FreeBSD.ORG  Tue Jan 15 20:28:00 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: fs@freebsd.org
Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115])
 by hub.freebsd.org (Postfix) with ESMTP id AFA488DA;
 Tue, 15 Jan 2013 20:28:00 +0000 (UTC) (envelope-from jhb@freebsd.org)
Received: from bigwig.baldwin.cx (bigknife-pt.tunnel.tserv9.chi1.ipv6.he.net
 [IPv6:2001:470:1f10:75::2])
 by mx1.freebsd.org (Postfix) with ESMTP id 88634647;
 Tue, 15 Jan 2013 20:28:00 +0000 (UTC)
Received: from pakbsde14.localnet (unknown [38.105.238.108])
 by bigwig.baldwin.cx (Postfix) with ESMTPSA id C4AEBB9AD;
 Tue, 15 Jan 2013 15:27:59 -0500 (EST)
From: John Baldwin <jhb@freebsd.org>
To: Bruce Evans <brde@optusnet.com.au>
Subject: Re: [PATCH] Better handle NULL utimes() in the NFS client
Date: Tue, 15 Jan 2013 14:58:42 -0500
User-Agent: KMail/1.13.5 (FreeBSD/8.2-CBSD-20110714-p22; KDE/4.5.5; amd64; ; )
References: <162405990.1985479.1358212854967.JavaMail.root@erie.cs.uoguelph.ca>
 <20130115141019.H1444@besplex.bde.org>
In-Reply-To: <20130115141019.H1444@besplex.bde.org>
MIME-Version: 1.0
Content-Type: Text/Plain;
  charset="iso-8859-1"
Content-Transfer-Encoding: 7bit
Message-Id: <201301151458.42874.jhb@freebsd.org>
X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.2.7
 (bigwig.baldwin.cx); Tue, 15 Jan 2013 15:27:59 -0500 (EST)
Cc: Rick Macklem <rmacklem@freebsd.org>, fs@freebsd.org
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 15 Jan 2013 20:28:00 -0000

On Monday, January 14, 2013 11:51:23 pm Bruce Evans wrote:
> On Mon, 14 Jan 2013, Rick Macklem wrote:
> 
> > John Baldwin wrote:
> >> The NFS client tries to infer when an application has passed NULL to
> >> utimes()
> >> so that it can let the server set the timestamp rather than using a
> >> client-
> >> supplied timestamp. It does this by checking to see if the desired
> >> timestamp's second matches the current second. However, this breaks
> >> applications that are intentionally trying to set a specific timestamp
> >> within
> >> the current second. In addition, utimes() sets a flag to indicate if
> >> NULL was
> >> passed to utimes(). The patch below changes the NFS client to check
> >> this flag
> >> and only use the server-supplied time in that case:
> 
> It is certainly an error to not check VA_UTIMES_NULL at all.  I think
> the flag (or the NULL pointer) cannot be passed to the server, so the
> best we can do for the VA_UTIMES_NULL case is read the current time on
> the client and pass it to the server.  Upper layers have already read
> the current time, but have passed us VA_UTIMES_NULL so that we can tell
> that the pointer was originally null so that we can do the different
> permissions checks for this case.
> 
> >> Index: fs/nfsclient/nfs_clport.c
> >> ===================================================================
> >> --- fs/nfsclient/nfs_clport.c (revision 225511)
> >> +++ fs/nfsclient/nfs_clport.c (working copy)
> >> @@ -762,7 +762,7 @@
> >> *tl = newnfs_false;
> >> }
> >> if (vap->va_atime.tv_sec != VNOVAL) {
> >> - if (vap->va_atime.tv_sec != curtime.tv_sec) {
> >> + if (!(vap->va_vaflags & VA_UTIMES_NULL)) {
> >> NFSM_BUILD(tl, u_int32_t *, 3 * NFSX_UNSIGNED);
> >> *tl++ = txdr_unsigned(NFSV3SATTRTIME_TOCLIENT);
> >> txdr_nfsv3time(&vap->va_atime, tl);
> >> @@ -775,7 +775,7 @@
> >> *tl = txdr_unsigned(NFSV3SATTRTIME_DONTCHANGE);
> >> ...
> 
> Something mangled the patch so that it is hard to see what it does.  It
> just uses the flag instead of guessing.
> 
> I can't see anything that does the different permissions check for
> the VA_UTIMES_NULL case, and testing shows that this case is just broken,
> at least for an old version of the old nfs client -- the same permissions
> are required for all cases, but write permission is supposed to be
> enough for the VA_UTIMES_NULL case (since write permission is sufficient
> for setting the mtime to the current time (plus epsilon) using write(2)
> and truncate(2).  Setting the atime to the current time should require
> no more and no less than read permission, since it can be done using
> read(2), but utimes(NULL) requires write permission for that too).

Correct.  All the other uses of VA_UTIMES_NULL in the tree are to
provide the permissions check you describe and there is a large
comment about it in ufs_setattr().  Other filesystems have comments
that reference ufs_setattr().  I think these checks should be done
in nfs_setattr() rather than in the routine to build an NFS attribute
object however.

Fixing NFS to properly use vfs_timestamp() seems to be a larger
project.

-- 
John Baldwin

From owner-freebsd-fs@FreeBSD.ORG  Tue Jan 15 22:46:05 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by hub.freebsd.org (Postfix) with ESMTP id 4A131EC;
 Tue, 15 Jan 2013 22:46:05 +0000 (UTC)
 (envelope-from nicolas@i.0x5.de)
Received: from n.0x5.de (n.0x5.de [217.197.85.144])
 by mx1.freebsd.org (Postfix) with ESMTP id 029C7EAB;
 Tue, 15 Jan 2013 22:46:04 +0000 (UTC)
Received: by pc5.i.0x5.de (Postfix, from userid 1003)
 id 3Ym68h4Ct7z7ySF; Tue, 15 Jan 2013 23:45:56 +0100 (CET)
Date: Tue, 15 Jan 2013 23:45:56 +0100
From: Nicolas Rachinsky <fbsd-mas-0@ml.turing-complete.org>
To: Artem Belevich <art@freebsd.org>
Subject: Re: slowdown of zfs (tx->tx)
Message-ID: <20130115224556.GA41774@mid.pc5.i.0x5.de>
References: <CAFqOu6jrng=v8eVyhqV-PBqJM_dYy+U7X4+=ahBeoxvK4mxcSA@mail.gmail.com>
 <20130110193949.GA10023@mid.pc5.i.0x5.de>
 <20130111073417.GA95100@mid.pc5.i.0x5.de>
 <CAFqOu6gWpMsWN0pTBiv10WfwyGWMfO9GzMLWTtcVxHixr-_i3Q@mail.gmail.com>
 <20130114094010.GA75529@mid.pc5.i.0x5.de>
 <CAFqOu6hxfGt_M6Jo9qWeifDz9YnNc_Bd9H-GEe4RYtutaPvH5w@mail.gmail.com>
 <20130114195148.GA20540@mid.pc5.i.0x5.de>
 <CAFqOu6jwJ4qhbOovN_NhzusdQJvrbvUC3g93sziR=Uw99SGenw@mail.gmail.com>
 <20130114214652.GA76779@mid.pc5.i.0x5.de>
 <CAFqOu6jKX-Ks6C1RK5GwZ51ZVUSnGSe7S99_EfK+fwLPjAFFYw@mail.gmail.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <CAFqOu6jKX-Ks6C1RK5GwZ51ZVUSnGSe7S99_EfK+fwLPjAFFYw@mail.gmail.com>
X-Powered-by: FreeBSD
X-Homepage: http://www.rachinsky.de
X-PGP-Keyid: 887BAE72
X-PGP-Fingerprint: 039E 9433 115F BC5F F88D  4524 5092 45C4 887B AE72
X-PGP-Keys: http://www.rachinsky.de/nicolas/gpg/nicolas_rachinsky.asc
User-Agent: Mutt/1.5.21 (2010-09-15)
Cc: freebsd-fs <freebsd-fs@freebsd.org>
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 15 Jan 2013 22:46:05 -0000

* Artem Belevich <art@freebsd.org> [2013-01-14 17:37 -0800]:
> OK. threads responsible for transaction sync seem to be stuck in zio_wait.
> zio_wait is in turn waiting for some task thread to be done with its work.
> Now you need to figure out what are those task threads doing.
> 
> 'procstat -kk 0' will dump few hundreds of taskq threads. Most of them
> would be zfs related. On an idle box (8.3/amd64 in my case) most of
> them would have the same stack trace looking like this (modulo
> offsets):
> 
> mi_switch+0x196 sleepq_wait+0x42 _sleep+0x3c0
> taskqueue_thread_loop+0xbe fork_exit+0x11f fork_trampoline+0xe
> 
> Look for stack traces that don't match that pattern.

There are some of these.

root@bolte ~# sh -c 'for i in `jot 1000`; do procstat -kk 0 ; sleep 0.1 ; done' | sort | uniq -c | grep -v -F 'mi_switch+0x176 sleepq_wait+0x42 _sleep+0x317 taskqueue_thread_loop+0xbe fork_exit+0x11f fork_trampoline+0xe' | sort -n
   1     0 100098 kernel           zio_write_issue_ mi_switch+0x176 sleepq_wait+0x42 _cv_wait+0x129 space_map_load_wait+0x20 metaslab_activate+0x73 metaslab_alloc+0x7b2 zio_dva_allocate+0x9a zio_execute+0xc3 zio_ready+0x17d zio_execute+0xc3 zio_ready+0x17d zio_execute+0xc3 zio_ready+0x17d zio_execute+0xc3 zio_ready+0x17d zio_execute+0xc3 zio_ready+0x17d zio_execute+0xc3 
   1     0 100098 kernel           zio_write_issue_ mi_switch+0x176 sleepq_wait+0x42 _cv_wait+0x129 zio_wait+0x61 dmu_buf_hold_array_by_dnode+0x22b dmu_read+0x89 space_map_load+0x108 metaslab_activate+0xdc metaslab_alloc+0x7b2 zio_dva_allocate+0x1aa zio_execute+0xc3 zio_ready+0x17d zio_execute+0xc3 zio_ready+0x17d zio_execute+0xc3 taskqueue_run_locked+0x85 taskqueue_thread_loop+0x4e fork_exit+0x11f 
   1     0 100098 kernel           zio_write_issue_ mi_switch+0x176 sleepq_wait+0x42 _sx_xlock_hard+0x305 _sx_xlock+0x4e metaslab_alloc+0x77b zio_dva_allocate+0x1aa zio_execute+0xc3 taskqueue_run_locked+0x85 taskqueue_thread_loop+0x4e fork_exit+0x11f fork_trampoline+0xe 
   1     0 100098 kernel           zio_write_issue_ mi_switch+0x176 sleepq_wait+0x42 _sx_xlock_hard+0x305 _sx_xlock+0x4e metaslab_alloc+0x77b zio_dva_allocate+0x1aa zio_execute+0xc3 zio_ready+0x17d zio_execute+0xc3 taskqueue_run_locked+0x85 taskqueue_thread_loop+0x4e fork_exit+0x11f fork_trampoline+0xe 
   1     0 100098 kernel           zio_write_issue_ mi_switch+0x176 turnstile_wait+0x1cb _mtx_lock_sleep+0xb0 taskqueue_member+0xe8 zio_execute+0x10c taskqueue_run_locked+0x85 taskqueue_thread_loop+0x4e fork_exit+0x11f fork_trampoline+0xe 
   1     0 100099 kernel           zio_write_issue_ mi_switch+0x176 critical_exit+0xa5 intr_event_handle+0xb3 intr_execute_handlers+0x5f lapic_handle_intr+0x37 Xapic_isr1+0xa5 space_map_remove+0x81 space_map_load+0x1a4 metaslab_activate+0xdc metaslab_alloc+0x7b2 zio_dva_allocate+0x9a zio_execute+0xc3 taskqueue_run_locked+0x85 taskqueue_thread_loop+0x4e fork_exit+0x11f fork_trampoline+0xe 
   1     0 100099 kernel           zio_write_issue_ mi_switch+0x176 sleepq_wait+0x42 _cv_wait+0x129 space_map_load_wait+0x20 metaslab_activate+0x73 metaslab_alloc+0x7b2 zio_dva_allocate+0x1aa zio_execute+0xc3 zio_ready+0x17d zio_execute+0xc3 taskqueue_run_locked+0x85 taskqueue_thread_loop+0x4e fork_exit+0x11f fork_trampoline+0xe 
   1     0 100099 kernel           zio_write_issue_ mi_switch+0x176 sleepq_wait+0x42 _sx_xlock_hard+0x305 _sx_xlock+0x4e metaslab_alloc+0x77b zio_dva_allocate+0x1aa zio_execute+0xc3 taskqueue_run_locked+0x85 taskqueue_thread_loop+0x4e fork_exit+0x11f fork_trampoline+0xe 
   1     0 100106 kernel           zio_write_intr_1 <running>                    
   1     0 100108 kernel           zio_write_intr_3 <running>                    
   1     0 100109 kernel           zio_write_intr_4 <running>                    
   1     0 100109 kernel           zio_write_intr_4 mi_switch+0x176 turnstile_wait+0x1cb _mtx_lock_sleep+0xb0 _sleep+0x251 taskqueue_thread_loop+0xbe fork_exit+0x11f fork_trampoline+0xe 
   1     0 100110 kernel           zio_write_intr_5 mi_switch+0x176 turnstile_wait+0x1cb _mtx_lock_sleep+0xb0 _sleep+0x251 taskqueue_thread_loop+0xbe fork_exit+0x11f fork_trampoline+0xe 
   2     0 100040 kernel           nfe0 taskq       <running>                    
   2     0 100096 kernel           zio_read_intr_0  <running>                    
   2     0 100098 kernel           zio_write_issue_ mi_switch+0x176 sleepq_wait+0x42 _cv_wait+0x129 space_map_load_wait+0x20 metaslab_activate+0x73 metaslab_alloc+0x7b2 zio_dva_allocate+0x1aa zio_execute+0xc3 zio_ready+0x17d zio_execute+0xc3 zio_ready+0x17d zio_execute+0xc3 zio_ready+0x17d zio_execute+0xc3 zio_ready+0x17d zio_execute+0xc3 zio_ready+0x17d zio_execute+0xc3 
   2     0 100099 kernel           zio_write_issue_ mi_switch+0x176 sleepq_wait+0x42 _cv_wait+0x129 space_map_load_wait+0x20 metaslab_activate+0x73 metaslab_alloc+0x7b2 zio_dva_allocate+0x9a zio_execute+0xc3 zio_ready+0x17d zio_execute+0xc3 zio_ready+0x17d zio_execute+0xc3 zio_ready+0x17d zio_execute+0xc3 zio_ready+0x17d zio_execute+0xc3 zio_ready+0x17d zio_execute+0xc3 
   2     0 100099 kernel           zio_write_issue_ mi_switch+0x176 sleepq_wait+0x42 _cv_wait+0x129 zio_wait+0x61 dbuf_read+0x5e5 dbuf_findbp+0x107 dbuf_prefetch+0x8f dmu_prefetch+0x1bb space_map_load+0x289 metaslab_activate+0xdc metaslab_alloc+0x7b2 zio_dva_allocate+0x1aa zio_execute+0xc3 taskqueue_run_locked+0x85 taskqueue_thread_loop+0x4e fork_exit+0x11f fork_trampoline+0xe 
   2     0 100099 kernel           zio_write_issue_ mi_switch+0x176 sleepq_wait+0x42 _sx_xlock_hard+0x305 _sx_xlock+0x4e metaslab_alloc+0x77b zio_dva_allocate+0x9a zio_execute+0xc3 taskqueue_run_locked+0x85 taskqueue_thread_loop+0x4e fork_exit+0x11f fork_trampoline+0xe 
   3     0 100098 kernel           zio_write_issue_ mi_switch+0x176 sleepq_wait+0x42 _cv_wait+0x129 space_map_load_wait+0x20 metaslab_activate+0x73 metaslab_alloc+0x7b2 zio_dva_allocate+0x9a zio_execute+0xc3 zio_ready+0x17d zio_execute+0xc3 zio_ready+0x17d zio_execute+0xc3 zio_ready+0x17d zio_execute+0xc3 zio_ready+0x17d zio_execute+0xc3 taskqueue_run_locked+0x85 taskqueue_thread_loop+0x4e 
   3     0 100099 kernel           zio_write_issue_ mi_switch+0x176 sleepq_wait+0x42 _cv_wait+0x129 space_map_load_wait+0x20 metaslab_activate+0x73 metaslab_alloc+0x7b2 zio_dva_allocate+0x1aa zio_execute+0xc3 zio_ready+0x17d zio_execute+0xc3 zio_ready+0x17d zio_execute+0xc3 zio_ready+0x17d zio_execute+0xc3 zio_ready+0x17d zio_execute+0xc3 taskqueue_run_locked+0x85 taskqueue_thread_loop+0x4e 
   3     0 100099 kernel           zio_write_issue_ mi_switch+0x176 sleepq_wait+0x42 _cv_wait+0x129 zio_wait+0x61 dmu_buf_hold_array_by_dnode+0x22b dmu_read+0x89 space_map_load+0x108 metaslab_activate+0xdc metaslab_alloc+0x7b2 zio_dva_allocate+0x1aa zio_execute+0xc3 taskqueue_run_locked+0x85 taskqueue_thread_loop+0x4e fork_exit+0x11f fork_trampoline+0xe 
   4     0 100098 kernel           zio_write_issue_ mi_switch+0x176 sleepq_wait+0x42 _cv_wait+0x129 space_map_load_wait+0x20 metaslab_activate+0x73 metaslab_alloc+0x7b2 zio_dva_allocate+0x1aa zio_execute+0xc3 zio_ready+0x17d zio_execute+0xc3 taskqueue_run_locked+0x85 taskqueue_thread_loop+0x4e fork_exit+0x11f fork_trampoline+0xe 
   6     0 100098 kernel           zio_write_issue_ mi_switch+0x176 sleepq_wait+0x42 _cv_wait+0x129 space_map_load_wait+0x20 metaslab_activate+0x73 metaslab_alloc+0x7b2 zio_dva_allocate+0x1aa zio_execute+0xc3 zio_ready+0x17d zio_execute+0xc3 zio_ready+0x17d zio_execute+0xc3 zio_ready+0x17d zio_execute+0xc3 taskqueue_run_locked+0x85 taskqueue_thread_loop+0x4e fork_exit+0x11f fork_trampoline+0xe 
   7     0 100098 kernel           zio_write_issue_ mi_switch+0x176 sleepq_wait+0x42 _cv_wait+0x129 space_map_load_wait+0x20 metaslab_activate+0x73 metaslab_alloc+0x7b2 zio_dva_allocate+0x9a zio_execute+0xc3 zio_ready+0x17d zio_execute+0xc3 zio_ready+0x17d zio_execute+0xc3 zio_ready+0x17d zio_execute+0xc3 taskqueue_run_locked+0x85 taskqueue_thread_loop+0x4e fork_exit+0x11f fork_trampoline+0xe 
   7     0 100099 kernel           zio_write_issue_ mi_switch+0x176 sleepq_wait+0x42 _cv_wait+0x129 space_map_load_wait+0x20 metaslab_activate+0x73 metaslab_alloc+0x7b2 zio_dva_allocate+0x9a zio_execute+0xc3 zio_ready+0x17d zio_execute+0xc3 zio_ready+0x17d zio_execute+0xc3 zio_ready+0x17d zio_execute+0xc3 taskqueue_run_locked+0x85 taskqueue_thread_loop+0x4e fork_exit+0x11f fork_trampoline+0xe 
  12     0 100099 kernel           zio_write_issue_ mi_switch+0x176 sleepq_wait+0x42 _cv_wait+0x129 space_map_load_wait+0x20 metaslab_activate+0x73 metaslab_alloc+0x7b2 zio_dva_allocate+0x9a zio_execute+0xc3 zio_ready+0x17d zio_execute+0xc3 zio_ready+0x17d zio_execute+0xc3 taskqueue_run_locked+0x85 taskqueue_thread_loop+0x4e fork_exit+0x11f fork_trampoline+0xe 
  14     0 100099 kernel           zio_write_issue_ mi_switch+0x176 sleepq_wait+0x42 _cv_wait+0x129 zio_wait+0x61 dmu_buf_hold_array_by_dnode+0x22b dmu_read+0x89 space_map_load+0x108 metaslab_activate+0xdc metaslab_alloc+0x7b2 zio_dva_allocate+0x9a zio_execute+0xc3 taskqueue_run_locked+0x85 taskqueue_thread_loop+0x4e fork_exit+0x11f fork_trampoline+0xe 
  18     0 100098 kernel           zio_write_issue_ mi_switch+0x176 sleepq_wait+0x42 _cv_wait+0x129 space_map_load_wait+0x20 metaslab_activate+0x73 metaslab_alloc+0x7b2 zio_dva_allocate+0x9a zio_execute+0xc3 zio_ready+0x17d zio_execute+0xc3 zio_ready+0x17d zio_execute+0xc3 taskqueue_run_locked+0x85 taskqueue_thread_loop+0x4e fork_exit+0x11f fork_trampoline+0xe 
  23     0 100098 kernel           zio_write_issue_ mi_switch+0x176 sleepq_wait+0x42 _cv_wait+0x129 space_map_load_wait+0x20 metaslab_activate+0x73 metaslab_alloc+0x7b2 zio_dva_allocate+0x9a zio_execute+0xc3 zio_ready+0x17d zio_execute+0xc3 taskqueue_run_locked+0x85 taskqueue_thread_loop+0x4e fork_exit+0x11f fork_trampoline+0xe 
  26     0 100099 kernel           zio_write_issue_ mi_switch+0x176 sleepq_wait+0x42 _cv_wait+0x129 space_map_load_wait+0x20 metaslab_activate+0x73 metaslab_alloc+0x7b2 zio_dva_allocate+0x9a zio_execute+0xc3 zio_ready+0x17d zio_execute+0xc3 taskqueue_run_locked+0x85 taskqueue_thread_loop+0x4e fork_exit+0x11f fork_trampoline+0xe 
  31     0 100098 kernel           zio_write_issue_ mi_switch+0x176 sleepq_wait+0x42 _cv_wait+0x129 space_map_load_wait+0x20 metaslab_activate+0x73 metaslab_alloc+0x7b2 zio_dva_allocate+0x1aa zio_execute+0xc3 zio_ready+0x17d zio_execute+0xc3 zio_ready+0x17d zio_execute+0xc3 taskqueue_run_locked+0x85 taskqueue_thread_loop+0x4e fork_exit+0x11f fork_trampoline+0xe 
  37     0 100099 kernel           zio_write_issue_ mi_switch+0x176 sleepq_wait+0x42 _cv_wait+0x129 space_map_load_wait+0x20 metaslab_activate+0x73 metaslab_alloc+0x7b2 zio_dva_allocate+0x1aa zio_execute+0xc3 zio_ready+0x17d zio_execute+0xc3 zio_ready+0x17d zio_execute+0xc3 taskqueue_run_locked+0x85 taskqueue_thread_loop+0x4e fork_exit+0x11f fork_trampoline+0xe 
  84     0 100098 kernel           zio_write_issue_ mi_switch+0x176 sleepq_wait+0x42 _cv_wait+0x129 space_map_load_wait+0x20 metaslab_activate+0x73 metaslab_alloc+0x7b2 zio_dva_allocate+0x1aa zio_execute+0xc3 taskqueue_run_locked+0x85 taskqueue_thread_loop+0x4e fork_exit+0x11f fork_trampoline+0xe 
  89     0 100099 kernel           zio_write_issue_ mi_switch+0x176 sleepq_wait+0x42 _cv_wait+0x129 space_map_load_wait+0x20 metaslab_activate+0x73 metaslab_alloc+0x7b2 zio_dva_allocate+0x1aa zio_execute+0xc3 taskqueue_run_locked+0x85 taskqueue_thread_loop+0x4e fork_exit+0x11f fork_trampoline+0xe 
 145     0 100099 kernel           zio_write_issue_ mi_switch+0x176 sleepq_wait+0x42 _cv_wait+0x129 space_map_load_wait+0x20 metaslab_activate+0x73 metaslab_alloc+0x7b2 zio_dva_allocate+0x9a zio_execute+0xc3 taskqueue_run_locked+0x85 taskqueue_thread_loop+0x4e fork_exit+0x11f fork_trampoline+0xe 
 147     0 100098 kernel           zio_write_issue_ mi_switch+0x176 sleepq_wait+0x42 _cv_wait+0x129 space_map_load_wait+0x20 metaslab_activate+0x73 metaslab_alloc+0x7b2 zio_dva_allocate+0x9a zio_execute+0xc3 taskqueue_run_locked+0x85 taskqueue_thread_loop+0x4e fork_exit+0x11f fork_trampoline+0xe 
 313     0 100099 kernel           zio_write_issue_ <running>                    
 329     0 100098 kernel           zio_write_issue_ <running>                    
 998     0 100040 kernel           nfe0 taskq       mi_switch+0x176 sleepq_wait+0x42 msleep_spin+0x1a2 taskqueue_thread_loop+0x71 fork_exit+0x11f fork_trampoline+0xe 
1000     0 100000 kernel           swapper          mi_switch+0x176 sleepq_timedwait+0x42 _sleep+0x301 scheduler+0x357 mi_startup+0x77 btext+0x2c 
1000     0 100017 kernel           acpi_task_0      mi_switch+0x176 sleepq_wait+0x42 msleep_spin+0x1a2 taskqueue_thread_loop+0x71 fork_exit+0x11f fork_trampoline+0xe 
1000     0 100018 kernel           acpi_task_1      mi_switch+0x176 sleepq_wait+0x42 msleep_spin+0x1a2 taskqueue_thread_loop+0x71 fork_exit+0x11f fork_trampoline+0xe 
1000     0 100019 kernel           acpi_task_2      mi_switch+0x176 sleepq_wait+0x42 msleep_spin+0x1a2 taskqueue_thread_loop+0x71 fork_exit+0x11f fork_trampoline+0xe 
1000     0 100041 kernel           nfe1 taskq       mi_switch+0x176 sleepq_wait+0x42 msleep_spin+0x1a2 taskqueue_thread_loop+0x71 fork_exit+0x11f fork_trampoline+0xe 
1000   PID    TID COMM             TDNAME           KSTACK                       

Thank you very much

Nicolas
-- 
http://www.rachinsky.de/nicolas

From owner-freebsd-fs@FreeBSD.ORG  Tue Jan 15 22:49:04 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: fs@freebsd.org
Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115])
 by hub.freebsd.org (Postfix) with ESMTP id 0EE961F7;
 Tue, 15 Jan 2013 22:49:04 +0000 (UTC)
 (envelope-from rmacklem@uoguelph.ca)
Received: from esa-annu.net.uoguelph.ca (esa-annu.mail.uoguelph.ca
 [131.104.91.36]) by mx1.freebsd.org (Postfix) with ESMTP id 79BE6EF6;
 Tue, 15 Jan 2013 22:49:03 +0000 (UTC)
X-IronPort-Anti-Spam-Filtered: true
X-IronPort-Anti-Spam-Result: Ap8EAJXb9VCDaFvO/2dsb2JhbABFhjq3YnOCHgEBBAEjBFIFFg4KAgINGQJZBogmBqYEgkCOc4EjjwKBEwOIYY0rkEmDE4IG
X-IronPort-AV: E=Sophos;i="4.84,475,1355115600"; 
   d="scan'208";a="9236221"
Received: from erie.cs.uoguelph.ca (HELO zcs3.mail.uoguelph.ca)
 ([131.104.91.206])
 by esa-annu.net.uoguelph.ca with ESMTP; 15 Jan 2013 17:49:00 -0500
Received: from zcs3.mail.uoguelph.ca (localhost.localdomain [127.0.0.1])
 by zcs3.mail.uoguelph.ca (Postfix) with ESMTP id 2F355B3F0D;
 Tue, 15 Jan 2013 17:49:00 -0500 (EST)
Date: Tue, 15 Jan 2013 17:49:00 -0500 (EST)
From: Rick Macklem <rmacklem@uoguelph.ca>
To: Bruce Evans <brde@optusnet.com.au>
Message-ID: <1149390778.2023367.1358290140175.JavaMail.root@erie.cs.uoguelph.ca>
In-Reply-To: <20130115141019.H1444@besplex.bde.org>
Subject: Re: [PATCH] Better handle NULL utimes() in the NFS client
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit
X-Originating-IP: [172.17.91.202]
X-Mailer: Zimbra 6.0.10_GA_2692 (ZimbraWebClient - FF3.0 (Win)/6.0.10_GA_2692)
Cc: Rick Macklem <rmacklem@FreeBSD.org>, fs@FreeBSD.org
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 15 Jan 2013 22:49:04 -0000

Bruce Evans wrote:
> On Mon, 14 Jan 2013, Rick Macklem wrote:
> 
> > John Baldwin wrote:
> >> The NFS client tries to infer when an application has passed NULL
> >> to
> >> utimes()
> >> so that it can let the server set the timestamp rather than using a
> >> client-
> >> supplied timestamp. It does this by checking to see if the desired
> >> timestamp's second matches the current second. However, this breaks
> >> applications that are intentionally trying to set a specific
> >> timestamp
> >> within
> >> the current second. In addition, utimes() sets a flag to indicate
> >> if
> >> NULL was
> >> passed to utimes(). The patch below changes the NFS client to check
> >> this flag
> >> and only use the server-supplied time in that case:
> 
> It is certainly an error to not check VA_UTIMES_NULL at all. I think
> the flag (or the NULL pointer) cannot be passed to the server, so the
> best we can do for the VA_UTIMES_NULL case is read the current time on
> the client and pass it to the server. Upper layers have already read
> the current time, but have passed us VA_UTIMES_NULL so that we can
> tell
> that the pointer was originally null so that we can do the different
> permissions checks for this case.
> 
> >> Index: fs/nfsclient/nfs_clport.c
> >> ===================================================================
> >> --- fs/nfsclient/nfs_clport.c (revision 225511)
> >> +++ fs/nfsclient/nfs_clport.c (working copy)
> >> @@ -762,7 +762,7 @@
> >> *tl = newnfs_false;
> >> }
> >> if (vap->va_atime.tv_sec != VNOVAL) {
> >> - if (vap->va_atime.tv_sec != curtime.tv_sec) {
> >> + if (!(vap->va_vaflags & VA_UTIMES_NULL)) {
> >> NFSM_BUILD(tl, u_int32_t *, 3 * NFSX_UNSIGNED);
> >> *tl++ = txdr_unsigned(NFSV3SATTRTIME_TOCLIENT);
> >> txdr_nfsv3time(&vap->va_atime, tl);
> >> @@ -775,7 +775,7 @@
> >> *tl = txdr_unsigned(NFSV3SATTRTIME_DONTCHANGE);
> >> ...
> 
> Something mangled the patch so that it is hard to see what it does. It
> just uses the flag instead of guessing.
> 
> I can't see anything that does the different permissions check for
> the VA_UTIMES_NULL case, and testing shows that this case is just
> broken,
> at least for an old version of the old nfs client -- the same
> permissions
> are required for all cases, but write permission is supposed to be
> enough for the VA_UTIMES_NULL case (since write permission is
> sufficient
> for setting the mtime to the current time (plus epsilon) using
> write(2)
> and truncate(2). Setting the atime to the current time should require
> no more and no less than read permission, since it can be done using
> read(2), but utimes(NULL) requires write permission for that too).
> 
I did a quick test on a -current client/server and it seems to work ok.
The client uses SET_TIME_TO_SERVER and the server sets VA_UTIMES_NULL
for this case. At least it works for a UFS exported volume.

> > In the old days, a lot of NFS servers only stored times at a
> > resolution of 1sec, which I think is why the code had the habit
> > of comparing "seconds equal".
> 
> I think this is not the reason for the check here.
> 
> > If there is some app. out there
> > that sets "current time" via utimes(2) with a curent time argument
> > instead of a NULL argument would seem to be broken to me.
> > (It is conceivable that some app. did this to avoid clock
> > skew between the client and server, but I doubt it.)
> 
> Apps have no alternative to using the NULL arg if they have write
> permission
> to the file but don't own it.
> 
> Oops, on looking at the code I now think it _is_ possible to pass the
> request to set the current time on the server, since in the
> NFSV3SATTRTIME_TOSERVER case we just pass this case value and not
> any time value to the server, so the server has no option but to use
> its current time. It is not surprising that the permissions checks
> for this don't work right. I thought that the client was responsible
> for most permissions checks, but can't find many or the relevant one
> here. The NFSV3SATTRTIME_TOSERVER code on the server sets
> VA_UTIMES_NULL, so I would have thought that the permissions check on
> the server does the right thing.
> 
As noted above, it seems to work correctly for the new server in -current,
at least for UFS exports.

Normally a server will do permission checking for NFS RPCs. There is nothing
stopping a client from doing a check and returning an error, but traditionally
a server has not trusted a client to do so. (I'm not sure if adding a check
in the client is what jhb@ was referring to in his reply to this?)

> There are some large timestamping bugs nearby:
> 
> - the old nfs server code for NFSV3SATTRTIME_TOSERVER uses
> getnanotime()
> to read the current time. This violates the system's policy set by
> the vfs.timestamp precision in most cases, since using getnanotime()
> is the worst supported policy and is not the defaul.
> 
> The old nfs client uses the correct function to read the current
> time, vfs_timestamp(), in nfs_create(), but this is the only use of
> vfs_timestamp() in old nfs code. I think most cases use the server
> time and thus use the correct function iff the leaf server file
> system uses the correct function.
> 
> - the new nfs server code for NFSV3SATTRTIME_TOSERVER macro-izes all
> reads of the current time except 1 as NFSGETTIME(). This uses
> getmicrotime(), so it violates the system's policy in all cases,
> since using getmicrotime() is not a supported policy (using
> microtime() is supported). The 1 exception is a hard-coded
> getmicrotime() in fs/nfsclient/nfs_clport.c whose use is visible
> in the above patch. This one really didn't matter, because only the
> seconds part of curtime was used. It was just a micro-pessimization
> and style bug. The (not quite) correct way to get the seconds part
> is to use time_second, as is done in the old nfs client.
> (This way is not quite correct because there are some races and
> non-monotonicities reading the times. In the above check,
> vap->va_atime.tv_sec might have been read by a more precise clock
> than curtime.tv_sec. Then the check might give a false positive
> or negative. But the check is only a heuristic, and is inherently
> racy, so this doesn't rally matter.
> With the above pathcm the check becomes a different pessimization and
> style bug. The curtime variable becomes unused except for its
> incorrect initialization.
> 
In this case, after the patch is applied, curtime and getmicrotime() can
just be deleted (as you noted, above).

> New nfs code never uses the correct function vfs_timestamp().
This needs to be fixed. Until now, I would have had no idea what is the
correct interface. (When I did the port, I just used a call that seemed
to return what I wanted.;-)

Having said that, after reading what you wrote below, it is not obvious
to me what the correct fix is? (It seems to be a choice between microtime()
and vfs_timestamp()?)

> 
> Following the system pollcy for file timestamps causes some problems
> for utimes(NULL) too. Old versions hard-coded microtime(). Current
> versions use vfs_timestamp(). The latter is better, but tends to
> give different results than times(non_NULL), since few or no
> applications know anything about the system's policy. touch(1)
> probably should know, but doesn't. So the simple "touch foo" gives
> various results, depending:
> - touch(1) starts with gettimeofday(). This gives microseconds
> resolution and usually microseconds accuracy if its result is used.
> - touch then tries utimes(non_NULL) with the current time that it
> just read. This usually works, giving microseconds resolution,
> etc. This is OK, but often different from the system policy.
> - touch then tries utimes(NULL). If this works, then it follow the
> system policy.
> 
> Another problem is that not all file systems support nanoseconds
> resolutions, so not all system policies or utimes() requests can
> be honored.
> 
> I would usually prefer the system's policy to be enforced as far as
> possible. Thus if the system's policy is microseconds resolution,
> then times with nanoseconds resolution should be rounded down to the
> nearest microsecond. This case is most useful since utimes() cannot
> preserve times with more than microseconds resolution. Utilities like
> cp(1) blindly round the times given in nanoseconds by stat(2) to ones
> that can be written by utimes(2), so this often happens in an
> uncontrollable way anyway (POSIX is finally getting around to
> specifying
> permissible errors for unrepresentable resolutions). But sometimes I
> want utimes() to preserve times as well as possible.
> 
> Bruce

From owner-freebsd-fs@FreeBSD.ORG  Tue Jan 15 23:32:07 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115])
 by hub.freebsd.org (Postfix) with ESMTP id E6F1B3F1
 for <freebsd-fs@freebsd.org>; Tue, 15 Jan 2013 23:32:07 +0000 (UTC)
 (envelope-from rmacklem@uoguelph.ca)
Received: from esa-jnhn.mail.uoguelph.ca (esa-jnhn.mail.uoguelph.ca
 [131.104.91.44]) by mx1.freebsd.org (Postfix) with ESMTP id AF2B527C
 for <freebsd-fs@freebsd.org>; Tue, 15 Jan 2013 23:32:07 +0000 (UTC)
X-IronPort-Anti-Spam-Filtered: true
X-IronPort-Anti-Spam-Result: AqAEAG/l9VCDaFvO/2dsb2JhbABFhjq3YnOCHgEBAQQBAQEgKyALGw4KAgINGQIjBgEJJgYIBwQBHASHZgMPDKV/gkCGZQ2HfoEjimWBCIMVgRMDiGGKfViBVoEcihuFEoMTgVE1
X-IronPort-AV: E=Sophos;i="4.84,475,1355115600"; d="scan'208";a="12075217"
Received: from erie.cs.uoguelph.ca (HELO zcs3.mail.uoguelph.ca)
 ([131.104.91.206])
 by esa-jnhn.mail.uoguelph.ca with ESMTP; 15 Jan 2013 18:32:05 -0500
Received: from zcs3.mail.uoguelph.ca (localhost.localdomain [127.0.0.1])
 by zcs3.mail.uoguelph.ca (Postfix) with ESMTP id 3C2F0B3F18;
 Tue, 15 Jan 2013 18:32:05 -0500 (EST)
Date: Tue, 15 Jan 2013 18:32:05 -0500 (EST)
From: Rick Macklem <rmacklem@uoguelph.ca>
To: Sergey Kandaurov <pluknet@gmail.com>
Message-ID: <2118820107.2024400.1358292725230.JavaMail.root@erie.cs.uoguelph.ca>
In-Reply-To: <CAE-mSOJk1HbxvF=ZpoSP21b9j65qMov=AE-OM6wcUkbadQeZbw@mail.gmail.com>
Subject: Re: getcwd lies on/under nfs4-mounted zfs dataset
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit
X-Originating-IP: [172.17.91.202]
X-Mailer: Zimbra 6.0.10_GA_2692 (ZimbraWebClient - FF3.0 (Win)/6.0.10_GA_2692)
Cc: freebsd-fs@freebsd.org
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 15 Jan 2013 23:32:08 -0000

pluknet@gmail.com wrote:
> Hi.
> 
> We stuck with the problem getting wrong current directory path
> when sitting on/under zfs dataset filesystem mounted over NFSv4.
> Both nfs server and client are 10.0-CURRENT from December or so.
> 
> The component path "user3" unexpectedly appears to be "." (dot).
> nfs-client:/home/user3 # pwd
> /home/.
> nfs-client:/home/user3/var/run # pwd
> /home/./var/run
> 
> nfs-client:~ # procstat -f 3225
> PID COMM FD T V FLAGS REF OFFSET PRO NAME
> 3225 a.out text v r r-------- - - - /home/./var/a.out
> 3225 a.out ctty v c rw------- - - - /dev/pts/2
> 3225 a.out cwd v d r-------- - - - /home/./var
> 3225 a.out root v d r-------- - - - /
> 
> The used setup follows.
> 
> 1. NFS Server with local ZFS:
> # cat /etc/exports
> V4: / -sec=sys
> 
> # zfs list
> pool1 10.4M 122G 580K /pool1
> pool1/user3 on /pool1/user3 (zfs, NFS exported, local, nfsv4acls)
> 
> Exports list on localhost:
> /pool1/user3 109.70.28.0
> /pool1 109.70.28.0
> 
> # zfs get sharenfs pool1/user3
> NAME PROPERTY VALUE SOURCE
> pool1/user3 sharenfs -alldirs -maproot=root -network=109.70.28.0/24
> local
> 
> 2. pool1 is mounted on NFSv4 client:
> nfs-server:/pool1 on /home (nfs, noatime, nfsv4acls)
> 
> So that on NFS client the "pool1/user3" dataset comes at /home/user3.
> / - ufs
> /home - zpool-over-nfsv4
> /home/user3 - zfs dataset "pool1/user3"
> 
> At the same time it works as expected when we're not on zfs dataset,
> but directly on its parent zfs pool (also over NFSv4), e.g.
> nfs-client:/home/non_dataset_dir # pwd
> /home/non_dataset_dir
> 
> The ls command works as expected:
> nfs-client:/# ls -dl /home/user3/var/
> drwxrwxrwt+ 6 root wheel 6 Jan 10 16:19 /home/user3/var/
> 
Well, if you are just looking for a work around, you could try mounting
/home/user3 separately.

Otherwise, here's roughly what needs to happen for it to work. (There
may be some additional trick(s) I am not aware of.)

On the server, ZFS must report:
- different fsids for /home vs /home/user3
- fileno (A) must be the same value for "." and ".." for the zfs dataset root
  (and set VV_ROOT on the vnode)
- fileno (B) for "user3" reported by readdir() on /home must be different
  than what "." and ".." report.

Then the NFS server will report a different value (B) for Mounted_on_fileno
than it does for Fileno (A), when the client gets attributes for the directory
/home/user3.

When the client sees Mounted_on_fileno != Fileno, it knows it is at a server
mount point boundary and should report the correct stuff to stat() and readdir().

I haven't tested this for a while, so it might be broken for UFS as well. If that's
the case, I can probably try and track down the problem here.
If not, you can capture packets when you do the getcwd() and then look at them
in wireshark, so you can see what the server is returning for Fileno and 
Mounted_on_fileno. They should be different for "/home/user3" and the latter
one should be the value returned by Readdir of "/home" for the "user3" entry.

I won't be in a position to look at a wireshark trace until April, so I can't
help with that at this time.

Since I've never used ZFS, I have no idea what it considers a "mount point"?
(Generically, within a mount point there needs to be "same fsid" and a unique
 set of "fileno" values for all objects. When crossing the mount point, VV_ROOT
 needs to be set and the mounted_on_vp (or whatever it's called) must refer
 to the parent. "/home" for this case.)

I don't know if this helps, rick
ps: Solaris10 clients don't get this to work, so you always need to mount
    each server file system separately, which is the "work around" I suggested
    at the beginning of this post.

> --
> wbr,
> pluknet
> _______________________________________________
> freebsd-fs@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-fs
> To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org"

From owner-freebsd-fs@FreeBSD.ORG  Wed Jan 16 00:07:44 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by hub.freebsd.org (Postfix) with ESMTP id 99B89260
 for <freebsd-fs@freebsd.org>; Wed, 16 Jan 2013 00:07:44 +0000 (UTC)
 (envelope-from rcartwri@asu.edu)
Received: from mail-oa0-f48.google.com (mail-oa0-f48.google.com
 [209.85.219.48]) by mx1.freebsd.org (Postfix) with ESMTP id 63E69713
 for <freebsd-fs@freebsd.org>; Wed, 16 Jan 2013 00:07:44 +0000 (UTC)
Received: by mail-oa0-f48.google.com with SMTP id h2so816736oag.35
 for <freebsd-fs@freebsd.org>; Tue, 15 Jan 2013 16:07:38 -0800 (PST)
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=google.com; s=20120113;
 h=mime-version:in-reply-to:references:date:message-id:subject:from:to
 :cc:content-type:x-gm-message-state;
 bh=IVTKzE3xkDriny8v4brXDu2cpIUqrxpTT97TyGQjVEc=;
 b=QCHCzYMMg+WAeYkZCRroLHFeKqjYr1HdJ6AmaVFVrj0w76Ih5nd0DI7VRujZEwEUl6
 fThNh9HhDjfxbxsJIzeIrYjcSKJ3W0cctTQGkKzZ0dic5kJ7tGSHCpqis1zEF+DY5FTQ
 eHT0S116SMwsZ6Jta+5y7KK3sc+RNhS13rHsZRw4Q90Snzj+/+mL1m33rumdw1aSuYTG
 G91aQS+lHG5tI587zyo/uL0b51S4ZgqdNQeKn5aYdyLTqrpmbeM3MQYn475I5u8i4u2c
 1MAeIOiUpQ+kU9nuyYWZeXDeRblaCX42HmdW5l2ExdKN9QUKTSai5zTsUHdooOHDSeC4
 NXpg==
MIME-Version: 1.0
Received: by 10.182.235.70 with SMTP id uk6mr23274848obc.54.1358294858555;
 Tue, 15 Jan 2013 16:07:38 -0800 (PST)
Received: by 10.76.173.101 with HTTP; Tue, 15 Jan 2013 16:07:38 -0800 (PST)
In-Reply-To: <CALC5+1MEQ3kqqGqk=dhSFLXSJA2fdR1smvKKjch0QeG_mhJX2g@mail.gmail.com>
References: <CALC5+1MbmG8xyKmr6LVYvrFOOWrv-v=BR6JkyG4jW_d3-Js7GA@mail.gmail.com>
 <CALC5+1MEQ3kqqGqk=dhSFLXSJA2fdR1smvKKjch0QeG_mhJX2g@mail.gmail.com>
Date: Tue, 15 Jan 2013 17:07:38 -0700
Message-ID: <CALOkxuyJXTj9U2JaG2Qa0aTU0eUdptMLgCzJJebadGTJjECRRg@mail.gmail.com>
Subject: Re: CAM hangs in 9-STABLE? [Was: NFS/ZFS hangs after upgrading from
 9.0-RELEASE to -STABLE]
From: "Reed A. Cartwright" <cartwright@asu.edu>
To: olivier <olivier777a7@gmail.com>
Content-Type: text/plain; charset=ISO-8859-1
X-Gm-Message-State: ALoCoQmAQOk86bShZ+ExMD9igtOjThBtIy1ywzwWkit1RMiFqSA8tl1wcgXm4Su7dQLt/gDJYFdW
Cc: freebsd-fs@freebsd.org, ken@freebsd.org,
 "freebsd-stable@freebsd.org" <freebsd-stable@freebsd.org>,
 Andriy Gapon <avg@freebsd.org>
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 16 Jan 2013 00:07:44 -0000

I don't know if this is relevant or not, but I deadlock was recently
fixed in the VFS code:

http://svnweb.freebsd.org/base?view=revision&revision=244795

On Tue, Jan 15, 2013 at 12:55 PM, olivier <olivier777a7@gmail.com> wrote:
> Dear All,
> Still experiencing the same hangs I reported earlier with 9.1. I've been
> running a kernel with WITNESS enabled to provide more information.
>
> During an occurrence of the hang, running show alllocks gave
>
> Process 25777 (sysctl) thread 0xfffffe014c5b2920 (102567)
> exclusive sleep mutex Giant (Giant) r = 0 (0xffffffff811e34c0) locked @
> /usr/src/sys/dev/usb/usb_transfer.c:3171
> Process 25750 (sshd) thread 0xfffffe015a688000 (104313)
> exclusive sx so_rcv_sx (so_rcv_sx) r = 0 (0xfffffe0204e0bb98) locked @
> /usr/src/sys/kern/uipc_sockbuf.c:148
> Process 24922 (cnid_dbd) thread 0xfffffe0187ac4920 (103597)
> shared lockmgr zfs (zfs) r = 0 (0xfffffe0973062488) locked @
> /usr/src/sys/kern/vfs_syscalls.c:3591
> Process 24117 (sshd) thread 0xfffffe07bd914490 (104195)
> exclusive sx so_rcv_sx (so_rcv_sx) r = 0 (0xfffffe0204e0a8f0) locked @
> /usr/src/sys/kern/uipc_sockbuf.c:148
> Process 1243 (java) thread 0xfffffe01ca85d000 (102704)
> exclusive sleep mutex pmap (pmap) r = 0 (0xfffffe015aec1440) locked @
> /usr/src/sys/amd64/amd64/pmap.c:4840
> exclusive rw pmap pv global (pmap pv global) r = 0 (0xffffffff81409780)
> locked @ /usr/src/sys/amd64/amd64/pmap.c:4802
> exclusive sleep mutex vm page (vm page) r = 0 (0xffffffff813f0a80) locked @
> /usr/src/sys/vm/vm_object.c:1128
> exclusive sleep mutex vm object (standard object) r = 0
> (0xfffffe01458e43a0) locked @ /usr/src/sys/vm/vm_object.c:1076
> shared sx vm map (user) (vm map (user)) r = 0 (0xfffffe015aec1388) locked @
> /usr/src/sys/vm/vm_map.c:2045
> Process 994 (nfsd) thread 0xfffffe015a0df000 (102426)
> shared lockmgr zfs (zfs) r = 0 (0xfffffe0c3b505878) locked @
> /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vfsops.c:1760
> Process 994 (nfsd) thread 0xfffffe015a0f8490 (102422)
> exclusive lockmgr zfs (zfs) r = 0 (0xfffffe02db3b3e60) locked @
> /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vfsops.c:1760
> Process 931 (syslogd) thread 0xfffffe015af18920 (102365)
> shared lockmgr zfs (zfs) r = 0 (0xfffffe0141dd6680) locked @
> /usr/src/sys/kern/vfs_syscalls.c:3591
> Process 22 (syncer) thread 0xfffffe0125077000 (100279)
> exclusive lockmgr syncer (syncer) r = 0 (0xfffffe015a2ff680) locked @
> /usr/src/sys/kern/vfs_subr.c:1809
>
> I don't have full "show lockedvnods" output because the output does not get
> captured by ddb after using "capture on", it doesn't fit on a single
> screen, and doesn't get piped into a "more" equivalent. What I did manage
> to get (copied by hand, typos possible) is:
>
> 0xfffffe0c3b5057e0: 0xfffffe0c3b5057e0: tag zfs, type VREG
> tag zfs, type VREG
> usecount 1, writecount 0, refcount 1 mountedhere 0
> usecount 1, writecount 0, refcount 1 mountedhere 0
> flags (VI_ACTIVE)
> flags (VI_ACTIVE)
> v_object 0xfffffe089bc1b828 ref 0 pages 0
> v_object 0xfffffe089bc1b828 ref 0 pages 0
> lock type zfs: SHARED (count 1)
> lock type zfs: SHARED (count 1)
>
> 0xfffffe02db3b3dc8: 0xfffffe02db3b3dc8: tag zfs, type VREG
> tag zfs, type VREG
> usecount 6, writecount 0, refcount 6 mountedhere 0
> usecount 6, writecount 0, refcount 6 mountedhere 0
> flags (VI_ACTIVE)
> flags (VI_ACTIVE)
> v_object 0xfffffe0b79583ae0 ref 0 pages 0
> v_object 0xfffffe0b79583ae0 ref 0 pages 0
> lock type zfs: EXCL by thread 0xfffffe015a0f8490 (pid 994)
> lock type zfs: EXCL by thread 0xfffffe015a0f8490 (pid 994)
> with exclusive waiters pending
> with exclusive waiters pending
>
> The output of show witness is at http://pastebin.com/eSRb3FEu
>
> The output of alltrace is at http://pastebin.com/X1LruNrf (a number of
> threads are stuck in zio_wait, none I can find in zio_interrupt, and
> according to gstat and disks eventually going to sleep all disk IO seems to
> be stuck for good; I think Andriy explained earlier that these criteria
> might indicate this is a ZFS hang).
>
> The output of show geom is at http://pastebin.com/6nwQbKr4
>
> The output of vmstat -i is at http://pastebin.com/9LcZ7Mi0 Interrupts are
> occurring at a normal rate during the hang, as far as I can tell.
>
> Any help would be greatly appreciated.
> Thanks
> Olivier
> PS: my kernel was compiled from 9-STABLE from December, with CAM and ahci
> from 9.0 (in the hope it would fix the hangs I was experiencing in plain
> 9-STABLE; obviously the hangs are still occurring). The rest of my
> configuration is the same as posted earlier.
>
> On Mon, Dec 24, 2012 at 9:42 PM, olivier <olivier777a7@gmail.com> wrote:
>
>> Dear All
>> It turns out that reverting to an older version of the mps driver did not
>> fix the ZFS hangs I've been struggling with in 9.1 and 9-STABLE after all
>> (they just took a bit longer to occur again, possibly just by chance). I
>> followed steps along lines suggested by Andriy to collect more information
>> when the problem occurs. Hopefully this will help figure out what's going
>> on.
>>
>> As far as I can tell, what happens is that at some point IO operations to
>> a bunch of drives that belong to different pools get stuck. For these
>> drives, gstat shows no activity but 1 pending operation, as such:
>>
>>  L(q)  ops/s    r/s   kBps   ms/r    w/s   kBps   ms/w    d/s   kBps
>> ms/d   %busy Name
>>     1      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  da1
>>
>> I've been running gstat in a loop (every 100s) to monitor the machine.
>> Just before the hang occurs, everything seems fine (see full gstat output
>> below). Right after the hang occurs a number of drives seem stuck (see full
>> gstat output below). Notably, some stuck drives are seen through the mps
>> driver and others through the mpt driver. So the problem doesn't seem to be
>> driver-specific. I have had the problem occur (at a lower frequency) on
>> similar machines that don't use the mpt driver (and only have 1 disk
>> provided through mps), so the problem doesn't seem to be caused by the mpt
>> driver (and is likely not caused by defective hardware). Since based on the
>> information I provided earlier Andriy thinks the problem might not
>> originate in ZFS, perhaps that means that the problem is in the CAM layer?
>>
>> camcontrol tags -v (as suggested by Andriy) in the hung state shows for
>> example
>>
>> (pass56:mpt1:0:8:20): dev_openings  254
>> (pass56:mpt1:0:8:20): dev_active    1
>> (pass56:mpt1:0:8:20): devq_openings 254
>> (pass56:mpt1:0:8:20): devq_queued   0
>> (pass56:mpt1:0:8:20): held          0
>> (pass56:mpt1:0:8:20): mintags       2
>> (pass56:mpt1:0:8:20): maxtags       255
>> (I'm not providing full camcontrol tags output below because I couldn't
>> get it to run during the specific hang I documented most thoroughly; the
>> example above is from a different occurrence of the hang).
>>
>> The buses don't seem completely frozen: if I manually remove drives while
>> the machine is hanging, that's picked up by the mpt driver, which prints
>> out corresponding messages to the console. But camcontrol reset all or
>> rescan all don't seem to do anything.
>>
>> I've tried reducing vfs.zfs.vdev.min_pending and vfs.zfs.vdev.max_pending
>> to 1, to no avail.
>>
>> Any suggestions to resolve this problem, work around it, or further
>> investigate it would be greatly appreciated!
>> Thanks a lot
>> Olivier
>>
>> Detailed information:
>>
>> Output of procstat -a -kk when the machine is hanging is available at
>> http://pastebin.com/7D2KtT35 (not putting it here because it's pretty
>> long)
>>
>> dmesg is available at http://pastebin.com/9zJQwWJG . Note that I'm using
>> LUN masking, so the "illegal requests" reported aren't really errors. Maybe
>> one day if I get my problems sorted out I'll use geom multipathing instead.
>>
>> My kernel config is
>> include GENERIC
>> ident MYKERNEL
>>
>> options IPSEC
>> device crypto
>>
>> options OFED # Infiniband protocol
>>
>> device mlx4ib # ConnectX Infiniband support
>> device mlxen # ConnectX Ethernet support
>> device mthca # Infinihost cards
>> device ipoib # IP over IB devices
>>
>> options         ATA_CAM         # Handle legacy controllers with CAM
>> options         ATA_STATIC_ID   # Static device numbering
>>
>> options KDB
>> options DDB
>>
>>
>>
>> Full output of gstat just before the hang (at most 100s before the hang):
>>  L(q)  ops/s    r/s   kBps   ms/r    w/s   kBps   ms/w    d/s   kBps
>> ms/d   %busy Name
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  da2
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  da0
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  DEV/da2/da2
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  DEV/da0/da0
>>     1     85     48     79    4.7     35     84    0.5      0      0
>>  0.0   24.3  da1
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  DEV/da1/da1
>>     1     83     47     77    4.3     34     79    0.5      0      0
>>  0.0   22.1  da4
>>     1   1324   1303  21433    0.6     19     42    0.7      0      0
>>  0.0   79.8  da3
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  da5
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  da6
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  da7
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  da8
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  da9
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  da10
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  da11
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  da12
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  da13
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  da14
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  da15
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  da16
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  da17
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  da18
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  da19
>>     0     97     57     93    3.5     38     84    0.3      0      0
>>  0.0   21.3  da20
>>     0     85     47     69    3.3     36     86    0.4      0      0
>>  0.0   16.8  da21
>>     0   1666   1641  18992    0.3     23     43    0.4      0      0
>>  0.0   57.9  da22
>>     0     93     55     98    3.5     36     87    0.4      0      0
>>  0.0   20.6  da23
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  da24
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  da25
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  da26
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  da27
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  da28
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  da29
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  da30
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  da31
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  da32
>>     0   1200      0      0    0.0   1198  11751    0.6      0      0
>>  0.0   67.3  da33
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  da34
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  da35
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  da36
>>     0     81     44     67    2.0     35     84    0.3      0      0
>>  0.0   10.1  da37
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  da38
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  da39
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  da40
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  da41
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  da42
>>     1   1020    999  22028    0.8     19     42    0.7      0      0
>>  0.0   84.8  da43
>>     0   1050   1029  23479    0.8     19     47    0.7      0      0
>>  0.0   83.3  da44
>>     1   1006    984  22758    0.8     21     46    0.6      0      0
>>  0.0   84.8  da45
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  da46
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  da47
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  da48
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  da49
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  da50
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  cd0
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  DEV/da4/da4
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  DEV/da3/da3
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  DEV/da5/da5
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  DEV/da6/da6
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  DEV/da7/da7
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  DEV/da8/da8
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  DEV/da9/da9
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  DEV/da10/da10
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  DEV/da11/da11
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  DEV/da12/da12
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  DEV/da13/da13
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  DEV/da14/da14
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  DEV/da15/da15
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  DEV/da16/da16
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  DEV/da17/da17
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  DEV/da18/da18
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  DEV/da19/da19
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  DEV/da20/da20
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  DEV/da21/da21
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  DEV/da22/da22
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  DEV/da23/da23
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  DEV/da24/da24
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  DEV/da25/da25
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  DEV/da26/da26
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  PART/da26/da26
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  da26p1
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  da26p2
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  da26p3
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  DEV/da27/da27
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  DEV/da28/da28
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  DEV/da29/da29
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  DEV/da30/da30
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  DEV/da31/da31
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  DEV/da32/da32
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  DEV/da33/da33
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  DEV/da34/da34
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  DEV/da35/da35
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  DEV/da36/da36
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  DEV/da37/da37
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  DEV/da38/da38
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  DEV/da39/da39
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  DEV/da40/da40
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  DEV/da41/da41
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  DEV/da42/da42
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  DEV/da43/da43
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  DEV/da44/da44
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  DEV/da45/da45
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  DEV/da46/da46
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  DEV/da47/da47
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  DEV/da48/da48
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  DEV/da49/da49
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  DEV/da50/da50
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  DEV/cd0/cd0
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  DEV/da26p1/da26p1
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  DEV/da26p2/da26p2
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  LABEL/da26p1/da26p1
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  gptid/84d4487b-34e3-11e2-b773-00259058949a
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  DEV/da26p3/da26p3
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  LABEL/da26p2/da26p2
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  gptid/b4255780-34e3-11e2-b773-00259058949a
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0
>>  DEV/gptid/84d4487b-34e3-11e2-b773-00259058949a/gptid/84d4487b-34e3-11e2-b773-00259058949a
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  ZFS::VDEV/zfs::vdev/da25
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0
>>  DEV/gptid/b4255780-34e3-11e2-b773-00259058949a/gptid/b4255780-34e3-11e2-b773-00259058949a
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  ZFS::VDEV/zfs::vdev/da40
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  ZFS::VDEV/zfs::vdev/da41
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  ZFS::VDEV/zfs::vdev/da26p3
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  ZFS::VDEV/zfs::vdev/da29
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  ZFS::VDEV/zfs::vdev/da30
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  ZFS::VDEV/zfs::vdev/da24
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  ZFS::VDEV/zfs::vdev/da6
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  ZFS::VDEV/zfs::vdev/da7
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  ZFS::VDEV/zfs::vdev/da16
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  ZFS::VDEV/zfs::vdev/da17
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  ZFS::VDEV/zfs::vdev/da20
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  ZFS::VDEV/zfs::vdev/da21
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  ZFS::VDEV/zfs::vdev/da37
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  ZFS::VDEV/zfs::vdev/da23
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  ZFS::VDEV/zfs::vdev/da1
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  ZFS::VDEV/zfs::vdev/da4
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  ZFS::VDEV/zfs::vdev/da43
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  ZFS::VDEV/zfs::vdev/da44
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  ZFS::VDEV/zfs::vdev/da22
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  ZFS::VDEV/zfs::vdev/da33
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  ZFS::VDEV/zfs::vdev/da45
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  ZFS::VDEV/zfs::vdev/da3
>>
>>
>> Full output of gstat just after the hang (at most 100s after the hang):
>>  L(q)  ops/s    r/s   kBps   ms/r    w/s   kBps   ms/w    d/s   kBps
>> ms/d   %busy Name
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  da2
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  da0
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  DEV/da2/da2
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  DEV/da0/da0
>>     1      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  da1
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  DEV/da1/da1
>>     1      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  da4
>>     1      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  da3
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  da5
>>     1      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  da6
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  da7
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  da8
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  da9
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  da10
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  da11
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  da12
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  da13
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  da14
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  da15
>>     1      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  da16
>>     1      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  da17
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  da18
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  da19
>>     1      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  da20
>>     1      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  da21
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  da22
>>     1      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  da23
>>     1      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  da24
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  da25
>>     1      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  da26
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  da27
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  da28
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  da29
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  da30
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  da31
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  da32
>>     1      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  da33
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  da34
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  da35
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  da36
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  da37
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  da38
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  da39
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  da40
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  da41
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  da42
>>     1      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  da43
>>     1      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  da44
>>     1      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  da45
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  da46
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  da47
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  da48
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  da49
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  da50
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  cd0
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  DEV/da4/da4
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  DEV/da3/da3
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  DEV/da5/da5
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  DEV/da6/da6
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  DEV/da7/da7
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  DEV/da8/da8
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  DEV/da9/da9
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  DEV/da10/da10
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  DEV/da11/da11
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  DEV/da12/da12
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  DEV/da13/da13
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  DEV/da14/da14
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  DEV/da15/da15
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  DEV/da16/da16
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  DEV/da17/da17
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  DEV/da18/da18
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  DEV/da19/da19
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  DEV/da20/da20
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  DEV/da21/da21
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  DEV/da22/da22
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  DEV/da23/da23
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  DEV/da24/da24
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  DEV/da25/da25
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  DEV/da26/da26
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  PART/da26/da26
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  da26p1
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  da26p2
>>     1      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  da26p3
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  DEV/da27/da27
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  DEV/da28/da28
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  DEV/da29/da29
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  DEV/da30/da30
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  DEV/da31/da31
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  DEV/da32/da32
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  DEV/da33/da33
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  DEV/da34/da34
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  DEV/da35/da35
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  DEV/da36/da36
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  DEV/da37/da37
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  DEV/da38/da38
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  DEV/da39/da39
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  DEV/da40/da40
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  DEV/da41/da41
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  DEV/da42/da42
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  DEV/da43/da43
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  DEV/da44/da44
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  DEV/da45/da45
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  DEV/da46/da46
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  DEV/da47/da47
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  DEV/da48/da48
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  DEV/da49/da49
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  DEV/da50/da50
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  DEV/cd0/cd0
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  DEV/da26p1/da26p1
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  DEV/da26p2/da26p2
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  LABEL/da26p1/da26p1
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  gptid/84d4487b-34e3-11e2-b773-00259058949a
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  DEV/da26p3/da26p3
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  LABEL/da26p2/da26p2
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  gptid/b4255780-34e3-11e2-b773-00259058949a
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0
>>  DEV/gptid/84d4487b-34e3-11e2-b773-00259058949a/gptid/84d4487b-34e3-11e2-b773-00259058949a
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  ZFS::VDEV/zfs::vdev/da25
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0
>>  DEV/gptid/b4255780-34e3-11e2-b773-00259058949a/gptid/b4255780-34e3-11e2-b773-00259058949a
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  ZFS::VDEV/zfs::vdev/da40
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  ZFS::VDEV/zfs::vdev/da41
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  ZFS::VDEV/zfs::vdev/da26p3
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  ZFS::VDEV/zfs::vdev/da29
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  ZFS::VDEV/zfs::vdev/da30
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  ZFS::VDEV/zfs::vdev/da24
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  ZFS::VDEV/zfs::vdev/da6
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  ZFS::VDEV/zfs::vdev/da7
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  ZFS::VDEV/zfs::vdev/da16
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  ZFS::VDEV/zfs::vdev/da17
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  ZFS::VDEV/zfs::vdev/da20
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  ZFS::VDEV/zfs::vdev/da21
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  ZFS::VDEV/zfs::vdev/da37
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  ZFS::VDEV/zfs::vdev/da23
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  ZFS::VDEV/zfs::vdev/da1
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  ZFS::VDEV/zfs::vdev/da4
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  ZFS::VDEV/zfs::vdev/da43
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  ZFS::VDEV/zfs::vdev/da44
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  ZFS::VDEV/zfs::vdev/da22
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  ZFS::VDEV/zfs::vdev/da33
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  ZFS::VDEV/zfs::vdev/da45
>>     0      0      0      0    0.0      0      0    0.0      0      0
>>  0.0    0.0  ZFS::VDEV/zfs::vdev/da3
>>
>>
>> On Thu, Dec 13, 2012 at 10:14 PM, olivier <olivier777a7@gmail.com> wrote:
>>
>>> For what it's worth, I think I might have solved my problem by reverting
>>> to an older version of the mps driver. I checked out a recent version of
>>> 9-STABLE and reversed the changes in
>>> http://svnweb.freebsd.org/base?view=revision&revision=230592 (perhaps
>>> there was a simpler way of reverting to the older mps driver). So far so
>>> good, no hang even when hammering the file system.
>>>
>>> This does not conclusively prove that the new LSI mps driver is at fault,
>>> but that seems to be a likely explanation.
>>>
>>> Thanks to everybody who pointed me in the right direction. Hope this
>>> helps others who run into similar problems with 9.1
>>>  Olivier
>>>
>>>
>>> On Thu, Dec 13, 2012 at 10:14 AM, olivier <olivier777a7@gmail.com> wrote:
>>>
>>>>
>>>>
>>>> On Thu, Dec 13, 2012 at 9:54 AM, Andriy Gapon <avg@freebsd.org> wrote:
>>>>
>>>>> Google for "zfs deadman".  This is already committed upstream and I
>>>>> think that it
>>>>> is imported into FreeBSD, but I am not sure...  Maybe it's imported
>>>>> just into the
>>>>> vendor area and is not merged yet.
>>>>>
>>>>
>>>> Yes, that's exactly what I had in mind. The logic for panicking makes
>>>> sense.
>>>> As far as I can tell you're correct that deadman is in the vendor area
>>>> but not merged. Any idea when it might make it into 9-STABLE?
>>>> Thanks
>>>> Olivier
>>>>
>>>>
>>>>
>>>>
>>>>> So, when enabled this logic would panic a system as a way of letting
>>>>> know that
>>>>> something is wrong.  You can read in the links why panic was selected
>>>>> for this job.
>>>>>
>>>>> And speaking FreeBSD-centric - I think that our CAM layer would be a
>>>>> perfect place
>>>>> to detect such issues in non-ZFS-specific way.
>>>>>
>>>>> --
>>>>> Andriy Gapon
>>>>>
>>>>
>>>>
>>>
>>
> _______________________________________________
> freebsd-stable@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-stable
> To unsubscribe, send any mail to "freebsd-stable-unsubscribe@freebsd.org"


-- 
Reed A. Cartwright, PhD
Assistant Professor of Genomics, Evolution, and Bioinformatics
School of Life Sciences
Center for Evolutionary Medicine and Informatics
The Biodesign Institute
Arizona State University

From owner-freebsd-fs@FreeBSD.ORG  Wed Jan 16 00:16:47 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by hub.freebsd.org (Postfix) with ESMTP id CC3AE599
 for <freebsd-fs@freebsd.org>; Wed, 16 Jan 2013 00:16:47 +0000 (UTC)
 (envelope-from artemb@gmail.com)
Received: from mail-vc0-f180.google.com (mail-vc0-f180.google.com
 [209.85.220.180]) by mx1.freebsd.org (Postfix) with ESMTP id 8CF557C4
 for <freebsd-fs@freebsd.org>; Wed, 16 Jan 2013 00:16:47 +0000 (UTC)
Received: by mail-vc0-f180.google.com with SMTP id p16so762001vcq.39
 for <freebsd-fs@freebsd.org>; Tue, 15 Jan 2013 16:16:41 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
 h=mime-version:sender:in-reply-to:references:date
 :x-google-sender-auth:message-id:subject:from:to:cc:content-type
 :content-transfer-encoding;
 bh=RZ4UGXlPyiFkjowbzCuy+4v/0dm3vdNPzEKHocW0Nhc=;
 b=lf1S93PQXn8v7iWzMVhoRF/N8O2yUWBseY9wFk9fN5Zgtmf72aFFP02lv/rfrNUYuY
 zzVIR55EAaGxe0XMFqpcLbfy4uQcfx5bMa4zxI2tkW+quHkLE6fZNqCh9d3wbQjC4ZY0
 eRbe2wL4vGQFQwocjUjW1yE0OmRUeuDKVdJgWTyhKloHJ2C6P7scU2RHlo3/WD8A/buF
 V2uL12pS/nmVW/g+fjEr/F19be4OBihNm9KyEylErzdz5FoBWqVslhqqwoEpDb37S/zT
 0g+RzyrVpDUtsPCDknCo86OPo9KhD2FY4VP/X2ka411KbQrBbgiuOYsIdf02PZQBgiOP
 dc9w==
MIME-Version: 1.0
Received: by 10.220.153.201 with SMTP id l9mr106550322vcw.33.1358295401115;
 Tue, 15 Jan 2013 16:16:41 -0800 (PST)
Sender: artemb@gmail.com
Received: by 10.220.122.196 with HTTP; Tue, 15 Jan 2013 16:16:40 -0800 (PST)
In-Reply-To: <20130115224556.GA41774@mid.pc5.i.0x5.de>
References: <CAFqOu6jrng=v8eVyhqV-PBqJM_dYy+U7X4+=ahBeoxvK4mxcSA@mail.gmail.com>
 <20130110193949.GA10023@mid.pc5.i.0x5.de>
 <20130111073417.GA95100@mid.pc5.i.0x5.de>
 <CAFqOu6gWpMsWN0pTBiv10WfwyGWMfO9GzMLWTtcVxHixr-_i3Q@mail.gmail.com>
 <20130114094010.GA75529@mid.pc5.i.0x5.de>
 <CAFqOu6hxfGt_M6Jo9qWeifDz9YnNc_Bd9H-GEe4RYtutaPvH5w@mail.gmail.com>
 <20130114195148.GA20540@mid.pc5.i.0x5.de>
 <CAFqOu6jwJ4qhbOovN_NhzusdQJvrbvUC3g93sziR=Uw99SGenw@mail.gmail.com>
 <20130114214652.GA76779@mid.pc5.i.0x5.de>
 <CAFqOu6jKX-Ks6C1RK5GwZ51ZVUSnGSe7S99_EfK+fwLPjAFFYw@mail.gmail.com>
 <20130115224556.GA41774@mid.pc5.i.0x5.de>
Date: Tue, 15 Jan 2013 16:16:40 -0800
X-Google-Sender-Auth: kDVbLxBGd15xXKwDJfHspV66yps
Message-ID: <CAFqOu6jJnWdbikPmE1-UML5i_x7meF+iyY=9WBRyv2j7AeOaSg@mail.gmail.com>
Subject: Re: slowdown of zfs (tx->tx)
From: Artem Belevich <art@freebsd.org>
To: Nicolas Rachinsky <fbsd-mas-0@ml.turing-complete.org>
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable
Cc: freebsd-fs <freebsd-fs@freebsd.org>
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 16 Jan 2013 00:16:47 -0000

On Tue, Jan 15, 2013 at 2:45 PM, Nicolas Rachinsky
<fbsd-mas-0@ml.turing-complete.org> wrote:
>  147     0 100098 kernel           zio_write_issue_ mi_switch+0x176 sleep=
q_wait+0x42 _cv_wait+0x129 space_map_load_wait+0x20 metaslab_activate+0x73 =
metaslab_alloc+0x7b2 zio_dva_allocate+0x9a zio_execute+0xc3 taskqueue_run_l=
ocked+0x85 taskqueue_thread_loop+0x4e fork_exit+0x11f fork_trampoline+0xe

It appears that lots of threads are stuck in
metaslab_activate->space_map_load_wait path. This sounds like CR#
6876962 in Solaris: "degraded write performance with threads held up
by space_map_load_wait(). This bug is fixed in patch 147440-05, -06 or
-07, which is current and contains the fix." Alas, I could not find
specifics on how the issue got fixed and whether the same fix is
present in illumos and FreeBSD.

You may want to update your system to very recent FreeBSD as quite a
few fixes were recently imported from illumos. Hopefully it will deal
with the issue. I'm out of ideas otherwise. Sorry.

--Artem

From owner-freebsd-fs@FreeBSD.ORG  Wed Jan 16 00:56:52 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115])
 by hub.freebsd.org (Postfix) with ESMTP id 49CFD7C9;
 Wed, 16 Jan 2013 00:56:52 +0000 (UTC)
 (envelope-from prvs=1728d5906c=killing@multiplay.co.uk)
Received: from mail1.multiplay.co.uk (mail1.multiplay.co.uk [85.236.96.23])
 by mx1.freebsd.org (Postfix) with ESMTP id 5EB15A16;
 Wed, 16 Jan 2013 00:56:51 +0000 (UTC)
Received: from r2d2 ([188.220.16.49])
 by mail1.multiplay.co.uk (mail1.multiplay.co.uk [85.236.96.23])
 (MDaemon PRO v10.0.4) with ESMTP id md50001720881.msg;
 Wed, 16 Jan 2013 00:56:43 +0000
X-Spam-Processed: mail1.multiplay.co.uk, Wed, 16 Jan 2013 00:56:43 +0000
 (not processed: message from valid local sender)
X-MDRemoteIP: 188.220.16.49
X-Return-Path: prvs=1728d5906c=killing@multiplay.co.uk
X-Envelope-From: killing@multiplay.co.uk
Message-ID: <00F86FD0E85D4EEEA1A01E115497F022@multiplay.co.uk>
From: "Steven Hartland" <killing@multiplay.co.uk>
To: "Artem Belevich" <art@freebsd.org>,
 "Nicolas Rachinsky" <fbsd-mas-0@ml.turing-complete.org>
References: <CAFqOu6jrng=v8eVyhqV-PBqJM_dYy+U7X4+=ahBeoxvK4mxcSA@mail.gmail.com>
 <20130110193949.GA10023@mid.pc5.i.0x5.de>
 <20130111073417.GA95100@mid.pc5.i.0x5.de>
 <CAFqOu6gWpMsWN0pTBiv10WfwyGWMfO9GzMLWTtcVxHixr-_i3Q@mail.gmail.com>
 <20130114094010.GA75529@mid.pc5.i.0x5.de>
 <CAFqOu6hxfGt_M6Jo9qWeifDz9YnNc_Bd9H-GEe4RYtutaPvH5w@mail.gmail.com>
 <20130114195148.GA20540@mid.pc5.i.0x5.de>
 <CAFqOu6jwJ4qhbOovN_NhzusdQJvrbvUC3g93sziR=Uw99SGenw@mail.gmail.com>
 <20130114214652.GA76779@mid.pc5.i.0x5.de>
 <CAFqOu6jKX-Ks6C1RK5GwZ51ZVUSnGSe7S99_EfK+fwLPjAFFYw@mail.gmail.com>
 <20130115224556.GA41774@mid.pc5.i.0x5.de>
 <CAFqOu6jJnWdbikPmE1-UML5i_x7meF+iyY=9WBRyv2j7AeOaSg@mail.gmail.com>
Subject: Re: slowdown of zfs (tx->tx)
Date: Wed, 16 Jan 2013 00:57:03 -0000
MIME-Version: 1.0
Content-Type: text/plain; format=flowed; charset="iso-8859-1";
 reply-type=original
Content-Transfer-Encoding: 7bit
X-Priority: 3
X-MSMail-Priority: Normal
X-Mailer: Microsoft Outlook Express 6.00.2900.5931
X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2900.6157
Cc: freebsd-fs <freebsd-fs@freebsd.org>
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 16 Jan 2013 00:56:52 -0000


----- Original Message ----- 
From: "Artem Belevich" <art@freebsd.org>
To: "Nicolas Rachinsky" <fbsd-mas-0@ml.turing-complete.org>
Cc: "freebsd-fs" <freebsd-fs@freebsd.org>
Sent: Wednesday, January 16, 2013 12:16 AM
Subject: Re: slowdown of zfs (tx->tx)


> On Tue, Jan 15, 2013 at 2:45 PM, Nicolas Rachinsky
> <fbsd-mas-0@ml.turing-complete.org> wrote:
>>  147     0 100098 kernel           zio_write_issue_ mi_switch+0x176 sleepq_wait+0x42 _cv_wait+0x129 space_map_load_wait+0x20 
>> metaslab_activate+0x73 metaslab_alloc+0x7b2 zio_dva_allocate+0x9a zio_execute+0xc3 taskqueue_run_locked+0x85 
>> taskqueue_thread_loop+0x4e fork_exit+0x11f fork_trampoline+0xe
>
> It appears that lots of threads are stuck in
> metaslab_activate->space_map_load_wait path. This sounds like CR#
> 6876962 in Solaris: "degraded write performance with threads held up
> by space_map_load_wait(). This bug is fixed in patch 147440-05, -06 or
> -07, which is current and contains the fix." Alas, I could not find
> specifics on how the issue got fixed and whether the same fix is
> present in illumos and FreeBSD.
>
> You may want to update your system to very recent FreeBSD as quite a
> few fixes were recently imported from illumos. Hopefully it will deal
> with the issue. I'm out of ideas otherwise. Sorry.

That would tend to indicate its blocking on write. If this is the case
yet the rsync is copying from this box, with little else doing writes
it could be atime which is causing the issue.

A test for this would be to use the following to disable atime and see
if that helps:
zfs set atime=off [filesystem]

Also out of interest does the pool have many snapshots?

    Regards
    Steve 


================================================
This e.mail is private and confidential between Multiplay (UK) Ltd. and the person or entity to whom it is addressed. In the event of misdirection, the recipient is prohibited from using, copying, printing or otherwise disseminating it or any information contained in it. 

In the event of misdirection, illegible or incomplete transmission please telephone +44 845 868 1337
or return the E.mail to postmaster@multiplay.co.uk.


From owner-freebsd-fs@FreeBSD.ORG  Wed Jan 16 02:23:32 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115])
 by hub.freebsd.org (Postfix) with ESMTP id 41B80DED;
 Wed, 16 Jan 2013 02:23:32 +0000 (UTC)
 (envelope-from olivier777a7@gmail.com)
Received: from mail-la0-f48.google.com (mail-la0-f48.google.com
 [209.85.215.48]) by mx1.freebsd.org (Postfix) with ESMTP id C17A0E9A;
 Wed, 16 Jan 2013 02:23:30 +0000 (UTC)
Received: by mail-la0-f48.google.com with SMTP id ej20so871696lab.35
 for <multiple recipients>; Tue, 15 Jan 2013 18:23:24 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
 h=mime-version:in-reply-to:references:date:message-id:subject:from:to
 :cc:content-type;
 bh=WxoEaKQMpUbcYwd6CadgFniPrFeyKzP9yeqLzzSeJao=;
 b=otcxFn/U7LRdxP1bOUJRrt4l9CYWDemIBHjEqORBzAwp4ky1pnwGI57opXWN5xbIXO
 U9F87ArtiVxRFH/wIrFQYLjiKjjjR6u16WxmaiBxSmKiiMJyE86HL6EO8UdOfIhBjrk8
 dnf57JBg8kyIsXNww3wq3H3s7OW+9SA+VaOcYxKiYGyD4mmUCL5cfPZOK3bj+oTtDt5L
 hkIEGUyOLGh/9v3/H3szSetsP5mMp2O5AEgY4BP9SIzVOLcFY1U7PyggJiZjXMDQeCe8
 NgTlqTxWHb7PqyxqDuQtICH65yPTqjrND1fTGeI6WiZOplbSmPFjecA51f70dNgJJ94s
 2tkg==
MIME-Version: 1.0
Received: by 10.152.144.164 with SMTP id sn4mr87688027lab.57.1358303004305;
 Tue, 15 Jan 2013 18:23:24 -0800 (PST)
Received: by 10.114.78.41 with HTTP; Tue, 15 Jan 2013 18:23:24 -0800 (PST)
In-Reply-To: <CALOkxuyJXTj9U2JaG2Qa0aTU0eUdptMLgCzJJebadGTJjECRRg@mail.gmail.com>
References: <CALC5+1MbmG8xyKmr6LVYvrFOOWrv-v=BR6JkyG4jW_d3-Js7GA@mail.gmail.com>
 <CALC5+1MEQ3kqqGqk=dhSFLXSJA2fdR1smvKKjch0QeG_mhJX2g@mail.gmail.com>
 <CALOkxuyJXTj9U2JaG2Qa0aTU0eUdptMLgCzJJebadGTJjECRRg@mail.gmail.com>
Date: Tue, 15 Jan 2013 18:23:24 -0800
Message-ID: <CALC5+1N+aqgLkn9qwVvVyWsHuCY35pGh_w=3U2xCn-69biWYBQ@mail.gmail.com>
Subject: Re: CAM hangs in 9-STABLE? [Was: NFS/ZFS hangs after upgrading from
 9.0-RELEASE to -STABLE]
From: olivier <olivier777a7@gmail.com>
To: "Reed A. Cartwright" <cartwright@asu.edu>
Content-Type: text/plain; charset=ISO-8859-1
X-Content-Filtered-By: Mailman/MimeDel 2.1.14
Cc: freebsd-fs@freebsd.org, ken@freebsd.org,
 "freebsd-stable@freebsd.org" <freebsd-stable@freebsd.org>,
 Andriy Gapon <avg@freebsd.org>
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 16 Jan 2013 02:23:32 -0000

My understanding is that the locks (and pieces of kernel code) involved are
different.
Maybe someone more knowledgeable than I am can comment.
Thanks for the suggestion...
Olivier


On Tue, Jan 15, 2013 at 4:07 PM, Reed A. Cartwright <cartwright@asu.edu>wrote:

> I don't know if this is relevant or not, but I deadlock was recently
> fixed in the VFS code:
>
> http://svnweb.freebsd.org/base?view=revision&revision=244795
>
> On Tue, Jan 15, 2013 at 12:55 PM, olivier <olivier777a7@gmail.com> wrote:
> > Dear All,
> > Still experiencing the same hangs I reported earlier with 9.1. I've been
> > running a kernel with WITNESS enabled to provide more information.
> >
> > During an occurrence of the hang, running show alllocks gave
> >
> > Process 25777 (sysctl) thread 0xfffffe014c5b2920 (102567)
> > exclusive sleep mutex Giant (Giant) r = 0 (0xffffffff811e34c0) locked @
> > /usr/src/sys/dev/usb/usb_transfer.c:3171
> > Process 25750 (sshd) thread 0xfffffe015a688000 (104313)
> > exclusive sx so_rcv_sx (so_rcv_sx) r = 0 (0xfffffe0204e0bb98) locked @
> > /usr/src/sys/kern/uipc_sockbuf.c:148
> > Process 24922 (cnid_dbd) thread 0xfffffe0187ac4920 (103597)
> > shared lockmgr zfs (zfs) r = 0 (0xfffffe0973062488) locked @
> > /usr/src/sys/kern/vfs_syscalls.c:3591
> > Process 24117 (sshd) thread 0xfffffe07bd914490 (104195)
> > exclusive sx so_rcv_sx (so_rcv_sx) r = 0 (0xfffffe0204e0a8f0) locked @
> > /usr/src/sys/kern/uipc_sockbuf.c:148
> > Process 1243 (java) thread 0xfffffe01ca85d000 (102704)
> > exclusive sleep mutex pmap (pmap) r = 0 (0xfffffe015aec1440) locked @
> > /usr/src/sys/amd64/amd64/pmap.c:4840
> > exclusive rw pmap pv global (pmap pv global) r = 0 (0xffffffff81409780)
> > locked @ /usr/src/sys/amd64/amd64/pmap.c:4802
> > exclusive sleep mutex vm page (vm page) r = 0 (0xffffffff813f0a80)
> locked @
> > /usr/src/sys/vm/vm_object.c:1128
> > exclusive sleep mutex vm object (standard object) r = 0
> > (0xfffffe01458e43a0) locked @ /usr/src/sys/vm/vm_object.c:1076
> > shared sx vm map (user) (vm map (user)) r = 0 (0xfffffe015aec1388)
> locked @
> > /usr/src/sys/vm/vm_map.c:2045
> > Process 994 (nfsd) thread 0xfffffe015a0df000 (102426)
> > shared lockmgr zfs (zfs) r = 0 (0xfffffe0c3b505878) locked @
> >
> /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vfsops.c:1760
> > Process 994 (nfsd) thread 0xfffffe015a0f8490 (102422)
> > exclusive lockmgr zfs (zfs) r = 0 (0xfffffe02db3b3e60) locked @
> >
> /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vfsops.c:1760
> > Process 931 (syslogd) thread 0xfffffe015af18920 (102365)
> > shared lockmgr zfs (zfs) r = 0 (0xfffffe0141dd6680) locked @
> > /usr/src/sys/kern/vfs_syscalls.c:3591
> > Process 22 (syncer) thread 0xfffffe0125077000 (100279)
> > exclusive lockmgr syncer (syncer) r = 0 (0xfffffe015a2ff680) locked @
> > /usr/src/sys/kern/vfs_subr.c:1809
> >
> > I don't have full "show lockedvnods" output because the output does not
> get
> > captured by ddb after using "capture on", it doesn't fit on a single
> > screen, and doesn't get piped into a "more" equivalent. What I did manage
> > to get (copied by hand, typos possible) is:
> >
> > 0xfffffe0c3b5057e0: 0xfffffe0c3b5057e0: tag zfs, type VREG
> > tag zfs, type VREG
> > usecount 1, writecount 0, refcount 1 mountedhere 0
> > usecount 1, writecount 0, refcount 1 mountedhere 0
> > flags (VI_ACTIVE)
> > flags (VI_ACTIVE)
> > v_object 0xfffffe089bc1b828 ref 0 pages 0
> > v_object 0xfffffe089bc1b828 ref 0 pages 0
> > lock type zfs: SHARED (count 1)
> > lock type zfs: SHARED (count 1)
> >
> > 0xfffffe02db3b3dc8: 0xfffffe02db3b3dc8: tag zfs, type VREG
> > tag zfs, type VREG
> > usecount 6, writecount 0, refcount 6 mountedhere 0
> > usecount 6, writecount 0, refcount 6 mountedhere 0
> > flags (VI_ACTIVE)
> > flags (VI_ACTIVE)
> > v_object 0xfffffe0b79583ae0 ref 0 pages 0
> > v_object 0xfffffe0b79583ae0 ref 0 pages 0
> > lock type zfs: EXCL by thread 0xfffffe015a0f8490 (pid 994)
> > lock type zfs: EXCL by thread 0xfffffe015a0f8490 (pid 994)
> > with exclusive waiters pending
> > with exclusive waiters pending
> >
> > The output of show witness is at http://pastebin.com/eSRb3FEu
> >
> > The output of alltrace is at http://pastebin.com/X1LruNrf (a number of
> > threads are stuck in zio_wait, none I can find in zio_interrupt, and
> > according to gstat and disks eventually going to sleep all disk IO seems
> to
> > be stuck for good; I think Andriy explained earlier that these criteria
> > might indicate this is a ZFS hang).
> >
> > The output of show geom is at http://pastebin.com/6nwQbKr4
> >
> > The output of vmstat -i is at http://pastebin.com/9LcZ7Mi0 Interrupts
> are
> > occurring at a normal rate during the hang, as far as I can tell.
> >
> > Any help would be greatly appreciated.
> > Thanks
> > Olivier
> > PS: my kernel was compiled from 9-STABLE from December, with CAM and ahci
> > from 9.0 (in the hope it would fix the hangs I was experiencing in plain
> > 9-STABLE; obviously the hangs are still occurring). The rest of my
> > configuration is the same as posted earlier.
> >
> > On Mon, Dec 24, 2012 at 9:42 PM, olivier <olivier777a7@gmail.com> wrote:
> >
> >> Dear All
> >> It turns out that reverting to an older version of the mps driver did
> not
> >> fix the ZFS hangs I've been struggling with in 9.1 and 9-STABLE after
> all
> >> (they just took a bit longer to occur again, possibly just by chance). I
> >> followed steps along lines suggested by Andriy to collect more
> information
> >> when the problem occurs. Hopefully this will help figure out what's
> going
> >> on.
> >>
> >> As far as I can tell, what happens is that at some point IO operations
> to
> >> a bunch of drives that belong to different pools get stuck. For these
> >> drives, gstat shows no activity but 1 pending operation, as such:
> >>
> >>  L(q)  ops/s    r/s   kBps   ms/r    w/s   kBps   ms/w    d/s   kBps
> >> ms/d   %busy Name
> >>     1      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  da1
> >>
> >> I've been running gstat in a loop (every 100s) to monitor the machine.
> >> Just before the hang occurs, everything seems fine (see full gstat
> output
> >> below). Right after the hang occurs a number of drives seem stuck (see
> full
> >> gstat output below). Notably, some stuck drives are seen through the mps
> >> driver and others through the mpt driver. So the problem doesn't seem
> to be
> >> driver-specific. I have had the problem occur (at a lower frequency) on
> >> similar machines that don't use the mpt driver (and only have 1 disk
> >> provided through mps), so the problem doesn't seem to be caused by the
> mpt
> >> driver (and is likely not caused by defective hardware). Since based on
> the
> >> information I provided earlier Andriy thinks the problem might not
> >> originate in ZFS, perhaps that means that the problem is in the CAM
> layer?
> >>
> >> camcontrol tags -v (as suggested by Andriy) in the hung state shows for
> >> example
> >>
> >> (pass56:mpt1:0:8:20): dev_openings  254
> >> (pass56:mpt1:0:8:20): dev_active    1
> >> (pass56:mpt1:0:8:20): devq_openings 254
> >> (pass56:mpt1:0:8:20): devq_queued   0
> >> (pass56:mpt1:0:8:20): held          0
> >> (pass56:mpt1:0:8:20): mintags       2
> >> (pass56:mpt1:0:8:20): maxtags       255
> >> (I'm not providing full camcontrol tags output below because I couldn't
> >> get it to run during the specific hang I documented most thoroughly; the
> >> example above is from a different occurrence of the hang).
> >>
> >> The buses don't seem completely frozen: if I manually remove drives
> while
> >> the machine is hanging, that's picked up by the mpt driver, which prints
> >> out corresponding messages to the console. But camcontrol reset all or
> >> rescan all don't seem to do anything.
> >>
> >> I've tried reducing vfs.zfs.vdev.min_pending and
> vfs.zfs.vdev.max_pending
> >> to 1, to no avail.
> >>
> >> Any suggestions to resolve this problem, work around it, or further
> >> investigate it would be greatly appreciated!
> >> Thanks a lot
> >> Olivier
> >>
> >> Detailed information:
> >>
> >> Output of procstat -a -kk when the machine is hanging is available at
> >> http://pastebin.com/7D2KtT35 (not putting it here because it's pretty
> >> long)
> >>
> >> dmesg is available at http://pastebin.com/9zJQwWJG . Note that I'm
> using
> >> LUN masking, so the "illegal requests" reported aren't really errors.
> Maybe
> >> one day if I get my problems sorted out I'll use geom multipathing
> instead.
> >>
> >> My kernel config is
> >> include GENERIC
> >> ident MYKERNEL
> >>
> >> options IPSEC
> >> device crypto
> >>
> >> options OFED # Infiniband protocol
> >>
> >> device mlx4ib # ConnectX Infiniband support
> >> device mlxen # ConnectX Ethernet support
> >> device mthca # Infinihost cards
> >> device ipoib # IP over IB devices
> >>
> >> options         ATA_CAM         # Handle legacy controllers with CAM
> >> options         ATA_STATIC_ID   # Static device numbering
> >>
> >> options KDB
> >> options DDB
> >>
> >>
> >>
> >> Full output of gstat just before the hang (at most 100s before the
> hang):
> >>  L(q)  ops/s    r/s   kBps   ms/r    w/s   kBps   ms/w    d/s   kBps
> >> ms/d   %busy Name
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  da2
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  da0
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  DEV/da2/da2
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  DEV/da0/da0
> >>     1     85     48     79    4.7     35     84    0.5      0      0
> >>  0.0   24.3  da1
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  DEV/da1/da1
> >>     1     83     47     77    4.3     34     79    0.5      0      0
> >>  0.0   22.1  da4
> >>     1   1324   1303  21433    0.6     19     42    0.7      0      0
> >>  0.0   79.8  da3
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  da5
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  da6
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  da7
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  da8
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  da9
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  da10
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  da11
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  da12
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  da13
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  da14
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  da15
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  da16
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  da17
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  da18
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  da19
> >>     0     97     57     93    3.5     38     84    0.3      0      0
> >>  0.0   21.3  da20
> >>     0     85     47     69    3.3     36     86    0.4      0      0
> >>  0.0   16.8  da21
> >>     0   1666   1641  18992    0.3     23     43    0.4      0      0
> >>  0.0   57.9  da22
> >>     0     93     55     98    3.5     36     87    0.4      0      0
> >>  0.0   20.6  da23
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  da24
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  da25
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  da26
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  da27
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  da28
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  da29
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  da30
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  da31
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  da32
> >>     0   1200      0      0    0.0   1198  11751    0.6      0      0
> >>  0.0   67.3  da33
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  da34
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  da35
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  da36
> >>     0     81     44     67    2.0     35     84    0.3      0      0
> >>  0.0   10.1  da37
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  da38
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  da39
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  da40
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  da41
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  da42
> >>     1   1020    999  22028    0.8     19     42    0.7      0      0
> >>  0.0   84.8  da43
> >>     0   1050   1029  23479    0.8     19     47    0.7      0      0
> >>  0.0   83.3  da44
> >>     1   1006    984  22758    0.8     21     46    0.6      0      0
> >>  0.0   84.8  da45
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  da46
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  da47
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  da48
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  da49
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  da50
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  cd0
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  DEV/da4/da4
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  DEV/da3/da3
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  DEV/da5/da5
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  DEV/da6/da6
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  DEV/da7/da7
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  DEV/da8/da8
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  DEV/da9/da9
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  DEV/da10/da10
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  DEV/da11/da11
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  DEV/da12/da12
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  DEV/da13/da13
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  DEV/da14/da14
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  DEV/da15/da15
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  DEV/da16/da16
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  DEV/da17/da17
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  DEV/da18/da18
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  DEV/da19/da19
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  DEV/da20/da20
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  DEV/da21/da21
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  DEV/da22/da22
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  DEV/da23/da23
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  DEV/da24/da24
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  DEV/da25/da25
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  DEV/da26/da26
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  PART/da26/da26
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  da26p1
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  da26p2
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  da26p3
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  DEV/da27/da27
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  DEV/da28/da28
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  DEV/da29/da29
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  DEV/da30/da30
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  DEV/da31/da31
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  DEV/da32/da32
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  DEV/da33/da33
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  DEV/da34/da34
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  DEV/da35/da35
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  DEV/da36/da36
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  DEV/da37/da37
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  DEV/da38/da38
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  DEV/da39/da39
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  DEV/da40/da40
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  DEV/da41/da41
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  DEV/da42/da42
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  DEV/da43/da43
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  DEV/da44/da44
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  DEV/da45/da45
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  DEV/da46/da46
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  DEV/da47/da47
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  DEV/da48/da48
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  DEV/da49/da49
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  DEV/da50/da50
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  DEV/cd0/cd0
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  DEV/da26p1/da26p1
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  DEV/da26p2/da26p2
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  LABEL/da26p1/da26p1
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  gptid/84d4487b-34e3-11e2-b773-00259058949a
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  DEV/da26p3/da26p3
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  LABEL/da26p2/da26p2
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  gptid/b4255780-34e3-11e2-b773-00259058949a
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0
> >>
>  DEV/gptid/84d4487b-34e3-11e2-b773-00259058949a/gptid/84d4487b-34e3-11e2-b773-00259058949a
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  ZFS::VDEV/zfs::vdev/da25
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0
> >>
>  DEV/gptid/b4255780-34e3-11e2-b773-00259058949a/gptid/b4255780-34e3-11e2-b773-00259058949a
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  ZFS::VDEV/zfs::vdev/da40
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  ZFS::VDEV/zfs::vdev/da41
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  ZFS::VDEV/zfs::vdev/da26p3
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  ZFS::VDEV/zfs::vdev/da29
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  ZFS::VDEV/zfs::vdev/da30
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  ZFS::VDEV/zfs::vdev/da24
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  ZFS::VDEV/zfs::vdev/da6
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  ZFS::VDEV/zfs::vdev/da7
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  ZFS::VDEV/zfs::vdev/da16
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  ZFS::VDEV/zfs::vdev/da17
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  ZFS::VDEV/zfs::vdev/da20
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  ZFS::VDEV/zfs::vdev/da21
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  ZFS::VDEV/zfs::vdev/da37
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  ZFS::VDEV/zfs::vdev/da23
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  ZFS::VDEV/zfs::vdev/da1
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  ZFS::VDEV/zfs::vdev/da4
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  ZFS::VDEV/zfs::vdev/da43
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  ZFS::VDEV/zfs::vdev/da44
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  ZFS::VDEV/zfs::vdev/da22
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  ZFS::VDEV/zfs::vdev/da33
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  ZFS::VDEV/zfs::vdev/da45
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  ZFS::VDEV/zfs::vdev/da3
> >>
> >>
> >> Full output of gstat just after the hang (at most 100s after the hang):
> >>  L(q)  ops/s    r/s   kBps   ms/r    w/s   kBps   ms/w    d/s   kBps
> >> ms/d   %busy Name
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  da2
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  da0
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  DEV/da2/da2
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  DEV/da0/da0
> >>     1      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  da1
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  DEV/da1/da1
> >>     1      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  da4
> >>     1      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  da3
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  da5
> >>     1      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  da6
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  da7
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  da8
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  da9
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  da10
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  da11
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  da12
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  da13
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  da14
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  da15
> >>     1      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  da16
> >>     1      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  da17
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  da18
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  da19
> >>     1      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  da20
> >>     1      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  da21
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  da22
> >>     1      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  da23
> >>     1      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  da24
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  da25
> >>     1      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  da26
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  da27
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  da28
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  da29
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  da30
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  da31
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  da32
> >>     1      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  da33
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  da34
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  da35
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  da36
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  da37
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  da38
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  da39
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  da40
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  da41
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  da42
> >>     1      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  da43
> >>     1      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  da44
> >>     1      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  da45
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  da46
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  da47
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  da48
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  da49
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  da50
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  cd0
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  DEV/da4/da4
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  DEV/da3/da3
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  DEV/da5/da5
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  DEV/da6/da6
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  DEV/da7/da7
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  DEV/da8/da8
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  DEV/da9/da9
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  DEV/da10/da10
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  DEV/da11/da11
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  DEV/da12/da12
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  DEV/da13/da13
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  DEV/da14/da14
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  DEV/da15/da15
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  DEV/da16/da16
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  DEV/da17/da17
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  DEV/da18/da18
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  DEV/da19/da19
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  DEV/da20/da20
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  DEV/da21/da21
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  DEV/da22/da22
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  DEV/da23/da23
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  DEV/da24/da24
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  DEV/da25/da25
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  DEV/da26/da26
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  PART/da26/da26
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  da26p1
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  da26p2
> >>     1      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  da26p3
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  DEV/da27/da27
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  DEV/da28/da28
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  DEV/da29/da29
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  DEV/da30/da30
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  DEV/da31/da31
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  DEV/da32/da32
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  DEV/da33/da33
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  DEV/da34/da34
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  DEV/da35/da35
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  DEV/da36/da36
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  DEV/da37/da37
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  DEV/da38/da38
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  DEV/da39/da39
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  DEV/da40/da40
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  DEV/da41/da41
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  DEV/da42/da42
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  DEV/da43/da43
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  DEV/da44/da44
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  DEV/da45/da45
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  DEV/da46/da46
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  DEV/da47/da47
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  DEV/da48/da48
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  DEV/da49/da49
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  DEV/da50/da50
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  DEV/cd0/cd0
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  DEV/da26p1/da26p1
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  DEV/da26p2/da26p2
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  LABEL/da26p1/da26p1
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  gptid/84d4487b-34e3-11e2-b773-00259058949a
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  DEV/da26p3/da26p3
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  LABEL/da26p2/da26p2
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  gptid/b4255780-34e3-11e2-b773-00259058949a
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0
> >>
>  DEV/gptid/84d4487b-34e3-11e2-b773-00259058949a/gptid/84d4487b-34e3-11e2-b773-00259058949a
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  ZFS::VDEV/zfs::vdev/da25
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0
> >>
>  DEV/gptid/b4255780-34e3-11e2-b773-00259058949a/gptid/b4255780-34e3-11e2-b773-00259058949a
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  ZFS::VDEV/zfs::vdev/da40
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  ZFS::VDEV/zfs::vdev/da41
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  ZFS::VDEV/zfs::vdev/da26p3
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  ZFS::VDEV/zfs::vdev/da29
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  ZFS::VDEV/zfs::vdev/da30
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  ZFS::VDEV/zfs::vdev/da24
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  ZFS::VDEV/zfs::vdev/da6
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  ZFS::VDEV/zfs::vdev/da7
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  ZFS::VDEV/zfs::vdev/da16
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  ZFS::VDEV/zfs::vdev/da17
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  ZFS::VDEV/zfs::vdev/da20
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  ZFS::VDEV/zfs::vdev/da21
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  ZFS::VDEV/zfs::vdev/da37
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  ZFS::VDEV/zfs::vdev/da23
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  ZFS::VDEV/zfs::vdev/da1
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  ZFS::VDEV/zfs::vdev/da4
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  ZFS::VDEV/zfs::vdev/da43
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  ZFS::VDEV/zfs::vdev/da44
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  ZFS::VDEV/zfs::vdev/da22
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  ZFS::VDEV/zfs::vdev/da33
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  ZFS::VDEV/zfs::vdev/da45
> >>     0      0      0      0    0.0      0      0    0.0      0      0
> >>  0.0    0.0  ZFS::VDEV/zfs::vdev/da3
> >>
> >>
> >> On Thu, Dec 13, 2012 at 10:14 PM, olivier <olivier777a7@gmail.com>
> wrote:
> >>
> >>> For what it's worth, I think I might have solved my problem by
> reverting
> >>> to an older version of the mps driver. I checked out a recent version
> of
> >>> 9-STABLE and reversed the changes in
> >>> http://svnweb.freebsd.org/base?view=revision&revision=230592 (perhaps
> >>> there was a simpler way of reverting to the older mps driver). So far
> so
> >>> good, no hang even when hammering the file system.
> >>>
> >>> This does not conclusively prove that the new LSI mps driver is at
> fault,
> >>> but that seems to be a likely explanation.
> >>>
> >>> Thanks to everybody who pointed me in the right direction. Hope this
> >>> helps others who run into similar problems with 9.1
> >>>  Olivier
> >>>
> >>>
> >>> On Thu, Dec 13, 2012 at 10:14 AM, olivier <olivier777a7@gmail.com>
> wrote:
> >>>
> >>>>
> >>>>
> >>>> On Thu, Dec 13, 2012 at 9:54 AM, Andriy Gapon <avg@freebsd.org>
> wrote:
> >>>>
> >>>>> Google for "zfs deadman".  This is already committed upstream and I
> >>>>> think that it
> >>>>> is imported into FreeBSD, but I am not sure...  Maybe it's imported
> >>>>> just into the
> >>>>> vendor area and is not merged yet.
> >>>>>
> >>>>
> >>>> Yes, that's exactly what I had in mind. The logic for panicking makes
> >>>> sense.
> >>>> As far as I can tell you're correct that deadman is in the vendor area
> >>>> but not merged. Any idea when it might make it into 9-STABLE?
> >>>> Thanks
> >>>> Olivier
> >>>>
> >>>>
> >>>>
> >>>>
> >>>>> So, when enabled this logic would panic a system as a way of letting
> >>>>> know that
> >>>>> something is wrong.  You can read in the links why panic was selected
> >>>>> for this job.
> >>>>>
> >>>>> And speaking FreeBSD-centric - I think that our CAM layer would be a
> >>>>> perfect place
> >>>>> to detect such issues in non-ZFS-specific way.
> >>>>>
> >>>>> --
> >>>>> Andriy Gapon
> >>>>>
> >>>>
> >>>>
> >>>
> >>
> > _______________________________________________
> > freebsd-stable@freebsd.org mailing list
> > http://lists.freebsd.org/mailman/listinfo/freebsd-stable
> > To unsubscribe, send any mail to "freebsd-stable-unsubscribe@freebsd.org
> "
>
>
>
> --
> Reed A. Cartwright, PhD
> Assistant Professor of Genomics, Evolution, and Bioinformatics
> School of Life Sciences
> Center for Evolutionary Medicine and Informatics
> The Biodesign Institute
> Arizona State University
>

From owner-freebsd-fs@FreeBSD.ORG  Wed Jan 16 02:50:57 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by hub.freebsd.org (Postfix) with ESMTP id B06936A4
 for <freebsd-fs@freebsd.org>; Wed, 16 Jan 2013 02:50:57 +0000 (UTC)
 (envelope-from freebsd@deman.com)
Received: from plato.corp.nas.com (plato.corp.nas.com [66.114.32.138])
 by mx1.freebsd.org (Postfix) with ESMTP id 792BFFEA
 for <freebsd-fs@freebsd.org>; Wed, 16 Jan 2013 02:50:57 +0000 (UTC)
Received: from localhost (localhost [127.0.0.1])
 by plato.corp.nas.com (Postfix) with ESMTP id F0DA112D75DFE
 for <freebsd-fs@freebsd.org>; Tue, 15 Jan 2013 18:43:49 -0800 (PST)
X-Virus-Scanned: amavisd-new at corp.nas.com
Received: from plato.corp.nas.com ([127.0.0.1])
 by localhost (plato.corp.nas.com [127.0.0.1]) (amavisd-new, port 10024)
 with ESMTP id sfVJv17MLc76 for <freebsd-fs@freebsd.org>;
 Tue, 15 Jan 2013 18:43:49 -0800 (PST)
Received: from [192.168.0.120] (c-50-135-255-120.hsd1.wa.comcast.net
 [50.135.255.120])
 by plato.corp.nas.com (Postfix) with ESMTPSA id 6F8ED12D75DF3
 for <freebsd-fs@freebsd.org>; Tue, 15 Jan 2013 18:43:49 -0800 (PST)
Content-Type: text/plain; charset=us-ascii
Mime-Version: 1.0 (Mac OS X Mail 6.2 \(1499\))
Subject: Re: HAST + ZFS self healing? Hot spares?
From: Michael DeMan <freebsd@deman.com>
In-Reply-To: <50F4BBE7.7050207@gogrid.com>
Date: Tue, 15 Jan 2013 18:43:49 -0800
Content-Transfer-Encoding: 7bit
Message-Id: <6214EC5B-D846-4B0D-AF14-7AB9F91D2F82@deman.com>
References: 4DD5A1CF.70807@itassistans.se <50F4BBE7.7050207@gogrid.com>
To: freebsd-fs@freebsd.org
X-Mailer: Apple Mail (2.1499)
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 16 Jan 2013 02:50:57 -0000

Was there supposed to be any content in this envelope?
On Jan 14, 2013, at 6:16 PM, Edward Xiao <edward@gogrid.com> wrote:

> 
> _______________________________________________
> freebsd-fs@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-fs
> To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org"


From owner-freebsd-fs@FreeBSD.ORG  Wed Jan 16 03:10:41 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by hub.freebsd.org (Postfix) with ESMTP id EB3EBB97
 for <freebsd-fs@freebsd.org>; Wed, 16 Jan 2013 03:10:41 +0000 (UTC)
 (envelope-from rmacklem@uoguelph.ca)
Received: from esa-annu.net.uoguelph.ca (esa-annu.mail.uoguelph.ca
 [131.104.91.36]) by mx1.freebsd.org (Postfix) with ESMTP id A298C1E1
 for <freebsd-fs@freebsd.org>; Wed, 16 Jan 2013 03:10:41 +0000 (UTC)
X-IronPort-Anti-Spam-Filtered: true
X-IronPort-Anti-Spam-Result: AqAEAPIZ9lCDaFvO/2dsb2JhbABFhjq3aXOCHgEBAQMBAQEBIAQnIAsFFg4KERkCBB8GAQkmBggHBAEcBIdmAwkGDKYQgkCGXA2HfowIgQiDFYETA4hhhieEVliBVoEcihuFEoMTgVE1
X-IronPort-AV: E=Sophos;i="4.84,476,1355115600"; 
   d="scan'208";a="9262898"
Received: from erie.cs.uoguelph.ca (HELO zcs3.mail.uoguelph.ca)
 ([131.104.91.206])
 by esa-annu.net.uoguelph.ca with ESMTP; 15 Jan 2013 22:10:35 -0500
Received: from zcs3.mail.uoguelph.ca (localhost.localdomain [127.0.0.1])
 by zcs3.mail.uoguelph.ca (Postfix) with ESMTP id 5E9ABB3F2E;
 Tue, 15 Jan 2013 22:10:35 -0500 (EST)
Date: Tue, 15 Jan 2013 22:10:35 -0500 (EST)
From: Rick Macklem <rmacklem@uoguelph.ca>
To: Sergey Kandaurov <pluknet@gmail.com>
Message-ID: <980540815.2029630.1358305835362.JavaMail.root@erie.cs.uoguelph.ca>
In-Reply-To: <CAE-mSOJk1HbxvF=ZpoSP21b9j65qMov=AE-OM6wcUkbadQeZbw@mail.gmail.com>
Subject: Re: getcwd lies on/under nfs4-mounted zfs dataset
MIME-Version: 1.0
Content-Type: multipart/mixed; 
 boundary="----=_Part_2029629_336101775.1358305835359"
X-Originating-IP: [172.17.91.201]
X-Mailer: Zimbra 6.0.10_GA_2692 (ZimbraWebClient - FF3.0 (Win)/6.0.10_GA_2692)
Cc: freebsd-fs@freebsd.org
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 16 Jan 2013 03:10:42 -0000

------=_Part_2029629_336101775.1358305835359
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit

pluknet@gmail.com wrote:
> Hi.
> 
> We stuck with the problem getting wrong current directory path
> when sitting on/under zfs dataset filesystem mounted over NFSv4.
> Both nfs server and client are 10.0-CURRENT from December or so.
> 
> The component path "user3" unexpectedly appears to be "." (dot).
> nfs-client:/home/user3 # pwd
> /home/.
> nfs-client:/home/user3/var/run # pwd
> /home/./var/run
> 
Yep, it was broken for UFS too. I think the attached patch for
the client might fix it. (It fixes a trivial test case for UFS,
but I haven't gone through the code to check if it might break
something else.)

I vaguely recall bumping into a non-FreeBSD server that only
returned the Mounted_on_fileno attribute for mount points at
a testing bakeathon and hacking around the problem that caused.
(I think that hack made it into head, oops.;-)

So, I wouldn't test this patch for a production type system,
but if you can test it, that would be great.

Sorry about the breakage, rick

> nfs-client:~ # procstat -f 3225
> PID COMM FD T V FLAGS REF OFFSET PRO NAME
> 3225 a.out text v r r-------- - - - /home/./var/a.out
> 3225 a.out ctty v c rw------- - - - /dev/pts/2
> 3225 a.out cwd v d r-------- - - - /home/./var
> 3225 a.out root v d r-------- - - - /
> 
> The used setup follows.
> 
> 1. NFS Server with local ZFS:
> # cat /etc/exports
> V4: / -sec=sys
> 
> # zfs list
> pool1 10.4M 122G 580K /pool1
> pool1/user3 on /pool1/user3 (zfs, NFS exported, local, nfsv4acls)
> 
> Exports list on localhost:
> /pool1/user3 109.70.28.0
> /pool1 109.70.28.0
> 
> # zfs get sharenfs pool1/user3
> NAME PROPERTY VALUE SOURCE
> pool1/user3 sharenfs -alldirs -maproot=root -network=109.70.28.0/24
> local
> 
> 2. pool1 is mounted on NFSv4 client:
> nfs-server:/pool1 on /home (nfs, noatime, nfsv4acls)
> 
> So that on NFS client the "pool1/user3" dataset comes at /home/user3.
> / - ufs
> /home - zpool-over-nfsv4
> /home/user3 - zfs dataset "pool1/user3"
> 
> At the same time it works as expected when we're not on zfs dataset,
> but directly on its parent zfs pool (also over NFSv4), e.g.
> nfs-client:/home/non_dataset_dir # pwd
> /home/non_dataset_dir
> 
> The ls command works as expected:
> nfs-client:/# ls -dl /home/user3/var/
> drwxrwxrwt+ 6 root wheel 6 Jan 10 16:19 /home/user3/var/
> 
> --
> wbr,
> pluknet
> _______________________________________________
> freebsd-fs@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-fs
> To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org"

------=_Part_2029629_336101775.1358305835359
Content-Type: text/x-patch; name=client-getcwd.patch
Content-Transfer-Encoding: base64
Content-Disposition: attachment; filename=client-getcwd.patch

LS0tIGZzL25mcy9uZnNwcm90by5oLnNhdjIJMjAxMy0wMS0xNSAyMTozNDo0OS4wMDAwMDAwMDAg
LTA1MDAKKysrIGZzL25mcy9uZnNwcm90by5oCTIwMTMtMDEtMTUgMjE6MzY6NTUuMDAwMDAwMDAw
IC0wNTAwCkBAIC05ODQsNyArOTg0LDggQEAgc3RydWN0IG5mc3YzX3NhdHRyIHsKICAJTkZTQVRU
UkJNX1NQQUNFVVNFRCB8CQkJCQkJXAogIAlORlNBVFRSQk1fVElNRUFDQ0VTUyB8CQkJCQkJXAog
IAlORlNBVFRSQk1fVElNRU1FVEFEQVRBIHwJCQkJCVwKLSAJTkZTQVRUUkJNX1RJTUVNT0RJRlkp
CisgCU5GU0FUVFJCTV9USU1FTU9ESUZZIHwJCQkJCQlcCisJTkZTQVRUUkJNX01PVU5URURPTkZJ
TEVJRCkKIAogLyoKICAqIFN1YnNldCBvZiB0aGUgYWJvdmUgdGhhdCB0aGUgV3JpdGUgUlBDIGdl
dHMuCi0tLSBmcy9uZnMvbmZzX2NvbW1vbnN1YnMuYy5zYXYyCTIwMTMtMDEtMTUgMjE6Mzg6NTMu
MDAwMDAwMDAwIC0wNTAwCisrKyBmcy9uZnMvbmZzX2NvbW1vbnN1YnMuYwkyMDEzLTAxLTE1IDIx
OjQwOjA0LjAwMDAwMDAwMCAtMDUwMApAQCAtMTcyNiw2ICsxNzI2LDcgQEAgbmZzdjRfbG9hZGF0
dHIoc3RydWN0IG5mc3J2X2Rlc2NyaXB0ICpuZAogCQkJICAgIGlmICgqdGwrKykKIAkJCQlwcmlu
dGYoIk5GU3Y0IG1vdW50ZWQgb24gZmlsZWlkID4gMzJiaXRzXG4iKTsKIAkJCSAgICBuYXAtPm5h
X21udG9uZmlsZW5vID0gdGh5cDsKKwkJCSAgICBuYXAtPm5hX2ZpbGVpZCA9IG5hcC0+bmFfbW50
b25maWxlbm87CiAJCQl9CiAJCQlhdHRyc3VtICs9IE5GU1hfSFlQRVI7CiAJCQlicmVhazsK
------=_Part_2029629_336101775.1358305835359--

From owner-freebsd-fs@FreeBSD.ORG  Wed Jan 16 03:49:20 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: fs@FreeBSD.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by hub.freebsd.org (Postfix) with ESMTP id D681726E;
 Wed, 16 Jan 2013 03:49:20 +0000 (UTC)
 (envelope-from brde@optusnet.com.au)
Received: from mail05.syd.optusnet.com.au (mail05.syd.optusnet.com.au
 [211.29.132.186])
 by mx1.freebsd.org (Postfix) with ESMTP id 5BC7B344;
 Wed, 16 Jan 2013 03:49:19 +0000 (UTC)
Received: from c211-30-173-106.carlnfd1.nsw.optusnet.com.au
 (c211-30-173-106.carlnfd1.nsw.optusnet.com.au [211.30.173.106])
 by mail05.syd.optusnet.com.au (8.13.1/8.13.1) with ESMTP id r0G3nBT5001550
 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO);
 Wed, 16 Jan 2013 14:49:13 +1100
Date: Wed, 16 Jan 2013 14:49:11 +1100 (EST)
From: Bruce Evans <brde@optusnet.com.au>
X-X-Sender: bde@besplex.bde.org
To: John Baldwin <jhb@FreeBSD.org>
Subject: Re: [PATCH] Better handle NULL utimes() in the NFS client
In-Reply-To: <201301151458.42874.jhb@freebsd.org>
Message-ID: <20130116134627.S1060@besplex.bde.org>
References: <162405990.1985479.1358212854967.JavaMail.root@erie.cs.uoguelph.ca>
 <20130115141019.H1444@besplex.bde.org> <201301151458.42874.jhb@freebsd.org>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed
X-Optus-CM-Score: 0
X-Optus-CM-Analysis: v=2.0 cv=P/xiHV8u c=1 sm=1 a=S8Qr1IbAvFsA:10
 a=kj9zAlcOel0A:10 a=PO7r1zJSAAAA:8 a=JzwRw_2MAAAA:8 a=U1Z5fgpPGSMA:10
 a=kNFpKb6NvubOC5D93twA:9 a=CjuIK1q_8ugA:10 a=TEtd8y5WR3g2ypngnwZWYw==:117
Cc: Rick Macklem <rmacklem@FreeBSD.org>, fs@FreeBSD.org
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 16 Jan 2013 03:49:20 -0000

On Tue, 15 Jan 2013, John Baldwin wrote:

> On Monday, January 14, 2013 11:51:23 pm Bruce Evans wrote:
>>
>> I can't see anything that does the different permissions check for
>> the VA_UTIMES_NULL case, and testing shows that this case is just broken,
>> at least for an old version of the old nfs client -- the same permissions
>> are required for all cases, but write permission is supposed to be
>> enough for the VA_UTIMES_NULL case (since write permission is sufficient
>> for setting the mtime to the current time (plus epsilon) using write(2)
>> and truncate(2).  Setting the atime to the current time should require
>> no more and no less than read permission, since it can be done using
>> read(2), but utimes(NULL) requires write permission for that too).
>
> Correct.  All the other uses of VA_UTIMES_NULL in the tree are to
> provide the permissions check you describe and there is a large
> comment about it in ufs_setattr().  Other filesystems have comments
> that reference ufs_setattr().  I think these checks should be done
> in nfs_setattr() rather than in the routine to build an NFS attribute
> object however.

Perhaps it can be done in vfs.  There are some technical problems with
this, but perhaps they are small.  One is that file systems might not
even have any timestamps.  (I forgot to mention a related problem with
the error handling.  The permissions checks for utimes() are usually
too strict for file systems that only have fake ownerships, like
msdosfs.  OTOH, msdosfs also doesn't have atimes for most variants of
the file system.  Since utimes() is supposed to set both the mtime and
the atime (especially in the non-NULL case), it strictly cannot work
on msdosfs.  But msdosfs is not strict about this.  It silently ignores
the atimes when it can't set them.)

> Fixing NFS to properly use vfs_timestamp() seems to be a larger
> project.

I think it is smaller.  For the new nfs code, it is not as simple as
changing the NFSGETTIME() macro, since nfs wants extra precision in
most cases (for things like comparing cache times), and very rarely
wants the semantics of vfs_timestamp().

I somehow missed seeing seeing even more confusion in this area:
- the new nfs code also has a macro NFSGETNANOTIME() which reduces to
   getnanotime().
- monotonic times should be used if possible, but the new nfs code only
   uses them for NFSD_MONOSEC (which is used a lot) and in 2 places in
   nfs_commonkrpc.c where a hard-coded getmicrouptime() is used.
- NFSGETNANOTIME() is used 4 times.  But there are 4 hard-coded uses of
   getnanotime().  The latter are mostly in places where vfs_timestamp()
   is correct, for things like n_atim for fifos.  nfs almost never needs
   to set file timestamps, since most file timestamps are set by leaf
   (non-nfs) file systems on the server.  The only exceptions that I know
   about are the ones already noted (utimes(NULL) on the server, something
   in create() (?), and n_atim for fifos on the client (?).  n_atim for
   special files on the client were an exception when special files were
   supported.
In the old nfs code:
- there are no NFSGET*TIME() macros, and there don't seem to be any
   get*time() calls instead either.  The get*time() calls used are
   almost exactly the same ones as in the nfs nfs code, with the only
   exception that I noticed being the ones in the server for utimes(NULL).
- monotonic times are only used in the same 2 places in nfs_commonkrpc.c.
   The non-monotonic time_second is used a lot (hard-coded), I think in
   much the same places where the new nfs server code time_uptime via
   NFSD_MONOSEC.

I don't like obfuscating standard time calls using macros.  Others that
I don't like:
- both the old and the new nfs client use NFS_TIMESPEC_COMPARE() instead
   of the standard and better timespeccmp().  NFS_TIMESSPEC_COMPARE()
   is more verbose and only compares for equality.  Its implementation
   is home made and not based on timespeccmp().
- the new nfs client also has a macro NFS_CMPTIME().  This gives the
   same result as NFS_TIMESPEC_COMPARE() (or timespeccmp(..., =)..  Iti
   works accidentally for both timespecs and timevals, due to the POSIX
   bug that the struct members for timespecs abuse the prefix for timevals
   in their spelling.  Code derived from the old nfs client has not been
   translated to use the new macro (the "new" macro is probably actually
   older and may even be older or more likely just from a different code
   base than FreeBSD's timespeccmp()).
- the new nfs client also has a macro NFS_SETTIME().  This doesn't actually
   set the time, but converts from the global timeval `time' to a timespec
   in the same way as the standard macro TIMEVAL_TO_TIMESPEC() would if it
   is passed a pointer to the global timeval.  Accessing the global `time'
   like this would give races.  Fortunately, the global `time' doesn't
   exist in FreeBSD, so of course this macro is never used.

I like NFSD_MONOSEC, however.  Global variables give a much more fragile
and harder to translate API than function calls and function-like macros.

Bruce

From owner-freebsd-fs@FreeBSD.ORG  Wed Jan 16 05:19:17 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: fs@FreeBSD.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by hub.freebsd.org (Postfix) with ESMTP id 921C69C9;
 Wed, 16 Jan 2013 05:19:17 +0000 (UTC)
 (envelope-from brde@optusnet.com.au)
Received: from fallbackmx07.syd.optusnet.com.au
 (fallbackmx07.syd.optusnet.com.au [211.29.132.9])
 by mx1.freebsd.org (Postfix) with ESMTP id D282B8B2;
 Wed, 16 Jan 2013 05:19:16 +0000 (UTC)
Received: from mail27.syd.optusnet.com.au (mail27.syd.optusnet.com.au
 [211.29.133.168])
 by fallbackmx07.syd.optusnet.com.au (8.13.1/8.13.1) with ESMTP id
 r0G5JBQC009796; Wed, 16 Jan 2013 16:19:11 +1100
Received: from c211-30-173-106.carlnfd1.nsw.optusnet.com.au
 (c211-30-173-106.carlnfd1.nsw.optusnet.com.au [211.30.173.106])
 by mail27.syd.optusnet.com.au (8.13.1/8.13.1) with ESMTP id r0G5J1mS028275
 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO);
 Wed, 16 Jan 2013 16:19:02 +1100
Date: Wed, 16 Jan 2013 16:19:01 +1100 (EST)
From: Bruce Evans <brde@optusnet.com.au>
X-X-Sender: bde@besplex.bde.org
To: Rick Macklem <rmacklem@uoguelph.ca>
Subject: Re: [PATCH] Better handle NULL utimes() in the NFS client
In-Reply-To: <1149390778.2023367.1358290140175.JavaMail.root@erie.cs.uoguelph.ca>
Message-ID: <20130116151051.O1060@besplex.bde.org>
References: <1149390778.2023367.1358290140175.JavaMail.root@erie.cs.uoguelph.ca>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed
X-Optus-CM-Score: 0
X-Optus-CM-Analysis: v=2.0 cv=Or8XUFDt c=1 sm=1 a=S8Qr1IbAvFsA:10
 a=kj9zAlcOel0A:10 a=PO7r1zJSAAAA:8 a=JzwRw_2MAAAA:8 a=U1Z5fgpPGSMA:10
 a=0YgfEWQ0QP7ufM_Kn4MA:9 a=CjuIK1q_8ugA:10 a=TEtd8y5WR3g2ypngnwZWYw==:117
Cc: Rick Macklem <rmacklem@FreeBSD.org>, fs@FreeBSD.org
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 16 Jan 2013 05:19:17 -0000

On Tue, 15 Jan 2013, Rick Macklem wrote:

> Bruce Evans wrote:

>> I can't see anything that does the different permissions check for
>> the VA_UTIMES_NULL case, and testing shows that this case is just
>> broken,
>> at least for an old version of the old nfs client -- the same
>> permissions
>> are required for all cases, but write permission is supposed to be
>> enough for the VA_UTIMES_NULL case (since write permission is
>> sufficient
>> for setting the mtime to the current time (plus epsilon) using
>> write(2)
>> and truncate(2). Setting the atime to the current time should require
>> no more and no less than read permission, since it can be done using
>> read(2), but utimes(NULL) requires write permission for that too).
>>
> I did a quick test on a -current client/server and it seems to work ok.
> The client uses SET_TIME_TO_SERVER and the server sets VA_UTIMES_NULL
> for this case. At least it works for a UFS exported volume.

It's not working for me with newnfs from 4 Mar 2012:

$ mount | grep /c
besplex:/c on /c (nfs, asynchronous)
$ ls -l /c/tmp/z
-rw-rw-rw-  1 root  wheel  0 Jan 16 15:12 /c/tmp/z
# Not even root owns it, since root on the client is mapped to 0xFFFFFFFFE.
$ touch /c/tmp/z
touch: /c/tmp/z: Operation not permitted
$ touch -r . /c/tmp/z
touch: /c/tmp/z: Operation not permitted
touch: /c/tmp/z: Operation not permitted

The error message from touch are confusing.  For plain touch:
- it fails twice using utimes(), with errno EPERM and no error message
- it then succeeds using read(), write() and truncate()
- it then prints an error message
- it then exits with status 0.
This is with an old version of touch.  It always prints an error message
if it reaches the read()/write()/truncate() step (rw() function):
- if rw() succeeded, then it prints an error message after the rw()
   returns.  rw() fails to preserve errno, so the errno for this step
   is garbage, but it is usually the one from the second failing utimes().
- if rw() fails, it prints an error message internally.  The errno for
   this is now correct.
The current version of touch is even more broken.  Someone removed the
rw() step from it, under the naive assumption that utimes() actually
works.

For touch -r:
- it fails twice using utimes(), with errno EPERM and no error message.
   Now even trying the second time (with utimes(NULL) is a bug.  A
   comment says that there is nothing else that we can do in this case,
   but the code actually falls through and does something wrong (it
   tries to set to the current time instead of to the specified time).
   This bug fixed in the current version.
- since it is not supposed to do anything more, it prints an error message
   after the first utimes() failure.  It also sets rval to 1 to give an
   exit status of 1 later.
- then it continues the same as for the plain touch case:
    - it then "succeeds" using read(), write() and truncate(), but this
      success is in clobbering the timestamps to the current time
    - it then prints an error message despite "succeeding"
- it then exits with status 1.

The nfs error is just for the second utimes() in the plain touch case.
This should succeed (it succeeds on a local ffs file system).  Also, when
it fails, the correct errno is EACCES, not EPERM.  This works correctly
after changing the file mode to readonly and using the buggy touch -r
to reach the second utimes() -- the error is now EACCES for both nfs
and local ffs.  So it seems that the server ffs is being reached
correctly, but the non-error case for utimes(NULL) is being mishandled
somewhere.  This is not due to some maproot magic, since the same error
occurs for the non-error case when the ownership is changed to a mere
user (!= the test user).

>> Oops, on looking at the code I now think it _is_ possible to pass the
>> request to set the current time on the server, since in the
>> NFSV3SATTRTIME_TOSERVER case we just pass this case value and not
>> any time value to the server, so the server has no option but to use
>> its current time. It is not surprising that the permissions checks
>> for this don't work right. I thought that the client was responsible
>> for most permissions checks, but can't find many or the relevant one
>> here. The NFSV3SATTRTIME_TOSERVER code on the server sets
>> VA_UTIMES_NULL, so I would have thought that the permissions check on
>> the server does the right thing.
>>
> As noted above, it seems to work correctly for the new server in -current,
> at least for UFS exports.
>
> Normally a server will do permission checking for NFS RPCs. There is nothing
> stopping a client from doing a check and returning an error, but traditionally
> a server has not trusted a client to do so. (I'm not sure if adding a check
> in the client is what jhb@ was referring to in his reply to this?)

Checking in the client doesn't seem right now.  The bug seems to be a
different one on the server.

>> There are some large timestamping bugs nearby:
>>
>> - the old nfs server code for NFSV3SATTRTIME_TOSERVER uses
>> getnanotime()
>> to read the current time. This violates the system's policy set by
>> the vfs.timestamp precision in most cases, since using getnanotime()
>> is the worst supported policy and is not the defaul.
>> ...
>
>> New nfs code never uses the correct function vfs_timestamp().
> This needs to be fixed. Until now, I would have had no idea what is the
> correct interface. (When I did the port, I just used a call that seemed
> to return what I wanted.;-)
>
> Having said that, after reading what you wrote below, it is not obvious
> to me what the correct fix is? (It seems to be a choice between microtime()
> and vfs_timestamp()?)

Just use vfs_timestamp() whenever generating a file timestamp but not for
other purposes.  Like permissions checking, the client very rarely generates
file timestamps, and even on the server most timestamps are not generated
by nfs directly.  So there are only a few places to check and change.  We
know about fifos and the utimes(NULL) case in the server (the latter is
emulating upper layers in vfs) before calling VOP_SETATTR().  I wonder
how well the fifo code works.  Its timestamps aren't very important, but
they should be synced to the server very occasionally.

Bruce

From owner-freebsd-fs@FreeBSD.ORG  Wed Jan 16 07:28:47 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: fs@freebsd.org
Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115])
 by hub.freebsd.org (Postfix) with ESMTP id 4B11081D
 for <fs@freebsd.org>; Wed, 16 Jan 2013 07:28:47 +0000 (UTC)
 (envelope-from meritoriouslyn4@jetstar.com)
Received: from mail.soundplugtw.com (mail.soundplugtw.com [220.130.230.4])
 by mx1.freebsd.org (Postfix) with ESMTP id 9B633EE4
 for <fs@freebsd.org>; Wed, 16 Jan 2013 07:27:06 +0000 (UTC)
Received: from [216.82.254.35:63758] by server-8.bemta-7.messagelabs.com id
 46/BB-25004-C0552805; Wed, 16 Jan 2013 15:27:05 +0800
Received: (qmail 20687 invoked from network); Wed, 16 Jan 2013 15:27:05 +0800
Received: from unknown (HELO sydeqximr01.corp.jetstar.com) (168.134.2.42)
 by server-15.tower-143.messagelabs.com with SMTP;
 Wed, 16 Jan 2013 15:27:05 +0800
Received: from SYDEQXITN04 (sydeqxitn04.corp.jetstar.com [172.23.145.89])
 by sydeqximr01.corp.jetstar.com (Postfix) with ESMTP id DA94058046
 for <<fs@freebsd.org>>; Wed, 16 Jan 2013 15:27:05 +0800
From: Jetstar <noreplyitineraries@jetstar.com>
To: <fs@freebsd.org>
Date: Wed, 16 Jan 2013 15:27:05 +0800
Subject: Jetstar Check-in Details
Message-Id: <20123171978166.5Z86445057@sydeqximr01.corp.jetstar.com>
MIME-Version: 1.0
Content-Type: multipart/mixed;
  boundary="----=a__itnirykf_04_10_35"
X-Content-Filtered-By: Mailman/MimeDel 2.1.14
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 16 Jan 2013 07:28:47 -0000

------=a__itnirykf_04_10_35
Content-Type: text/plain;
	charset="windows-1250"
Content-Transfer-Encoding: quoted-printable

              Jetstar Flight Itinerary (Booking ref# F2K3SN)          bod=
y, td{font-family:arial,sans-serif;font-size:13px} a:link, a:active {colo=
r:#1155CC; text-decoration:none} a:hover {text-decoration:underline; curs=
or: pointer} a:visited{color:##6611CC} img{border:0px} pre { white-space:=
 pre; white-space: -moz-pre-wrap; white-space: -o-pre-wrap; white-space: =
pre-wrap; word-wrap: break-word; width: 800px; overflow: auto;}          =
                                                                         =
                              Wed, 16 Jan 2013 15:27:05 +0800            =
                                                                         =
                                                                         =
                                                                         =
                                                                         =
                                                                         =
                     
                                                                         =
                                                                         =
                                                                         =
                                                                         =
                                                                         =
                                                                     Your=
 self Check-in Details are attached                                      =
                                                    
                                                                         =
                                                                         =
                                                                         =
                                                                         =
                                                                         =
                                           Booking Reference             =
                                           
                                                                         =
                                         M1KNFB                          =
                              
                                                                         =
                                                                         =
            													                                                =
  
                                                                         =
                                                                 
                                                                         =
                                               
                                                                         =
                             
                                                                         =
                         
                                                                         =
                         
                                                                         =
                                                                         =
                                                                         =
                                                                         =
                                                This is not a Boarding Pa=
ss                                      
                                                                         =
                             
                                                                         =
                                                                         =
                                  Your Itinerary is attached as file to p=
rint                                                                     =
         
                                                                         =
                             
                                                                         =
                                                                         =
          
                                                                         =
                                                                         =
                                                                         =
     
                                                                         =
                                                     Unsubscribe from Jet=
star Marketing Communications                                
                                                                         =
                                                                         =
                                                                         =
                     Baggage                                      
                                                                         =
                                           Cabin Baggage                 =
                     
                                                                         =
                                                                         =
                Your main item must not exceed the dimensions below      =
                                                                        
                                                                         =
                                                                         =
                                                                         =
                                                                         =
                                                                         =
                                               Flight                    =
                                Height                                   =
                 Width                                                   =
 Depth                                                  
                                                                         =
                                                                         =
      
                                                                         =
                             JQ, 3K, VF, GK                              =
                      56cm                                               =
     36cm                                                    23cm        =
                                          
                                                                         =
                                                                         =
          Carry-on Bag                                                   =
                                                   
                                                                         =
                                                                         =
                                                                         =
                                                                         =
                                                  Flight                 =
                                   Height                                =
                    Width                                                =
    Depth                                                  
                                                                         =
                                                                         =
      
                                                                         =
                             JQ, 3K, VF, GK                              =
                      114cm                                              =
      60cm                                                    11cm       =
                                           
                                                                         =
                                                                         =
          Suit Pack- non rigid                                           =
                                                           
                                                                         =
                                                                 
                                                                         =
                                               
                                                                         =
                                               If your cabin baggage exce=
eds any of these limits, you may be required to check in your baggage and=
 significant fees may apply.Your allowance:Economy Starter fares (includi=
ng Starter Plus and Starter Max): 1 main piece of Cabin Baggage and one s=
mall item up to a maximum total combined weight of 10kg for each passenge=
r for JQ/3K/VF/GK, or 7kg for BL operated flights.Business fares (includi=
ng Business Max fares): 2 main pieces of Cabin Baggage and one small item=
 for each passenger ,provided that each main item does not exceed 10kg, w=
ith a total combined Cabin Baggage weight of up to 20kg.All Small items m=
ust fit under the seat in front of you and may be one of the following it=
ems: small handbag, pocket book or purse, coat, umbrella or laptop. Infan=
ts do not have a baggage allowance if they do not occupy a paid seat. Jet=
star will carry strollers/prams and allow food/consumables for in-flight =
use for any infants/small children free of charge.                       =
                                                                         =
                        
                                                                         =
     Checked Baggage                                      
                                                                         =
                                                                         =
                Any one bag must not weigh more than 32kg or be higher th=
an the dimensions below                                                  =
                            
                                                                         =
                                                                         =
                                                                         =
                                                                         =
                                                                         =
                                         Flight                          =
                          Height                                         =
         
                                                                         =
                             A320, A321, B737                            =
                        190cm (6'3)                                      =
            
                                                                         =
                                                                   The ty=
pe of aircraft operating your flight can be found in your flight details =
above. Not all fares include a checked baggage allowance, see flight deta=
ils above for details of each passengers allowance.                      =
                      
                                                                         =
                                               
                                                                         =
                                           
                                                                         =
                                               Fare Rules                =
                          Your flights are governed by the particular far=
e rules of each selected fare. The fare rules give key information as to =
if and when the booking is refundable, what changes are permitted and wit=
hin what timeframe, and other key information you are required to know.Th=
e selected fare for each flight can be found in Flight Details above.    =
                                      Click to view your full fare rules =
                                                                         =
    
                                                                         =
                                           
                                                                         =
                                               Further Important Informat=
ion                                          Click here to view further i=
nformation regarding your booking and flight with Jetstar, such as furthe=
r baggage information, health requirements security information and in-fl=
ight product.                                                            =
                                                            
                                                                         =
                                           
                                                                         =
                                               Conditions of Carriage    =
                                      Your travel is subject to the Jetst=
ar Conditions of Carriage. Some of the applicable key conditions of carri=
age are provided in the link below. The full Conditions of Carriage are a=
vailable at the airport or on jetstar.com. If your journey is to another =
country, the Montreal or Warsaw Convention may govern and limit liability=
 for death or injury and for loss of or delay or damage to baggage. For m=
ore info view the link below.                                          Vi=
ew Jetstar key conditions of carriage                                    =
                                          
                                                                         =
                                           
                                                                         =
                                               No Flight Connections     =
                                     Unless you have been advised otherwi=
se by Jetstar, you must collect your Checked Baggage after each individua=
l flight. It is the Passenger's responsibility when making Bookings to al=
low time for Baggage collection and recheck and terminal transfer if requ=
ired. Please see our &#1300;ravel Info&#1312;section of Jetstar.com, and =
refer to our &#1281;t the Airport&#1312;page for further information. Tra=
vel insurance is recommended. Jetstar does not guarantee it will be able =
to carry you and your Baggage in accordance with the scheduled date and t=
ime of the flights specified. Schedules may change without notice for a r=
ange of reasons including but not limited to bad weather, air traffic con=
trol delays, strikes, technical disruptions and late inbound aircraft. Un=
less otherwise required by law, we will not be responsible for paying any=
 costs or expenses you may incur as a result of the changed time or cance=
llation.                                                                 =
             
                                                                         =
                                           
                                                                         =
     For all other details visit our Customer Service Page               =
                       
                                                                         =
                                                                   
                                                                  Jetstar=
 Airways Pty. Ltd.GPO Box 4713Melbourne VIC 3001AUSTRALIAABN: 33 069 720 =
243Travel Agents Licence Number VIC32696                                
                                                                         =
           
                                                                        T=
his e-mail is intended only to be read or used by the addressee. It is co=
nfidential and may contain legally privileged information. If you are not=
 the addressee indicated in this message (or responsible for delivery of =
the message to such person), you may not copy or deliver this message to =
anyone, and you should destroy this message and kindly notify the sender =
by reply e-mail. Confidentiality and legal privilege are not waived or lo=
st by reason of mistaken delivery to you.                                =
                 Jetstar Airways Pty Limited                         ABN =
33 069 720 243                                                           =
                                                                         =
                                                              
                                                      
------=a__itnirykf_04_10_35--


From owner-freebsd-fs@FreeBSD.ORG  Wed Jan 16 07:38:03 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115])
 by hub.freebsd.org (Postfix) with ESMTP id 48B2A93E;
 Wed, 16 Jan 2013 07:38:03 +0000 (UTC)
 (envelope-from nicolas@i.0x5.de)
Received: from n.0x5.de (n.0x5.de [217.197.85.144])
 by mx1.freebsd.org (Postfix) with ESMTP id 60986F49;
 Wed, 16 Jan 2013 07:38:02 +0000 (UTC)
Received: by pc5.i.0x5.de (Postfix, from userid 1003)
 id 3YmKyb6vnkz7ySF; Wed, 16 Jan 2013 08:37:59 +0100 (CET)
Date: Wed, 16 Jan 2013 08:37:59 +0100
From: Nicolas Rachinsky <fbsd-mas-0@ml.turing-complete.org>
To: Artem Belevich <art@freebsd.org>
Subject: Re: slowdown of zfs (tx->tx)
Message-ID: <20130116073759.GA47781@mid.pc5.i.0x5.de>
References: <20130111073417.GA95100@mid.pc5.i.0x5.de>
 <CAFqOu6gWpMsWN0pTBiv10WfwyGWMfO9GzMLWTtcVxHixr-_i3Q@mail.gmail.com>
 <20130114094010.GA75529@mid.pc5.i.0x5.de>
 <CAFqOu6hxfGt_M6Jo9qWeifDz9YnNc_Bd9H-GEe4RYtutaPvH5w@mail.gmail.com>
 <20130114195148.GA20540@mid.pc5.i.0x5.de>
 <CAFqOu6jwJ4qhbOovN_NhzusdQJvrbvUC3g93sziR=Uw99SGenw@mail.gmail.com>
 <20130114214652.GA76779@mid.pc5.i.0x5.de>
 <CAFqOu6jKX-Ks6C1RK5GwZ51ZVUSnGSe7S99_EfK+fwLPjAFFYw@mail.gmail.com>
 <20130115224556.GA41774@mid.pc5.i.0x5.de>
 <CAFqOu6jJnWdbikPmE1-UML5i_x7meF+iyY=9WBRyv2j7AeOaSg@mail.gmail.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <CAFqOu6jJnWdbikPmE1-UML5i_x7meF+iyY=9WBRyv2j7AeOaSg@mail.gmail.com>
X-Powered-by: FreeBSD
X-Homepage: http://www.rachinsky.de
X-PGP-Keyid: 887BAE72
X-PGP-Fingerprint: 039E 9433 115F BC5F F88D  4524 5092 45C4 887B AE72
X-PGP-Keys: http://www.rachinsky.de/nicolas/gpg/nicolas_rachinsky.asc
User-Agent: Mutt/1.5.21 (2010-09-15)
Cc: freebsd-fs <freebsd-fs@freebsd.org>
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 16 Jan 2013 07:38:03 -0000

* Artem Belevich <art@freebsd.org> [2013-01-15 16:16 -0800]:
> On Tue, Jan 15, 2013 at 2:45 PM, Nicolas Rachinsky
> <fbsd-mas-0@ml.turing-complete.org> wrote:
> >  147     0 100098 kernel           zio_write_issue_ mi_switch+0x176 sleepq_wait+0x42 _cv_wait+0x129 space_map_load_wait+0x20 metaslab_activate+0x73 metaslab_alloc+0x7b2 zio_dva_allocate+0x9a zio_execute+0xc3 taskqueue_run_locked+0x85 taskqueue_thread_loop+0x4e fork_exit+0x11f fork_trampoline+0xe
> 
> It appears that lots of threads are stuck in
> metaslab_activate->space_map_load_wait path. This sounds like CR#
> 6876962 in Solaris: "degraded write performance with threads held up
> by space_map_load_wait(). This bug is fixed in patch 147440-05, -06 or
> -07, which is current and contains the fix." Alas, I could not find
> specifics on how the issue got fixed and whether the same fix is
> present in illumos and FreeBSD.
> 
> You may want to update your system to very recent FreeBSD as quite a
> few fixes were recently imported from illumos. Hopefully it will deal
> with the issue. I'm out of ideas otherwise. Sorry.

Do you mean -CURRENT or -STABLE with very recent? Or just 9.1?

Thank you for your efforts!

Nicolas
-- 
http://www.rachinsky.de/nicolas

From owner-freebsd-fs@FreeBSD.ORG  Wed Jan 16 07:42:46 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115])
 by hub.freebsd.org (Postfix) with ESMTP id 92A1CC74;
 Wed, 16 Jan 2013 07:42:46 +0000 (UTC)
 (envelope-from nicolas@i.0x5.de)
Received: from n.0x5.de (n.0x5.de [217.197.85.144])
 by mx1.freebsd.org (Postfix) with ESMTP id 179DDF98;
 Wed, 16 Jan 2013 07:42:46 +0000 (UTC)
Received: by pc5.i.0x5.de (Postfix, from userid 1003)
 id 3YmL452m1Yz7ySF; Wed, 16 Jan 2013 08:42:45 +0100 (CET)
Date: Wed, 16 Jan 2013 08:42:45 +0100
From: Nicolas Rachinsky <fbsd-mas-0@ml.turing-complete.org>
To: Steven Hartland <killing@multiplay.co.uk>
Subject: Re: slowdown of zfs (tx->tx)
Message-ID: <20130116074245.GB47781@mid.pc5.i.0x5.de>
References: <CAFqOu6gWpMsWN0pTBiv10WfwyGWMfO9GzMLWTtcVxHixr-_i3Q@mail.gmail.com>
 <20130114094010.GA75529@mid.pc5.i.0x5.de>
 <CAFqOu6hxfGt_M6Jo9qWeifDz9YnNc_Bd9H-GEe4RYtutaPvH5w@mail.gmail.com>
 <20130114195148.GA20540@mid.pc5.i.0x5.de>
 <CAFqOu6jwJ4qhbOovN_NhzusdQJvrbvUC3g93sziR=Uw99SGenw@mail.gmail.com>
 <20130114214652.GA76779@mid.pc5.i.0x5.de>
 <CAFqOu6jKX-Ks6C1RK5GwZ51ZVUSnGSe7S99_EfK+fwLPjAFFYw@mail.gmail.com>
 <20130115224556.GA41774@mid.pc5.i.0x5.de>
 <CAFqOu6jJnWdbikPmE1-UML5i_x7meF+iyY=9WBRyv2j7AeOaSg@mail.gmail.com>
 <00F86FD0E85D4EEEA1A01E115497F022@multiplay.co.uk>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <00F86FD0E85D4EEEA1A01E115497F022@multiplay.co.uk>
X-Powered-by: FreeBSD
X-Homepage: http://www.rachinsky.de
X-PGP-Keyid: 887BAE72
X-PGP-Fingerprint: 039E 9433 115F BC5F F88D  4524 5092 45C4 887B AE72
X-PGP-Keys: http://www.rachinsky.de/nicolas/gpg/nicolas_rachinsky.asc
User-Agent: Mutt/1.5.21 (2010-09-15)
Cc: freebsd-fs <freebsd-fs@freebsd.org>
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 16 Jan 2013 07:42:46 -0000

* Steven Hartland <killing@multiplay.co.uk> [2013-01-16 00:57 -0000]:
> 
> ----- Original Message ----- From: "Artem Belevich"
> <art@freebsd.org>
> To: "Nicolas Rachinsky" <fbsd-mas-0@ml.turing-complete.org>
> Cc: "freebsd-fs" <freebsd-fs@freebsd.org>
> Sent: Wednesday, January 16, 2013 12:16 AM
> Subject: Re: slowdown of zfs (tx->tx)
> 
> 
> >It appears that lots of threads are stuck in
> >metaslab_activate->space_map_load_wait path. This sounds like CR#
> >6876962 in Solaris: "degraded write performance with threads held up
> >by space_map_load_wait(). This bug is fixed in patch 147440-05, -06 or
> >-07, which is current and contains the fix." Alas, I could not find
> >specifics on how the issue got fixed and whether the same fix is
> >present in illumos and FreeBSD.
> 
> That would tend to indicate its blocking on write. If this is the case
> yet the rsync is copying from this box, with little else doing writes
> it could be atime which is causing the issue.

I was probably misformulating my mail. The rsync writes to the local
zpool.

> A test for this would be to use the following to disable atime and see
> if that helps:
> zfs set atime=off [filesystem]
> 
> Also out of interest does the pool have many snapshots?

There are 115 filesystems. 84 of these have between 10 and 20
snapshots.

Nicolas
-- 
http://www.rachinsky.de/nicolas

From owner-freebsd-fs@FreeBSD.ORG  Wed Jan 16 08:45:04 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by hub.freebsd.org (Postfix) with ESMTP id A99CA2C6
 for <freebsd-fs@freebsd.org>; Wed, 16 Jan 2013 08:45:04 +0000 (UTC)
 (envelope-from artemb@gmail.com)
Received: from mail-vb0-f53.google.com (mail-vb0-f53.google.com
 [209.85.212.53]) by mx1.freebsd.org (Postfix) with ESMTP id 6B99035F
 for <freebsd-fs@freebsd.org>; Wed, 16 Jan 2013 08:45:04 +0000 (UTC)
Received: by mail-vb0-f53.google.com with SMTP id b23so1039545vbz.26
 for <freebsd-fs@freebsd.org>; Wed, 16 Jan 2013 00:45:01 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
 h=mime-version:x-received:sender:in-reply-to:references:date
 :x-google-sender-auth:message-id:subject:from:to:cc:content-type;
 bh=C3Jkc6hVnOvIT58v6Kf2LAxUDOYu5fzO+97ZfbORdoI=;
 b=qUxrD63WSN+dyaDeaDbT6Zv7eDvdLsf69b3WIO7tGmDLkRyNzqVSzi5GR1h4bxl4Cc
 t4AnapxPmS4NxOVCjC5Pqb3cjU683WpjQ8Ofl+dGXyY2uFsNOLR5O8AMOcpEI6TwfWCf
 IVhTVm6Zk27JKB21EYobjhmQgooKFVZxVW9Py5xT8cAeIfJlmylCXPzjZoTT1VfBiPzJ
 R5N954n6BSeVcyun1ktdg5zX20HU7qNjbYCWTl5sKkbOtE7g7Ve279Sv1zKY52f3jS5g
 I+C7Ch6brN9r3UvkTRVSwZiYFgHjBjIi6b1pDtEsJYHtK1z5w3d0DRGyKjgP//SQEftZ
 JFow==
MIME-Version: 1.0
X-Received: by 10.52.156.40 with SMTP id wb8mr273529vdb.39.1358325901454; Wed,
 16 Jan 2013 00:45:01 -0800 (PST)
Sender: artemb@gmail.com
Received: by 10.220.122.196 with HTTP; Wed, 16 Jan 2013 00:45:01 -0800 (PST)
In-Reply-To: <20130116073759.GA47781@mid.pc5.i.0x5.de>
References: <20130111073417.GA95100@mid.pc5.i.0x5.de>
 <CAFqOu6gWpMsWN0pTBiv10WfwyGWMfO9GzMLWTtcVxHixr-_i3Q@mail.gmail.com>
 <20130114094010.GA75529@mid.pc5.i.0x5.de>
 <CAFqOu6hxfGt_M6Jo9qWeifDz9YnNc_Bd9H-GEe4RYtutaPvH5w@mail.gmail.com>
 <20130114195148.GA20540@mid.pc5.i.0x5.de>
 <CAFqOu6jwJ4qhbOovN_NhzusdQJvrbvUC3g93sziR=Uw99SGenw@mail.gmail.com>
 <20130114214652.GA76779@mid.pc5.i.0x5.de>
 <CAFqOu6jKX-Ks6C1RK5GwZ51ZVUSnGSe7S99_EfK+fwLPjAFFYw@mail.gmail.com>
 <20130115224556.GA41774@mid.pc5.i.0x5.de>
 <CAFqOu6jJnWdbikPmE1-UML5i_x7meF+iyY=9WBRyv2j7AeOaSg@mail.gmail.com>
 <20130116073759.GA47781@mid.pc5.i.0x5.de>
Date: Wed, 16 Jan 2013 00:45:01 -0800
X-Google-Sender-Auth: B8y6J4yhknJ3SFmTekEizE9Tq_w
Message-ID: <CAFqOu6j1T1gntgUm6eS1FZZAjoVyZH+eq7HAduDsOA36rJ+KhA@mail.gmail.com>
Subject: Re: slowdown of zfs (tx->tx)
From: Artem Belevich <art@freebsd.org>
To: Nicolas Rachinsky <fbsd-mas-0@ml.turing-complete.org>
Content-Type: text/plain; charset=ISO-8859-1
Cc: freebsd-fs <freebsd-fs@freebsd.org>
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 16 Jan 2013 08:45:04 -0000

On Tue, Jan 15, 2013 at 11:37 PM, Nicolas Rachinsky
<fbsd-mas-0@ml.turing-complete.org> wrote:
>> You may want to update your system to very recent FreeBSD as quite a
>> few fixes were recently imported from illumos. Hopefully it will deal
>> with the issue. I'm out of ideas otherwise. Sorry.
>
> Do you mean -CURRENT or -STABLE with very recent? Or just 9.1?

-HEAD or -STABLE (-8 or -9).

--Artem

From owner-freebsd-fs@FreeBSD.ORG  Wed Jan 16 09:39:42 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@FreeBSD.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by hub.freebsd.org (Postfix) with ESMTP id F05B6277;
 Wed, 16 Jan 2013 09:39:42 +0000 (UTC) (envelope-from avg@FreeBSD.org)
Received: from citadel.icyb.net.ua (citadel.icyb.net.ua [212.40.38.140])
 by mx1.freebsd.org (Postfix) with ESMTP id 0C3E8875;
 Wed, 16 Jan 2013 09:39:41 +0000 (UTC)
Received: from porto.starpoint.kiev.ua (porto-e.starpoint.kiev.ua
 [212.40.38.100])
 by citadel.icyb.net.ua (8.8.8p3/ICyb-2.3exp) with ESMTP id LAA29126;
 Wed, 16 Jan 2013 11:39:31 +0200 (EET) (envelope-from avg@FreeBSD.org)
Received: from localhost ([127.0.0.1])
 by porto.starpoint.kiev.ua with esmtp (Exim 4.34 (FreeBSD))
 id 1TvPT1-000Nsg-DI; Wed, 16 Jan 2013 11:39:31 +0200
Message-ID: <50F67551.5020704@FreeBSD.org>
Date: Wed, 16 Jan 2013 11:39:29 +0200
From: Andriy Gapon <avg@FreeBSD.org>
User-Agent: Mozilla/5.0 (X11; FreeBSD amd64;
 rv:17.0) Gecko/17.0 Thunderbird/17.0
MIME-Version: 1.0
To: Nicolas Rachinsky <fbsd-mas-0@ml.turing-complete.org>
Subject: Re: slowdown of zfs (tx->tx)
References: <CAFqOu6jrng=v8eVyhqV-PBqJM_dYy+U7X4+=ahBeoxvK4mxcSA@mail.gmail.com>
 <20130110193949.GA10023@mid.pc5.i.0x5.de>
 <20130111073417.GA95100@mid.pc5.i.0x5.de>
 <CAFqOu6gWpMsWN0pTBiv10WfwyGWMfO9GzMLWTtcVxHixr-_i3Q@mail.gmail.com>
 <20130114094010.GA75529@mid.pc5.i.0x5.de>
 <CAFqOu6hxfGt_M6Jo9qWeifDz9YnNc_Bd9H-GEe4RYtutaPvH5w@mail.gmail.com>
 <20130114195148.GA20540@mid.pc5.i.0x5.de>
 <CAFqOu6jwJ4qhbOovN_NhzusdQJvrbvUC3g93sziR=Uw99SGenw@mail.gmail.com>
 <20130114214652.GA76779@mid.pc5.i.0x5.de>
 <CAFqOu6jKX-Ks6C1RK5GwZ51ZVUSnGSe7S99_EfK+fwLPjAFFYw@mail.gmail.com>
 <20130115224556.GA41774@mid.pc5.i.0x5.de>
 <CAFqOu6jJnWdbikPmE1-UML5i_x7meF+iyY=9WBRyv2j7AeOaSg@mail.gmail.com>
In-Reply-To: <CAFqOu6jJnWdbikPmE1-UML5i_x7meF+iyY=9WBRyv2j7AeOaSg@mail.gmail.com>
X-Enigmail-Version: 1.4.6
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 7bit
Cc: freebsd-fs <freebsd-fs@FreeBSD.org>
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 16 Jan 2013 09:39:43 -0000

on 16/01/2013 02:16 Artem Belevich said the following:
> It appears that lots of threads are stuck in
> metaslab_activate->space_map_load_wait path.

Nicolas,

another thing to check - is your pool nearly full.

-- 
Andriy Gapon

From owner-freebsd-fs@FreeBSD.ORG  Wed Jan 16 09:50:10 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@FreeBSD.org
Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115])
 by hub.freebsd.org (Postfix) with ESMTP id DD1A9471;
 Wed, 16 Jan 2013 09:50:10 +0000 (UTC)
 (envelope-from nicolas@i.0x5.de)
Received: from n.0x5.de (n.0x5.de [217.197.85.144])
 by mx1.freebsd.org (Postfix) with ESMTP id 97B7591E;
 Wed, 16 Jan 2013 09:50:10 +0000 (UTC)
Received: by pc5.i.0x5.de (Postfix, from userid 1003)
 id 3YmNv50LWCz7ySc; Wed, 16 Jan 2013 10:50:09 +0100 (CET)
Date: Wed, 16 Jan 2013 10:50:09 +0100
From: Nicolas Rachinsky <fbsd-mas-0@ml.turing-complete.org>
To: Andriy Gapon <avg@FreeBSD.org>
Subject: Re: slowdown of zfs (tx->tx)
Message-ID: <20130116095009.GA36867@mid.pc5.i.0x5.de>
References: <CAFqOu6gWpMsWN0pTBiv10WfwyGWMfO9GzMLWTtcVxHixr-_i3Q@mail.gmail.com>
 <20130114094010.GA75529@mid.pc5.i.0x5.de>
 <CAFqOu6hxfGt_M6Jo9qWeifDz9YnNc_Bd9H-GEe4RYtutaPvH5w@mail.gmail.com>
 <20130114195148.GA20540@mid.pc5.i.0x5.de>
 <CAFqOu6jwJ4qhbOovN_NhzusdQJvrbvUC3g93sziR=Uw99SGenw@mail.gmail.com>
 <20130114214652.GA76779@mid.pc5.i.0x5.de>
 <CAFqOu6jKX-Ks6C1RK5GwZ51ZVUSnGSe7S99_EfK+fwLPjAFFYw@mail.gmail.com>
 <20130115224556.GA41774@mid.pc5.i.0x5.de>
 <CAFqOu6jJnWdbikPmE1-UML5i_x7meF+iyY=9WBRyv2j7AeOaSg@mail.gmail.com>
 <50F67551.5020704@FreeBSD.org>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <50F67551.5020704@FreeBSD.org>
X-Powered-by: FreeBSD
X-Homepage: http://www.rachinsky.de
X-PGP-Keyid: 887BAE72
X-PGP-Fingerprint: 039E 9433 115F BC5F F88D  4524 5092 45C4 887B AE72
X-PGP-Keys: http://www.rachinsky.de/nicolas/gpg/nicolas_rachinsky.asc
User-Agent: Mutt/1.5.21 (2010-09-15)
Cc: freebsd-fs <freebsd-fs@FreeBSD.org>
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 16 Jan 2013 09:50:10 -0000

* Andriy Gapon <avg@FreeBSD.org> [2013-01-16 11:39 +0200]:
> on 16/01/2013 02:16 Artem Belevich said the following:
> > It appears that lots of threads are stuck in
> > metaslab_activate->space_map_load_wait path.
> 
> another thing to check - is your pool nearly full.

Don't think so:
NAME                                 USED  AVAIL  REFER  MOUNTPOINT
pool1                               5.52T   697G  11.9M  /pool1

Nicolas
-- 
http://www.rachinsky.de/nicolas

From owner-freebsd-fs@FreeBSD.ORG  Wed Jan 16 10:14:33 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@FreeBSD.org
Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115])
 by hub.freebsd.org (Postfix) with ESMTP id DE53FB06;
 Wed, 16 Jan 2013 10:14:33 +0000 (UTC)
 (envelope-from prvs=1728d5906c=killing@multiplay.co.uk)
Received: from mail1.multiplay.co.uk (mail1.multiplay.co.uk [85.236.96.23])
 by mx1.freebsd.org (Postfix) with ESMTP id 5C08DA5C;
 Wed, 16 Jan 2013 10:14:32 +0000 (UTC)
Received: from r2d2 ([188.220.16.49])
 by mail1.multiplay.co.uk (mail1.multiplay.co.uk [85.236.96.23])
 (MDaemon PRO v10.0.4) with ESMTP id md50001723873.msg;
 Wed, 16 Jan 2013 10:14:30 +0000
X-Spam-Processed: mail1.multiplay.co.uk, Wed, 16 Jan 2013 10:14:30 +0000
 (not processed: message from valid local sender)
X-MDRemoteIP: 188.220.16.49
X-Return-Path: prvs=1728d5906c=killing@multiplay.co.uk
X-Envelope-From: killing@multiplay.co.uk
Message-ID: <FD780217EA4548F187715AF1AAF2B91A@multiplay.co.uk>
From: "Steven Hartland" <killing@multiplay.co.uk>
To: "Nicolas Rachinsky" <fbsd-mas-0@ml.turing-complete.org>,
 "Andriy Gapon" <avg@FreeBSD.org>
References: <CAFqOu6gWpMsWN0pTBiv10WfwyGWMfO9GzMLWTtcVxHixr-_i3Q@mail.gmail.com>
 <20130114094010.GA75529@mid.pc5.i.0x5.de>
 <CAFqOu6hxfGt_M6Jo9qWeifDz9YnNc_Bd9H-GEe4RYtutaPvH5w@mail.gmail.com>
 <20130114195148.GA20540@mid.pc5.i.0x5.de>
 <CAFqOu6jwJ4qhbOovN_NhzusdQJvrbvUC3g93sziR=Uw99SGenw@mail.gmail.com>
 <20130114214652.GA76779@mid.pc5.i.0x5.de>
 <CAFqOu6jKX-Ks6C1RK5GwZ51ZVUSnGSe7S99_EfK+fwLPjAFFYw@mail.gmail.com>
 <20130115224556.GA41774@mid.pc5.i.0x5.de>
 <CAFqOu6jJnWdbikPmE1-UML5i_x7meF+iyY=9WBRyv2j7AeOaSg@mail.gmail.com>
 <50F67551.5020704@FreeBSD.org> <20130116095009.GA36867@mid.pc5.i.0x5.de>
Subject: Re: slowdown of zfs (tx->tx)
Date: Wed, 16 Jan 2013 10:14:54 -0000
MIME-Version: 1.0
Content-Type: text/plain; format=flowed; charset="iso-8859-1";
 reply-type=original
Content-Transfer-Encoding: 7bit
X-Priority: 3
X-MSMail-Priority: Normal
X-Mailer: Microsoft Outlook Express 6.00.2900.5931
X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2900.6157
Cc: freebsd-fs <freebsd-fs@FreeBSD.org>
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 16 Jan 2013 10:14:33 -0000

----- Original Message ----- 
From: "Nicolas Rachinsky"

>* Andriy Gapon <avg@FreeBSD.org> [2013-01-16 11:39 +0200]:
>> on 16/01/2013 02:16 Artem Belevich said the following:
>> > It appears that lots of threads are stuck in
>> > metaslab_activate->space_map_load_wait path.
>> 
>> another thing to check - is your pool nearly full.
> 
> Don't think so:
> NAME                                 USED  AVAIL  REFER  MOUNTPOINT
> pool1                               5.52T   697G  11.9M  /pool1

You only have ~11% free so yer it is pretty full ;-)

    Regards
    Steve

================================================
This e.mail is private and confidential between Multiplay (UK) Ltd. and the person or entity to whom it is addressed. In the event of misdirection, the recipient is prohibited from using, copying, printing or otherwise disseminating it or any information contained in it. 

In the event of misdirection, illegible or incomplete transmission please telephone +44 845 868 1337
or return the E.mail to postmaster@multiplay.co.uk.


From owner-freebsd-fs@FreeBSD.ORG  Wed Jan 16 10:20:25 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115])
 by hub.freebsd.org (Postfix) with ESMTP id 7C89BC2D;
 Wed, 16 Jan 2013 10:20:25 +0000 (UTC)
 (envelope-from prvs=1728d5906c=killing@multiplay.co.uk)
Received: from mail1.multiplay.co.uk (mail1.multiplay.co.uk [85.236.96.23])
 by mx1.freebsd.org (Postfix) with ESMTP id 7EBD1AAD;
 Wed, 16 Jan 2013 10:20:24 +0000 (UTC)
Received: from r2d2 ([188.220.16.49])
 by mail1.multiplay.co.uk (mail1.multiplay.co.uk [85.236.96.23])
 (MDaemon PRO v10.0.4) with ESMTP id md50001723941.msg;
 Wed, 16 Jan 2013 10:20:22 +0000
X-Spam-Processed: mail1.multiplay.co.uk, Wed, 16 Jan 2013 10:20:22 +0000
 (not processed: message from valid local sender)
X-MDRemoteIP: 188.220.16.49
X-Return-Path: prvs=1728d5906c=killing@multiplay.co.uk
X-Envelope-From: killing@multiplay.co.uk
Message-ID: <98723B7F45F643F3A96FDB6B9285E935@multiplay.co.uk>
From: "Steven Hartland" <killing@multiplay.co.uk>
To: "Nicolas Rachinsky" <fbsd-mas-0@ml.turing-complete.org>
References: <CAFqOu6gWpMsWN0pTBiv10WfwyGWMfO9GzMLWTtcVxHixr-_i3Q@mail.gmail.com>
 <20130114094010.GA75529@mid.pc5.i.0x5.de>
 <CAFqOu6hxfGt_M6Jo9qWeifDz9YnNc_Bd9H-GEe4RYtutaPvH5w@mail.gmail.com>
 <20130114195148.GA20540@mid.pc5.i.0x5.de>
 <CAFqOu6jwJ4qhbOovN_NhzusdQJvrbvUC3g93sziR=Uw99SGenw@mail.gmail.com>
 <20130114214652.GA76779@mid.pc5.i.0x5.de>
 <CAFqOu6jKX-Ks6C1RK5GwZ51ZVUSnGSe7S99_EfK+fwLPjAFFYw@mail.gmail.com>
 <20130115224556.GA41774@mid.pc5.i.0x5.de>
 <CAFqOu6jJnWdbikPmE1-UML5i_x7meF+iyY=9WBRyv2j7AeOaSg@mail.gmail.com>
 <00F86FD0E85D4EEEA1A01E115497F022@multiplay.co.uk>
 <20130116074245.GB47781@mid.pc5.i.0x5.de>
Subject: Re: slowdown of zfs (tx->tx)
Date: Wed, 16 Jan 2013 10:20:46 -0000
MIME-Version: 1.0
Content-Type: text/plain; format=flowed; charset="iso-8859-1";
 reply-type=original
Content-Transfer-Encoding: 7bit
X-Priority: 3
X-MSMail-Priority: Normal
X-Mailer: Microsoft Outlook Express 6.00.2900.5931
X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2900.6157
Cc: freebsd-fs <freebsd-fs@freebsd.org>
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 16 Jan 2013 10:20:25 -0000


----- Original Message ----- 
From: "Nicolas Rachinsky" <fbsd-mas-0@ml.turing-complete.org>
To: "Steven Hartland" <killing@multiplay.co.uk>
Cc: "Artem Belevich" <art@freebsd.org>; "freebsd-fs" <freebsd-fs@freebsd.org>
Sent: Wednesday, January 16, 2013 7:42 AM
Subject: Re: slowdown of zfs (tx->tx)


>* Steven Hartland <killing@multiplay.co.uk> [2013-01-16 00:57 -0000]:
>> 
>> ----- Original Message ----- From: "Artem Belevich"
>> <art@freebsd.org>
>> To: "Nicolas Rachinsky" <fbsd-mas-0@ml.turing-complete.org>
>> Cc: "freebsd-fs" <freebsd-fs@freebsd.org>
>> Sent: Wednesday, January 16, 2013 12:16 AM
>> Subject: Re: slowdown of zfs (tx->tx)
>> 
>> 
>> >It appears that lots of threads are stuck in
>> >metaslab_activate->space_map_load_wait path. This sounds like CR#
>> >6876962 in Solaris: "degraded write performance with threads held up
>> >by space_map_load_wait(). This bug is fixed in patch 147440-05, -06 or
>> >-07, which is current and contains the fix." Alas, I could not find
>> >specifics on how the issue got fixed and whether the same fix is
>> >present in illumos and FreeBSD.
>> 
>> That would tend to indicate its blocking on write. If this is the case
>> yet the rsync is copying from this box, with little else doing writes
>> it could be atime which is causing the issue.
> 
> I was probably misformulating my mail. The rsync writes to the local
> zpool.
> 
>> A test for this would be to use the following to disable atime and see
>> if that helps:
>> zfs set atime=off [filesystem]

If you don't need atime I would still recommend setting atime=off.

>> Also out of interest does the pool have many snapshots?
> 
> There are 115 filesystems. 84 of these have between 10 and 20
> snapshots.

Hmm so over 1000 snapshots, that's not going to help. If there's are something
that's built up over time + increased disk usage, that could well explain
the slowdown your seeing and would also explain why your seeing threads
taking time in metaslab_activate->space_map_load_wait.

Are the snapshots something you can clear down and test to see if that
improves things?

Out of interested what type of data are you working with and is it
compressible? If it is, it might be worth testing with compression
enabled.

    Regards
    Steve

================================================
This e.mail is private and confidential between Multiplay (UK) Ltd. and the person or entity to whom it is addressed. In the event of misdirection, the recipient is prohibited from using, copying, printing or otherwise disseminating it or any information contained in it. 

In the event of misdirection, illegible or incomplete transmission please telephone +44 845 868 1337
or return the E.mail to postmaster@multiplay.co.uk.


From owner-freebsd-fs@FreeBSD.ORG  Wed Jan 16 12:05:35 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@FreeBSD.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by hub.freebsd.org (Postfix) with ESMTP id 1BAD7574
 for <freebsd-fs@FreeBSD.org>; Wed, 16 Jan 2013 12:05:35 +0000 (UTC)
 (envelope-from avg@FreeBSD.org)
Received: from citadel.icyb.net.ua (citadel.icyb.net.ua [212.40.38.140])
 by mx1.freebsd.org (Postfix) with ESMTP id 61E14140
 for <freebsd-fs@FreeBSD.org>; Wed, 16 Jan 2013 12:05:34 +0000 (UTC)
Received: from porto.starpoint.kiev.ua (porto-e.starpoint.kiev.ua
 [212.40.38.100])
 by citadel.icyb.net.ua (8.8.8p3/ICyb-2.3exp) with ESMTP id OAA00475;
 Wed, 16 Jan 2013 14:05:30 +0200 (EET) (envelope-from avg@FreeBSD.org)
Received: from localhost ([127.0.0.1])
 by porto.starpoint.kiev.ua with esmtp (Exim 4.34 (FreeBSD))
 id 1TvRkI-000O2E-51; Wed, 16 Jan 2013 14:05:30 +0200
Message-ID: <50F69788.2040506@FreeBSD.org>
Date: Wed, 16 Jan 2013 14:05:28 +0200
From: Andriy Gapon <avg@FreeBSD.org>
User-Agent: Mozilla/5.0 (X11; FreeBSD amd64;
 rv:17.0) Gecko/17.0 Thunderbird/17.0
MIME-Version: 1.0
To: Nicolas Rachinsky <fbsd-mas-0@ml.turing-complete.org>
Subject: Re: slowdown of zfs (tx->tx)
References: <CAFqOu6gWpMsWN0pTBiv10WfwyGWMfO9GzMLWTtcVxHixr-_i3Q@mail.gmail.com>
 <20130114094010.GA75529@mid.pc5.i.0x5.de>
 <CAFqOu6hxfGt_M6Jo9qWeifDz9YnNc_Bd9H-GEe4RYtutaPvH5w@mail.gmail.com>
 <20130114195148.GA20540@mid.pc5.i.0x5.de>
 <CAFqOu6jwJ4qhbOovN_NhzusdQJvrbvUC3g93sziR=Uw99SGenw@mail.gmail.com>
 <20130114214652.GA76779@mid.pc5.i.0x5.de>
 <CAFqOu6jKX-Ks6C1RK5GwZ51ZVUSnGSe7S99_EfK+fwLPjAFFYw@mail.gmail.com>
 <20130115224556.GA41774@mid.pc5.i.0x5.de>
 <CAFqOu6jJnWdbikPmE1-UML5i_x7meF+iyY=9WBRyv2j7AeOaSg@mail.gmail.com>
 <50F67551.5020704@FreeBSD.org> <20130116095009.GA36867@mid.pc5.i.0x5.de>
 <FD780217EA4548F187715AF1AAF2B91A@multiplay.co.uk>
In-Reply-To: <FD780217EA4548F187715AF1AAF2B91A@multiplay.co.uk>
X-Enigmail-Version: 1.4.6
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 7bit
Cc: freebsd-fs <freebsd-fs@FreeBSD.org>
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 16 Jan 2013 12:05:35 -0000

on 16/01/2013 12:14 Steven Hartland said the following:
> ----- Original Message ----- From: "Nicolas Rachinsky"
> 
>> * Andriy Gapon <avg@FreeBSD.org> [2013-01-16 11:39 +0200]:
>>> on 16/01/2013 02:16 Artem Belevich said the following:
>>> > It appears that lots of threads are stuck in
>>> > metaslab_activate->space_map_load_wait path.
>>>
>>> another thing to check - is your pool nearly full.
>>
>> Don't think so:
>> NAME                                 USED  AVAIL  REFER  MOUNTPOINT
>> pool1                               5.52T   697G  11.9M  /pool1
> 
> You only have ~11% free so yer it is pretty full ;-)

Nicolas,

just in case, Steve is not kidding.

Those free hundreds of gigabytes could be spread over the terabytes and could be
quite fragmented if the pool has a history of adding and removing lots of files.
 ZFS could be spending quite a lot of time in that case when it looks for some
free space and tries to minimize further fragmentation.

Empirical/anecdotal safe limit on pool utilization is said to be about 70-80%.

You can test if this guess is true by doing the following:
kgdb -w
(kgdb) set metaslab_min_alloc_size=4096

If performance noticeably improves after that, then this is your problem indeed.

-- 
Andriy Gapon

From owner-freebsd-fs@FreeBSD.ORG  Wed Jan 16 13:01:23 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by hub.freebsd.org (Postfix) with ESMTP id 27307CE8;
 Wed, 16 Jan 2013 13:01:23 +0000 (UTC) (envelope-from feld@feld.me)
Received: from feld.me (unknown [IPv6:2607:f4e0:100:300::2])
 by mx1.freebsd.org (Postfix) with ESMTP id EBF2F680;
 Wed, 16 Jan 2013 13:01:22 +0000 (UTC)
DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=feld.me;
 s=blargle; 
 h=In-Reply-To:Message-Id:From:Mime-Version:Date:References:Subject:Cc:To:Content-Type;
 bh=DZp3JXw6VlNx5xcPoteJ0jCEm4wk1vL/UOH1VAdtMVw=; 
 b=ohtn4sAFxTYUlXYroLgRYEJ0xeLYjUODKf7vaziOceUbylyGlS1UFef6QPFvVKNscdrMf0wcJrREcQJf2c6R/GXCi4ctq2ALfmuGK3jlRt4C4heHdGMoeFxjGu9QTOoH;
Received: from localhost ([127.0.0.1] helo=mwi1.coffeenet.org)
 by feld.me with esmtp (Exim 4.80.1 (FreeBSD))
 (envelope-from <feld@feld.me>)
 id 1TvScG-0007uH-7t; Wed, 16 Jan 2013 07:01:16 -0600
Received: from feld@feld.me by mwi1.coffeenet.org (Archiveopteryx 3.1.4)
 with esmtpsa id 1358341270-24241-86284/5/1; Wed, 16 Jan 2013 13:01:10
 +0000
Content-Type: text/plain; format=flowed; delsp=yes
To: Andriy Gapon <avg@freebsd.org>,
 Nicolas Rachinsky <fbsd-mas-0@ml.turing-complete.org>
Subject: Re: slowdown of zfs (tx->tx)
References: <CAFqOu6gWpMsWN0pTBiv10WfwyGWMfO9GzMLWTtcVxHixr-_i3Q@mail.gmail.com>
 <20130114094010.GA75529@mid.pc5.i.0x5.de>
 <CAFqOu6hxfGt_M6Jo9qWeifDz9YnNc_Bd9H-GEe4RYtutaPvH5w@mail.gmail.com>
 <20130114195148.GA20540@mid.pc5.i.0x5.de>
 <CAFqOu6jwJ4qhbOovN_NhzusdQJvrbvUC3g93sziR=Uw99SGenw@mail.gmail.com>
 <20130114214652.GA76779@mid.pc5.i.0x5.de>
 <CAFqOu6jKX-Ks6C1RK5GwZ51ZVUSnGSe7S99_EfK+fwLPjAFFYw@mail.gmail.com>
 <20130115224556.GA41774@mid.pc5.i.0x5.de>
 <CAFqOu6jJnWdbikPmE1-UML5i_x7meF+iyY=9WBRyv2j7AeOaSg@mail.gmail.com>
 <50F67551.5020704@FreeBSD.org> <20130116095009.GA36867@mid.pc5.i.0x5.de>
Date: Wed, 16 Jan 2013 07:01:10 -0600
Mime-Version: 1.0
From: Mark Felder <feld@feld.me>
Message-Id: <op.wqz8nxyt34t2sn@markf.office.supranet.net>
In-Reply-To: <20130116095009.GA36867@mid.pc5.i.0x5.de>
User-Agent: Opera Mail/12.12 (FreeBSD)
Cc: freebsd-fs@freebsd.org
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 16 Jan 2013 13:01:23 -0000

On Wed, 16 Jan 2013 03:50:09 -0600, Nicolas Rachinsky  
<fbsd-mas-0@ml.turing-complete.org> wrote:

> * Andriy Gapon <avg@FreeBSD.org> [2013-01-16 11:39 +0200]:
>> on 16/01/2013 02:16 Artem Belevich said the following:
>> > It appears that lots of threads are stuck in
>> > metaslab_activate->space_map_load_wait path.
>>
>> another thing to check - is your pool nearly full.
>
> Don't think so:
> NAME                                 USED  AVAIL  REFER  MOUNTPOINT
> pool1                               5.52T   697G  11.9M  /pool1
>

Never let your ZFS pool go above 80% or you'll have very, very poor  
performance.

From owner-freebsd-fs@FreeBSD.ORG  Wed Jan 16 14:26:47 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by hub.freebsd.org (Postfix) with ESMTP id D684FF76;
 Wed, 16 Jan 2013 14:26:47 +0000 (UTC)
 (envelope-from rmacklem@uoguelph.ca)
Received: from esa-annu.net.uoguelph.ca (esa-annu.mail.uoguelph.ca
 [131.104.91.36]) by mx1.freebsd.org (Postfix) with ESMTP id 65382C41;
 Wed, 16 Jan 2013 14:26:47 +0000 (UTC)
X-IronPort-Anti-Spam-Filtered: true
X-IronPort-Anti-Spam-Result: Ap8EAAO49lCDaFvO/2dsb2JhbABFhjq3YnOCHgEBBAEjVgUWDgoCAg0ZAlkGiCYGpmmRKYEji1KDMIETA4hhjSuQSYMTggY
X-IronPort-AV: E=Sophos;i="4.84,479,1355115600"; 
   d="scan'208";a="9317566"
Received: from erie.cs.uoguelph.ca (HELO zcs3.mail.uoguelph.ca)
 ([131.104.91.206])
 by esa-annu.net.uoguelph.ca with ESMTP; 16 Jan 2013 09:26:46 -0500
Received: from zcs3.mail.uoguelph.ca (localhost.localdomain [127.0.0.1])
 by zcs3.mail.uoguelph.ca (Postfix) with ESMTP id 08BA7B3F15;
 Wed, 16 Jan 2013 09:26:46 -0500 (EST)
Date: Wed, 16 Jan 2013 09:26:46 -0500 (EST)
From: Rick Macklem <rmacklem@uoguelph.ca>
To: Bruce Evans <brde@optusnet.com.au>
Message-ID: <1642392672.2036529.1358346406018.JavaMail.root@erie.cs.uoguelph.ca>
In-Reply-To: <20130116151051.O1060@besplex.bde.org>
Subject: Re: [PATCH] Better handle NULL utimes() in the NFS client
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit
X-Originating-IP: [172.17.91.203]
X-Mailer: Zimbra 6.0.10_GA_2692 (ZimbraWebClient - FF3.0 (Win)/6.0.10_GA_2692)
Cc: Rick Macklem <rmacklem@FreeBSD.org>, fs@FreeBSD.org
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 16 Jan 2013 14:26:47 -0000

Bruce Evans wrote:
> On Tue, 15 Jan 2013, Rick Macklem wrote:
> 
> > Bruce Evans wrote:
> 
> >> I can't see anything that does the different permissions check for
> >> the VA_UTIMES_NULL case, and testing shows that this case is just
> >> broken,
> >> at least for an old version of the old nfs client -- the same
> >> permissions
> >> are required for all cases, but write permission is supposed to be
> >> enough for the VA_UTIMES_NULL case (since write permission is
> >> sufficient
> >> for setting the mtime to the current time (plus epsilon) using
> >> write(2)
> >> and truncate(2). Setting the atime to the current time should
> >> require
> >> no more and no less than read permission, since it can be done
> >> using
> >> read(2), but utimes(NULL) requires write permission for that too).
> >>
> > I did a quick test on a -current client/server and it seems to work
> > ok.
> > The client uses SET_TIME_TO_SERVER and the server sets
> > VA_UTIMES_NULL
> > for this case. At least it works for a UFS exported volume.
> 
> It's not working for me with newnfs from 4 Mar 2012:
> 
> $ mount | grep /c
> besplex:/c on /c (nfs, asynchronous)
> $ ls -l /c/tmp/z
> -rw-rw-rw- 1 root wheel 0 Jan 16 15:12 /c/tmp/z
> # Not even root owns it, since root on the client is mapped to
> 0xFFFFFFFFE.
> $ touch /c/tmp/z
> touch: /c/tmp/z: Operation not permitted
> $ touch -r . /c/tmp/z
> touch: /c/tmp/z: Operation not permitted
> touch: /c/tmp/z: Operation not permitted
> 
Well, I just ran essentially the same test, using the new client patched
with jhb@'s patch and an up to date server and I got the same behaviour
as when doing the touch locally on the file in the file system.
- when not the file owner, but having write permissions
  touch <file>   - worked for both local and NFS mount
  touch -r <other-file> <file>  - failed with Operation not permitted for
                                  both local and NFS mount

The test I had done before used a trivial program that just did a utimes(NULL)
and it worked as non-owner with write access, as well.

The server appears to have been patched for this at r157325 (Apr. 2006).

Maybe your server hasn't been patched for this?

rick

> The error message from touch are confusing. For plain touch:
> - it fails twice using utimes(), with errno EPERM and no error message
> - it then succeeds using read(), write() and truncate()
> - it then prints an error message
> - it then exits with status 0.
> This is with an old version of touch. It always prints an error
> message
> if it reaches the read()/write()/truncate() step (rw() function):
> - if rw() succeeded, then it prints an error message after the rw()
> returns. rw() fails to preserve errno, so the errno for this step
> is garbage, but it is usually the one from the second failing
> utimes().
> - if rw() fails, it prints an error message internally. The errno for
> this is now correct.
> The current version of touch is even more broken. Someone removed the
> rw() step from it, under the naive assumption that utimes() actually
> works.
> 
> For touch -r:
> - it fails twice using utimes(), with errno EPERM and no error
> message.
> Now even trying the second time (with utimes(NULL) is a bug. A
> comment says that there is nothing else that we can do in this case,
> but the code actually falls through and does something wrong (it
> tries to set to the current time instead of to the specified time).
> This bug fixed in the current version.
> - since it is not supposed to do anything more, it prints an error
> message
> after the first utimes() failure. It also sets rval to 1 to give an
> exit status of 1 later.
> - then it continues the same as for the plain touch case:
> - it then "succeeds" using read(), write() and truncate(), but this
> success is in clobbering the timestamps to the current time
> - it then prints an error message despite "succeeding"
> - it then exits with status 1.
> 
> The nfs error is just for the second utimes() in the plain touch case.
> This should succeed (it succeeds on a local ffs file system). Also,
> when
> it fails, the correct errno is EACCES, not EPERM. This works correctly
> after changing the file mode to readonly and using the buggy touch -r
> to reach the second utimes() -- the error is now EACCES for both nfs
> and local ffs. So it seems that the server ffs is being reached
> correctly, but the non-error case for utimes(NULL) is being mishandled
> somewhere. This is not due to some maproot magic, since the same error
> occurs for the non-error case when the ownership is changed to a mere
> user (!= the test user).
> 
> >> Oops, on looking at the code I now think it _is_ possible to pass
> >> the
> >> request to set the current time on the server, since in the
> >> NFSV3SATTRTIME_TOSERVER case we just pass this case value and not
> >> any time value to the server, so the server has no option but to
> >> use
> >> its current time. It is not surprising that the permissions checks
> >> for this don't work right. I thought that the client was
> >> responsible
> >> for most permissions checks, but can't find many or the relevant
> >> one
> >> here. The NFSV3SATTRTIME_TOSERVER code on the server sets
> >> VA_UTIMES_NULL, so I would have thought that the permissions check
> >> on
> >> the server does the right thing.
> >>
> > As noted above, it seems to work correctly for the new server in
> > -current,
> > at least for UFS exports.
> >
> > Normally a server will do permission checking for NFS RPCs. There is
> > nothing
> > stopping a client from doing a check and returning an error, but
> > traditionally
> > a server has not trusted a client to do so. (I'm not sure if adding
> > a check
> > in the client is what jhb@ was referring to in his reply to this?)
> 
> Checking in the client doesn't seem right now. The bug seems to be a
> different one on the server.
> 
> >> There are some large timestamping bugs nearby:
> >>
> >> - the old nfs server code for NFSV3SATTRTIME_TOSERVER uses
> >> getnanotime()
> >> to read the current time. This violates the system's policy set by
> >> the vfs.timestamp precision in most cases, since using
> >> getnanotime()
> >> is the worst supported policy and is not the defaul.
> >> ...
> >
> >> New nfs code never uses the correct function vfs_timestamp().
> > This needs to be fixed. Until now, I would have had no idea what is
> > the
> > correct interface. (When I did the port, I just used a call that
> > seemed
> > to return what I wanted.;-)
> >
> > Having said that, after reading what you wrote below, it is not
> > obvious
> > to me what the correct fix is? (It seems to be a choice between
> > microtime()
> > and vfs_timestamp()?)
> 
> Just use vfs_timestamp() whenever generating a file timestamp but not
> for
> other purposes. Like permissions checking, the client very rarely
> generates
> file timestamps, and even on the server most timestamps are not
> generated
> by nfs directly. So there are only a few places to check and change.
> We
> know about fifos and the utimes(NULL) case in the server (the latter
> is
> emulating upper layers in vfs) before calling VOP_SETATTR(). I wonder
> how well the fifo code works. Its timestamps aren't very important,
> but
> they should be synced to the server very occasionally.
> 
> Bruce

From owner-freebsd-fs@FreeBSD.ORG  Wed Jan 16 14:33:53 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by hub.freebsd.org (Postfix) with ESMTP id 161AC1DA
 for <freebsd-fs@freebsd.org>; Wed, 16 Jan 2013 14:33:53 +0000 (UTC)
 (envelope-from rmacklem@uoguelph.ca)
Received: from esa-annu.net.uoguelph.ca (esa-annu.mail.uoguelph.ca
 [131.104.91.36]) by mx1.freebsd.org (Postfix) with ESMTP id D4988D11
 for <freebsd-fs@freebsd.org>; Wed, 16 Jan 2013 14:33:52 +0000 (UTC)
X-IronPort-Anti-Spam-Filtered: true
X-IronPort-Anti-Spam-Result: AqAEAFC59lCDaFvO/2dsb2JhbABFhjq3YnOCHgEBAQMBAQEBICsgCwUWDgoCAg0ZAiMGAQkmBggHBAEcBIdmAwkGDKZdiR4Nh36BI4plgQiDFYETA4hhin1YgVaBHIobhRKDE4FRNQ
X-IronPort-AV: E=Sophos;i="4.84,479,1355115600"; 
   d="scan'208";a="9319197"
Received: from erie.cs.uoguelph.ca (HELO zcs3.mail.uoguelph.ca)
 ([131.104.91.206])
 by esa-annu.net.uoguelph.ca with ESMTP; 16 Jan 2013 09:33:51 -0500
Received: from zcs3.mail.uoguelph.ca (localhost.localdomain [127.0.0.1])
 by zcs3.mail.uoguelph.ca (Postfix) with ESMTP id ED68BB3F0B;
 Wed, 16 Jan 2013 09:33:51 -0500 (EST)
Date: Wed, 16 Jan 2013 09:33:51 -0500 (EST)
From: Rick Macklem <rmacklem@uoguelph.ca>
To: Sergey Kandaurov <pluknet@gmail.com>
Message-ID: <227703439.2036949.1358346831962.JavaMail.root@erie.cs.uoguelph.ca>
In-Reply-To: <CAE-mSOJk1HbxvF=ZpoSP21b9j65qMov=AE-OM6wcUkbadQeZbw@mail.gmail.com>
Subject: Re: getcwd lies on/under nfs4-mounted zfs dataset
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit
X-Originating-IP: [172.17.91.203]
X-Mailer: Zimbra 6.0.10_GA_2692 (ZimbraWebClient - FF3.0 (Win)/6.0.10_GA_2692)
Cc: freebsd-fs@freebsd.org
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 16 Jan 2013 14:33:53 -0000

pluknet@gmail.com wrote:
> Hi.
> 
> We stuck with the problem getting wrong current directory path
> when sitting on/under zfs dataset filesystem mounted over NFSv4.
> Both nfs server and client are 10.0-CURRENT from December or so.
> 
> The component path "user3" unexpectedly appears to be "." (dot).
> nfs-client:/home/user3 # pwd
> /home/.
> nfs-client:/home/user3/var/run # pwd
> /home/./var/run
> 
Although you are welcome to try the patch I emailed you yesterday, I
think it will result in the tree traversal algorithm in libc complaining
about a cycle at some point, because there could be another node in the
file system on the server that has the same fileno as the mounted-on-fileno.

I need to take a close look at getcwd() and see how it handles mount point
crossings.

The trick is that the NFSv4 client must make getcwd() happy, but also try to
avoid duplicate fileno (i-node #s) values within a subtree of the mount
that has a given fsid.

Since I can reproduce it here, I'll work on it and post if/when I have
a better patch.

rick

> nfs-client:~ # procstat -f 3225
> PID COMM FD T V FLAGS REF OFFSET PRO NAME
> 3225 a.out text v r r-------- - - - /home/./var/a.out
> 3225 a.out ctty v c rw------- - - - /dev/pts/2
> 3225 a.out cwd v d r-------- - - - /home/./var
> 3225 a.out root v d r-------- - - - /
> 
> The used setup follows.
> 
> 1. NFS Server with local ZFS:
> # cat /etc/exports
> V4: / -sec=sys
> 
> # zfs list
> pool1 10.4M 122G 580K /pool1
> pool1/user3 on /pool1/user3 (zfs, NFS exported, local, nfsv4acls)
> 
> Exports list on localhost:
> /pool1/user3 109.70.28.0
> /pool1 109.70.28.0
> 
> # zfs get sharenfs pool1/user3
> NAME PROPERTY VALUE SOURCE
> pool1/user3 sharenfs -alldirs -maproot=root -network=109.70.28.0/24
> local
> 
> 2. pool1 is mounted on NFSv4 client:
> nfs-server:/pool1 on /home (nfs, noatime, nfsv4acls)
> 
> So that on NFS client the "pool1/user3" dataset comes at /home/user3.
> / - ufs
> /home - zpool-over-nfsv4
> /home/user3 - zfs dataset "pool1/user3"
> 
> At the same time it works as expected when we're not on zfs dataset,
> but directly on its parent zfs pool (also over NFSv4), e.g.
> nfs-client:/home/non_dataset_dir # pwd
> /home/non_dataset_dir
> 
> The ls command works as expected:
> nfs-client:/# ls -dl /home/user3/var/
> drwxrwxrwt+ 6 root wheel 6 Jan 10 16:19 /home/user3/var/
> 
> --
> wbr,
> pluknet
> _______________________________________________
> freebsd-fs@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-fs
> To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org"

From owner-freebsd-fs@FreeBSD.ORG  Wed Jan 16 23:47:43 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115])
 by hub.freebsd.org (Postfix) with ESMTP id 80680C8E
 for <freebsd-fs@freebsd.org>; Wed, 16 Jan 2013 23:47:43 +0000 (UTC)
 (envelope-from cochard@gmail.com)
Received: from mail-bk0-f53.google.com (mail-bk0-f53.google.com
 [209.85.214.53]) by mx1.freebsd.org (Postfix) with ESMTP id 101A8858
 for <freebsd-fs@freebsd.org>; Wed, 16 Jan 2013 23:47:42 +0000 (UTC)
Received: by mail-bk0-f53.google.com with SMTP id j5so1024650bkw.40
 for <freebsd-fs@freebsd.org>; Wed, 16 Jan 2013 15:47:41 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
 h=x-received:mime-version:sender:from:date:x-google-sender-auth
 :message-id:subject:to:content-type;
 bh=yNrhERibtUvbyHvK+62pMOiErmvu/IC4/8sci63L884=;
 b=OAumYMNmR3miDlasg8U5m4sbRH64oJpp9Pxv7hqNEVvveatN0u80+/ve7psjKEB8Ex
 b6zLaWllNI/Tz1azxZTGBpvpQDALZCAxA7VU8eeuAVI3zli5S7E+uqW1nOKaRIuJQU5i
 EGrcZQq0eno6KslMbGPnS7McGio15NIlgPMfzVnGHRHC8eWDN0Vr3ugL21b7vCbFxPg0
 KXwWBf/Gcypun2nzQVhUU1bfVH9GNLQvelxFOQee3T3tj2ArV9HEsmwfA5jNkM17VTE+
 Nofzf1tjzR1LaB3+umNZKRKCvOnY7Gb32jW4fpWmgBicjxICpquo1PcBDjvhN7jkBWR9
 fJGA==
X-Received: by 10.204.157.152 with SMTP id b24mr949549bkx.92.1358380061423;
 Wed, 16 Jan 2013 15:47:41 -0800 (PST)
MIME-Version: 1.0
Sender: cochard@gmail.com
Received: by 10.205.116.199 with HTTP; Wed, 16 Jan 2013 15:47:21 -0800 (PST)
From: =?ISO-8859-1?Q?Olivier_Cochard=2DLabb=E9?= <olivier@cochard.me>
Date: Thu, 17 Jan 2013 00:47:21 +0100
X-Google-Sender-Auth: 6LF5BdYdwXlmEbOktVDLzyp2QWw
Message-ID: <CA+q+Tcpv8QXZJYXdk0RRQJJoZD1FhdA1kfB6WascHc0TpUVGHg@mail.gmail.com>
Subject: Reproducible crash with tmpfs on 9.1-release
To: freebsd-fs@freebsd.org
Content-Type: text/plain; charset=ISO-8859-1
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 16 Jan 2013 23:47:43 -0000

Hi,
I reach to reproduce a crash on 9.1-release (amd64) by compiling
software on a tmpfs workdir (cf PR/175353).
My first machine is a 8 core with 56GB RAM, but without swap, then I
didn't have a core dump on it.
Then I reproduced the crash on a smaller machine with 4 core and only
4GB RAM but with a swap.
I've put the files from my /var/crash online for anyone interested by
(with the exception of the full vmcore):
http://gugus69.free.fr/freebsd/tmpfs/core0/

Happy debugging !

Olivier

From owner-freebsd-fs@FreeBSD.ORG  Thu Jan 17 00:13:38 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@smarthost.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by hub.freebsd.org (Postfix) with ESMTP id A82B96BD;
 Thu, 17 Jan 2013 00:13:38 +0000 (UTC)
 (envelope-from linimon@FreeBSD.org)
Received: from freefall.freebsd.org (freefall.freebsd.org
 [IPv6:2001:1900:2254:206c::16:87])
 by mx1.freebsd.org (Postfix) with ESMTP id 823E19D0;
 Thu, 17 Jan 2013 00:13:38 +0000 (UTC)
Received: from freefall.freebsd.org (localhost [127.0.0.1])
 by freefall.freebsd.org (8.14.6/8.14.6) with ESMTP id r0H0DcaR013091;
 Thu, 17 Jan 2013 00:13:38 GMT
 (envelope-from linimon@freefall.freebsd.org)
Received: (from linimon@localhost)
 by freefall.freebsd.org (8.14.6/8.14.6/Submit) id r0H0DcRD013087;
 Thu, 17 Jan 2013 00:13:38 GMT (envelope-from linimon)
Date: Thu, 17 Jan 2013 00:13:38 GMT
Message-Id: <201301170013.r0H0DcRD013087@freefall.freebsd.org>
To: linimon@FreeBSD.org, freebsd-bugs@FreeBSD.org, freebsd-fs@FreeBSD.org
From: linimon@FreeBSD.org
Subject: Re: kern/175353: [tmpfs] [panic] panic during building a nanobsd
 image + ports
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 17 Jan 2013 00:13:38 -0000

Old Synopsis: tmpfs panic during building a nanobsd image + ports
New Synopsis: [tmpfs] [panic] panic during building a nanobsd image + ports

Responsible-Changed-From-To: freebsd-bugs->freebsd-fs
Responsible-Changed-By: linimon
Responsible-Changed-When: Thu Jan 17 00:13:20 UTC 2013
Responsible-Changed-Why: 
Over to maintainer(s).

http://www.freebsd.org/cgi/query-pr.cgi?pr=175353

From owner-freebsd-fs@FreeBSD.ORG  Thu Jan 17 00:42:58 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by hub.freebsd.org (Postfix) with ESMTP id CB36370
 for <freebsd-fs@freebsd.org>; Thu, 17 Jan 2013 00:42:58 +0000 (UTC)
 (envelope-from rmacklem@uoguelph.ca)
Received: from esa-jnhn.mail.uoguelph.ca (esa-jnhn.mail.uoguelph.ca
 [131.104.91.44]) by mx1.freebsd.org (Postfix) with ESMTP id 919A6B24
 for <freebsd-fs@freebsd.org>; Thu, 17 Jan 2013 00:42:58 +0000 (UTC)
X-IronPort-Anti-Spam-Filtered: true
X-IronPort-Anti-Spam-Result: AqAEAMxH91CDaFvO/2dsb2JhbABEhj23WXOCHgEBAQMBAQEBICsgCwUWDgoCAg0ZAiMGAQkmBggHBAEcBIdmAwkGDKduiQcNiBqBI4plgQiDFYETA4hhin1YgVaBHIobhRKDE4FRNQ
X-IronPort-AV: E=Sophos;i="4.84,481,1355115600"; d="scan'208";a="12252288"
Received: from erie.cs.uoguelph.ca (HELO zcs3.mail.uoguelph.ca)
 ([131.104.91.206])
 by esa-jnhn.mail.uoguelph.ca with ESMTP; 16 Jan 2013 19:42:57 -0500
Received: from zcs3.mail.uoguelph.ca (localhost.localdomain [127.0.0.1])
 by zcs3.mail.uoguelph.ca (Postfix) with ESMTP id 8A372B4096;
 Wed, 16 Jan 2013 19:42:57 -0500 (EST)
Date: Wed, 16 Jan 2013 19:42:57 -0500 (EST)
From: Rick Macklem <rmacklem@uoguelph.ca>
To: Sergey Kandaurov <pluknet@gmail.com>
Message-ID: <1171241649.2066788.1358383377496.JavaMail.root@erie.cs.uoguelph.ca>
In-Reply-To: <CAE-mSOJk1HbxvF=ZpoSP21b9j65qMov=AE-OM6wcUkbadQeZbw@mail.gmail.com>
Subject: Re: getcwd lies on/under nfs4-mounted zfs dataset
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit
X-Originating-IP: [172.17.91.202]
X-Mailer: Zimbra 6.0.10_GA_2692 (ZimbraWebClient - FF3.0 (Win)/6.0.10_GA_2692)
Cc: freebsd-fs@freebsd.org
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 17 Jan 2013 00:42:58 -0000

pluknet@gmail.com wrote:
> Hi.
> 
> We stuck with the problem getting wrong current directory path
> when sitting on/under zfs dataset filesystem mounted over NFSv4.
> Both nfs server and client are 10.0-CURRENT from December or so.
> 
> The component path "user3" unexpectedly appears to be "." (dot).
> nfs-client:/home/user3 # pwd
> /home/.
> nfs-client:/home/user3/var/run # pwd
> /home/./var/run
> 
Ok, I've figured out what is going on. The algorithm in libc
works, but vn_fullpath1() doesn't. The latter assumes that
"mount points" are marked with VV_ROOT etc. For the
"pseudo mount points" (which are mount points within the
directory tree on the NFSv4 server), this isn't the case.

If you:
sysctl debug.disablecwd=1
sysctl debug.disablefullpath=1

it works. (At least for the UFS case I tested.)

I can't see how this can be made to work correctly
for vn_fullpath1() unless it was re-written to use the
same algorithm that lib/libc/gen/getcwd.c implements.

I was pretty sure this used to work. Maybe the syscalls
used to be disabled by default or weren't used by the
libc functions?

Anyhow, sorry about the cofusing posts while I figured
out what was going on, rick
ps: Don't use the patch I posted. It isn't needed and
    will break stuff.

> nfs-client:~ # procstat -f 3225
> PID COMM FD T V FLAGS REF OFFSET PRO NAME
> 3225 a.out text v r r-------- - - - /home/./var/a.out
> 3225 a.out ctty v c rw------- - - - /dev/pts/2
> 3225 a.out cwd v d r-------- - - - /home/./var
> 3225 a.out root v d r-------- - - - /
> 
> The used setup follows.
> 
> 1. NFS Server with local ZFS:
> # cat /etc/exports
> V4: / -sec=sys
> 
> # zfs list
> pool1 10.4M 122G 580K /pool1
> pool1/user3 on /pool1/user3 (zfs, NFS exported, local, nfsv4acls)
> 
> Exports list on localhost:
> /pool1/user3 109.70.28.0
> /pool1 109.70.28.0
> 
> # zfs get sharenfs pool1/user3
> NAME PROPERTY VALUE SOURCE
> pool1/user3 sharenfs -alldirs -maproot=root -network=109.70.28.0/24
> local
> 
> 2. pool1 is mounted on NFSv4 client:
> nfs-server:/pool1 on /home (nfs, noatime, nfsv4acls)
> 
> So that on NFS client the "pool1/user3" dataset comes at /home/user3.
> / - ufs
> /home - zpool-over-nfsv4
> /home/user3 - zfs dataset "pool1/user3"
> 
> At the same time it works as expected when we're not on zfs dataset,
> but directly on its parent zfs pool (also over NFSv4), e.g.
> nfs-client:/home/non_dataset_dir # pwd
> /home/non_dataset_dir
> 
> The ls command works as expected:
> nfs-client:/# ls -dl /home/user3/var/
> drwxrwxrwt+ 6 root wheel 6 Jan 10 16:19 /home/user3/var/
> 
> --
> wbr,
> pluknet
> _______________________________________________
> freebsd-fs@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-fs
> To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org"

From owner-freebsd-fs@FreeBSD.ORG  Thu Jan 17 02:31:33 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by hub.freebsd.org (Postfix) with ESMTP id 6F0DCAA1;
 Thu, 17 Jan 2013 02:31:33 +0000 (UTC)
 (envelope-from brde@optusnet.com.au)
Received: from mail10.syd.optusnet.com.au (mail10.syd.optusnet.com.au
 [211.29.132.191])
 by mx1.freebsd.org (Postfix) with ESMTP id 0F01D182;
 Thu, 17 Jan 2013 02:31:32 +0000 (UTC)
Received: from c211-30-173-106.carlnfd1.nsw.optusnet.com.au
 (c211-30-173-106.carlnfd1.nsw.optusnet.com.au [211.30.173.106])
 by mail10.syd.optusnet.com.au (8.13.1/8.13.1) with ESMTP id r0H2VR97023518
 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO);
 Thu, 17 Jan 2013 13:31:30 +1100
Date: Thu, 17 Jan 2013 13:31:26 +1100 (EST)
From: Bruce Evans <brde@optusnet.com.au>
X-X-Sender: bde@besplex.bde.org
To: Rick Macklem <rmacklem@uoguelph.ca>
Subject: Re: [PATCH] Better handle NULL utimes() in the NFS client
In-Reply-To: <1642392672.2036529.1358346406018.JavaMail.root@erie.cs.uoguelph.ca>
Message-ID: <20130117132903.O1225@besplex.bde.org>
References: <1642392672.2036529.1358346406018.JavaMail.root@erie.cs.uoguelph.ca>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed
X-Optus-CM-Score: 0
X-Optus-CM-Analysis: v=2.0 cv=R7tbgqtX c=1 sm=1 a=S8Qr1IbAvFsA:10
 a=kj9zAlcOel0A:10 a=PO7r1zJSAAAA:8 a=JzwRw_2MAAAA:8 a=U1Z5fgpPGSMA:10
 a=-VJYlQ4Kf0PCPYjB1mIA:9 a=CjuIK1q_8ugA:10 a=TEtd8y5WR3g2ypngnwZWYw==:117
Cc: Rick Macklem <rmacklem@freebsd.org>, fs@freebsd.org
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 17 Jan 2013 02:31:33 -0000

On Wed, 16 Jan 2013, Rick Macklem wrote:

> Bruce Evans wrote:
>> On Tue, 15 Jan 2013, Rick Macklem wrote:
>>
>>> Bruce Evans wrote:
>>
>>>> I can't see anything that does the different permissions check for
>>>> the VA_UTIMES_NULL case, and testing shows that this case is just
>>>> broken,
>>>> at least for an old version of the old nfs client -- the same
>>>> ...
>>> I did a quick test on a -current client/server and it seems to work
>>> ok.
>>> The client uses SET_TIME_TO_SERVER and the server sets
>>> VA_UTIMES_NULL
>>> for this case. At least it works for a UFS exported volume.
>>
>> It's not working for me with newnfs from 4 Mar 2012:
>> ...
> Well, I just ran essentially the same test, using the new client patched
> with jhb@'s patch and an up to date server and I got the same behaviour
> as when doing the touch locally on the file in the file system.
> - when not the file owner, but having write permissions
>  touch <file>   - worked for both local and NFS mount
>  touch -r <other-file> <file>  - failed with Operation not permitted for
>                                  both local and NFS mount
>
> The test I had done before used a trivial program that just did a utimes(NULL)
> and it worked as non-owner with write access, as well.
>
> The server appears to have been patched for this at r157325 (Apr. 2006).
>
> Maybe your server hasn't been patched for this?

Indeed it hasn't -- it is missing setting of VA_UTIMES_NULL.

Bruce

From owner-freebsd-fs@FreeBSD.ORG  Thu Jan 17 03:11:10 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115])
 by hub.freebsd.org (Postfix) with ESMTP id 1D319244;
 Thu, 17 Jan 2013 03:11:10 +0000 (UTC)
 (envelope-from araujobsdport@gmail.com)
Received: from mail-wi0-f171.google.com (mail-wi0-f171.google.com
 [209.85.212.171])
 by mx1.freebsd.org (Postfix) with ESMTP id 8DD893E6;
 Thu, 17 Jan 2013 03:11:09 +0000 (UTC)
Received: by mail-wi0-f171.google.com with SMTP id hn14so4101003wib.16
 for <multiple recipients>; Wed, 16 Jan 2013 19:11:08 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
 h=mime-version:x-received:reply-to:in-reply-to:references:date
 :message-id:subject:from:to:cc:content-type;
 bh=wVrXJfBtsd1/lgHlaf9PnL68rxnZfEQzXcncALh97Ys=;
 b=saP0H4k9Gt73kzoXbjEdZWHqNWYM12tTa1q9eV6cUvdar3SuX3pHvlMDR/bY8Qg6V1
 Eehqa/G+/Pio58HZyoeA9dCJnzu0QC3VTboGwlu8Zf5RMijPeQLm9feH+4aNLBKIsK9I
 dh3NqrOKW2nQq0B821kTonkl6E7FiUZD/ukz8wh6EtQZDBMzZ+PzIahQQNVuD8Hc1wkq
 RasY7E0tFBeY7eySQ9ARj+yQMncV6Nmeg/zD9TquFI1SylyqLJ+4wO/Fqeoea9OJ3ToM
 48CajaYaPkiyO2bUisMEXmqz17R2X6pgqWLIRSf037ZG8N0EOTq7lXfwdS0/Lxi37Q9Y
 S/sQ==
MIME-Version: 1.0
X-Received: by 10.180.85.103 with SMTP id g7mr5597769wiz.29.1358392268419;
 Wed, 16 Jan 2013 19:11:08 -0800 (PST)
Received: by 10.180.145.44 with HTTP; Wed, 16 Jan 2013 19:11:08 -0800 (PST)
In-Reply-To: <20130109023327.GA1888@FreeBSD.org>
References: <20130109023327.GA1888@FreeBSD.org>
Date: Thu, 17 Jan 2013 11:11:08 +0800
Message-ID: <CAOfEmZiO8rORVZ3Sqg305tW1pfJSQbY+fGXnrvODjZcU-Fy-oQ@mail.gmail.com>
Subject: Re: rc.d script for memory based zfs intent log
From: Marcelo Araujo <araujobsdport@gmail.com>
To: John <jwd@freebsd.org>
Content-Type: text/plain; charset=ISO-8859-1
X-Content-Filtered-By: Mailman/MimeDel 2.1.14
Cc: FreeBSD Filesystems <freebsd-fs@freebsd.org>
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
Reply-To: araujo@FreeBSD.org
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 17 Jan 2013 03:11:10 -0000

2013/1/9 John <jwd@freebsd.org>

> http://people.freebsd.org/~jwd/memzil.txt


Hello John,

In my point of view this script seems be very useful, and I probably will
use it in my product.

As an example, I faced out few problems on system reboot/shutdown, I use
Ramdisk as ZIL, when normal reboot/shutdown the ramdisk will disappear,
because right now there is nothing to detach the ramdisk safely. I do
believe with this script I can attache a new ZIL in every boot and detach
it safely when perform a reboot/shutdown.

I use some tricks with my ZIL, it is mirrored using RAM and the RAM has its
own battery to protect the data, my main problem is what I described above.
I don't think sync=disable is a good option, regards you can loose data.

Nice script!

Thanks!
-- 
Marcelo Araujo
araujo@FreeBSD.org

From owner-freebsd-fs@FreeBSD.ORG  Thu Jan 17 04:21:44 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by hub.freebsd.org (Postfix) with ESMTP id 4BF7A881
 for <freebsd-fs@freebsd.org>; Thu, 17 Jan 2013 04:21:44 +0000 (UTC)
 (envelope-from matthew.ahrens@delphix.com)
Received: from mail-ie0-f169.google.com (mail-ie0-f169.google.com
 [209.85.223.169]) by mx1.freebsd.org (Postfix) with ESMTP id 1FA0099A
 for <freebsd-fs@freebsd.org>; Thu, 17 Jan 2013 04:21:43 +0000 (UTC)
Received: by mail-ie0-f169.google.com with SMTP id c14so4165346ieb.28
 for <freebsd-fs@freebsd.org>; Wed, 16 Jan 2013 20:21:43 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=delphix.com; s=google;
 h=mime-version:x-received:in-reply-to:references:date:message-id
 :subject:from:to:cc:content-type;
 bh=NHVW9y7cpP5XK6Ji3Qf32ZhKVIxLJhzATdxtUH2ewJ8=;
 b=WuDeEEllgXIpaqGziFdu3jdlj/+47queTFUHAKovWh3kPO88eS7F5I+7WBk6oFvbc+
 uP1m0py2zfVkxEgZ4cs7FHBZHYHU3YfTnJExhli8r40xQuufW1i3Nt62Wc4GoID2Lrh2
 0pYTo4maNewpQyO0aXW3Yxt8lsS2cUkihTG9o=
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=google.com; s=20120113;
 h=mime-version:x-received:in-reply-to:references:date:message-id
 :subject:from:to:cc:content-type:x-gm-message-state;
 bh=NHVW9y7cpP5XK6Ji3Qf32ZhKVIxLJhzATdxtUH2ewJ8=;
 b=mqY19zdTl2tnYzS21XLNgtdvaWUoaUWSdezNj5Bb7CczZzsLXsy4KLXKIIcBWau4r7
 89PdCidJmYmut7RD9zlOZ13e7GpiHm2wBSak3JYQXr2gfJPdKS7Gjc7WIWfAMyaNmW+d
 ERHwL8JgyMDvcR/bVvWEF2HI2G/ixqnI2Nw3h2fQg7KQNwSm9rZ5zc02FVN78L00krpF
 yZ8IX8H3HKFH4XeYW5+nlZDGJaIIT7E+l9nJizinHR+RyPn0ECNxIoW5aEHveQ+tj+Vh
 5SdFbxK+g1Ix65Dwgi1LDbhKhwRvf0MQj8lxNWxCFnpDDQ/5IPpzfUrsVfs8TsE2yYmy
 cNcw==
MIME-Version: 1.0
X-Received: by 10.50.7.204 with SMTP id l12mr6904927iga.103.1358396503365;
 Wed, 16 Jan 2013 20:21:43 -0800 (PST)
Received: by 10.50.91.131 with HTTP; Wed, 16 Jan 2013 20:21:43 -0800 (PST)
In-Reply-To: <CAOfEmZiO8rORVZ3Sqg305tW1pfJSQbY+fGXnrvODjZcU-Fy-oQ@mail.gmail.com>
References: <20130109023327.GA1888@FreeBSD.org>
 <CAOfEmZiO8rORVZ3Sqg305tW1pfJSQbY+fGXnrvODjZcU-Fy-oQ@mail.gmail.com>
Date: Wed, 16 Jan 2013 20:21:43 -0800
Message-ID: <CAJjvXiEqErtfuY_g9uB853w1yU0u2C845m9ey=vvgEOMMh+AdA@mail.gmail.com>
Subject: Re: rc.d script for memory based zfs intent log
From: Matthew Ahrens <mahrens@delphix.com>
To: araujo@freebsd.org
Content-Type: text/plain; charset=ISO-8859-1
X-Gm-Message-State: ALoCoQkL7md778WPa7CW4ZdGfYYODqJdIeDV4p/6Bq4vw0XqeRq/uO/LVjYaCTVkkaHT6dvQDNLq
Cc: FreeBSD Filesystems <freebsd-fs@freebsd.org>, John <jwd@freebsd.org>
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 17 Jan 2013 04:21:44 -0000

Pardon my inexperience with FreeBSD, but is this ramdisk persistent
across reboots?  POST doesn't overwrite it?

--matt

On Wed, Jan 16, 2013 at 7:11 PM, Marcelo Araujo <araujobsdport@gmail.com> wrote:
> 2013/1/9 John <jwd@freebsd.org>
>
>> http://people.freebsd.org/~jwd/memzil.txt
>
>
> Hello John,
>
> In my point of view this script seems be very useful, and I probably will
> use it in my product.
>
> As an example, I faced out few problems on system reboot/shutdown, I use
> Ramdisk as ZIL, when normal reboot/shutdown the ramdisk will disappear,
> because right now there is nothing to detach the ramdisk safely. I do
> believe with this script I can attache a new ZIL in every boot and detach
> it safely when perform a reboot/shutdown.
>
> I use some tricks with my ZIL, it is mirrored using RAM and the RAM has its
> own battery to protect the data, my main problem is what I described above.
> I don't think sync=disable is a good option, regards you can loose data.
>
> Nice script!
>
> Thanks!
> --
> Marcelo Araujo
> araujo@FreeBSD.org
> _______________________________________________
> freebsd-fs@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-fs
> To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org"

From owner-freebsd-fs@FreeBSD.ORG  Thu Jan 17 04:33:00 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by hub.freebsd.org (Postfix) with ESMTP id 9E0A0A20;
 Thu, 17 Jan 2013 04:33:00 +0000 (UTC)
 (envelope-from araujobsdport@gmail.com)
Received: from mail-wi0-f182.google.com (mail-wi0-f182.google.com
 [209.85.212.182])
 by mx1.freebsd.org (Postfix) with ESMTP id 1983225F;
 Thu, 17 Jan 2013 04:32:59 +0000 (UTC)
Received: by mail-wi0-f182.google.com with SMTP id hn14so1943081wib.3
 for <multiple recipients>; Wed, 16 Jan 2013 20:32:59 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
 h=mime-version:x-received:reply-to:in-reply-to:references:date
 :message-id:subject:from:to:cc:content-type;
 bh=yWXPQIAK/pmOpzSXkOQOZOEUiFZ1UvL5sAqW+oAxOrw=;
 b=ADp5o7OsjfxtM1cCBwtcJJG7z25pSnjedrtCN3XZVZbGL7o5BWAaTUei1/ozKlLTnp
 1IcDPWNEVa8Y2KmDaV3Givj/lFDXFI23yoa6/iFb87xXvXcVvB+YwzIy/opb/Am5tJMx
 qiP1GmE0R55NwqZBYUW0wF1mX87fGQeJhF8x1pQFhJ2B8ti7ow9SS/J1XfxBai2nl+o3
 BdmbOOw72bs96KIlG6KazdSVwdT7SBMRKAOkCifL6lB3KwJlk1lI73zRbNlYS5hgW6wo
 LN/CXi5ZXrqMtggbfW7Pu4bZnu+UvUL51wFpNnl1eQNgQ5cHgAmtUwVIvlH3EbqCoSpW
 PWXQ==
MIME-Version: 1.0
X-Received: by 10.194.240.233 with SMTP id wd9mr5908977wjc.54.1358397179188;
 Wed, 16 Jan 2013 20:32:59 -0800 (PST)
Received: by 10.180.145.44 with HTTP; Wed, 16 Jan 2013 20:32:59 -0800 (PST)
In-Reply-To: <CAJjvXiEqErtfuY_g9uB853w1yU0u2C845m9ey=vvgEOMMh+AdA@mail.gmail.com>
References: <20130109023327.GA1888@FreeBSD.org>
 <CAOfEmZiO8rORVZ3Sqg305tW1pfJSQbY+fGXnrvODjZcU-Fy-oQ@mail.gmail.com>
 <CAJjvXiEqErtfuY_g9uB853w1yU0u2C845m9ey=vvgEOMMh+AdA@mail.gmail.com>
Date: Thu, 17 Jan 2013 12:32:59 +0800
Message-ID: <CAOfEmZhTwfYt=t-SDo1iMEDd7AEV7hOxxiMOuQ6Mz-4vOnxGew@mail.gmail.com>
Subject: Re: rc.d script for memory based zfs intent log
From: Marcelo Araujo <araujobsdport@gmail.com>
To: Matthew Ahrens <mahrens@delphix.com>
Content-Type: text/plain; charset=ISO-8859-1
X-Content-Filtered-By: Mailman/MimeDel 2.1.14
Cc: FreeBSD Filesystems <freebsd-fs@freebsd.org>, John <jwd@freebsd.org>
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
Reply-To: araujo@FreeBSD.org
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 17 Jan 2013 04:33:00 -0000

2013/1/17 Matthew Ahrens <mahrens@delphix.com>

> Pardon my inexperience with FreeBSD, but is this ramdisk persistent
> across reboots?  POST doesn't overwrite it?
>
> --matt


Hello Matthew,

No, it is not persistent, it will create a new RAMDISK at boot time.
But like in my case, I have a persistent ZIL that is mirrored, with small
changes on this script you can have a persistent ZIL.

But in my case, I have a special stuff, I have the ZIL at RAMDISK mirrored
and two different RAM and protected by battery, when I make a shutdown it
dump all information from RAMDISK to a SSD, and when I boot, it does
the opposite, create the RAMDISK and dump from SSD to RAMDISK and I just
attach it once again to the pool.

Best Regards,
-- 
Marcelo Araujo
araujo@FreeBSD.org

From owner-freebsd-fs@FreeBSD.ORG  Thu Jan 17 04:42:54 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by hub.freebsd.org (Postfix) with ESMTP id DD5E2C0B;
 Thu, 17 Jan 2013 04:42:54 +0000 (UTC)
 (envelope-from araujobsdport@gmail.com)
Received: from mail-wg0-f43.google.com (mail-wg0-f43.google.com [74.125.82.43])
 by mx1.freebsd.org (Postfix) with ESMTP id 48FE62A3;
 Thu, 17 Jan 2013 04:42:53 +0000 (UTC)
Received: by mail-wg0-f43.google.com with SMTP id e12so89634wge.22
 for <multiple recipients>; Wed, 16 Jan 2013 20:42:53 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
 h=mime-version:x-received:reply-to:in-reply-to:references:date
 :message-id:subject:from:to:cc:content-type;
 bh=EzQb9Hx4uTEwN0b0ot9EEUwVT/2euWbbVFTNfKZ3kbY=;
 b=JoE1cFM3KYT4GfXg/XEkBlVqdH70sblE4ISHBDeGwGl7UA1DA3XzYMtHDyywaOPoOh
 sY7brwxDYZq1fO75nQGHvG4e1GzECTESOAGe9UOAVXcV85cAkISdntHa+MeNtlFM41LY
 aWa2fwQCrma978fMT64zjun8M31/zVAQzgxMcGJApnyc2UW8+eDDtVeneljEtMerNO2S
 8qUvknISgLBvXfxV0BEMWOE895nf45tbTwZ6gndKC1si6h6b+1uyLafCiWbv/Gq6Sqle
 GFGfsSybGgyY3BCPmy7G0yzIMZ6DJzM/vGN5l0jxMhChwyNIVwjNdQeF1MvqzN/XJbZG
 XeAQ==
MIME-Version: 1.0
X-Received: by 10.194.240.129 with SMTP id wa1mr6089690wjc.21.1358397773117;
 Wed, 16 Jan 2013 20:42:53 -0800 (PST)
Received: by 10.180.145.44 with HTTP; Wed, 16 Jan 2013 20:42:53 -0800 (PST)
In-Reply-To: <CAOfEmZiO8rORVZ3Sqg305tW1pfJSQbY+fGXnrvODjZcU-Fy-oQ@mail.gmail.com>
References: <20130109023327.GA1888@FreeBSD.org>
 <CAOfEmZiO8rORVZ3Sqg305tW1pfJSQbY+fGXnrvODjZcU-Fy-oQ@mail.gmail.com>
Date: Thu, 17 Jan 2013 12:42:53 +0800
Message-ID: <CAOfEmZivss6TWoJ+YF5Zj4d5hQLAM3_-u45WWO-j+_9ts4xwSA@mail.gmail.com>
Subject: Re: rc.d script for memory based zfs intent log
From: Marcelo Araujo <araujobsdport@gmail.com>
To: John <jwd@freebsd.org>
Content-Type: text/plain; charset=ISO-8859-1
X-Content-Filtered-By: Mailman/MimeDel 2.1.14
Cc: FreeBSD Filesystems <freebsd-fs@freebsd.org>
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
Reply-To: araujo@FreeBSD.org
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 17 Jan 2013 04:42:54 -0000

2013/1/17 Marcelo Araujo <araujobsdport@gmail.com>

>
> 2013/1/9 John <jwd@freebsd.org>
>
>> http://people.freebsd.org/~jwd/memzil.txt
>
>
> Hello John,
>
> In my point of view this script seems be very useful, and I probably will
> use it in my product.
>
>
>
Dear John,

Just another thing... you must add in the begging of your script the
KEYWORD: shutdown, or otherwise it wont be called and you perform the
shutdown.


Best Regards,
-- 
Marcelo Araujo
araujo@FreeBSD.org

From owner-freebsd-fs@FreeBSD.ORG  Thu Jan 17 05:10:54 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115])
 by hub.freebsd.org (Postfix) with ESMTP id 9770F221
 for <freebsd-fs@freebsd.org>; Thu, 17 Jan 2013 05:10:54 +0000 (UTC)
 (envelope-from kostikbel@gmail.com)
Received: from kib.kiev.ua (kib.kiev.ua [IPv6:2001:470:d5e7:1::1])
 by mx1.freebsd.org (Postfix) with ESMTP id 095E7372
 for <freebsd-fs@freebsd.org>; Thu, 17 Jan 2013 05:10:53 +0000 (UTC)
Received: from tom.home (kostik@localhost [127.0.0.1])
 by kib.kiev.ua (8.14.6/8.14.6) with ESMTP id r0H5Am09011029;
 Thu, 17 Jan 2013 07:10:48 +0200 (EET)
 (envelope-from kostikbel@gmail.com)
DKIM-Filter: OpenDKIM Filter v2.7.4 kib.kiev.ua r0H5Am09011029
Received: (from kostik@localhost)
 by tom.home (8.14.6/8.14.6/Submit) id r0H5Amio011028;
 Thu, 17 Jan 2013 07:10:48 +0200 (EET)
 (envelope-from kostikbel@gmail.com)
X-Authentication-Warning: tom.home: kostik set sender to kostikbel@gmail.com
 using -f
Date: Thu, 17 Jan 2013 07:10:48 +0200
From: Konstantin Belousov <kostikbel@gmail.com>
To: Olivier Cochard-Labb? <olivier@cochard.me>
Subject: Re: Reproducible crash with tmpfs on 9.1-release
Message-ID: <20130117051048.GK2522@kib.kiev.ua>
References: <CA+q+Tcpv8QXZJYXdk0RRQJJoZD1FhdA1kfB6WascHc0TpUVGHg@mail.gmail.com>
MIME-Version: 1.0
Content-Type: multipart/signed; micalg=pgp-sha1;
 protocol="application/pgp-signature"; boundary="/0P/MvzTfyTu5j9Q"
Content-Disposition: inline
In-Reply-To: <CA+q+Tcpv8QXZJYXdk0RRQJJoZD1FhdA1kfB6WascHc0TpUVGHg@mail.gmail.com>
User-Agent: Mutt/1.5.21 (2010-09-15)
X-Spam-Status: No, score=-2.0 required=5.0 tests=ALL_TRUSTED,BAYES_00,
 DKIM_ADSP_CUSTOM_MED,FREEMAIL_FROM,NML_ADSP_CUSTOM_MED autolearn=no
 version=3.3.2
X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on tom.home
Cc: freebsd-fs@freebsd.org
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 17 Jan 2013 05:10:54 -0000


--/0P/MvzTfyTu5j9Q
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline

On Thu, Jan 17, 2013 at 12:47:21AM +0100, Olivier Cochard-Labb? wrote:
> Hi,
> I reach to reproduce a crash on 9.1-release (amd64) by compiling
> software on a tmpfs workdir (cf PR/175353).
> My first machine is a 8 core with 56GB RAM, but without swap, then I
> didn't have a core dump on it.
> Then I reproduced the crash on a smaller machine with 4 core and only
> 4GB RAM but with a swap.
> I've put the files from my /var/crash online for anyone interested by
> (with the exception of the full vmcore):
> http://gugus69.free.fr/freebsd/tmpfs/core0/

This looks as unionfs problem, and not tmpfs. Unionfs is known to be
broken in varying ways.

--/0P/MvzTfyTu5j9Q
Content-Type: application/pgp-signature

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.19 (FreeBSD)

iQIcBAEBAgAGBQJQ94fYAAoJEJDCuSvBvK1B3mIP+gOsmeIalj2AOCvB09iIJtbb
xAeHOc99RN6S1vD5xMD/7mJHQ6aKzUQyRiGk03SzecQsoDo6OHUKl5rJ2RgtrIsw
iHny7Z2CD7GE1KgGgDK1G8LEmOKONvKaqV6AUmCKdS8yW87MMM4VRua5uc/7MNG9
D/1+Syb9ES37KwrKoIYMXxksaQoAvk+QzF4Gdfd6dvVgLoh1SGkPG8n9LUm3VzsO
8Anmy35rpSl4dEvkbSZ7PSyi7FxCTzsaAG88CaEmmq60vmTJhSQyXCzhvBcI0kPI
rM2cAi00lc+PghGPmbuGexULbKFnUlN4+kpzEx7eD8rpf6FRfy2WcKR0PRjGRbIp
cNMvHJGCyljY6+82W9z6SD9Y4tlCq1J1pABpVzMTVNUhCihYCmYQRavbwP0rAlsv
ld0FM8IwGnn/twgNnBkLsJgysd88JhtF81bEwgW8O/J0M4AoTcHE6vVvuDbpGzGb
esemmbAAAffsZqyKr2O3EgEX1lMYwDRTqX+7vaG3IMrXgyGvgIRutkLhwdnoosHI
e90LXR9IO8StQ9p+fiK1Y37raVRJgkcCNpsDhzSIz/mbMmbc4Jd/2tLCIbwLEjJC
yM5A+AIQr9wK9Pv7pzPNKH3fv/Z2i6eGJaqi/zUYAE4/CD6Nd8076yD20eMsdsRN
vDbNenJaRFOA/w0hFw2y
=MxRL
-----END PGP SIGNATURE-----

--/0P/MvzTfyTu5j9Q--

From owner-freebsd-fs@FreeBSD.ORG  Thu Jan 17 09:33:06 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@FreeBSD.org
Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115])
 by hub.freebsd.org (Postfix) with ESMTP id 8A8327A0;
 Thu, 17 Jan 2013 09:33:06 +0000 (UTC)
 (envelope-from nicolas@i.0x5.de)
Received: from n.0x5.de (n.0x5.de [217.197.85.144])
 by mx1.freebsd.org (Postfix) with ESMTP id 45FD8F36;
 Thu, 17 Jan 2013 09:33:05 +0000 (UTC)
Received: by pc5.i.0x5.de (Postfix, from userid 1003)
 id 3Yn0Sq2hTVz7ySF; Thu, 17 Jan 2013 10:32:59 +0100 (CET)
Date: Thu, 17 Jan 2013 10:32:59 +0100
From: Nicolas Rachinsky <fbsd-mas-0@ml.turing-complete.org>
To: Andriy Gapon <avg@FreeBSD.org>
Subject: Re: slowdown of zfs (tx->tx)
Message-ID: <20130117093259.GA83951@mid.pc5.i.0x5.de>
References: <20130114195148.GA20540@mid.pc5.i.0x5.de>
 <CAFqOu6jwJ4qhbOovN_NhzusdQJvrbvUC3g93sziR=Uw99SGenw@mail.gmail.com>
 <20130114214652.GA76779@mid.pc5.i.0x5.de>
 <CAFqOu6jKX-Ks6C1RK5GwZ51ZVUSnGSe7S99_EfK+fwLPjAFFYw@mail.gmail.com>
 <20130115224556.GA41774@mid.pc5.i.0x5.de>
 <CAFqOu6jJnWdbikPmE1-UML5i_x7meF+iyY=9WBRyv2j7AeOaSg@mail.gmail.com>
 <50F67551.5020704@FreeBSD.org>
 <20130116095009.GA36867@mid.pc5.i.0x5.de>
 <FD780217EA4548F187715AF1AAF2B91A@multiplay.co.uk>
 <50F69788.2040506@FreeBSD.org>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <50F69788.2040506@FreeBSD.org>
X-Powered-by: FreeBSD
X-Homepage: http://www.rachinsky.de
X-PGP-Keyid: 887BAE72
X-PGP-Fingerprint: 039E 9433 115F BC5F F88D  4524 5092 45C4 887B AE72
X-PGP-Keys: http://www.rachinsky.de/nicolas/gpg/nicolas_rachinsky.asc
User-Agent: Mutt/1.5.21 (2010-09-15)
Cc: freebsd-fs <freebsd-fs@FreeBSD.org>
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 17 Jan 2013 09:33:06 -0000

* Andriy Gapon <avg@FreeBSD.org> [2013-01-16 14:05 +0200]:
> on 16/01/2013 12:14 Steven Hartland said the following:
> > You only have ~11% free so yer it is pretty full ;-)
> 
> just in case, Steve is not kidding.
> 
> Those free hundreds of gigabytes could be spread over the terabytes and could be
> quite fragmented if the pool has a history of adding and removing lots of files.
>  ZFS could be spending quite a lot of time in that case when it looks for some
> free space and tries to minimize further fragmentation.
> 
> Empirical/anecdotal safe limit on pool utilization is said to be about 70-80%.
> 
> You can test if this guess is true by doing the following:
> kgdb -w
> (kgdb) set metaslab_min_alloc_size=4096
> 
> If performance noticeably improves after that, then this is your problem indeed.

I tried this, but I didn't notice any difference in performance.

Next I'll try the update Artem suggested.

Thanks

Nicolas
-- 
http://www.rachinsky.de/nicolas

From owner-freebsd-fs@FreeBSD.ORG  Thu Jan 17 10:34:11 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by hub.freebsd.org (Postfix) with ESMTP id 54038867
 for <freebsd-fs@freebsd.org>; Thu, 17 Jan 2013 10:34:11 +0000 (UTC)
 (envelope-from cochard@gmail.com)
Received: from mail-vb0-f50.google.com (mail-vb0-f50.google.com
 [209.85.212.50]) by mx1.freebsd.org (Postfix) with ESMTP id 16A621E0
 for <freebsd-fs@freebsd.org>; Thu, 17 Jan 2013 10:34:10 +0000 (UTC)
Received: by mail-vb0-f50.google.com with SMTP id ft2so1888070vbb.9
 for <freebsd-fs@freebsd.org>; Thu, 17 Jan 2013 02:34:10 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
 h=x-received:mime-version:sender:in-reply-to:references:from:date
 :x-google-sender-auth:message-id:subject:to:cc:content-type;
 bh=AFGbU6ubqEF0Ngb1O7G45hiRuo8Anovd7Bhpnk9lrRI=;
 b=w9XYcr9H+cjqo4Bg4dn8++rHTzRA7ihjEbHSZe3gjxyazBoIKk96JpWpY/gYUwi0L6
 znXDF4HCWEluw+ZoCiw345opuc0l3B4PtmUR/Dl2q4u2yhthVDDOODy5iO7CB4o7UC/P
 V78I/x7Tl2q2PQdrcydcAtAqL/UsOuleOtg7uvJgBKUwCT6nD6JveZsNCYr1RELMlAaa
 bAiU2Bfh72BQfH5DvvUwVB5ScUXkvJmHc+lRL/g4us9OqKAHLNm3fBSIo+ue9b7lJGzW
 XHNCfVriUuSQYvYrZ9xQ/qH8svcUHDOjPlp3NO1Ti3I+yfa7qRZe/EHkwfmv+Sf2lsSS
 2EXw==
X-Received: by 10.220.107.202 with SMTP id c10mr4783050vcp.59.1358418850019;
 Thu, 17 Jan 2013 02:34:10 -0800 (PST)
MIME-Version: 1.0
Sender: cochard@gmail.com
Received: by 10.58.164.100 with HTTP; Thu, 17 Jan 2013 02:33:49 -0800 (PST)
In-Reply-To: <20130117051048.GK2522@kib.kiev.ua>
References: <CA+q+Tcpv8QXZJYXdk0RRQJJoZD1FhdA1kfB6WascHc0TpUVGHg@mail.gmail.com>
 <20130117051048.GK2522@kib.kiev.ua>
From: =?ISO-8859-1?Q?Olivier_Cochard=2DLabb=E9?= <olivier@cochard.me>
Date: Thu, 17 Jan 2013 11:33:49 +0100
X-Google-Sender-Auth: U3SUkz8E-GclYEPsPiYJ_LgD1Ew
Message-ID: <CA+q+Tcq2B1XikXF21FOCR7AV_CJ7M5SCYELOzo2PMTgOjA4TDQ@mail.gmail.com>
Subject: Re: Reproducible crash with tmpfs on 9.1-release
To: Konstantin Belousov <kostikbel@gmail.com>
Content-Type: text/plain; charset=ISO-8859-1
Cc: freebsd-fs@freebsd.org
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 17 Jan 2013 10:34:11 -0000

On Thu, Jan 17, 2013 at 6:10 AM, Konstantin Belousov
<kostikbel@gmail.com> wrote:
>
> This looks as unionfs problem, and not tmpfs. Unionfs is known to be
> broken in varying ways.

Yes I'm using unionfs too, but I meet this problem only when I use a
tmpfs as workdir.

By the way, applying the mgj's patch (in stable/9 as of r245351):
http://people.freebsd.org/~mjg/patches/lockmgr-noshare-interlock.diff

solve this problem on my 2 machines ! No more crash :-)

Regards,

Olivier

From owner-freebsd-fs@FreeBSD.ORG  Thu Jan 17 10:40:01 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@smarthost.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115])
 by hub.freebsd.org (Postfix) with ESMTP id CA178929
 for <freebsd-fs@smarthost.ysv.freebsd.org>;
 Thu, 17 Jan 2013 10:40:01 +0000 (UTC)
 (envelope-from gnats@FreeBSD.org)
Received: from freefall.freebsd.org (freefall.freebsd.org
 [IPv6:2001:1900:2254:206c::16:87])
 by mx1.freebsd.org (Postfix) with ESMTP id 9B4D7210
 for <freebsd-fs@smarthost.ysv.freebsd.org>;
 Thu, 17 Jan 2013 10:40:01 +0000 (UTC)
Received: from freefall.freebsd.org (localhost [127.0.0.1])
 by freefall.freebsd.org (8.14.6/8.14.6) with ESMTP id r0HAe1mo029361
 for <freebsd-fs@freefall.freebsd.org>; Thu, 17 Jan 2013 10:40:01 GMT
 (envelope-from gnats@freefall.freebsd.org)
Received: (from gnats@localhost)
 by freefall.freebsd.org (8.14.6/8.14.6/Submit) id r0HAe1S9029360;
 Thu, 17 Jan 2013 10:40:01 GMT (envelope-from gnats)
Date: Thu, 17 Jan 2013 10:40:01 GMT
Message-Id: <201301171040.r0HAe1S9029360@freefall.freebsd.org>
To: freebsd-fs@FreeBSD.org
Cc: 
From: =?ISO-8859-1?Q?Olivier_Cochard=2DLabb=E9?= <olivier@cochard.me>
Subject: Re: kern/175353: [tmpfs] [panic] panic during building a nanobsd
 image + ports
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
Reply-To: =?ISO-8859-1?Q?Olivier_Cochard=2DLabb=E9?= <olivier@cochard.me>
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 17 Jan 2013 10:40:01 -0000

The following reply was made to PR kern/175353; it has been noted by GNATS.

From: =?ISO-8859-1?Q?Olivier_Cochard=2DLabb=E9?= <olivier@cochard.me>
To: bug-followup@freebsd.org
Cc:  
Subject: Re: kern/175353: [tmpfs] [panic] panic during building a nanobsd
 image + ports
Date: Thu, 17 Jan 2013 11:36:22 +0100

 Applying the mjg's patch of revision 245351 (kern_lock.c) solve this
 problem on my 2 machines.
 
 Regards,
 
 Olivier

From owner-freebsd-fs@FreeBSD.ORG  Thu Jan 17 19:19:23 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115])
 by hub.freebsd.org (Postfix) with ESMTP id 2DF32118
 for <freebsd-fs@freebsd.org>; Thu, 17 Jan 2013 19:19:23 +0000 (UTC)
 (envelope-from ronald-freebsd8@klop.yi.org)
Received: from cpsmtpb-ews05.kpnxchange.com (cpsmtpb-ews05.kpnxchange.com
 [213.75.39.8]) by mx1.freebsd.org (Postfix) with ESMTP id BAD9CB9E
 for <freebsd-fs@freebsd.org>; Thu, 17 Jan 2013 19:19:22 +0000 (UTC)
Received: from cpsps-ews29.kpnxchange.com ([10.94.84.195]) by
 cpsmtpb-ews05.kpnxchange.com with Microsoft SMTPSVC(7.5.7601.17514); 
 Thu, 17 Jan 2013 20:17:06 +0100
Received: from CPSMTPM-TLF104.kpnxchange.com ([195.121.3.7]) by
 cpsps-ews29.kpnxchange.com with Microsoft SMTPSVC(7.5.7601.17514); 
 Thu, 17 Jan 2013 20:17:06 +0100
Received: from sjakie.klop.ws ([212.182.167.131]) by
 CPSMTPM-TLF104.kpnxchange.com with Microsoft SMTPSVC(7.5.7601.17514); 
 Thu, 17 Jan 2013 20:18:14 +0100
Received: from 212-182-167-131.ip.telfort.nl (localhost [127.0.0.1])
 by sjakie.klop.ws (Postfix) with ESMTP id 7B3DD98AE
 for <freebsd-fs@freebsd.org>; Thu, 17 Jan 2013 20:18:14 +0100 (CET)
Content-Type: text/plain; charset=us-ascii; format=flowed; delsp=yes
To: freebsd-fs@freebsd.org
Subject: Re: slowdown of zfs (tx->tx)
References: <20130114195148.GA20540@mid.pc5.i.0x5.de>
 <CAFqOu6jwJ4qhbOovN_NhzusdQJvrbvUC3g93sziR=Uw99SGenw@mail.gmail.com>
 <20130114214652.GA76779@mid.pc5.i.0x5.de>
 <CAFqOu6jKX-Ks6C1RK5GwZ51ZVUSnGSe7S99_EfK+fwLPjAFFYw@mail.gmail.com>
 <20130115224556.GA41774@mid.pc5.i.0x5.de>
 <CAFqOu6jJnWdbikPmE1-UML5i_x7meF+iyY=9WBRyv2j7AeOaSg@mail.gmail.com>
 <50F67551.5020704@FreeBSD.org> <20130116095009.GA36867@mid.pc5.i.0x5.de>
 <FD780217EA4548F187715AF1AAF2B91A@multiplay.co.uk>
 <50F69788.2040506@FreeBSD.org> <20130117093259.GA83951@mid.pc5.i.0x5.de>
Date: Thu, 17 Jan 2013 20:18:14 +0100
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
From: "Ronald Klop" <ronald-freebsd8@klop.yi.org>
Message-ID: <op.wq2k8or38527sy@212-182-167-131.ip.telfort.nl>
In-Reply-To: <20130117093259.GA83951@mid.pc5.i.0x5.de>
User-Agent: Opera Mail/12.12 (FreeBSD)
X-OriginalArrivalTime: 17 Jan 2013 19:18:14.0831 (UTC)
 FILETIME=[65EDABF0:01CDF4E7]
X-RcptDomain: freebsd.org
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 17 Jan 2013 19:19:23 -0000

On Thu, 17 Jan 2013 10:32:59 +0100, Nicolas Rachinsky  
<fbsd-mas-0@ml.turing-complete.org> wrote:

> * Andriy Gapon <avg@FreeBSD.org> [2013-01-16 14:05 +0200]:
>> on 16/01/2013 12:14 Steven Hartland said the following:
>> > You only have ~11% free so yer it is pretty full ;-)
>>
>> just in case, Steve is not kidding.
>>
>> Those free hundreds of gigabytes could be spread over the terabytes and  
>> could be
>> quite fragmented if the pool has a history of adding and removing lots  
>> of files.
>>  ZFS could be spending quite a lot of time in that case when it looks  
>> for some
>> free space and tries to minimize further fragmentation.
>>
>> Empirical/anecdotal safe limit on pool utilization is said to be about  
>> 70-80%.
>>
>> You can test if this guess is true by doing the following:
>> kgdb -w
>> (kgdb) set metaslab_min_alloc_size=4096
>>
>> If performance noticeably improves after that, then this is your  
>> problem indeed.
>
> I tried this, but I didn't notice any difference in performance.
>
> Next I'll try the update Artem suggested.
>
> Thanks
>
> Nicolas

Did you already try to free some space?

Ronald.

From owner-freebsd-fs@FreeBSD.ORG  Thu Jan 17 19:57:45 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115])
 by hub.freebsd.org (Postfix) with ESMTP id CC2A8FC7;
 Thu, 17 Jan 2013 19:57:45 +0000 (UTC) (envelope-from jhb@freebsd.org)
Received: from bigwig.baldwin.cx (bigknife-pt.tunnel.tserv9.chi1.ipv6.he.net
 [IPv6:2001:470:1f10:75::2])
 by mx1.freebsd.org (Postfix) with ESMTP id 7AAFBDF5;
 Thu, 17 Jan 2013 19:57:45 +0000 (UTC)
Received: from pakbsde14.localnet (unknown [38.105.238.108])
 by bigwig.baldwin.cx (Postfix) with ESMTPSA id D57F7B924;
 Thu, 17 Jan 2013 14:57:44 -0500 (EST)
From: John Baldwin <jhb@freebsd.org>
To: freebsd-fs@freebsd.org
Subject: [PATCH] Use vfs_timestamp() instead of getnanotime() in NFS
Date: Thu, 17 Jan 2013 14:57:43 -0500
User-Agent: KMail/1.13.5 (FreeBSD/8.2-CBSD-20110714-p22; KDE/4.5.5; amd64; ; )
References: <162405990.1985479.1358212854967.JavaMail.root@erie.cs.uoguelph.ca>
 <20130115141019.H1444@besplex.bde.org> <201301151458.42874.jhb@freebsd.org>
In-Reply-To: <201301151458.42874.jhb@freebsd.org>
MIME-Version: 1.0
Content-Type: Text/Plain;
  charset="iso-8859-1"
Content-Transfer-Encoding: 7bit
Message-Id: <201301171457.43800.jhb@freebsd.org>
X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.2.7
 (bigwig.baldwin.cx); Thu, 17 Jan 2013 14:57:44 -0500 (EST)
Cc: Rick Macklem <rmacklem@freebsd.org>
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 17 Jan 2013 19:57:45 -0000

On Tuesday, January 15, 2013 2:58:42 pm John Baldwin wrote:
> Fixing NFS to properly use vfs_timestamp() seems to be a larger
> project.

Actually, I have a patch that I think does this below.  It builds, have not 
yet booted it (but will do so in a bit).

Index: fs/nfsclient/nfs_clstate.c
===================================================================
--- fs/nfsclient/nfs_clstate.c	(revision 245225)
+++ fs/nfsclient/nfs_clstate.c	(working copy)
@@ -4611,7 +4611,7 @@
 	}
 	dp = nfscl_finddeleg(clp, np->n_fhp->nfh_fh, np->n_fhp->nfh_len);
 	if (dp != NULL && (dp->nfsdl_flags & NFSCLDL_WRITE)) {
-		NFSGETNANOTIME(&dp->nfsdl_modtime);
+		vfs_timestamp(&dp->nfsdl_modtime);
 		dp->nfsdl_flags |= NFSCLDL_MODTIMESET;
 	}
 	NFSUNLOCKCLSTATE();
Index: fs/nfsclient/nfs_clvnops.c
===================================================================
--- fs/nfsclient/nfs_clvnops.c	(revision 245225)
+++ fs/nfsclient/nfs_clvnops.c	(working copy)
@@ -3247,7 +3247,7 @@
 	 */
 	mtx_lock(&np->n_mtx);
 	np->n_flag |= NACC;
-	getnanotime(&np->n_atim);
+	vfs_timestamp(&np->n_atim);
 	mtx_unlock(&np->n_mtx);
 	error = fifo_specops.vop_read(ap);
 	return error;	
@@ -3266,7 +3266,7 @@
 	 */
 	mtx_lock(&np->n_mtx);
 	np->n_flag |= NUPD;
-	getnanotime(&np->n_mtim);
+	vfs_timestamp(&np->n_mtim);
 	mtx_unlock(&np->n_mtx);
 	return(fifo_specops.vop_write(ap));
 }
@@ -3286,7 +3286,7 @@
 
 	mtx_lock(&np->n_mtx);
 	if (np->n_flag & (NACC | NUPD)) {
-		getnanotime(&ts);
+		vfs_timestamp(&ts);
 		if (np->n_flag & NACC)
 			np->n_atim = ts;
 		if (np->n_flag & NUPD)
Index: fs/nfsserver/nfs_nfsdport.c
===================================================================
--- fs/nfsserver/nfs_nfsdport.c	(revision 245225)
+++ fs/nfsserver/nfs_nfsdport.c	(working copy)
@@ -1476,7 +1476,7 @@
 	struct vattr va;
 
 	VATTR_NULL(&va);
-	getnanotime(&va.va_mtime);
+	vfs_timestamp(&va.va_mtime);
 	(void) VOP_SETATTR(vp, &va, cred);
 	(void) nfsvno_getattr(vp, nvap, cred, p, 1);
 }
@@ -2248,7 +2248,6 @@
 {
 	u_int32_t *tl;
 	struct nfsv2_sattr *sp;
-	struct timeval curtime;
 	int error = 0, toclient = 0;
 
 	switch (nd->nd_flag & (ND_NFSV2 | ND_NFSV3 | ND_NFSV4)) {
@@ -2307,9 +2306,7 @@
 			toclient = 1;
 			break;
 		case NFSV3SATTRTIME_TOSERVER:
-			NFSGETTIME(&curtime);
-			nvap->na_atime.tv_sec = curtime.tv_sec;
-			nvap->na_atime.tv_nsec = curtime.tv_usec * 1000;
+			vfs_timestamp(&nvap->na_atime);
 			nvap->na_vaflags |= VA_UTIMES_NULL;
 			break;
 		};
@@ -2321,9 +2318,7 @@
 			nvap->na_vaflags &= ~VA_UTIMES_NULL;
 			break;
 		case NFSV3SATTRTIME_TOSERVER:
-			NFSGETTIME(&curtime);
-			nvap->na_mtime.tv_sec = curtime.tv_sec;
-			nvap->na_mtime.tv_nsec = curtime.tv_usec * 1000;
+			vfs_timestamp(&nvap->na_mtime);
 			if (!toclient)
 				nvap->na_vaflags |= VA_UTIMES_NULL;
 			break;
@@ -2353,7 +2348,6 @@
 	u_char *cp, namestr[NFSV4_SMALLSTR + 1];
 	uid_t uid;
 	gid_t gid;
-	struct timeval curtime;
 
 	error = nfsrv_getattrbits(nd, attrbitp, NULL, &retnotsup);
 	if (error)
@@ -2488,9 +2482,7 @@
 			    toclient = 1;
 			    attrsum += NFSX_V4TIME;
 			} else {
-			    NFSGETTIME(&curtime);
-			    nvap->na_atime.tv_sec = curtime.tv_sec;
-			    nvap->na_atime.tv_nsec = curtime.tv_usec * 1000;
+			    vfs_timestamp(&nvap->na_atime);
 			    nvap->na_vaflags |= VA_UTIMES_NULL;
 			}
 			break;
@@ -2515,9 +2507,7 @@
 			    nvap->na_vaflags &= ~VA_UTIMES_NULL;
 			    attrsum += NFSX_V4TIME;
 			} else {
-			    NFSGETTIME(&curtime);
-			    nvap->na_mtime.tv_sec = curtime.tv_sec;
-			    nvap->na_mtime.tv_nsec = curtime.tv_usec * 1000;
+			    vfs_timestamp(&nvap->na_mtime);
 			    if (!toclient)
 				nvap->na_vaflags |= VA_UTIMES_NULL;
 			}
Index: nfsclient/nfs_vnops.c
===================================================================
--- nfsclient/nfs_vnops.c	(revision 245225)
+++ nfsclient/nfs_vnops.c	(working copy)
@@ -3458,7 +3458,7 @@
 	 */
 	mtx_lock(&np->n_mtx);
 	np->n_flag |= NACC;
-	getnanotime(&np->n_atim);
+	vfs_timestamp(&np->n_atim);
 	mtx_unlock(&np->n_mtx);
 	error = fifo_specops.vop_read(ap);
 	return error;	
@@ -3477,7 +3477,7 @@
 	 */
 	mtx_lock(&np->n_mtx);
 	np->n_flag |= NUPD;
-	getnanotime(&np->n_mtim);
+	vfs_timestamp(&np->n_mtim);
 	mtx_unlock(&np->n_mtx);
 	return(fifo_specops.vop_write(ap));
 }
@@ -3497,7 +3497,7 @@
 
 	mtx_lock(&np->n_mtx);
 	if (np->n_flag & (NACC | NUPD)) {
-		getnanotime(&ts);
+		vfs_timestamp(&ts);
 		if (np->n_flag & NACC)
 			np->n_atim = ts;
 		if (np->n_flag & NUPD)
Index: nfsserver/nfs_srvsubs.c
===================================================================
--- nfsserver/nfs_srvsubs.c	(revision 245225)
+++ nfsserver/nfs_srvsubs.c	(working copy)
@@ -1393,7 +1393,7 @@
 		toclient = 1;
 		break;
 	case NFSV3SATTRTIME_TOSERVER:
-		getnanotime(&(a)->va_atime);
+		vfs_timestamp(&(a)->va_atime);
 		a->va_vaflags |= VA_UTIMES_NULL;
 		break;
 	}
@@ -1409,7 +1409,7 @@
 		a->va_vaflags &= ~VA_UTIMES_NULL;
 		break;
 	case NFSV3SATTRTIME_TOSERVER:
-		getnanotime(&(a)->va_mtime);
+		vfs_timestamp(&(a)->va_mtime);
 		if (toclient == 0)
 			a->va_vaflags |= VA_UTIMES_NULL;
 		break;

-- 
John Baldwin

From owner-freebsd-fs@FreeBSD.ORG  Thu Jan 17 23:05:49 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@smarthost.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115])
 by hub.freebsd.org (Postfix) with ESMTP id 95326F6D;
 Thu, 17 Jan 2013 23:05:49 +0000 (UTC) (envelope-from mjg@FreeBSD.org)
Received: from freefall.freebsd.org (freefall.freebsd.org
 [IPv6:2001:1900:2254:206c::16:87])
 by mx1.freebsd.org (Postfix) with ESMTP id 6F02BB15;
 Thu, 17 Jan 2013 23:05:49 +0000 (UTC)
Received: from freefall.freebsd.org (localhost [127.0.0.1])
 by freefall.freebsd.org (8.14.6/8.14.6) with ESMTP id r0HN5nBW066428;
 Thu, 17 Jan 2013 23:05:49 GMT
 (envelope-from mjg@freefall.freebsd.org)
Received: (from mjg@localhost)
 by freefall.freebsd.org (8.14.6/8.14.6/Submit) id r0HN5nLN066424;
 Thu, 17 Jan 2013 23:05:49 GMT (envelope-from mjg)
Date: Thu, 17 Jan 2013 23:05:49 GMT
Message-Id: <201301172305.r0HN5nLN066424@freefall.freebsd.org>
To: mjg@FreeBSD.org, freebsd-fs@FreeBSD.org, mjg@FreeBSD.org
From: mjg@FreeBSD.org
Subject: Re: kern/175353: [tmpfs] [panic] panic during building a nanobsd
 image + ports
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 17 Jan 2013 23:05:49 -0000

Synopsis: [tmpfs] [panic] panic during building a nanobsd image + ports

Responsible-Changed-From-To: freebsd-fs->mjg
Responsible-Changed-By: mjg
Responsible-Changed-When: Thu Jan 17 23:05:48 UTC 2013
Responsible-Changed-Why: 
Take

http://www.freebsd.org/cgi/query-pr.cgi?pr=175353

From owner-freebsd-fs@FreeBSD.ORG  Fri Jan 18 00:49:29 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by hub.freebsd.org (Postfix) with ESMTP id 5149F779;
 Fri, 18 Jan 2013 00:49:29 +0000 (UTC)
 (envelope-from cross+freebsd@distal.com)
Received: from mail.distal.com (mail.distal.com [IPv6:2001:470:e24c:200::ae25])
 by mx1.freebsd.org (Postfix) with ESMTP id E2D8A8C;
 Fri, 18 Jan 2013 00:49:28 +0000 (UTC)
Received: from magrathea.distal.com (magrathea.distal.com
 [IPv6:2001:470:e24c:200:ea06:88ff:feca:960e]) (authenticated bits=0)
 by mail.distal.com (8.14.3/8.14.3) with ESMTP id r0I0nPuC014155
 (version=TLSv1/SSLv3 cipher=AES128-SHA bits=128 verify=NO);
 Thu, 17 Jan 2013 19:49:25 -0500 (EST)
Content-Type: text/plain; charset=iso-8859-1
Mime-Version: 1.0 (Mac OS X Mail 6.2 \(1499\))
Subject: Re: Changes to kern.geom.debugflags?
From: Chris Ross <cross+freebsd@distal.com>
In-Reply-To: <50F82846.6030104@FreeBSD.org>
Date: Thu, 17 Jan 2013 19:49:24 -0500
Content-Transfer-Encoding: quoted-printable
Message-Id: <315EDE17-4995-4819-BC82-E9B7D942E82A@distal.com>
References: <7AA0B5D0-D49C-4D5A-8FA0-AA57C091C040@distal.com>
 <6A0C1005-F328-4C4C-BB83-CA463BD85127@distal.com>
 <20121225232507.GA47735@alchemy.franken.de>
 <8D01A854-97D9-4F1F-906A-7AB59BF8850B@distal.com>
 <A947C892-5379-4F70-BFA0-0A7AB94DF0C6@distal.com>
 <6FC4189B-85FA-466F-AA00-C660E9C16367@distal.com>
 <20121230032403.GA29164@pix.net>
 <56B28B8A-2284-421D-A666-A21F995C7640@distal.com>
 <EEB60849-3192-46E2-8626-EC6824182515@distal.com>
 <20130104234616.GA37999@alchemy.franken.de>
 <BD3E2E30-457F-4D8A-8057-3ECED966419F@distal.com>
 <D3EE961A-D099-4CA9-9B2B-B69E171B95F1@distal.com>
 <50F82846.6030104@FreeBSD.org>
To: Andriy Gapon <avg@freebsd.org>
X-Mailer: Apple Mail (2.1499)
X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.2.2
 (mail.distal.com [IPv6:2001:470:e24c:200::ae25]);
 Thu, 17 Jan 2013 19:49:26 -0500 (EST)
Cc: "freebsd-fs@freebsd.org" <freebsd-fs@freebsd.org>, Kurt Lidl <lidl@pix.net>,
 "freebsd-sparc64@freebsd.org" <freebsd-sparc64@freebsd.org>,
 Marius Strobl <marius@alchemy.franken.de>
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 18 Jan 2013 00:49:29 -0000


On Jan 17, 2013, at 11:35 , Andriy Gapon <avg@freebsd.org> wrote:
> on 08/01/2013 03:53 Chris Ross said the following:
>>=20
>>  Out of curiosity, I did try 242229. It boots.  So. the problem =
occurred with 242230, which=20
>> came from 241289.  FYI.
>=20
> Chris,
>=20
> thank you for triaging and analyzing this problem.  And sorry for the =
long delay
> (caused by the New Year craziness you mentioned earlier).
>=20
> The problem is that arch_zfs_probe methods are expected only to probe =
for ZFS
> disks/partitions, but they are not allowed to execute any other ZFS =
operations.
> I assumed this to be true and forgot to check sparc64_zfs_probe.  Mea =
culpa.
>=20
> Could you please test the following patch?

  Thank you, Andriy.  Much as you'd expect, that patch solves the =
problem.  I get some
of the printf()s that I'd put into zfs_fmtdev(), and the system loads =
successfully.

  Please commit that patch, and if you could, change the comment just =
below the last
portion of it that is now not quite accurate (since you moved mentioned =
code).

  Thanks again!  How long will this take to get to stable/9?  Being new =
to FreeBSD,
I'm not too familiar with the process of HEAD/stable/etc.  (In NetBSD, =
it would be a
commit followed by a pull request.)

                      - Chris


From owner-freebsd-fs@FreeBSD.ORG  Fri Jan 18 02:23:43 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115])
 by hub.freebsd.org (Postfix) with ESMTP id 7A6D0669;
 Fri, 18 Jan 2013 02:23:43 +0000 (UTC)
 (envelope-from rmacklem@uoguelph.ca)
Received: from esa-annu.net.uoguelph.ca (esa-annu.mail.uoguelph.ca
 [131.104.91.36]) by mx1.freebsd.org (Postfix) with ESMTP id E31D6654;
 Fri, 18 Jan 2013 02:23:42 +0000 (UTC)
X-IronPort-Anti-Spam-Filtered: true
X-IronPort-Anti-Spam-Result: AqAEAG6x+FCDaFvO/2dsb2JhbABFhkW0CYN/c4IeAQEEASMEUgUWDgoCAg0ZAlkGiCYGqVORdoEjjwOBEwOIYY0riU2GfIMTggY
X-IronPort-AV: E=Sophos;i="4.84,488,1355115600"; 
   d="scan'208";a="9745603"
Received: from erie.cs.uoguelph.ca (HELO zcs3.mail.uoguelph.ca)
 ([131.104.91.206])
 by esa-annu.net.uoguelph.ca with ESMTP; 17 Jan 2013 21:23:35 -0500
Received: from zcs3.mail.uoguelph.ca (localhost.localdomain [127.0.0.1])
 by zcs3.mail.uoguelph.ca (Postfix) with ESMTP id DC8CCB3F44;
 Thu, 17 Jan 2013 21:23:35 -0500 (EST)
Date: Thu, 17 Jan 2013 21:23:35 -0500 (EST)
From: Rick Macklem <rmacklem@uoguelph.ca>
To: John Baldwin <jhb@freebsd.org>
Message-ID: <460209850.2108683.1358475815866.JavaMail.root@erie.cs.uoguelph.ca>
In-Reply-To: <201301171457.43800.jhb@freebsd.org>
Subject: Re: [PATCH] Use vfs_timestamp() instead of getnanotime() in NFS
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit
X-Originating-IP: [172.17.91.203]
X-Mailer: Zimbra 6.0.10_GA_2692 (ZimbraWebClient - FF3.0 (Win)/6.0.10_GA_2692)
Cc: freebsd-fs@freebsd.org, Rick Macklem <rmacklem@freebsd.org>
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 18 Jan 2013 02:23:43 -0000

John Baldwin wrote:
> On Tuesday, January 15, 2013 2:58:42 pm John Baldwin wrote:
> > Fixing NFS to properly use vfs_timestamp() seems to be a larger
> > project.
> 
> Actually, I have a patch that I think does this below. It builds, have
> not
> yet booted it (but will do so in a bit).
> 
> Index: fs/nfsclient/nfs_clstate.c
> ===================================================================
> --- fs/nfsclient/nfs_clstate.c (revision 245225)
> +++ fs/nfsclient/nfs_clstate.c (working copy)
> @@ -4611,7 +4611,7 @@
> }
> dp = nfscl_finddeleg(clp, np->n_fhp->nfh_fh, np->n_fhp->nfh_len);
> if (dp != NULL && (dp->nfsdl_flags & NFSCLDL_WRITE)) {
> - NFSGETNANOTIME(&dp->nfsdl_modtime);
> + vfs_timestamp(&dp->nfsdl_modtime);
> dp->nfsdl_flags |= NFSCLDL_MODTIMESET;
> }
> NFSUNLOCKCLSTATE();
Not sure about this case. Although nfsdl_modtime is being set for local
use, it replaces the mtime returned by the NFS server while the delegation
is in use. Ideally it would be the same resolution as the NFS server, but
that resolution isn't known to the client. (It is often better than 1sec,
which is the default for vfs_timestamp().)

I'd be tempted to leave it (although the function used by the macro might
need to be changed, since Bruce mentions getnanotime() isn't supposed to
be used?).

> Index: fs/nfsclient/nfs_clvnops.c
> ===================================================================
> --- fs/nfsclient/nfs_clvnops.c (revision 245225)
> +++ fs/nfsclient/nfs_clvnops.c (working copy)
> @@ -3247,7 +3247,7 @@
> */
> mtx_lock(&np->n_mtx);
> np->n_flag |= NACC;
> - getnanotime(&np->n_atim);
> + vfs_timestamp(&np->n_atim);
> mtx_unlock(&np->n_mtx);
> error = fifo_specops.vop_read(ap);
> return error;
> @@ -3266,7 +3266,7 @@
> */
> mtx_lock(&np->n_mtx);
> np->n_flag |= NUPD;
> - getnanotime(&np->n_mtim);
> + vfs_timestamp(&np->n_mtim);
> mtx_unlock(&np->n_mtx);
> return(fifo_specops.vop_write(ap));
> }
> @@ -3286,7 +3286,7 @@
> 
> mtx_lock(&np->n_mtx);
> if (np->n_flag & (NACC | NUPD)) {
> - getnanotime(&ts);
> + vfs_timestamp(&ts);
> if (np->n_flag & NACC)
> np->n_atim = ts;
> if (np->n_flag & NUPD)
> Index: fs/nfsserver/nfs_nfsdport.c
> ===================================================================
> --- fs/nfsserver/nfs_nfsdport.c (revision 245225)
> +++ fs/nfsserver/nfs_nfsdport.c (working copy)
> @@ -1476,7 +1476,7 @@
> struct vattr va;
> 
> VATTR_NULL(&va);
> - getnanotime(&va.va_mtime);
> + vfs_timestamp(&va.va_mtime);
> (void) VOP_SETATTR(vp, &va, cred);
> (void) nfsvno_getattr(vp, nvap, cred, p, 1);
> }
> @@ -2248,7 +2248,6 @@
> {
> u_int32_t *tl;
> struct nfsv2_sattr *sp;
> - struct timeval curtime;
> int error = 0, toclient = 0;
> 
> switch (nd->nd_flag & (ND_NFSV2 | ND_NFSV3 | ND_NFSV4)) {
> @@ -2307,9 +2306,7 @@
> toclient = 1;
> break;
> case NFSV3SATTRTIME_TOSERVER:
> - NFSGETTIME(&curtime);
> - nvap->na_atime.tv_sec = curtime.tv_sec;
> - nvap->na_atime.tv_nsec = curtime.tv_usec * 1000;
> + vfs_timestamp(&nvap->na_atime);
> nvap->na_vaflags |= VA_UTIMES_NULL;
> break;
> };
> @@ -2321,9 +2318,7 @@
> nvap->na_vaflags &= ~VA_UTIMES_NULL;
> break;
> case NFSV3SATTRTIME_TOSERVER:
> - NFSGETTIME(&curtime);
> - nvap->na_mtime.tv_sec = curtime.tv_sec;
> - nvap->na_mtime.tv_nsec = curtime.tv_usec * 1000;
> + vfs_timestamp(&nvap->na_mtime);
> if (!toclient)
> nvap->na_vaflags |= VA_UTIMES_NULL;
> break;
> @@ -2353,7 +2348,6 @@
> u_char *cp, namestr[NFSV4_SMALLSTR + 1];
> uid_t uid;
> gid_t gid;
> - struct timeval curtime;
> 
> error = nfsrv_getattrbits(nd, attrbitp, NULL, &retnotsup);
> if (error)
> @@ -2488,9 +2482,7 @@
> toclient = 1;
> attrsum += NFSX_V4TIME;
> } else {
> - NFSGETTIME(&curtime);
> - nvap->na_atime.tv_sec = curtime.tv_sec;
> - nvap->na_atime.tv_nsec = curtime.tv_usec * 1000;
> + vfs_timestamp(&nvap->na_atime);
> nvap->na_vaflags |= VA_UTIMES_NULL;
> }
> break;
> @@ -2515,9 +2507,7 @@
> nvap->na_vaflags &= ~VA_UTIMES_NULL;
> attrsum += NFSX_V4TIME;
> } else {
> - NFSGETTIME(&curtime);
> - nvap->na_mtime.tv_sec = curtime.tv_sec;
> - nvap->na_mtime.tv_nsec = curtime.tv_usec * 1000;
> + vfs_timestamp(&nvap->na_mtime);
> if (!toclient)
> nvap->na_vaflags |= VA_UTIMES_NULL;
> }
> Index: nfsclient/nfs_vnops.c
> ===================================================================
> --- nfsclient/nfs_vnops.c (revision 245225)
> +++ nfsclient/nfs_vnops.c (working copy)
> @@ -3458,7 +3458,7 @@
> */
> mtx_lock(&np->n_mtx);
> np->n_flag |= NACC;
> - getnanotime(&np->n_atim);
> + vfs_timestamp(&np->n_atim);
> mtx_unlock(&np->n_mtx);
> error = fifo_specops.vop_read(ap);
> return error;
> @@ -3477,7 +3477,7 @@
> */
> mtx_lock(&np->n_mtx);
> np->n_flag |= NUPD;
> - getnanotime(&np->n_mtim);
> + vfs_timestamp(&np->n_mtim);
> mtx_unlock(&np->n_mtx);
> return(fifo_specops.vop_write(ap));
> }
> @@ -3497,7 +3497,7 @@
> 
> mtx_lock(&np->n_mtx);
> if (np->n_flag & (NACC | NUPD)) {
> - getnanotime(&ts);
> + vfs_timestamp(&ts);
> if (np->n_flag & NACC)
> np->n_atim = ts;
> if (np->n_flag & NUPD)
> Index: nfsserver/nfs_srvsubs.c
> ===================================================================
> --- nfsserver/nfs_srvsubs.c (revision 245225)
> +++ nfsserver/nfs_srvsubs.c (working copy)
> @@ -1393,7 +1393,7 @@
> toclient = 1;
> break;
> case NFSV3SATTRTIME_TOSERVER:
> - getnanotime(&(a)->va_atime);
> + vfs_timestamp(&(a)->va_atime);
> a->va_vaflags |= VA_UTIMES_NULL;
> break;
> }
> @@ -1409,7 +1409,7 @@
> a->va_vaflags &= ~VA_UTIMES_NULL;
> break;
> case NFSV3SATTRTIME_TOSERVER:
> - getnanotime(&(a)->va_mtime);
> + vfs_timestamp(&(a)->va_mtime);
> if (toclient == 0)
> a->va_vaflags |= VA_UTIMES_NULL;
> break;
> 
> --
> John Baldwin
Other than nfsdl_modtime, the rest look ok to me, since they
are either the times for the special files in the client or
timestamps for server file systems.

rick


From owner-freebsd-fs@FreeBSD.ORG  Fri Jan 18 06:19:42 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@FreeBSD.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by hub.freebsd.org (Postfix) with ESMTP id 039A4B54;
 Fri, 18 Jan 2013 06:19:42 +0000 (UTC)
 (envelope-from brde@optusnet.com.au)
Received: from mail13.syd.optusnet.com.au (mail13.syd.optusnet.com.au
 [211.29.132.194])
 by mx1.freebsd.org (Postfix) with ESMTP id 8715DF66;
 Fri, 18 Jan 2013 06:19:40 +0000 (UTC)
Received: from c211-30-173-106.carlnfd1.nsw.optusnet.com.au
 (c211-30-173-106.carlnfd1.nsw.optusnet.com.au [211.30.173.106])
 by mail13.syd.optusnet.com.au (8.13.1/8.13.1) with ESMTP id r0I6JTZv001575
 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO);
 Fri, 18 Jan 2013 17:19:30 +1100
Date: Fri, 18 Jan 2013 17:19:29 +1100 (EST)
From: Bruce Evans <brde@optusnet.com.au>
X-X-Sender: bde@besplex.bde.org
To: Rick Macklem <rmacklem@uoguelph.ca>
Subject: Re: [PATCH] Use vfs_timestamp() instead of getnanotime() in NFS
In-Reply-To: <460209850.2108683.1358475815866.JavaMail.root@erie.cs.uoguelph.ca>
Message-ID: <20130118165934.K1042@besplex.bde.org>
References: <460209850.2108683.1358475815866.JavaMail.root@erie.cs.uoguelph.ca>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed
X-Optus-CM-Score: 0
X-Optus-CM-Analysis: v=2.0 cv=Zty1sKHG c=1 sm=1 a=kdfE0iePi98A:10
 a=kj9zAlcOel0A:10 a=PO7r1zJSAAAA:8 a=JzwRw_2MAAAA:8 a=9IEQsz3md4oA:10
 a=g0RNfcms4CO3QVWdHTYA:9 a=CjuIK1q_8ugA:10 a=TEtd8y5WR3g2ypngnwZWYw==:117
Cc: freebsd-fs@FreeBSD.org, Rick Macklem <rmacklem@FreeBSD.org>
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 18 Jan 2013 06:19:42 -0000

On Thu, 17 Jan 2013, Rick Macklem wrote:

> John Baldwin wrote:
>> On Tuesday, January 15, 2013 2:58:42 pm John Baldwin wrote:
>>> Fixing NFS to properly use vfs_timestamp() seems to be a larger
>>> project.
>>
>> Actually, I have a patch that I think does this below. It builds, have
>> not
>> yet booted it (but will do so in a bit).
>>
>> Index: fs/nfsclient/nfs_clstate.c
>> ===================================================================
>> --- fs/nfsclient/nfs_clstate.c (revision 245225)
>> +++ fs/nfsclient/nfs_clstate.c (working copy)
>> @@ -4611,7 +4611,7 @@
>> }
>> dp = nfscl_finddeleg(clp, np->n_fhp->nfh_fh, np->n_fhp->nfh_len);
>> if (dp != NULL && (dp->nfsdl_flags & NFSCLDL_WRITE)) {
>> - NFSGETNANOTIME(&dp->nfsdl_modtime);
>> + vfs_timestamp(&dp->nfsdl_modtime);
>> dp->nfsdl_flags |= NFSCLDL_MODTIMESET;
>> }
>> NFSUNLOCKCLSTATE();
> Not sure about this case. Although nfsdl_modtime is being set for local
> use, it replaces the mtime returned by the NFS server while the delegation
> is in use. Ideally it would be the same resolution as the NFS server, but
> that resolution isn't known to the client. (It is often better than 1sec,
> which is the default for vfs_timestamp().)

The patch seems about right except for this.

> I'd be tempted to leave it (although the function used by the macro might
> need to be changed, since Bruce mentions getnanotime() isn't supposed to
> be used?).

For maximal precision and accuracy, it nanotime() should be used.  I'm
not sure if you need to be at least as precise and accurate as the server.
Having them synced to nanoseconds accuracy is impossible, but
getnanotime() gives <= 1/HZ of accuracy and it is easy for them to be
synced with more accuracy than that.  Then the extra accuracy can be
seen in server timestamps if the server is FreeBSD and uses vfs_timestamp()
with a either microtime() or nanotime().

Further style fixes:
- remove the NFSGETNANOTIME() macro.  It is only used in the above, and
   in 3 other instances where its use is bogus because only the seconds
   part is used.  The `time_second' global gives seconds part with the
   same (in)accuracy as getnanotime().  If you want maximal accuracy
   for just the seconds part, then bintime() should be used (this is
   slightly faster than microtime() and nanotime().
     (get*time()'s seconds part is the same as time_second.  This
     inaccurate since it lags bintime()'s seconds part by up to 1/HZ
     seconds (so it differs by a full second for an everage of 1 one
     in every HZ readings).  The difference is visible if one reader,
     say make(1) reads the time using bintime() while another reader,
     say vfs_timestamp() reads the time using getbintime().)

>> Index: nfsserver/nfs_srvsubs.c
>> ===================================================================
>> --- nfsserver/nfs_srvsubs.c (revision 245225)
>> +++ nfsserver/nfs_srvsubs.c (working copy)
>> @@ -1393,7 +1393,7 @@
>> toclient = 1;
>> break;
>> case NFSV3SATTRTIME_TOSERVER:
>> - getnanotime(&(a)->va_atime);
>> + vfs_timestamp(&(a)->va_atime);
>> a->va_vaflags |= VA_UTIMES_NULL;
>> break;
>> }
>> @@ -1409,7 +1409,7 @@
>> a->va_vaflags &= ~VA_UTIMES_NULL;
>> break;
>> case NFSV3SATTRTIME_TOSERVER:
>> - getnanotime(&(a)->va_mtime);
>> + vfs_timestamp(&(a)->va_mtime);
>> if (toclient == 0)
>> a->va_vaflags |= VA_UTIMES_NULL;
>> break;

- parenthesizing 'a' is bogus.

Bruce

From owner-freebsd-fs@FreeBSD.ORG  Fri Jan 18 11:26:37 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by hub.freebsd.org (Postfix) with ESMTP id DC6FB473;
 Fri, 18 Jan 2013 11:26:37 +0000 (UTC)
 (envelope-from nicolas@i.0x5.de)
Received: from n.0x5.de (n.0x5.de [217.197.85.144])
 by mx1.freebsd.org (Postfix) with ESMTP id 4573AF7B;
 Fri, 18 Jan 2013 11:26:37 +0000 (UTC)
Received: by pc5.i.0x5.de (Postfix, from userid 1003)
 id 3YnfxL12vTz7ySF; Fri, 18 Jan 2013 12:26:30 +0100 (CET)
Date: Fri, 18 Jan 2013 12:26:30 +0100
From: Nicolas Rachinsky <fbsd-mas-0@ml.turing-complete.org>
To: Artem Belevich <art@freebsd.org>
Subject: Re: slowdown of zfs (tx->tx)
Message-ID: <20130118112630.GA41074@mid.pc5.i.0x5.de>
References: <20130114094010.GA75529@mid.pc5.i.0x5.de>
 <CAFqOu6hxfGt_M6Jo9qWeifDz9YnNc_Bd9H-GEe4RYtutaPvH5w@mail.gmail.com>
 <20130114195148.GA20540@mid.pc5.i.0x5.de>
 <CAFqOu6jwJ4qhbOovN_NhzusdQJvrbvUC3g93sziR=Uw99SGenw@mail.gmail.com>
 <20130114214652.GA76779@mid.pc5.i.0x5.de>
 <CAFqOu6jKX-Ks6C1RK5GwZ51ZVUSnGSe7S99_EfK+fwLPjAFFYw@mail.gmail.com>
 <20130115224556.GA41774@mid.pc5.i.0x5.de>
 <CAFqOu6jJnWdbikPmE1-UML5i_x7meF+iyY=9WBRyv2j7AeOaSg@mail.gmail.com>
 <20130116073759.GA47781@mid.pc5.i.0x5.de>
 <CAFqOu6j1T1gntgUm6eS1FZZAjoVyZH+eq7HAduDsOA36rJ+KhA@mail.gmail.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <CAFqOu6j1T1gntgUm6eS1FZZAjoVyZH+eq7HAduDsOA36rJ+KhA@mail.gmail.com>
X-Powered-by: FreeBSD
X-Homepage: http://www.rachinsky.de
X-PGP-Keyid: 887BAE72
X-PGP-Fingerprint: 039E 9433 115F BC5F F88D  4524 5092 45C4 887B AE72
X-PGP-Keys: http://www.rachinsky.de/nicolas/gpg/nicolas_rachinsky.asc
User-Agent: Mutt/1.5.21 (2010-09-15)
Cc: freebsd-fs <freebsd-fs@freebsd.org>
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 18 Jan 2013 11:26:37 -0000

* Artem Belevich <art@freebsd.org> [2013-01-16 00:45 -0800]:
> On Tue, Jan 15, 2013 at 11:37 PM, Nicolas Rachinsky
> <fbsd-mas-0@ml.turing-complete.org> wrote:
> >> You may want to update your system to very recent FreeBSD as quite a
> >> few fixes were recently imported from illumos. Hopefully it will deal
> >> with the issue. I'm out of ideas otherwise. Sorry.
> >
> > Do you mean -CURRENT or -STABLE with very recent? Or just 9.1?
> 
> -HEAD or -STABLE (-8 or -9).

I have now updated the machine to stable/8 r245541. I have not updated
the zpool.

But the problem still occurs. Should I update the pool? Or try other
things first?

Nicolas
-- 
http://www.rachinsky.de/nicolas

From owner-freebsd-fs@FreeBSD.ORG  Fri Jan 18 13:07:25 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by hub.freebsd.org (Postfix) with ESMTP id 9EC36A21
 for <freebsd-fs@freebsd.org>; Fri, 18 Jan 2013 13:07:25 +0000 (UTC)
 (envelope-from dppascual@gmail.com)
Received: from mail-ie0-f170.google.com (mail-ie0-f170.google.com
 [209.85.223.170]) by mx1.freebsd.org (Postfix) with ESMTP id 79C4577B
 for <freebsd-fs@freebsd.org>; Fri, 18 Jan 2013 13:07:25 +0000 (UTC)
Received: by mail-ie0-f170.google.com with SMTP id k10so6353951iea.29
 for <freebsd-fs@freebsd.org>; Fri, 18 Jan 2013 05:07:19 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
 h=mime-version:x-received:date:message-id:subject:from:to
 :content-type; bh=wRO4nsto0fLyZC7+GaptBc0TY74Iv8a2Mah8PyEb/6E=;
 b=wHtp/NDI9OnG1B7+JFGvF/3MzZwRD0UrUXo0jrriGvePRHcoXjVykqVL7mQkKROWGz
 ZRNfSk95zk7dCpQQk8g9On7cJCYSMEnRHgMyetKPPGH4n8Oi1XzVK4CJSVP+qPtSoVtV
 1e56dBqMnDOkbWXUzUNbppWbRW2s5H38o8/eoUdY2R6dzpXEjjF8u/43Ey9oXeor+WaB
 yPqcDB0zgIA1B9vbpGo6ZS7y3bdgkNXTE8QVrDI6NWknhCl4qRSXJhwXWv6fOk64ES9W
 HQBZqriv221SHKWHjZdm0iHs1Go3tY8OehT0X0KbyAWbke0squdna0qhnmaDPFGqVfZD
 BHMw==
MIME-Version: 1.0
X-Received: by 10.50.16.210 with SMTP id i18mr2000604igd.53.1358514439132;
 Fri, 18 Jan 2013 05:07:19 -0800 (PST)
Received: by 10.50.153.168 with HTTP; Fri, 18 Jan 2013 05:07:19 -0800 (PST)
Date: Fri, 18 Jan 2013 14:07:19 +0100
Message-ID: <CAAOK-NAnzq=iUnygw=LmxT3rUHzSM7h+WO2WuOUkZVB7K1jDvw@mail.gmail.com>
Subject: Enable UNMAP in ZFS
From: Dani <dppascual@gmail.com>
To: freebsd-fs@freebsd.org
Content-Type: text/plain; charset=ISO-8859-1
X-Content-Filtered-By: Mailman/MimeDel 2.1.14
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 18 Jan 2013 13:07:25 -0000

Hi all,

I have installed FreeBSD 9.1 and have created a ZFS pool with SAS disks.
How can I enable UNMAP on SSDs devices used as cache and log devices?

Thanks you. Regards.

From owner-freebsd-fs@FreeBSD.ORG  Fri Jan 18 14:23:10 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115])
 by hub.freebsd.org (Postfix) with ESMTP id E493ED43
 for <freebsd-fs@freebsd.org>; Fri, 18 Jan 2013 14:23:10 +0000 (UTC)
 (envelope-from tevans.uk@googlemail.com)
Received: from mail-vc0-f169.google.com (mail-vc0-f169.google.com
 [209.85.220.169]) by mx1.freebsd.org (Postfix) with ESMTP id A138DA9D
 for <freebsd-fs@freebsd.org>; Fri, 18 Jan 2013 14:23:10 +0000 (UTC)
Received: by mail-vc0-f169.google.com with SMTP id gb23so3755736vcb.0
 for <freebsd-fs@freebsd.org>; Fri, 18 Jan 2013 06:23:04 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
 h=mime-version:x-received:in-reply-to:references:date:message-id
 :subject:from:to:cc:content-type;
 bh=6bJ80RVz1XvB/mB+9S14OvcMRoTrYHLmWL0plk9ncx8=;
 b=QOGybnry5ZmdcYTxRQ4aFHv7WPQIX6pLVnU9qW1wfQbvCJnMHWhoRJRCvSsZkQNvvH
 a/LD2fOFLtyh0o8gQTvVNNLP4UETCzmNR32PUwONjo8/MV7vEjzSfiWkLKdoiv2SEryw
 sWaU4v/DqlSSYvFXjGMSBWLpXR/0Z82HjpNdT4J4g1wgQFecvNEJfH01j8kiuPvd+Smt
 B77KjM4dNOHJDfA3tAG5O5pe9Gfb1Np+4lqGjgQlT3UpK3z0cQQB4nPLzLjcjrOmtdB0
 M9WWRnzTPF4ICebV7SkuKTEhNYUAfaVvaT0u2izoO442vPYzUQOftKwFIC5dTWJPE1jv
 BrPg==
MIME-Version: 1.0
X-Received: by 10.52.175.106 with SMTP id bz10mr8397064vdc.125.1358518983967; 
 Fri, 18 Jan 2013 06:23:03 -0800 (PST)
Received: by 10.58.145.196 with HTTP; Fri, 18 Jan 2013 06:23:03 -0800 (PST)
In-Reply-To: <CAAOK-NAnzq=iUnygw=LmxT3rUHzSM7h+WO2WuOUkZVB7K1jDvw@mail.gmail.com>
References: <CAAOK-NAnzq=iUnygw=LmxT3rUHzSM7h+WO2WuOUkZVB7K1jDvw@mail.gmail.com>
Date: Fri, 18 Jan 2013 14:23:03 +0000
Message-ID: <CAFHbX1+8RnVZ0SF-6jMKc5gZMoyK6LB0VBzKeT5TuQEA0LREVA@mail.gmail.com>
Subject: Re: Enable UNMAP in ZFS
From: Tom Evans <tevans.uk@googlemail.com>
To: Dani <dppascual@gmail.com>
Content-Type: text/plain; charset=UTF-8
Cc: freebsd-fs@freebsd.org
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 18 Jan 2013 14:23:11 -0000

On Fri, Jan 18, 2013 at 1:07 PM, Dani <dppascual@gmail.com> wrote:
> Hi all,
>
> I have installed FreeBSD 9.1 and have created a ZFS pool with SAS disks.
> How can I enable UNMAP on SSDs devices used as cache and log devices?
>
> Thanks you. Regards.

UNMAP, I don't know. FreeBSD ZFS has support for TRIM in 10-CURRENT.

Oracle support UNMAP in Solaris 11.1, but judging from this, it's not
that useful:

http://docs.oracle.com/cd/E26502_01/html/E28978/gmibl.html

Cheers

Tom

From owner-freebsd-fs@FreeBSD.ORG  Fri Jan 18 14:57:43 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by hub.freebsd.org (Postfix) with ESMTP id 195AFA6B
 for <freebsd-fs@freebsd.org>; Fri, 18 Jan 2013 14:57:43 +0000 (UTC)
 (envelope-from ronald-freebsd8@klop.yi.org)
Received: from smarthost1.greenhost.nl (smarthost1.greenhost.nl
 [195.190.28.78]) by mx1.freebsd.org (Postfix) with ESMTP id C9FB2DF0
 for <freebsd-fs@freebsd.org>; Fri, 18 Jan 2013 14:57:42 +0000 (UTC)
Received: from smtp.greenhost.nl ([213.108.104.138])
 by smarthost1.greenhost.nl with esmtps (TLS1.0:RSA_AES_256_CBC_SHA1:32)
 (Exim 4.69) (envelope-from <ronald-freebsd8@klop.yi.org>)
 id 1TwDNy-0004Lu-Tr
 for freebsd-fs@freebsd.org; Fri, 18 Jan 2013 15:57:40 +0100
Received: from [81.21.138.17] (helo=ronaldradial.versatec.local)
 by smtp.greenhost.nl with esmtpsa (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32)
 (Exim 4.72) (envelope-from <ronald-freebsd8@klop.yi.org>)
 id 1TwDNz-00017M-1v
 for freebsd-fs@freebsd.org; Fri, 18 Jan 2013 15:57:39 +0100
Content-Type: text/plain; charset=us-ascii; format=flowed; delsp=yes
To: freebsd-fs@freebsd.org
Subject: Re: Enable UNMAP in ZFS
References: <CAAOK-NAnzq=iUnygw=LmxT3rUHzSM7h+WO2WuOUkZVB7K1jDvw@mail.gmail.com>
 <CAFHbX1+8RnVZ0SF-6jMKc5gZMoyK6LB0VBzKeT5TuQEA0LREVA@mail.gmail.com>
Date: Fri, 18 Jan 2013 15:57:37 +0100
MIME-Version: 1.0
Content-Transfer-Encoding: 8bit
From: "Ronald Klop" <ronald-freebsd8@klop.yi.org>
Message-ID: <op.wq33ubwm8527sy@ronaldradial.versatec.local>
In-Reply-To: <CAFHbX1+8RnVZ0SF-6jMKc5gZMoyK6LB0VBzKeT5TuQEA0LREVA@mail.gmail.com>
User-Agent: Opera Mail/12.12 (Win32)
X-Virus-Scanned: by clamav at smarthost1.samage.net
X-Spam-Level: /
X-Spam-Score: -0.0
X-Spam-Status: No, score=-0.0 required=5.0 tests=BAYES_40 autolearn=disabled
 version=3.3.1
X-Scan-Signature: c74461a82029b6293650421ecb57b64a
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 18 Jan 2013 14:57:43 -0000

On Fri, 18 Jan 2013 15:23:03 +0100, Tom Evans <tevans.uk@googlemail.com>  
wrote:

> On Fri, Jan 18, 2013 at 1:07 PM, Dani <dppascual@gmail.com> wrote:
>> Hi all,
>>
>> I have installed FreeBSD 9.1 and have created a ZFS pool with SAS disks.
>> How can I enable UNMAP on SSDs devices used as cache and log devices?
>>
>> Thanks you. Regards.
>
> UNMAP, I don't know. FreeBSD ZFS has support for TRIM in 10-CURRENT.
>
> Oracle support UNMAP in Solaris 11.1, but judging from this, it's not
> that useful:
>
> http://docs.oracle.com/cd/E26502_01/html/E28978/gmibl.html
>
> Cheers
>
> Tom

Isn't UNMAP the SCSI name for TRIM?

Ronald.

From owner-freebsd-fs@FreeBSD.ORG  Fri Jan 18 15:02:01 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@FreeBSD.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by hub.freebsd.org (Postfix) with ESMTP id 5707DCE6
 for <freebsd-fs@FreeBSD.org>; Fri, 18 Jan 2013 15:02:01 +0000 (UTC)
 (envelope-from gavin@FreeBSD.org)
Received: from mail-gw14.york.ac.uk (mail-gw14.york.ac.uk [144.32.129.164])
 by mx1.freebsd.org (Postfix) with ESMTP id 0EE57E3A
 for <freebsd-fs@FreeBSD.org>; Fri, 18 Jan 2013 15:02:00 +0000 (UTC)
Received: from ury.york.ac.uk ([144.32.108.81]:37640)
 by mail-gw14.york.ac.uk with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32)
 (Exim 4.71) (envelope-from <gavin@FreeBSD.org>) id 1TwDSB-0006Qv-KY
 for freebsd-fs@FreeBSD.org; Fri, 18 Jan 2013 15:01:59 +0000
Date: Fri, 18 Jan 2013 15:01:59 +0000 (GMT)
From: Gavin Atkinson <gavin@FreeBSD.org>
X-X-Sender: gavin@thunderhorn.york.ac.uk
To: freebsd-fs@FreeBSD.org
Subject: ZFS lock up 9-stable r244911 (Jan)
Message-ID: <alpine.BSF.2.00.1301181356140.29541@thunderhorn.york.ac.uk>
User-Agent: Alpine 2.00 (BSF 1167 2008-08-23)
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 18 Jan 2013 15:02:01 -0000


Hi all,

I have a machine on which ZFS appears to have locked up, and any processes 
that attempt to access the ZFS filesystem.  This machine is running 
9-stable amd64 r244911 (though from cvs, not SVN), and therefore I believe 
has all of avg's ZFS deadlock patches.

This machine has both UFS and ZFS filesystems.  All of the "system" 
filesystems are on UFS, and as a result the machine itself is responsive 
and I can investigate state using kgdb against the live kernel.  I've 
included all thread backtraces, a couple of other bits relating to held 
locks, and ps/sysctl output at
 http://people.freebsd.org/~gavin/tay-zfs-hang.txt 
 http://people.freebsd.org/~gavin/tay-sysctl-a.txt 
 http://people.freebsd.org/~gavin/tay-ps-auxwwwH.txt

This machine was in use as a pkgng package builder, using poudriere.  
Poudriere makes heavy use of zfs filesystems within jails, including "zfs 
get", "zfs set", "zfs snapshot", "zfs rollback", "zfs diff" and other 
commands, although there do not appear to be any instances of the zfs 
process running currently. At the time of the hang 16 parallel builds were 
in progress, 

The underlying disk subsystem is a single hardware RAID-10 on a twa 
controller, and the zpool is on a single partition of this device.  The 
RAID-10 itself is intact, the controller reports no errors.  There is no 
L2ARC or separate ZIL.  The UFS filesystems (still seem to be fully 
functional) are on separate partitions on the same underlying device, so I 
do not believe the underlying storage is having issues.  I can "dd" from 
the underlying ZFS partition without problem.  Nothing has been logged to 
/var/log/messages.

I can keep this machine in this state for a couple of days, so can get 
further details as required.  I am happy to work with somebody in 
order to diagnose this hang further - Note however that the kernel does 
not have WITNESS etc compiled in.

Thanks,

Gavin

From owner-freebsd-fs@FreeBSD.ORG  Fri Jan 18 15:17:16 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by hub.freebsd.org (Postfix) with ESMTP id 83C63411
 for <freebsd-fs@freebsd.org>; Fri, 18 Jan 2013 15:17:16 +0000 (UTC)
 (envelope-from feld@feld.me)
Received: from feld.me (unknown [IPv6:2607:f4e0:100:300::2])
 by mx1.freebsd.org (Postfix) with ESMTP id 62FB5EFC
 for <freebsd-fs@freebsd.org>; Fri, 18 Jan 2013 15:17:16 +0000 (UTC)
DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=feld.me;
 s=blargle; 
 h=In-Reply-To:Message-Id:From:Mime-Version:Date:References:Subject:To:Content-Type;
 bh=+CTXW3J+6nDt8B1ALMKX4+D7hvMTHe20zAhXlL4DW6A=; 
 b=RULViNeCZY/QgdhfR55BdLctdYeXw9FQ0GVKBHdMCYF+GHrkpGrc/46abbVSV/J0TebTR9jVLOmmtUFyjXjMiwMhE2MYdKa+JEd8mJdRayQMADrE2J14nUf7jaG/TIGh;
Received: from localhost ([127.0.0.1] helo=mwi1.coffeenet.org)
 by feld.me with esmtp (Exim 4.80.1 (FreeBSD))
 (envelope-from <feld@feld.me>)
 id 1TwDgy-0008vp-5U; Fri, 18 Jan 2013 09:17:16 -0600
Received: from feld@feld.me by mwi1.coffeenet.org (Archiveopteryx 3.1.4)
 with esmtpsa id 1358522210-12155-89420/5/1; Fri, 18 Jan 2013 15:16:50
 +0000
Content-Type: text/plain; format=flowed; delsp=yes
To: freebsd-fs@freebsd.org, Dani <dppascual@gmail.com>
Subject: Re: Enable UNMAP in ZFS
References: <CAAOK-NAnzq=iUnygw=LmxT3rUHzSM7h+WO2WuOUkZVB7K1jDvw@mail.gmail.com>
Date: Fri, 18 Jan 2013 09:16:50 -0600
Mime-Version: 1.0
From: Mark Felder <feld@feld.me>
Message-Id: <op.wq34qcdk34t2sn@markf.office.supranet.net>
In-Reply-To: <CAAOK-NAnzq=iUnygw=LmxT3rUHzSM7h+WO2WuOUkZVB7K1jDvw@mail.gmail.com>
User-Agent: Opera Mail/12.12 (FreeBSD)
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 18 Jan 2013 15:17:16 -0000

On Fri, 18 Jan 2013 07:07:19 -0600, Dani <dppascual@gmail.com> wrote:

>
> I have installed FreeBSD 9.1 and have created a ZFS pool with SAS disks.
> How can I enable UNMAP on SSDs devices used as cache and log devices?

By UNMAP do you mean TRIM?

From owner-freebsd-fs@FreeBSD.ORG  Fri Jan 18 16:20:14 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115])
 by hub.freebsd.org (Postfix) with ESMTP id E570FF4E
 for <freebsd-fs@freebsd.org>; Fri, 18 Jan 2013 16:20:14 +0000 (UTC)
 (envelope-from artemb@gmail.com)
Received: from mail-vc0-f175.google.com (mail-vc0-f175.google.com
 [209.85.220.175]) by mx1.freebsd.org (Postfix) with ESMTP id 9F4C4311
 for <freebsd-fs@freebsd.org>; Fri, 18 Jan 2013 16:20:14 +0000 (UTC)
Received: by mail-vc0-f175.google.com with SMTP id fw7so1100534vcb.6
 for <freebsd-fs@freebsd.org>; Fri, 18 Jan 2013 08:20:08 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
 h=mime-version:x-received:sender:in-reply-to:references:date
 :x-google-sender-auth:message-id:subject:from:to:cc:content-type;
 bh=rIRTqlKhtT5UIxfIX8loXRz8bC96UlBEIAIjFFMi9ug=;
 b=LFFVWTakfpOAdW4YlFCqtfYfaBxN2Tba6SIoCGdbBkoGW/yIb6grhhUXM5MN0rPbDz
 QiovAzUUMqJvzb+u0XTL+MuUcqniLs1qazUSmxxUHnt1crd3OhGDlCRj7IUZKHVaPshA
 lLAgDkmUvKBgaCXY3u+MmgsvRx8R2G/CVVS/pFSil/uCH1XIzptW9D36+xpq3mKj5JR0
 HU1Gh7sSdvy3ps00dyL+YJanb21djpMkcaQCgRjM9OO2Z0xi6dT7fYRKu+CiFn6XVAhR
 EvdLMFquQqTLRO+wrcUWkNUldSCf/xAzM/CuaVCABsDxHbpZJyl0GpvK3Au9PxnRsqAi
 Dq3Q==
MIME-Version: 1.0
X-Received: by 10.52.74.38 with SMTP id q6mr9240240vdv.17.1358526008608; Fri,
 18 Jan 2013 08:20:08 -0800 (PST)
Sender: artemb@gmail.com
Received: by 10.220.122.196 with HTTP; Fri, 18 Jan 2013 08:20:08 -0800 (PST)
In-Reply-To: <20130118112630.GA41074@mid.pc5.i.0x5.de>
References: <20130114094010.GA75529@mid.pc5.i.0x5.de>
 <CAFqOu6hxfGt_M6Jo9qWeifDz9YnNc_Bd9H-GEe4RYtutaPvH5w@mail.gmail.com>
 <20130114195148.GA20540@mid.pc5.i.0x5.de>
 <CAFqOu6jwJ4qhbOovN_NhzusdQJvrbvUC3g93sziR=Uw99SGenw@mail.gmail.com>
 <20130114214652.GA76779@mid.pc5.i.0x5.de>
 <CAFqOu6jKX-Ks6C1RK5GwZ51ZVUSnGSe7S99_EfK+fwLPjAFFYw@mail.gmail.com>
 <20130115224556.GA41774@mid.pc5.i.0x5.de>
 <CAFqOu6jJnWdbikPmE1-UML5i_x7meF+iyY=9WBRyv2j7AeOaSg@mail.gmail.com>
 <20130116073759.GA47781@mid.pc5.i.0x5.de>
 <CAFqOu6j1T1gntgUm6eS1FZZAjoVyZH+eq7HAduDsOA36rJ+KhA@mail.gmail.com>
 <20130118112630.GA41074@mid.pc5.i.0x5.de>
Date: Fri, 18 Jan 2013 08:20:08 -0800
X-Google-Sender-Auth: 70rmbILCuVZFPnvxl2-xn-17Xew
Message-ID: <CAFqOu6hwms2y+P=v+uOOEDy_i2B7tuU1V-amTyCw9nvJ7G0t6g@mail.gmail.com>
Subject: Re: slowdown of zfs (tx->tx)
From: Artem Belevich <art@freebsd.org>
To: Nicolas Rachinsky <fbsd-mas-0@ml.turing-complete.org>
Content-Type: text/plain; charset=ISO-8859-1
Cc: freebsd-fs <freebsd-fs@freebsd.org>
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 18 Jan 2013 16:20:15 -0000

On Fri, Jan 18, 2013 at 3:26 AM, Nicolas Rachinsky
<fbsd-mas-0@ml.turing-complete.org> wrote:
> * Artem Belevich <art@freebsd.org> [2013-01-16 00:45 -0800]:
>> On Tue, Jan 15, 2013 at 11:37 PM, Nicolas Rachinsky
>> <fbsd-mas-0@ml.turing-complete.org> wrote:
>> >> You may want to update your system to very recent FreeBSD as quite a
>> >> few fixes were recently imported from illumos. Hopefully it will deal
>> >> with the issue. I'm out of ideas otherwise. Sorry.
>> >
>> > Do you mean -CURRENT or -STABLE with very recent? Or just 9.1?
>>
>> -HEAD or -STABLE (-8 or -9).
>
> I have now updated the machine to stable/8 r245541. I have not updated
> the zpool.
>
> But the problem still occurs. Should I update the pool? Or try other
> things first?

 Updating the pool is an irreversible operation. In general I'd
suggest trying less drastic options first.

Other people suggested that the problem may be just a side effect of
almost-full filesystem. ZFS needs fair amount of unfragmented free
space in order to work efficiently. If that's what's causing your
problem, then one thing to try would be to free enough free space. The
gotcha there is that you need to free up enough contiguous space.
Removing bunch of recently written files may not help as those writes
would happen on already fragmented FS. Removing files written when FS
had a lot of free space may have better chance of freeing contiguous
space. Old snapshots are good candidates for this.

Other than that I'm out of ideas. Sorry.

--Artem

From owner-freebsd-fs@FreeBSD.ORG  Fri Jan 18 17:13:18 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by hub.freebsd.org (Postfix) with ESMTP id 7A0205A2;
 Fri, 18 Jan 2013 17:13:18 +0000 (UTC) (envelope-from jhb@freebsd.org)
Received: from bigwig.baldwin.cx (bigknife-pt.tunnel.tserv9.chi1.ipv6.he.net
 [IPv6:2001:470:1f10:75::2])
 by mx1.freebsd.org (Postfix) with ESMTP id 500B388F;
 Fri, 18 Jan 2013 17:13:18 +0000 (UTC)
Received: from pakbsde14.localnet (unknown [38.105.238.108])
 by bigwig.baldwin.cx (Postfix) with ESMTPSA id C3F63B993;
 Fri, 18 Jan 2013 12:13:17 -0500 (EST)
From: John Baldwin <jhb@freebsd.org>
To: Bruce Evans <brde@optusnet.com.au>
Subject: Re: [PATCH] Use vfs_timestamp() instead of getnanotime() in NFS
Date: Fri, 18 Jan 2013 12:12:41 -0500
User-Agent: KMail/1.13.5 (FreeBSD/8.2-CBSD-20110714-p22; KDE/4.5.5; amd64; ; )
References: <460209850.2108683.1358475815866.JavaMail.root@erie.cs.uoguelph.ca>
 <20130118165934.K1042@besplex.bde.org>
In-Reply-To: <20130118165934.K1042@besplex.bde.org>
MIME-Version: 1.0
Content-Type: Text/Plain;
  charset="iso-8859-1"
Content-Transfer-Encoding: 7bit
Message-Id: <201301181212.41321.jhb@freebsd.org>
X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.2.7
 (bigwig.baldwin.cx); Fri, 18 Jan 2013 12:13:17 -0500 (EST)
Cc: freebsd-fs@freebsd.org, Rick Macklem <rmacklem@freebsd.org>
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 18 Jan 2013 17:13:18 -0000

On Friday, January 18, 2013 1:19:29 am Bruce Evans wrote:
> On Thu, 17 Jan 2013, Rick Macklem wrote:
> 
> > John Baldwin wrote:
> >> On Tuesday, January 15, 2013 2:58:42 pm John Baldwin wrote:
> >>> Fixing NFS to properly use vfs_timestamp() seems to be a larger
> >>> project.
> >>
> >> Actually, I have a patch that I think does this below. It builds, have
> >> not
> >> yet booted it (but will do so in a bit).
> >>
> >> Index: fs/nfsclient/nfs_clstate.c
> >> ===================================================================
> >> --- fs/nfsclient/nfs_clstate.c (revision 245225)
> >> +++ fs/nfsclient/nfs_clstate.c (working copy)
> >> @@ -4611,7 +4611,7 @@
> >> }
> >> dp = nfscl_finddeleg(clp, np->n_fhp->nfh_fh, np->n_fhp->nfh_len);
> >> if (dp != NULL && (dp->nfsdl_flags & NFSCLDL_WRITE)) {
> >> - NFSGETNANOTIME(&dp->nfsdl_modtime);
> >> + vfs_timestamp(&dp->nfsdl_modtime);
> >> dp->nfsdl_flags |= NFSCLDL_MODTIMESET;
> >> }
> >> NFSUNLOCKCLSTATE();
> > Not sure about this case. Although nfsdl_modtime is being set for local
> > use, it replaces the mtime returned by the NFS server while the delegation
> > is in use. Ideally it would be the same resolution as the NFS server, but
> > that resolution isn't known to the client. (It is often better than 1sec,
> > which is the default for vfs_timestamp().)
> 
> The patch seems about right except for this.
> 
> > I'd be tempted to leave it (although the function used by the macro might
> > need to be changed, since Bruce mentions getnanotime() isn't supposed to
> > be used?).
> 
> For maximal precision and accuracy, it nanotime() should be used.  I'm
> not sure if you need to be at least as precise and accurate as the server.
> Having them synced to nanoseconds accuracy is impossible, but
> getnanotime() gives <= 1/HZ of accuracy and it is easy for them to be
> synced with more accuracy than that.  Then the extra accuracy can be
> seen in server timestamps if the server is FreeBSD and uses vfs_timestamp()
> with a either microtime() or nanotime().

I've certainly seen NFS servers use much more finely-grained VFS timestamps
(e.g. Isilon nodes run with vfs.timestamp_precision of 2 or 3 so they give
more precise timestamps than just getnanotime()).  OTOH, clock drift between
the client and server could easily screw this up.  I will leave this as-is
for now and just commit the vfs_timestamp() changes first.

> Further style fixes:
> - remove the NFSGETNANOTIME() macro.  It is only used in the above, and
>    in 3 other instances where its use is bogus because only the seconds
>    part is used.  The `time_second' global gives seconds part with the
>    same (in)accuracy as getnanotime().  If you want maximal accuracy
>    for just the seconds part, then bintime() should be used (this is
>    slightly faster than microtime() and nanotime().
>      (get*time()'s seconds part is the same as time_second.  This
>      inaccurate since it lags bintime()'s seconds part by up to 1/HZ
>      seconds (so it differs by a full second for an everage of 1 one
>      in every HZ readings).  The difference is visible if one reader,
>      say make(1) reads the time using bintime() while another reader,
>      say vfs_timestamp() reads the time using getbintime().)

Yes, I wondered if I could replace those with time_second.  The same
is true of the remaining uses of NFSGETTIME() as they also only use the
seconds portion.

-- 
John Baldwin

From owner-freebsd-fs@FreeBSD.ORG  Fri Jan 18 17:34:56 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by hub.freebsd.org (Postfix) with ESMTP id 52768F9E
 for <freebsd-fs@freebsd.org>; Fri, 18 Jan 2013 17:34:56 +0000 (UTC)
 (envelope-from dppascual@gmail.com)
Received: from mail-ia0-x235.google.com (mail-ia0-x235.google.com
 [IPv6:2607:f8b0:4001:c02::235])
 by mx1.freebsd.org (Postfix) with ESMTP id 26E05B04
 for <freebsd-fs@freebsd.org>; Fri, 18 Jan 2013 17:34:56 +0000 (UTC)
Received: by mail-ia0-f181.google.com with SMTP id k25so1612311iah.26
 for <freebsd-fs@freebsd.org>; Fri, 18 Jan 2013 09:34:55 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
 h=mime-version:x-received:in-reply-to:references:date:message-id
 :subject:from:to:cc:content-type;
 bh=yAs92Pf1RYhw4xjC1znAMcZusfhaW0X7rzO/JHNdxZo=;
 b=OLDFxdNJDF1eOODchXq3yX9K4uLtwKHQDDTKMh931fb4DDuow9071/O9g1iAWn6h5+
 LOdkw0D2VVMzSXR9kkm8LRcpPgkoXCRvPNnZrc3Op/pWJV/yKqmbznW9+NF/WxSndp8F
 XLzbtLQiieryYQxj907+D93kNuIFROJ1OhZsAA2T5Bt31xKeZqeSY2GLveQKHsCeKQCr
 VUFJ/grA8SIbVYI6s+LdxjMLxpSUtVjG0c5Rut+sljSl+8b6I1TTST1PxdZjr0BRtAIG
 7pzQor3ii91WKgtZSGwTlYVw9OXRlTitFx+3VXF5jAlRQ8gGvKXfowph5kvrAT5kwJJC
 6QYQ==
MIME-Version: 1.0
X-Received: by 10.42.98.80 with SMTP id r16mr6556503icn.45.1358530495865; Fri,
 18 Jan 2013 09:34:55 -0800 (PST)
Received: by 10.50.153.168 with HTTP; Fri, 18 Jan 2013 09:34:55 -0800 (PST)
Received: by 10.50.153.168 with HTTP; Fri, 18 Jan 2013 09:34:55 -0800 (PST)
In-Reply-To: <op.wq34qcdk34t2sn@markf.office.supranet.net>
References: <CAAOK-NAnzq=iUnygw=LmxT3rUHzSM7h+WO2WuOUkZVB7K1jDvw@mail.gmail.com>
 <op.wq34qcdk34t2sn@markf.office.supranet.net>
Date: Fri, 18 Jan 2013 18:34:55 +0100
Message-ID: <CAAOK-NCU4Y4ZSPvsAqKXMG-Fx8YCLws4H21Z8oz-gaQCRBJuDQ@mail.gmail.com>
Subject: Re: Enable UNMAP in ZFS
From: Dani <dppascual@gmail.com>
To: Mark Felder <feld@feld.me>
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable
X-Content-Filtered-By: Mailman/MimeDel 2.1.14
Cc: freebsd-fs@freebsd.org
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 18 Jan 2013 17:34:56 -0000

Hi all,

For SATA the command is called TRIM, for SAS the command is called UNMAP.

Regards
El 18/01/2013 16:17, "Mark Felder" <feld@feld.me> escribi=F3:

> On Fri, 18 Jan 2013 07:07:19 -0600, Dani <dppascual@gmail.com> wrote:
>
>
>> I have installed FreeBSD 9.1 and have created a ZFS pool with SAS disks.
>> How can I enable UNMAP on SSDs devices used as cache and log devices?
>>
>
> By UNMAP do you mean TRIM?
>

From owner-freebsd-fs@FreeBSD.ORG  Fri Jan 18 18:13:02 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115])
 by hub.freebsd.org (Postfix) with ESMTP id AB56C6E5
 for <freebsd-fs@freebsd.org>; Fri, 18 Jan 2013 18:13:02 +0000 (UTC)
 (envelope-from c.kworr@gmail.com)
Received: from mail-lb0-f179.google.com (mail-lb0-f179.google.com
 [209.85.217.179]) by mx1.freebsd.org (Postfix) with ESMTP id 30AFDDB0
 for <freebsd-fs@freebsd.org>; Fri, 18 Jan 2013 18:13:01 +0000 (UTC)
Received: by mail-lb0-f179.google.com with SMTP id gm13so2864472lbb.24
 for <freebsd-fs@freebsd.org>; Fri, 18 Jan 2013 10:13:00 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
 h=x-received:message-id:date:from:user-agent:mime-version:to:subject
 :content-type:content-transfer-encoding;
 bh=NqvEgT0z3dXu1S4xhgTkhvadPz3+d0KGAPA65ZlgwGA=;
 b=PBJEVYg+Ty/u20Z5WOhV+6HR4stfAR98Bjf3z6SGpJDXYYGC2YHyMMigEPX3ngR/ZJ
 rVCIftCxKwU5SjHQC6HH33oFnCRSls3gkmpabQznX0e1zuEJvn6Nj5VLcQtSMJ9wG7jr
 MOh3XpWVI0eRR0/wGNwcqfs4BJNSPtXcTbw+icUWo5guq5mJGZoecc2yIfIYE0inu2QK
 6iOalB2tVKhRk8KUeFZNl49/ol2dymvgCxXh29gwOLje732ubpjau6uD1gmTTXf4RAKU
 zeTdb5oby4F0gVQkfujRDVnQQddebIIihTYB/Xrk6Sqw+Pogv6c4kiH6BhOBdcUS9zcP
 4/TA==
X-Received: by 10.112.26.169 with SMTP id m9mr4187766lbg.116.1358532780675;
 Fri, 18 Jan 2013 10:13:00 -0800 (PST)
Received: from [192.168.1.130] (mau.donbass.com. [92.242.127.250])
 by mx.google.com with ESMTPS id j9sm1878629lbd.13.2013.01.18.10.12.57
 (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128);
 Fri, 18 Jan 2013 10:12:58 -0800 (PST)
Message-ID: <50F990A9.1030305@gmail.com>
Date: Fri, 18 Jan 2013 20:12:57 +0200
From: Volodymyr Kostyrko <c.kworr@gmail.com>
User-Agent: Mozilla/5.0 (X11; FreeBSD amd64;
 rv:18.0) Gecko/20100101 Firefox/18.0 SeaMonkey/2.15
MIME-Version: 1.0
To: freebsd-fs@FreeBSD.org
Subject: lz4 support for ZFS
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 18 Jan 2013 18:13:02 -0000

Hi all.

I see LZ4 is now supported in head. Can I ask is there any plans MFC'ing 
it to stable?

-- 
Sphinx of black quartz, judge my vow.

From owner-freebsd-fs@FreeBSD.ORG  Fri Jan 18 23:29:41 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by hub.freebsd.org (Postfix) with ESMTP id BCC5D87B
 for <freebsd-fs@freebsd.org>; Fri, 18 Jan 2013 23:29:41 +0000 (UTC)
 (envelope-from feld@feld.me)
Received: from feld.me (unknown [IPv6:2607:f4e0:100:300::2])
 by mx1.freebsd.org (Postfix) with ESMTP id 94D89DE0
 for <freebsd-fs@freebsd.org>; Fri, 18 Jan 2013 23:29:41 +0000 (UTC)
DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=feld.me;
 s=blargle; 
 h=Message-Id:Cc:To:Date:From:Subject:Content-Type:Mime-Version:References:In-Reply-To;
 bh=zLLjRUWU+o4bf88d1fwLrgLLUusiw4+FIittDc+uuv0=; 
 b=JAB3p+ApMf3GLPWrQrN4/U9jamoI6c9OrG5N8TrvJwZOj0K+JdjSHgnFoXMq5Sbo1MKlUuLt6QtcavGKD0sGO8FiTJdOqeO3+Mx7W1BscrSEyLQZhzvOlSkRRtpopl01;
Received: from localhost ([127.0.0.1] helo=mwi1.coffeenet.org)
 by feld.me with esmtp (Exim 4.80.1 (FreeBSD))
 (envelope-from <feld@feld.me>)
 id 1TwLNU-0005rb-BP; Fri, 18 Jan 2013 17:29:40 -0600
Received: from feld@feld.me by mwi1.coffeenet.org (Archiveopteryx 3.1.4)
 with esmtpsa id 1358551774-40632-89420/5/3; Fri, 18 Jan 2013 23:29:34
 +0000
User-Agent: K-9 Mail for Android
In-Reply-To: <CAAOK-NCU4Y4ZSPvsAqKXMG-Fx8YCLws4H21Z8oz-gaQCRBJuDQ@mail.gmail.com>
References: <CAAOK-NAnzq=iUnygw=LmxT3rUHzSM7h+WO2WuOUkZVB7K1jDvw@mail.gmail.com>
 <op.wq34qcdk34t2sn@markf.office.supranet.net>
 <CAAOK-NCU4Y4ZSPvsAqKXMG-Fx8YCLws4H21Z8oz-gaQCRBJuDQ@mail.gmail.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=utf-8
Subject: Re: Enable UNMAP in ZFS
From: Mark Felder <feld@feld.me>
Date: Fri, 18 Jan 2013 11:59:37 -0600
To: Dani <dppascual@gmail.com>
Message-Id: <e0bfd6a1-c107-4ff2-b20a-0742e286d58a@email.android.com>
Cc: freebsd-fs@freebsd.org
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 18 Jan 2013 23:29:41 -0000

I believe this requires 9-STABLE or 10

From owner-freebsd-fs@FreeBSD.ORG  Fri Jan 18 23:36:27 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115])
 by hub.freebsd.org (Postfix) with ESMTP id 3A8F7956;
 Fri, 18 Jan 2013 23:36:27 +0000 (UTC)
 (envelope-from rmacklem@uoguelph.ca)
Received: from esa-jnhn.mail.uoguelph.ca (esa-jnhn.mail.uoguelph.ca
 [131.104.91.44]) by mx1.freebsd.org (Postfix) with ESMTP id C38ADE18;
 Fri, 18 Jan 2013 23:36:26 +0000 (UTC)
X-IronPort-Anti-Spam-Filtered: true
X-IronPort-Anti-Spam-Result: AqAEAJDb+VCDaFvO/2dsb2JhbABEhkW0D4N8c4IeAQEFIwRSGw4KAgINGQJZBogsqlKRaYEjjwOBEwOIYY0riU2GfIMTggY
X-IronPort-AV: E=Sophos;i="4.84,495,1355115600"; d="scan'208";a="12685634"
Received: from erie.cs.uoguelph.ca (HELO zcs3.mail.uoguelph.ca)
 ([131.104.91.206])
 by esa-jnhn.mail.uoguelph.ca with ESMTP; 18 Jan 2013 18:36:20 -0500
Received: from zcs3.mail.uoguelph.ca (localhost.localdomain [127.0.0.1])
 by zcs3.mail.uoguelph.ca (Postfix) with ESMTP id 2D267B3F0D;
 Fri, 18 Jan 2013 18:36:20 -0500 (EST)
Date: Fri, 18 Jan 2013 18:36:20 -0500 (EST)
From: Rick Macklem <rmacklem@uoguelph.ca>
To: John Baldwin <jhb@freebsd.org>
Message-ID: <1309255502.2141506.1358552180122.JavaMail.root@erie.cs.uoguelph.ca>
In-Reply-To: <201301181212.41321.jhb@freebsd.org>
Subject: Re: [PATCH] Use vfs_timestamp() instead of getnanotime() in NFS
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit
X-Originating-IP: [172.17.91.202]
X-Mailer: Zimbra 6.0.10_GA_2692 (ZimbraWebClient - FF3.0 (Win)/6.0.10_GA_2692)
Cc: freebsd-fs@freebsd.org, Rick Macklem <rmacklem@freebsd.org>
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 18 Jan 2013 23:36:27 -0000

John Baldwin wrote:
> On Friday, January 18, 2013 1:19:29 am Bruce Evans wrote:
> > On Thu, 17 Jan 2013, Rick Macklem wrote:
> >
> > > John Baldwin wrote:
> > >> On Tuesday, January 15, 2013 2:58:42 pm John Baldwin wrote:
> > >>> Fixing NFS to properly use vfs_timestamp() seems to be a larger
> > >>> project.
> > >>
> > >> Actually, I have a patch that I think does this below. It builds,
> > >> have
> > >> not
> > >> yet booted it (but will do so in a bit).
> > >>
> > >> Index: fs/nfsclient/nfs_clstate.c
> > >> ===================================================================
> > >> --- fs/nfsclient/nfs_clstate.c (revision 245225)
> > >> +++ fs/nfsclient/nfs_clstate.c (working copy)
> > >> @@ -4611,7 +4611,7 @@
> > >> }
> > >> dp = nfscl_finddeleg(clp, np->n_fhp->nfh_fh, np->n_fhp->nfh_len);
> > >> if (dp != NULL && (dp->nfsdl_flags & NFSCLDL_WRITE)) {
> > >> - NFSGETNANOTIME(&dp->nfsdl_modtime);
> > >> + vfs_timestamp(&dp->nfsdl_modtime);
> > >> dp->nfsdl_flags |= NFSCLDL_MODTIMESET;
> > >> }
> > >> NFSUNLOCKCLSTATE();
> > > Not sure about this case. Although nfsdl_modtime is being set for
> > > local
> > > use, it replaces the mtime returned by the NFS server while the
> > > delegation
> > > is in use. Ideally it would be the same resolution as the NFS
> > > server, but
> > > that resolution isn't known to the client. (It is often better
> > > than 1sec,
> > > which is the default for vfs_timestamp().)
> >
> > The patch seems about right except for this.
> >
> > > I'd be tempted to leave it (although the function used by the
> > > macro might
> > > need to be changed, since Bruce mentions getnanotime() isn't
> > > supposed to
> > > be used?).
> >
> > For maximal precision and accuracy, it nanotime() should be used.
> > I'm
> > not sure if you need to be at least as precise and accurate as the
> > server.
> > Having them synced to nanoseconds accuracy is impossible, but
> > getnanotime() gives <= 1/HZ of accuracy and it is easy for them to
> > be
> > synced with more accuracy than that. Then the extra accuracy can be
> > seen in server timestamps if the server is FreeBSD and uses
> > vfs_timestamp()
> > with a either microtime() or nanotime().
> 
> I've certainly seen NFS servers use much more finely-grained VFS
> timestamps
> (e.g. Isilon nodes run with vfs.timestamp_precision of 2 or 3 so they
> give
> more precise timestamps than just getnanotime()). OTOH, clock drift
> between
> the client and server could easily screw this up. I will leave this
> as-is
> for now and just commit the vfs_timestamp() changes first.
> 
> > Further style fixes:
> > - remove the NFSGETNANOTIME() macro. It is only used in the above,
> > and
> >    in 3 other instances where its use is bogus because only the
> >    seconds
> >    part is used. The `time_second' global gives seconds part with
> >    the
> >    same (in)accuracy as getnanotime(). If you want maximal accuracy
> >    for just the seconds part, then bintime() should be used (this is
> >    slightly faster than microtime() and nanotime().
> >      (get*time()'s seconds part is the same as time_second. This
> >      inaccurate since it lags bintime()'s seconds part by up to 1/HZ
> >      seconds (so it differs by a full second for an everage of 1 one
> >      in every HZ readings). The difference is visible if one reader,
> >      say make(1) reads the time using bintime() while another
> >      reader,
> >      say vfs_timestamp() reads the time using getbintime().)
> 
> Yes, I wondered if I could replace those with time_second. The same
> is true of the remaining uses of NFSGETTIME() as they also only use
> the
> seconds portion.
> 
Those macros are just cruft left over from when the code was written to
be portable between various BSDen. And what they were mapped to was just
something that seemed to work.

Feel free to replace them with whatever makes sense, rick

> --
> John Baldwin

From owner-freebsd-fs@FreeBSD.ORG  Sat Jan 19 13:34:17 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by hub.freebsd.org (Postfix) with ESMTP id 2B8E81E4
 for <freebsd-fs@freebsd.org>; Sat, 19 Jan 2013 13:34:17 +0000 (UTC)
 (envelope-from prvs=17316559e7=killing@multiplay.co.uk)
Received: from mail1.multiplay.co.uk (mail1.multiplay.co.uk [85.236.96.23])
 by mx1.freebsd.org (Postfix) with ESMTP id C5CE0BE8
 for <freebsd-fs@freebsd.org>; Sat, 19 Jan 2013 13:34:16 +0000 (UTC)
Received: from r2d2 ([188.220.16.49])
 by mail1.multiplay.co.uk (mail1.multiplay.co.uk [85.236.96.23])
 (MDaemon PRO v10.0.4) with ESMTP id md50001761944.msg
 for <freebsd-fs@freebsd.org>; Sat, 19 Jan 2013 13:34:09 +0000
X-Spam-Processed: mail1.multiplay.co.uk, Sat, 19 Jan 2013 13:34:09 +0000
 (not processed: message from valid local sender)
X-MDRemoteIP: 188.220.16.49
X-Return-Path: prvs=17316559e7=killing@multiplay.co.uk
X-Envelope-From: killing@multiplay.co.uk
X-MDaemon-Deliver-To: freebsd-fs@freebsd.org
Message-ID: <E681C89BE2E84C0591E24CD00A62C578@multiplay.co.uk>
From: "Steven Hartland" <killing@multiplay.co.uk>
To: "Dani" <dppascual@gmail.com>,
	<freebsd-fs@freebsd.org>
References: <CAAOK-NAnzq=iUnygw=LmxT3rUHzSM7h+WO2WuOUkZVB7K1jDvw@mail.gmail.com>
Subject: Re: Enable UNMAP in ZFS
Date: Sat, 19 Jan 2013 13:34:38 -0000
MIME-Version: 1.0
Content-Type: text/plain; format=flowed; charset="iso-8859-1";
 reply-type=original
Content-Transfer-Encoding: 7bit
X-Priority: 3
X-MSMail-Priority: Normal
X-Mailer: Microsoft Outlook Express 6.00.2900.5931
X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2900.6157
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sat, 19 Jan 2013 13:34:17 -0000

As others have said ZFS TRIM support, which can be backed by SATA TRIM or SCSI UNMAP, is supported in head. 

All these require driver support for the underlying card, which currently means either CAM ata or scsi supported controllers.

I do plan to MFC ZFS TRIM to stable but it could be a few weeks before I get to it.

I also have a fairly extensive patch for cam which improves cam trim, unmap support if you want to test it.

    Regards
    Steve
----- Original Message ----- 
From: "Dani" <dppascual@gmail.com>
To: <freebsd-fs@freebsd.org>
Sent: Friday, January 18, 2013 1:07 PM
Subject: Enable UNMAP in ZFS


> Hi all,
> 
> I have installed FreeBSD 9.1 and have created a ZFS pool with SAS disks.
> How can I enable UNMAP on SSDs devices used as cache and log devices?
> 
> Thanks you. Regards.
> _______________________________________________
> freebsd-fs@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-fs
> To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org"
>

================================================
This e.mail is private and confidential between Multiplay (UK) Ltd. and the person or entity to whom it is addressed. In the event of misdirection, the recipient is prohibited from using, copying, printing or otherwise disseminating it or any information contained in it. 

In the event of misdirection, illegible or incomplete transmission please telephone +44 845 868 1337
or return the E.mail to postmaster@multiplay.co.uk.


From owner-freebsd-fs@FreeBSD.ORG  Sat Jan 19 17:01:07 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@FreeBSD.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by hub.freebsd.org (Postfix) with ESMTP id F26E8F3D;
 Sat, 19 Jan 2013 17:01:06 +0000 (UTC) (envelope-from avg@FreeBSD.org)
Received: from citadel.icyb.net.ua (citadel.icyb.net.ua [212.40.38.140])
 by mx1.freebsd.org (Postfix) with ESMTP id 96B057A5;
 Sat, 19 Jan 2013 17:01:05 +0000 (UTC)
Received: from porto.starpoint.kiev.ua (porto-e.starpoint.kiev.ua
 [212.40.38.100])
 by citadel.icyb.net.ua (8.8.8p3/ICyb-2.3exp) with ESMTP id TAA00114;
 Sat, 19 Jan 2013 19:00:57 +0200 (EET) (envelope-from avg@FreeBSD.org)
Received: from localhost ([127.0.0.1])
 by porto.starpoint.kiev.ua with esmtp (Exim 4.34 (FreeBSD))
 id 1Twbmq-0004Mx-Pt; Sat, 19 Jan 2013 19:00:56 +0200
Message-ID: <50FAD145.10906@FreeBSD.org>
Date: Sat, 19 Jan 2013 19:00:53 +0200
From: Andriy Gapon <avg@FreeBSD.org>
User-Agent: Mozilla/5.0 (X11; FreeBSD amd64;
 rv:17.0) Gecko/17.0 Thunderbird/17.0
MIME-Version: 1.0
To: Gavin Atkinson <gavin@FreeBSD.org>
Subject: Re: ZFS lock up 9-stable r244911 (Jan)
References: <alpine.BSF.2.00.1301181356140.29541@thunderhorn.york.ac.uk>
In-Reply-To: <alpine.BSF.2.00.1301181356140.29541@thunderhorn.york.ac.uk>
X-Enigmail-Version: 1.4.6
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 7bit
Cc: freebsd-fs@FreeBSD.org
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sat, 19 Jan 2013 17:01:07 -0000

on 18/01/2013 17:01 Gavin Atkinson said the following:
> 
> Hi all,
> 
> I have a machine on which ZFS appears to have locked up, and any processes 
> that attempt to access the ZFS filesystem.  This machine is running 
> 9-stable amd64 r244911 (though from cvs, not SVN), and therefore I believe 
> has all of avg's ZFS deadlock patches.
> 
> This machine has both UFS and ZFS filesystems.  All of the "system" 
> filesystems are on UFS, and as a result the machine itself is responsive 
> and I can investigate state using kgdb against the live kernel.  I've 
> included all thread backtraces, a couple of other bits relating to held 
> locks, and ps/sysctl output at
>  http://people.freebsd.org/~gavin/tay-zfs-hang.txt 
>  http://people.freebsd.org/~gavin/tay-sysctl-a.txt 
>  http://people.freebsd.org/~gavin/tay-ps-auxwwwH.txt
> 
> This machine was in use as a pkgng package builder, using poudriere.  
> Poudriere makes heavy use of zfs filesystems within jails, including "zfs 
> get", "zfs set", "zfs snapshot", "zfs rollback", "zfs diff" and other 
> commands, although there do not appear to be any instances of the zfs 
> process running currently. At the time of the hang 16 parallel builds were 
> in progress, 
> 
> The underlying disk subsystem is a single hardware RAID-10 on a twa 
> controller, and the zpool is on a single partition of this device.  The 
> RAID-10 itself is intact, the controller reports no errors.  There is no 
> L2ARC or separate ZIL.  The UFS filesystems (still seem to be fully 
> functional) are on separate partitions on the same underlying device, so I 
> do not believe the underlying storage is having issues.  I can "dd" from 
> the underlying ZFS partition without problem.  Nothing has been logged to 
> /var/log/messages.

Based on the above information plus some additional debugging information that
Gavin has kindly provided to me I've developed the following "theory" to explain
this deadlock.

I believe that was very high disk load (overwhelmingly high load) before the
deadlock occurred.  Further I think that there was a substantial number of high
priority writes.  Under those conditions the number of in-progress/pending zio-s
was constantly at zfs_vdev_max_pending (by default 10).  Number of queued zio-s
was above hundred:

vdev_queue = {
        vq_deadline_tree = {avl_root = 0xfffffe0338dbb248, avl_compar =
0xffffffff816855b0 <vdev_queue_deadline_compare>,
avl_offset = 584, avl_numnodes = 116, avl_size = 896},
        vq_read_tree = {avl_root = 0xfffffe019d0b65b0, avl_compar =
0xffffffff81685600 <vdev_queue_offset_compare>, avl_offset = 560, avl_numnodes =
8, avl_size = 896},
        vq_write_tree = { avl_root = 0xfffffe03e3d19230, avl_compar =
0xffffffff81685600 <vdev_queue_offset_compare>, avl_offset = 560, avl_numnodes =
108, avl_size = 896},
        vq_pending_tree = {avl_root = 0xfffffe025e32c230, avl_compar =
0xffffffff81685600 <vdev_queue_offset_compare>, avl_offset = 560, avl_numnodes =
10, avl_size = 896},

        vq_lock = {lock_object = {lo_name = 0xffffffff8172afc9 "vq->vq_lock",
lo_flags = 40960000, lo_data = 0, lo_witness = 0x0}, sx_lock = 1}},
        vdev_cache = {vc_offset_tree = {avl_root = 0x0, avl_compar =
0xffffffff81681740 <vdev_cache_offset_compare>, avl_offset = 24, avl_numnodes =
0, avl_size = 88}, vc_lastused_tree = { avl_root = 0x0, avl_compar =
0xffffffff81681760 <vdev_cache_lastused_compare>, avl_offset = 48, avl_numnodes
= 0, avl_size = 88}

Apparently processing of zio-s was so lagging behind that some executed zio-s
triggered "late arrival" condition.  My incomplete understanding shows here - I
am not sure what exactly triggers the condition and what is so special about it,
but from the following stack traces we can see that all five of
zio_write_intr_high taskqueue threads were executing dmu_sync_late_arrival_done():

    0 100432 kernel           zio_write_intr_h mi_switch+0x186 sleepq_wait+0x42
_sx_xlock_hard+0x426 _sx_xlock+0x51 txg_rele_to_sync+0x36 dmu_tx_commit+0xf1
dmu_sync_late_arrival_done+0x52 zio_done+0x353 zio_execute+0xc3 zio_done+0x3d0
zio_execute+0xc3 taskqueue_run_locked+0x74 taskqueue_thread_loop+0x46
fork_exit+0x11f fork_trampoline+0xe
    0 100433 kernel           zio_write_intr_h mi_switch+0x186 sleepq_wait+0x42
_sx_xlock_hard+0x426 _sx_xlock+0x51 txg_rele_to_sync+0x36 dmu_tx_commit+0xf1
dmu_sync_late_arrival_done+0x52 zio_done+0x353 zio_execute+0xc3 zio_done+0x3d0
zio_execute+0xc3 taskqueue_run_locked+0x74 taskqueue_thread_loop+0x46
fork_exit+0x11f fork_trampoline+0xe
    0 100434 kernel           zio_write_intr_h mi_switch+0x186 sleepq_wait+0x42
_sx_xlock_hard+0x426 _sx_xlock+0x51 txg_rele_to_sync+0x36 dmu_tx_commit+0xf1
dmu_sync_late_arrival_done+0x52 zio_done+0x353 zio_execute+0xc3 zio_done+0x3d0
zio_execute+0xc3 taskqueue_run_locked+0x74 taskqueue_thread_loop+0x46
fork_exit+0x11f fork_trampoline+0xe
    0 100435 kernel           zio_write_intr_h mi_switch+0x186 sleepq_wait+0x42
_sx_xlock_hard+0x426 _sx_xlock+0x51 txg_rele_to_sync+0x36 dmu_tx_commit+0xf1
dmu_sync_late_arrival_done+0x52 zio_done+0x353 zio_execute+0xc3 zio_done+0x3d0
zio_execute+0xc3 taskqueue_run_locked+0x74 taskqueue_thread_loop+0x46
fork_exit+0x11f fork_trampoline+0xe
    0 100436 kernel           zio_write_intr_h mi_switch+0x186 sleepq_wait+0x42
_sx_xlock_hard+0x426 _sx_xlock+0x51 txg_rele_to_sync+0x36 dmu_tx_commit+0xf1
dmu_sync_late_arrival_done+0x52 zio_done+0x353 zio_execute+0xc3 zio_done+0x3d0
zio_execute+0xc3 taskqueue_run_locked+0x74 taskqueue_thread_loop+0x46
fork_exit+0x11f fork_trampoline+0xe

In additional to the above, the taskqueue associated with the above threads has
another 9 pending tasks.

As you can see that "late arrival" code path involves txg_rele_to_sync() where
an instance tc_lock is acquired.

Further, it looks that tc_lock instances are held by the following threads:

64998 101921 pkg              initial thread   mi_switch+0x186 sleepq_wait+0x42
_sx_xlock_hard+0x426 _sx_xlock+0x51 txg_delay+0x9d
dsl_pool_tempreserve_space+0xd5 dsl_dir_tempreserve_space+0x154
dmu_tx_assign+0x370 zfs_freebsd_create+0x310 VOP_CREATE_APV+0x31
vn_open_cred+0x4b7 kern_openat+0x20a amd64_syscall+0x540 Xfast_syscall+0xf7
66152 102491 pkg              initial thread   mi_switch+0x186 sleepq_wait+0x42
_sx_xlock_hard+0x426 _sx_xlock+0x51 txg_delay+0x9d
dsl_pool_tempreserve_space+0xd5 dsl_dir_tempreserve_space+0x154
dmu_tx_assign+0x370 zfs_freebsd_write+0x45b VOP_WRITE_APV+0xb2 vn_write+0x37e
vn_io_fault+0x90 dofilewrite+0x85 kern_writev+0x6c sys_write+0x64
amd64_syscall+0x540 Xfast_syscall+0xf7
75803 101638 find             -                mi_switch+0x186 sleepq_wait+0x42
_sx_xlock_hard+0x426 _sx_xlock+0x51 txg_delay+0x9d
dsl_pool_tempreserve_space+0xd5 dsl_dir_tempreserve_space+0x154
dmu_tx_assign+0x370 zfs_inactive+0x1b7 zfs_freebsd_inactive+0x1a vinactive+0x86
vputx+0x2d8 sys_fchdir+0x356 amd64_syscall+0x540 Xfast_syscall+0xf7
75809 102932 find             -                mi_switch+0x186 sleepq_wait+0x42
_sx_xlock_hard+0x426 _sx_xlock+0x51 txg_delay+0x9d
dsl_pool_tempreserve_space+0xd5 dsl_dir_tempreserve_space+0x154
dmu_tx_assign+0x370 zfs_inactive+0x1b7 zfs_freebsd_inactive+0x1a vinactive+0x86
vputx+0x2d8 sys_fchdir+0x356 amd64_syscall+0x540 Xfast_syscall+0xf7
75813 101515 find             -                mi_switch+0x186 sleepq_wait+0x42
_sx_xlock_hard+0x426 _sx_xlock+0x51 txg_delay+0x9d
dsl_pool_tempreserve_space+0xd5 dsl_dir_tempreserve_space+0x154
dmu_tx_assign+0x370 zfs_inactive+0x1b7 zfs_freebsd_inactive+0x1a vinactive+0x86
vputx+0x2d8 sys_fchdir+0x356 amd64_syscall+0x540 Xfast_syscall+0xf7
77468 101412 update-mime-databas initial thread   mi_switch+0x186
sleepq_wait+0x42 _sx_xlock_hard+0x426 _sx_xlock+0x51 txg_delay+0x9d
dsl_pool_tempreserve_space+0xd5 dsl_dir_tempreserve_space+0x154
dmu_tx_assign+0x370 zfs_freebsd_write+0x45b VOP_WRITE_APV+0xb2 vn_write+0x37e
vn_io_fault+0x90 dofilewrite+0x85 kern_writev+0x6c sys_write+0x64
amd64_syscall+0x540 Xfast_syscall+0xf7

These threads calls txg_delay also because of the high load.
In the code we see that dmu_tx_assign first grabs an instance of tc_lock and
then calls dsl_dir_tempreserve_space.  Also, we see that txg_delay tries to
acquire tx_sync_lock and that's where all these threads are block.

Then we see that txg_sync_thread holds tx_sync_lock, but in its turn it is
blocked waiting for its zio:

 1552 100544 zfskern          txg_thread_enter mi_switch+0x186 sleepq_wait+0x42
_cv_wait+0x112 zio_wait+0x61 dbuf_read+0x5e5 dmu_buf_hold+0xe0 zap_lockdir+0x58
zap_lookup_norm+0x45 zap_lookup+0x2e feature_get_refcount+0x4b
spa_feature_is_active+0x52 dsl_scan_active+0x63 txg_sync_thread+0x20d
fork_exit+0x11f fork_trampoline+0xe

So a summary.
For some completed zio-s their zio_done routines are blocked because of
dmu_sync_late_arrival_done->txg_rele_to_sync->tc_lock.
tc_locks are held by threads in dmu_tx_assign->..->txg_delay where txg_delay is
blocked on tx_sync_lock.
tx_sync_lock is held by txg_sync_thread which waits for its zio to be processed.
The zio is held on queue and is not getting processed because the vdev already
has too many pending/in-progress zio-s.

This theory looks plausible to me, but I'd like to hear what the experts think.
Even more important question is how this situation can be avoided.

-- 
Andriy Gapon