From owner-freebsd-fs@FreeBSD.ORG  Sun Aug 14 10:27:28 2011
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 92830106566B
	for <freebsd-fs@freebsd.org>; Sun, 14 Aug 2011 10:27:28 +0000 (UTC)
	(envelope-from quazi@bk.ru)
Received: from fallback5.mail.ru (fallback5.mail.ru [94.100.176.59])
	by mx1.freebsd.org (Postfix) with ESMTP id 277408FC0A
	for <freebsd-fs@freebsd.org>; Sun, 14 Aug 2011 10:27:27 +0000 (UTC)
Received: from smtp3.mail.ru (smtp3.mail.ru [94.100.176.131])
	by fallback5.mail.ru (mPOP.Fallback_MX) with ESMTP id 4583E597C9F9
	for <freebsd-fs@freebsd.org>; Sun, 14 Aug 2011 14:11:50 +0400 (MSD)
DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=mail.ru;
	s=mail; 
	h=Content-Type:Subject:To:MIME-Version:From:Date:Message-ID;
	bh=r20huLYXuIN9578oJ3hEbYoRBu5XQWfYiqyObciweS4=; 
	b=AMLXnpjpLBZOiP96D1dfxh1LXaC87PgwBec2z64HlGXGCP9SdiGJba4rbKIPnixz9DpvX3wbdDi/Z4YK08Q8hsrTCMzV9Ze7uY2GvtNaOvovV5HlZgIimQK6WqaDZr9u;
Received: from [178.126.178.244] (port=43747 helo=QUAZIS.SNNLAN.local)
	by smtp3.mail.ru with asmtp id 1QsXfS-0005BI-00
	for freebsd-fs@freebsd.org; Sun, 14 Aug 2011 14:11:42 +0400
Message-ID: <4E47A065.1070709@bk.ru>
Date: Sun, 14 Aug 2011 13:16:05 +0300
From: Ruslan Yakovlev <quazi@bk.ru>
User-Agent: Mozilla/5.0 (X11; FreeBSD i386;
	rv:5.0) Gecko/20110804 Thunderbird/5.0
MIME-Version: 1.0
To: freebsd-fs@freebsd.org
X-Spam: Not detected
X-Mras: Ok
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
X-Content-Filtered-By: Mailman/MimeDel 2.1.5
Subject: ZFS: i/o error all block copies unavailable
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sun, 14 Aug 2011 10:27:28 -0000

Hi all
After power down on FreeBSD 8.2-STABLE #6 (now updated to #7, but 
problem standing) I can't boot from ZFS v28.
gptzfsboot wrote
   boot: ZFS: i/o error all block copies unavailable
instead
   boot: qroot:/boot/kernel/kernel
I download FreeBSD 9.0-BETA1 image and boot from it. I can mount my ZFS 
storage. I copy /boot from ZFS storage to flash and now kernel booted 
from flash fine, after that ZFS storage mounted as / and all work. zpool 
scrub don't detect any problems. zpool status wrote "No known data errors".
But it too slowly and I want normally boot from ZFS storage without 
loading kernel from flash. How can I fix "ZFS: i/o error all block 
copies unavailable" ?

Now I have
FreeBSD QUAZIS.SNNLAN.local 8.2-STABLE FreeBSD 8.2-STABLE #7: Fri Aug 12 
23:27:33 EEST 2011 root@QUAZIS.SNNLAN.local:/usr/obj/usr/src/sys/main8 i386

=> 34 976770988 ad4 GPT (465G)

34 256 1 freebsd-boot (128k)

290 16777216 2 freebsd-swap (8.0G)

16777506 959993516 3 freebsd-zfs (457G)


From owner-freebsd-fs@FreeBSD.ORG  Sun Aug 14 11:38:08 2011
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 2142C106564A
	for <freebsd-fs@freebsd.org>; Sun, 14 Aug 2011 11:38:08 +0000 (UTC)
	(envelope-from rubyneko@gmail.com)
Received: from mail-bw0-f54.google.com (mail-bw0-f54.google.com
	[209.85.214.54])
	by mx1.freebsd.org (Postfix) with ESMTP id A10118FC13
	for <freebsd-fs@freebsd.org>; Sun, 14 Aug 2011 11:38:07 +0000 (UTC)
Received: by bkat8 with SMTP id t8so3255543bka.13
	for <freebsd-fs@freebsd.org>; Sun, 14 Aug 2011 04:38:06 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma;
	h=mime-version:date:message-id:subject:from:to:cc:content-type;
	bh=0KD9wo8lbUAnOyK06acazdv6F6roAH6y50np5cCLSGI=;
	b=XZ2FtRu55tiuC5UuimR9pXDeCR7rQ1wkhKspTPXW7vaClY/N9zKGvBnNThE7h6J9b3
	wxs8mNZWlaj80u5DlzemkNIqQ0pFZTxWThkTmXV/k/K36Hjmw5KVisEeH3rYK1rL+0Ay
	TVyu8V1L/jpAOuesOjIi5IbhYuH+vLUctaI8Q=
MIME-Version: 1.0
Received: by 10.204.200.193 with SMTP id ex1mr550416bkb.39.1313320236023; Sun,
	14 Aug 2011 04:10:36 -0700 (PDT)
Received: by 10.204.102.7 with HTTP; Sun, 14 Aug 2011 04:10:35 -0700 (PDT)
Date: Sun, 14 Aug 2011 14:10:35 +0300
Message-ID: <CAFu=DfTWz2jYY6FSZ1T7j0V7umpjOMXGXrtP5dh5sad3o9VUVg@mail.gmail.com>
From: rubyneko neko <rubyneko@gmail.com>
To: Ruslan Yakovlev <quazi@bk.ru>
Content-Type: text/plain; charset=ISO-8859-1
X-Content-Filtered-By: Mailman/MimeDel 2.1.5
Cc: freebsd-fs@freebsd.org
Subject: Re: ZFS: i/o error all block copies unavailable
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sun, 14 Aug 2011 11:38:08 -0000

I have some problem too.
Currently I'm working from kernel.old.

gpart bootcode -p /boot/gptzfsboot -i 1 ad4
for my not work.

any idea?

On Sun, 2011-08-14 at 13:16 +0300, Ruslan Yakovlev wrote:
> Hi all
> After power down on FreeBSD 8.2-STABLE #6 (now updated to #7, but
> problem standing) I can't boot from ZFS v28.
> gptzfsboot wrote
>    boot: ZFS: i/o error all block copies unavailable
> instead
>    boot: qroot:/boot/kernel/kernel
> I download FreeBSD 9.0-BETA1 image and boot from it. I can mount my ZFS
> storage. I copy /boot from ZFS storage to flash and now kernel booted
> from flash fine, after that ZFS storage mounted as / and all work. zpool
> scrub don't detect any problems. zpool status wrote "No known data
errors".
> But it too slowly and I want normally boot from ZFS storage without
> loading kernel from flash. How can I fix "ZFS: i/o error all block
> copies unavailable" ?
>
> Now I have
> FreeBSD QUAZIS.SNNLAN.local 8.2-STABLE FreeBSD 8.2-STABLE #7: Fri Aug 12
> 23:27:33 EEST 2011 root@QUAZIS.SNNLAN.local:/usr/obj/usr/src/sys/main8
i386
>
> => 34 976770988 ad4 GPT (465G)
>
> 34 256 1 freebsd-boot (128k)
>
> 290 16777216 2 freebsd-swap (8.0G)
>
> 16777506 959993516 3 freebsd-zfs (457G)
>
>
>
> _______________________________________________
> freebsd-fs@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-fs
> To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org"

From owner-freebsd-fs@FreeBSD.ORG  Sun Aug 14 12:23:35 2011
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id EB4391065672
	for <freebsd-fs@freebsd.org>; Sun, 14 Aug 2011 12:23:35 +0000 (UTC)
	(envelope-from quazi@bk.ru)
Received: from smtp13.mail.ru (smtp13.mail.ru [94.100.176.90])
	by mx1.freebsd.org (Postfix) with ESMTP id 554008FC08
	for <freebsd-fs@freebsd.org>; Sun, 14 Aug 2011 12:23:35 +0000 (UTC)
DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=mail.ru;
	s=mail; 
	h=Content-Type:In-Reply-To:References:Subject:To:MIME-Version:From:Date:Message-ID;
	bh=8oYgFd0dT0Jf1AULUqz0RHz7vKGh3cnwUVhMr9bst/U=; 
	b=tmeVWDhK21wo7W1UKCvXBSyyAzeptdj7XrQj8X5F8rMe3CjzJAG2iAYlmWTYqBUvztj7+0o+laD7SXHZQXr8zINMjIQYKn/3AoO+Mw368vuI41IiGlGUzof/d15PrvJI;
Received: from [178.126.178.244] (port=18963 helo=QUAZIS.SNNLAN.local)
	by smtp13.mail.ru with asmtp id 1QsZj2-00064s-00
	for freebsd-fs@freebsd.org; Sun, 14 Aug 2011 16:23:33 +0400
Message-ID: <4E47BF5B.3010102@bk.ru>
Date: Sun, 14 Aug 2011 15:28:11 +0300
From: Ruslan Yakovlev <quazi@bk.ru>
User-Agent: Mozilla/5.0 (X11; FreeBSD i386;
	rv:5.0) Gecko/20110804 Thunderbird/5.0
MIME-Version: 1.0
To: freebsd-fs@freebsd.org
References: <CAFu=DfTWz2jYY6FSZ1T7j0V7umpjOMXGXrtP5dh5sad3o9VUVg@mail.gmail.com>
In-Reply-To: <CAFu=DfTWz2jYY6FSZ1T7j0V7umpjOMXGXrtP5dh5sad3o9VUVg@mail.gmail.com>
X-Spam: Not detected
X-Mras: Ok
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
X-Content-Filtered-By: Mailman/MimeDel 2.1.5
Subject: Re: ZFS: i/o error all block copies unavailable
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sun, 14 Aug 2011 12:23:36 -0000

I think it is not bootcode problem. I not modify my bootcode when power 
halted. It is some problems in ZFS.
When I probe import ZFS pool (in 9.0-BETA1) it wrote that pool is busy. 
Only zpool import -f work.
After that I change mountpoint, list my files, replace mountpoint to / 
and reboot. Now on boot it wrote many errors (many strings "ZFS: i/o 
error..") and wrote file names.
First it /boot/kernel/kernel
when  I probe list files from bootcode, I can see only / and /boot, on 
/boot/kernel it wrote "ZFS: i/o error.."
But now copy of /boot/kernel work fine from flash
I do
# copy -r /boot /boot.new
# move /boot /boot.broken
# move /boot.new /boot
Now kernel boot, but stopped when probe mount ZFS storage as root. If I 
select boot string from menu and do #load /boot/kernel/zfs.ko it wrote 
"ZFS: i/o error.."
pmbr and gptzfsboot from 9.0-BETA1 don't change anything. Problem staying.
And I can't boot from kernel.old (it wrote "ZFS: i/o error.." too)

I think if I copy all my files to other storage and rebuild ZFS pool, 
problem leave, but now I don't have any other storage for all my data.

On 14.08.2011 14:10, rubyneko neko wrote:
> I have some problem too.
> Currently I'm working from kernel.old.
>
> gpart bootcode -p /boot/gptzfsboot -i 1 ad4
> for my not work.
>
> any idea?
>
> On Sun, 2011-08-14 at 13:16 +0300, Ruslan Yakovlev wrote:
> > Hi all
> > After power down on FreeBSD 8.2-STABLE #6 (now updated to #7, but
> > problem standing) I can't boot from ZFS v28.
> > gptzfsboot wrote
> >    boot: ZFS: i/o error all block copies unavailable
> > instead
> >    boot: qroot:/boot/kernel/kernel
> > I download FreeBSD 9.0-BETA1 image and boot from it. I can mount my ZFS
> > storage. I copy /boot from ZFS storage to flash and now kernel booted
> > from flash fine, after that ZFS storage mounted as / and all work. 
> zpool
> > scrub don't detect any problems. zpool status wrote "No known data 
> errors".
> > But it too slowly and I want normally boot from ZFS storage without
> > loading kernel from flash. How can I fix "ZFS: i/o error all block
> > copies unavailable" ?
> >
> > Now I have
> > FreeBSD QUAZIS.SNNLAN.local 8.2-STABLE FreeBSD 8.2-STABLE #7: Fri 
> Aug 12
> > 23:27:33 EEST 2011 
> root@QUAZIS.SNNLAN.local:/usr/obj/usr/src/sys/main8 i386
> >
> > => 34 976770988 ad4 GPT (465G)
> >
> > 34 256 1 freebsd-boot (128k)
> >
> > 290 16777216 2 freebsd-swap (8.0G)
> >
> > 16777506 959993516 3 freebsd-zfs (457G)
> >
> >
> >
> > _______________________________________________
> > freebsd-fs@freebsd.org <mailto:freebsd-fs@freebsd.org> mailing list
> > http://lists.freebsd.org/mailman/listinfo/freebsd-fs
> > To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org 
> <mailto:freebsd-fs-unsubscribe@freebsd.org>"
>
>


From owner-freebsd-fs@FreeBSD.ORG  Sun Aug 14 12:54:04 2011
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id E15F8106564A
	for <freebsd-fs@freebsd.org>; Sun, 14 Aug 2011 12:54:03 +0000 (UTC)
	(envelope-from ronald-freebsd8@klop.yi.org)
Received: from fep34.mx.upcmail.net (fep34.mx.upcmail.net [62.179.121.52])
	by mx1.freebsd.org (Postfix) with ESMTP id 66F778FC0A
	for <freebsd-fs@freebsd.org>; Sun, 14 Aug 2011 12:54:03 +0000 (UTC)
Received: from edge02.upcmail.net ([192.168.13.237]) by viefep11-int.chello.at
	(InterMail vM.8.01.02.02 201-2260-120-106-20100312) with ESMTP
	id <20110814122600.ELYY1647.viefep11-int.chello.at@edge02.upcmail.net>; 
	Sun, 14 Aug 2011 14:26:00 +0200
Received: from pinky ([95.96.138.26]) by edge02.upcmail.net with edge
	id LCRy1h0090aMTqv02CRzvh; Sun, 14 Aug 2011 14:26:00 +0200
X-SourceIP: 95.96.138.26
Content-Type: text/plain; charset=us-ascii; format=flowed; delsp=yes
To: freebsd-fs@freebsd.org, "Ruslan Yakovlev" <quazi@bk.ru>
References: <CAFu=DfTWz2jYY6FSZ1T7j0V7umpjOMXGXrtP5dh5sad3o9VUVg@mail.gmail.com>
	<4E47BF5B.3010102@bk.ru>
Date: Sun, 14 Aug 2011 14:26:03 +0200
MIME-Version: 1.0
Content-Transfer-Encoding: 8bit
From: "Ronald Klop" <ronald-freebsd8@klop.yi.org>
Message-ID: <op.vz7d5pg68527sy@pinky>
In-Reply-To: <4E47BF5B.3010102@bk.ru>
User-Agent: Opera Mail/11.50 (Win32)
X-Cloudmark-Analysis: v=1.1 cv=8aHJgfg0GQPVAsFhHUWrXuSEk7IPywT3HfAl6KezIcg=
	c=1 sm=0 a=jSLzLkXI7GEA:10 a=eO4J7RWVLuUA:10 a=bgpUlknNv7MA:10
	a=kj9zAlcOel0A:10 a=6I5d2MoRAAAA:8 a=ECAJpNFI83C1cmKNUt0A:9
	a=6NtR6f4NIeGg6he39AYA:7 a=CjuIK1q_8ugA:10 a=SV7veod9ZcQA:10
	a=HpAAvcLHHh0Zw7uRqdWCyQ==:117
Cc: 
Subject: Re: ZFS: i/o error all block copies unavailable
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sun, 14 Aug 2011 12:54:04 -0000

Is /boot/zfs/zpool.cache correct for the current setup?

Ronald.

On Sun, 14 Aug 2011 14:28:11 +0200, Ruslan Yakovlev <quazi@bk.ru> wrote:

> I think it is not bootcode problem. I not modify my bootcode when power  
> halted. It is some problems in ZFS.
> When I probe import ZFS pool (in 9.0-BETA1) it wrote that pool is busy.  
> Only zpool import -f work.
> After that I change mountpoint, list my files, replace mountpoint to /  
> and reboot. Now on boot it wrote many errors (many strings "ZFS: i/o  
> error..") and wrote file names.
> First it /boot/kernel/kernel
> when  I probe list files from bootcode, I can see only / and /boot, on  
> /boot/kernel it wrote "ZFS: i/o error.."
> But now copy of /boot/kernel work fine from flash
> I do
> # copy -r /boot /boot.new
> # move /boot /boot.broken
> # move /boot.new /boot
> Now kernel boot, but stopped when probe mount ZFS storage as root. If I  
> select boot string from menu and do #load /boot/kernel/zfs.ko it wrote  
> "ZFS: i/o error.."
> pmbr and gptzfsboot from 9.0-BETA1 don't change anything. Problem  
> staying.
> And I can't boot from kernel.old (it wrote "ZFS: i/o error.." too)
>
> I think if I copy all my files to other storage and rebuild ZFS pool,  
> problem leave, but now I don't have any other storage for all my data.
>
> On 14.08.2011 14:10, rubyneko neko wrote:
>> I have some problem too.
>> Currently I'm working from kernel.old.
>>
>> gpart bootcode -p /boot/gptzfsboot -i 1 ad4
>> for my not work.
>>
>> any idea?
>>
>> On Sun, 2011-08-14 at 13:16 +0300, Ruslan Yakovlev wrote:
>> > Hi all
>> > After power down on FreeBSD 8.2-STABLE #6 (now updated to #7, but
>> > problem standing) I can't boot from ZFS v28.
>> > gptzfsboot wrote
>> >    boot: ZFS: i/o error all block copies unavailable
>> > instead
>> >    boot: qroot:/boot/kernel/kernel
>> > I download FreeBSD 9.0-BETA1 image and boot from it. I can mount my  
>> ZFS
>> > storage. I copy /boot from ZFS storage to flash and now kernel booted
>> > from flash fine, after that ZFS storage mounted as / and all work.  
>> zpool
>> > scrub don't detect any problems. zpool status wrote "No known data  
>> errors".
>> > But it too slowly and I want normally boot from ZFS storage without
>> > loading kernel from flash. How can I fix "ZFS: i/o error all block
>> > copies unavailable" ?
>> >
>> > Now I have
>> > FreeBSD QUAZIS.SNNLAN.local 8.2-STABLE FreeBSD 8.2-STABLE #7: Fri Aug  
>> 12
>> > 23:27:33 EEST 2011  
>> root@QUAZIS.SNNLAN.local:/usr/obj/usr/src/sys/main8 i386
>> >
>> > => 34 976770988 ad4 GPT (465G)
>> >
>> > 34 256 1 freebsd-boot (128k)
>> >
>> > 290 16777216 2 freebsd-swap (8.0G)
>> >
>> > 16777506 959993516 3 freebsd-zfs (457G)
>> >
>> >
>> >
>> > _______________________________________________
>> > freebsd-fs@freebsd.org <mailto:freebsd-fs@freebsd.org> mailing list
>> > http://lists.freebsd.org/mailman/listinfo/freebsd-fs
>> > To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org  
>> <mailto:freebsd-fs-unsubscribe@freebsd.org>"
>>
>>
>
> _______________________________________________
> freebsd-fs@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-fs
> To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org"

From owner-freebsd-fs@FreeBSD.ORG  Mon Aug 15 11:07:01 2011
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@FreeBSD.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 8A6521065675
	for <freebsd-fs@FreeBSD.org>; Mon, 15 Aug 2011 11:07:01 +0000 (UTC)
	(envelope-from owner-bugmaster@FreeBSD.org)
Received: from freefall.freebsd.org (freefall.freebsd.org
	[IPv6:2001:4f8:fff6::28])
	by mx1.freebsd.org (Postfix) with ESMTP id 6F59F8FC18
	for <freebsd-fs@FreeBSD.org>; Mon, 15 Aug 2011 11:07:01 +0000 (UTC)
Received: from freefall.freebsd.org (localhost [127.0.0.1])
	by freefall.freebsd.org (8.14.4/8.14.4) with ESMTP id p7FB71vl014720
	for <freebsd-fs@FreeBSD.org>; Mon, 15 Aug 2011 11:07:01 GMT
	(envelope-from owner-bugmaster@FreeBSD.org)
Received: (from gnats@localhost)
	by freefall.freebsd.org (8.14.4/8.14.4/Submit) id p7FB707I014718
	for freebsd-fs@FreeBSD.org; Mon, 15 Aug 2011 11:07:00 GMT
	(envelope-from owner-bugmaster@FreeBSD.org)
Date: Mon, 15 Aug 2011 11:07:00 GMT
Message-Id: <201108151107.p7FB707I014718@freefall.freebsd.org>
X-Authentication-Warning: freefall.freebsd.org: gnats set sender to
	owner-bugmaster@FreeBSD.org using -f
From: FreeBSD bugmaster <bugmaster@FreeBSD.org>
To: freebsd-fs@FreeBSD.org
Cc: 
Subject: Current problem reports assigned to freebsd-fs@FreeBSD.org
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 15 Aug 2011 11:07:01 -0000

Note: to view an individual PR, use:
  http://www.freebsd.org/cgi/query-pr.cgi?pr=(number).

The following is a listing of current problems submitted by FreeBSD users.
These represent problem reports covering all versions including
experimental development code and obsolete releases.


S Tracker      Resp.      Description
--------------------------------------------------------------------------------
o kern/159418  fs         [tmpfs] [panic] tmpfs kernel panic: recursing on non r
o kern/159402  fs         [zfs][loader] symlinks cause I/O errors
o kern/159357  fs         [zfs] ZFS MAXNAMELEN macro has confusing name (off-by-
o kern/159356  fs         [zfs] [patch] ZFS NAME_ERR_DISKLIKE check is Solaris-s
o kern/159351  fs         [nfs] [patch] - divide by zero in mountnfs()
o kern/159251  fs         [zfs] [request]: add FLETCHER4 as DEDUP hash option
o kern/159233  fs         [ext2fs] [patch] fs/ext2fs: finish reallocblk implemen
o kern/159232  fs         [ext2fs] [patch] fs/ext2fs: merge ext2_readwrite into 
o kern/159077  fs         [zfs] Can't cd .. with latest zfs version
o kern/159048  fs         [smbfs] smb mount corrupts large files
o kern/159045  fs         [zfs] [hang] ZFS scrub freezes system
o kern/158839  fs         [zfs] ZFS Bootloader Fails if there is a Dead Disk
o kern/158802  fs         [amd] amd(8) ICMP storm and unkillable process.
o kern/158711  fs         [ffs] [panic] panic in ffs_blkfree and ffs_valloc
o kern/158231  fs         [nullfs] panic on unmounting nullfs mounted over ufs o
f kern/157929  fs         [nfs] NFS slow read
o kern/157728  fs         [zfs] zfs (v28) incremental receive may leave behind t
o kern/157722  fs         [geli] unable to newfs a geli encrypted partition
o kern/157399  fs         [zfs] trouble with: mdconfig force delete && zfs strip
o kern/157179  fs         [zfs] zfs/dbuf.c: panic: solaris assert: arc_buf_remov
o kern/156933  fs         [zfs] ZFS receive after read on readonly=on filesystem
o kern/156797  fs         [zfs] [panic] Double panic with FreeBSD 9-CURRENT and 
o kern/156781  fs         [zfs] zfs is losing the snapshot directory,
p kern/156545  fs         [ufs] mv could break UFS on SMP systems
o kern/156193  fs         [ufs] [hang] UFS snapshot hangs && deadlocks processes
o kern/156168  fs         [nfs] [panic] Kernel panic under concurrent access ove
o kern/156039  fs         [nullfs] [unionfs] nullfs + unionfs do not compose, re
o kern/155615  fs         [zfs] zfs v28 broken on sparc64 -current
o kern/155587  fs         [zfs] [panic] kernel panic with zfs
o kern/155411  fs         [regression] [8.2-release] [tmpfs]: mount: tmpfs : No 
o kern/155199  fs         [ext2fs] ext3fs mounted as ext2fs gives I/O errors
o bin/155104   fs         [zfs][patch] use /dev prefix by default when importing
o kern/154930  fs         [zfs] cannot delete/unlink file from full volume -> EN
o kern/154828  fs         [msdosfs] Unable to create directories on external USB
o kern/154491  fs         [smbfs] smb_co_lock: recursive lock for object 1
o kern/154447  fs         [zfs] [panic] Occasional panics - solaris assert somew
p kern/154228  fs         [md] md getting stuck in wdrain state
o kern/153996  fs         [zfs] zfs root mount error while kernel is not located
o kern/153847  fs         [nfs] [panic] Kernel panic from incorrect m_free in nf
o kern/153753  fs         [zfs] ZFS v15 - grammatical error when attempting to u
o kern/153716  fs         [zfs] zpool scrub time remaining is incorrect
o kern/153695  fs         [patch] [zfs] Booting from zpool created on 4k-sector 
o kern/153680  fs         [xfs] 8.1 failing to mount XFS partitions
o kern/153520  fs         [zfs] Boot from GPT ZFS root on HP BL460c G1 unstable
o kern/153418  fs         [zfs] [panic] Kernel Panic occurred writing to zfs vol
o kern/153351  fs         [zfs] locking directories/files in ZFS
o bin/153258   fs         [patch][zfs] creating ZVOLs requires `refreservation' 
s kern/153173  fs         [zfs] booting from a gzip-compressed dataset doesn't w
o kern/153126  fs         [zfs] vdev failure, zpool=peegel type=vdev.too_small
p kern/152488  fs         [tmpfs] [patch] mtime of file updated when only inode 
o kern/152022  fs         [nfs] nfs service hangs with linux client [regression]
o kern/151942  fs         [zfs] panic during ls(1) zfs snapshot directory
o kern/151905  fs         [zfs] page fault under load in /sbin/zfs
o kern/151845  fs         [smbfs] [patch] smbfs should be upgraded to support Un
o bin/151713   fs         [patch] Bug in growfs(8) with respect to 32-bit overfl
o kern/151648  fs         [zfs] disk wait bug
o kern/151629  fs         [fs] [patch] Skip empty directory entries during name 
o kern/151330  fs         [zfs] will unshare all zfs filesystem after execute a 
o kern/151326  fs         [nfs] nfs exports fail if netgroups contain duplicate 
o kern/151251  fs         [ufs] Can not create files on filesystem with heavy us
o kern/151226  fs         [zfs] can't delete zfs snapshot
o kern/151111  fs         [zfs] vnodes leakage during zfs unmount
o kern/150503  fs         [zfs] ZFS disks are UNAVAIL and corrupted after reboot
o kern/150501  fs         [zfs] ZFS vdev failure vdev.bad_label on amd64
o kern/150390  fs         [zfs] zfs deadlock when arcmsr reports drive faulted
o kern/150336  fs         [nfs] mountd/nfsd became confused; refused to reload n
o kern/150207  fs         zpool(1): zpool import -d /dev tries to open weird dev
o kern/149208  fs         mksnap_ffs(8) hang/deadlock
o kern/149173  fs         [patch] [zfs] make OpenSolaris <sys/nvpair.h> installa
o kern/149015  fs         [zfs] [patch] misc fixes for ZFS code to build on Glib
o kern/149014  fs         [zfs] [patch] declarations in ZFS libraries/utilities 
o kern/149013  fs         [zfs] [patch] make ZFS makefiles use the libraries fro
o kern/148504  fs         [zfs] ZFS' zpool does not allow replacing drives to be
o kern/148490  fs         [zfs]: zpool attach - resilver bidirectionally, and re
o kern/148368  fs         [zfs] ZFS hanging forever on 8.1-PRERELEASE
o bin/148296   fs         [zfs] [loader] [patch] Very slow probe in /usr/src/sys
o kern/148204  fs         [nfs] UDP NFS causes overload
o kern/148138  fs         [zfs] zfs raidz pool commands freeze
o kern/147903  fs         [zfs] [panic] Kernel panics on faulty zfs device
o kern/147881  fs         [zfs] [patch] ZFS "sharenfs" doesn't allow different "
o kern/147790  fs         [zfs] zfs set acl(mode|inherit) fails on existing zfs
o kern/147560  fs         [zfs] [boot] Booting 8.1-PRERELEASE raidz system take 
o kern/147420  fs         [ufs] [panic] ufs_dirbad, nullfs, jail panic (corrupt 
o kern/146941  fs         [zfs] [panic] Kernel Double Fault - Happens constantly
o kern/146786  fs         [zfs] zpool import hangs with checksum errors
o kern/146708  fs         [ufs] [panic] Kernel panic in softdep_disk_write_compl
o kern/146528  fs         [zfs] Severe memory leak in ZFS on i386
o kern/146502  fs         [nfs] FreeBSD 8 NFS Client Connection to Server
s kern/145712  fs         [zfs] cannot offline two drives in a raidz2 configurat
o kern/145411  fs         [xfs] [panic] Kernel panics shortly after mounting an 
o bin/145309   fs         bsdlabel: Editing disk label invalidates the whole dev
o kern/145272  fs         [zfs] [panic] Panic during boot when accessing zfs on 
o kern/145246  fs         [ufs] dirhash in 7.3 gratuitously frees hashes when it
o kern/145238  fs         [zfs] [panic] kernel panic on zpool clear tank
o kern/145229  fs         [zfs] Vast differences in ZFS ARC behavior between 8.0
o kern/145189  fs         [nfs] nfsd performs abysmally under load
o kern/144929  fs         [ufs] [lor] vfs_bio.c + ufs_dirhash.c
p kern/144447  fs         [zfs] sharenfs fsunshare() & fsshare_main() non functi
o kern/144416  fs         [panic] Kernel panic on online filesystem optimization
s kern/144415  fs         [zfs] [panic] kernel panics on boot after zfs crash
o kern/144234  fs         [zfs] Cannot boot machine with recent gptzfsboot code 
o kern/143825  fs         [nfs] [panic] Kernel panic on NFS client
o bin/143572   fs         [zfs] zpool(1): [patch] The verbose output from iostat
o kern/143212  fs         [nfs] NFSv4 client strange work ...
o kern/143184  fs         [zfs] [lor] zfs/bufwait LOR
o kern/142878  fs         [zfs] [vfs] lock order reversal
o kern/142597  fs         [ext2fs] ext2fs does not work on filesystems with real
o kern/142489  fs         [zfs] [lor] allproc/zfs LOR
o kern/142466  fs         Update 7.2 -> 8.0 on Raid 1 ends with screwed raid [re
o kern/142306  fs         [zfs] [panic] ZFS drive (from OSX Leopard) causes two 
o kern/142068  fs         [ufs] BSD labels are got deleted spontaneously
o kern/141897  fs         [msdosfs] [panic] Kernel panic. msdofs: file name leng
o kern/141463  fs         [nfs] [panic] Frequent kernel panics after upgrade fro
o kern/141305  fs         [zfs] FreeBSD ZFS+sendfile severe performance issues (
o kern/141091  fs         [patch] [nullfs] fix panics with DIAGNOSTIC enabled
o kern/141086  fs         [nfs] [panic] panic("nfs: bioread, not dir") on FreeBS
o kern/141010  fs         [zfs] "zfs scrub" fails when backed by files in UFS2
o kern/140888  fs         [zfs] boot fail from zfs root while the pool resilveri
o kern/140661  fs         [zfs] [patch] /boot/loader fails to work on a GPT/ZFS-
o kern/140640  fs         [zfs] snapshot crash
o kern/140068  fs         [smbfs] [patch] smbfs does not allow semicolon in file
o kern/139725  fs         [zfs] zdb(1) dumps core on i386 when examining zpool c
o kern/139715  fs         [zfs] vfs.numvnodes leak on busy zfs
p bin/139651   fs         [nfs] mount(8): read-only remount of NFS volume does n
o kern/139597  fs         [patch] [tmpfs] tmpfs initializes va_gen but doesn't u
o kern/139564  fs         [zfs] [panic] 8.0-RC1 - Fatal trap 12 at end of shutdo
o kern/139407  fs         [smbfs] [panic] smb mount causes system crash if remot
o kern/138662  fs         [panic] ffs_blkfree: freeing free block
o kern/138421  fs         [ufs] [patch] remove UFS label limitations
o kern/138202  fs         mount_msdosfs(1) see only 2Gb
o kern/136968  fs         [ufs] [lor] ufs/bufwait/ufs (open)
o kern/136945  fs         [ufs] [lor] filedesc structure/ufs (poll)
o kern/136944  fs         [ffs] [lor] bufwait/snaplk (fsync)
o kern/136873  fs         [ntfs] Missing directories/files on NTFS volume
o kern/136865  fs         [nfs] [patch] NFS exports atomic and on-the-fly atomic
p kern/136470  fs         [nfs] Cannot mount / in read-only, over NFS
o kern/135546  fs         [zfs] zfs.ko module doesn't ignore zpool.cache filenam
o kern/135469  fs         [ufs] [panic] kernel crash on md operation in ufs_dirb
o kern/135050  fs         [zfs] ZFS clears/hides disk errors on reboot
o kern/134491  fs         [zfs] Hot spares are rather cold...
o kern/133676  fs         [smbfs] [panic] umount -f'ing a vnode-based memory dis
o kern/133174  fs         [msdosfs] [patch] msdosfs must support multibyte inter
o kern/132960  fs         [ufs] [panic] panic:ffs_blkfree: freeing free frag
o kern/132397  fs         reboot causes filesystem corruption (failure to sync b
o kern/132331  fs         [ufs] [lor] LOR ufs and syncer
o kern/132237  fs         [msdosfs] msdosfs has problems to read MSDOS Floppy
o kern/132145  fs         [panic] File System Hard Crashes
o kern/131441  fs         [unionfs] [nullfs] unionfs and/or nullfs not combineab
o kern/131360  fs         [nfs] poor scaling behavior of the NFS server under lo
o kern/131342  fs         [nfs] mounting/unmounting of disks causes NFS to fail
o bin/131341   fs         makefs: error "Bad file descriptor"  on the mount poin
o kern/130920  fs         [msdosfs] cp(1) takes 100% CPU time while copying file
o kern/130210  fs         [nullfs] Error by check nullfs
f kern/130133  fs         [panic] [zfs] 'kmem_map too small' caused by make clea
o kern/129760  fs         [nfs] after 'umount -f' of a stale NFS share FreeBSD l
o kern/129488  fs         [smbfs] Kernel "bug" when using smbfs in smbfs_smb.c: 
o kern/129231  fs         [ufs] [patch] New UFS mount (norandom) option - mostly
o kern/129152  fs         [panic] non-userfriendly panic when trying to mount(8)
o kern/127787  fs         [lor] [ufs] Three LORs: vfslock/devfs/vfslock, ufs/vfs
f kern/127375  fs         [zfs] If vm.kmem_size_max>"1073741823" then write spee
o bin/127270   fs         fsck_msdosfs(8) may crash if BytesPerSec is zero
o kern/127029  fs         [panic] mount(8): trying to mount a write protected zi
f kern/126703  fs         [panic] [zfs] _mtx_lock_sleep: recursed on non-recursi
o kern/126287  fs         [ufs] [panic] Kernel panics while mounting an UFS file
o kern/125895  fs         [ffs] [panic] kernel: panic: ffs_blkfree: freeing free
s kern/125738  fs         [zfs] [request] SHA256 acceleration in ZFS
o kern/123939  fs         [msdosfs] corrupts new files
f sparc/123566 fs         [zfs] zpool import issue: EOVERFLOW
o kern/122380  fs         [ffs] ffs_valloc:dup alloc (Soekris 4801/7.0/USB Flash
o bin/122172   fs         [fs]: amd(8) automount daemon dies on 6.3-STABLE i386,
o bin/121898   fs         [nullfs] pwd(1)/getcwd(2) fails with Permission denied
o bin/121366   fs         [zfs] [patch] Automatic disk scrubbing from periodic(8
o bin/121072   fs         [smbfs] mount_smbfs(8) cannot normally convert the cha
o kern/120483  fs         [ntfs] [patch] NTFS filesystem locking changes
o kern/120482  fs         [ntfs] [patch] Sync style changes between NetBSD and F
f kern/120210  fs         [zfs] [panic] reboot after panic: solaris assert: arc_
o kern/118912  fs         [2tb] disk sizing/geometry problem with large array
o kern/118713  fs         [minidump] [patch] Display media size required for a k
o bin/118249   fs         [ufs] mv(1): moving a directory changes its mtime
o kern/118126  fs         [nfs] [patch] Poor NFS server write performance
o kern/118107  fs         [ntfs] [panic] Kernel panic when accessing a file at N
o kern/117954  fs         [ufs] dirhash on very large directories blocks the mac
o bin/117315   fs         [smbfs] mount_smbfs(8) and related options can't mount
o kern/117314  fs         [ntfs] Long-filename only NTFS fs'es cause kernel pani
o kern/117158  fs         [zfs] zpool scrub causes panic if geli vdevs detach on
o bin/116980   fs         [msdosfs] [patch] mount_msdosfs(8) resets some flags f
o conf/116931  fs         lack of fsck_cd9660 prevents mounting iso images with 
o kern/116583  fs         [ffs] [hang] System freezes for short time when using 
o bin/115361   fs         [zfs] mount(8) gets into a state where it won't set/un
o kern/114955  fs         [cd9660] [patch] [request] support for mask,dirmask,ui
o kern/114847  fs         [ntfs] [patch] [request] dirmask support for NTFS ala 
o kern/114676  fs         [ufs] snapshot creation panics: snapacct_ufs2: bad blo
o bin/114468   fs         [patch] [request] add -d option to umount(8) to detach
o kern/113852  fs         [smbfs] smbfs does not properly implement DFS referral
o bin/113838   fs         [patch] [request] mount(8): add support for relative p
o bin/113049   fs         [patch] [request] make quot(8) use getopt(3) and show 
o kern/112658  fs         [smbfs] [patch] smbfs and caching problems (resolves b
o kern/111843  fs         [msdosfs] Long Names of files are incorrectly created 
o kern/111782  fs         [ufs] dump(8) fails horribly for large filesystems
s bin/111146   fs         [2tb] fsck(8) fails on 6T filesystem
o kern/109024  fs         [msdosfs] [iconv] mount_msdosfs: msdosfs_iconv: Operat
o kern/109010  fs         [msdosfs] can't mv directory within fat32 file system
o bin/107829   fs         [2TB] fdisk(8): invalid boundary checking in fdisk / w
o kern/106107  fs         [ufs] left-over fsck_snapshot after unfinished backgro
o kern/104406  fs         [ufs] Processes get stuck in "ufs" state under persist
o kern/104133  fs         [ext2fs] EXT2FS module corrupts EXT2/3 filesystems
o kern/103035  fs         [ntfs] Directories in NTFS mounted disc images appear 
o kern/101324  fs         [smbfs] smbfs sometimes not case sensitive when it's s
o kern/99290   fs         [ntfs] mount_ntfs ignorant of cluster sizes
s bin/97498    fs         [request] newfs(8) has no option to clear the first 12
o kern/97377   fs         [ntfs] [patch] syntax cleanup for ntfs_ihash.c
o kern/95222   fs         [cd9660] File sections on ISO9660 level 3 CDs ignored
o kern/94849   fs         [ufs] rename on UFS filesystem is not atomic
o bin/94810    fs         fsck(8) incorrectly reports 'file system marked clean'
o kern/94769   fs         [ufs] Multiple file deletions on multi-snapshotted fil
o kern/94733   fs         [smbfs] smbfs may cause double unlock
o kern/93942   fs         [vfs] [patch] panic: ufs_dirbad: bad dir (patch from D
o kern/92272   fs         [ffs] [hang] Filling a filesystem while creating a sna
o kern/91134   fs         [smbfs] [patch] Preserve access and modification time 
a kern/90815   fs         [smbfs] [patch] SMBFS with character conversions somet
o kern/88657   fs         [smbfs] windows client hang when browsing a samba shar
o kern/88555   fs         [panic] ffs_blkfree: freeing free frag on AMD 64
o kern/88266   fs         [smbfs] smbfs does not implement UIO_NOCOPY and sendfi
o bin/87966    fs         [patch] newfs(8): introduce -A flag for newfs to enabl
o kern/87859   fs         [smbfs] System reboot while umount smbfs.
o kern/86587   fs         [msdosfs] rm -r /PATH fails with lots of small files
o bin/85494    fs         fsck_ffs: unchecked use of cg_inosused macro etc.
o kern/80088   fs         [smbfs] Incorrect file time setting on NTFS mounted vi
o bin/74779    fs         Background-fsck checks one filesystem twice and omits 
o kern/73484   fs         [ntfs] Kernel panic when doing `ls` from the client si
o bin/73019    fs         [ufs] fsck_ufs(8) cannot alloc 607016868 bytes for ino
o kern/71774   fs         [ntfs] NTFS cannot "see" files on a WinXP filesystem
o bin/70600    fs         fsck(8) throws files away when it can't grow lost+foun
o kern/68978   fs         [panic] [ufs] crashes with failing hard disk, loose po
o kern/65920   fs         [nwfs] Mounted Netware filesystem behaves strange
o kern/65901   fs         [smbfs] [patch] smbfs fails fsx write/truncate-down/tr
o kern/61503   fs         [smbfs] mount_smbfs does not work as non-root
o kern/55617   fs         [smbfs] Accessing an nsmb-mounted drive via a smb expo
o kern/51685   fs         [hang] Unbounded inode allocation causes kernel to loc
o kern/51583   fs         [nullfs] [patch] allow to work with devices and socket
o kern/36566   fs         [smbfs] System reboot with dead smb mount and umount
o kern/33464   fs         [ufs] soft update inconsistencies after system crash
o bin/27687    fs         fsck(8) wrapper is not properly passing options to fsc
o kern/18874   fs         [2TB] 32bit NFS servers export wrong negative values t

244 problems total.


From owner-freebsd-fs@FreeBSD.ORG  Mon Aug 15 17:46:18 2011
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 82151106566C
	for <freebsd-fs@freebsd.org>; Mon, 15 Aug 2011 17:46:18 +0000 (UTC)
	(envelope-from jhb@freebsd.org)
Received: from cyrus.watson.org (cyrus.watson.org [65.122.17.42])
	by mx1.freebsd.org (Postfix) with ESMTP id 5691D8FC16
	for <freebsd-fs@freebsd.org>; Mon, 15 Aug 2011 17:46:18 +0000 (UTC)
Received: from bigwig.baldwin.cx (66.111.2.69.static.nyinternet.net
	[66.111.2.69])
	by cyrus.watson.org (Postfix) with ESMTPSA id F34AC46B32;
	Mon, 15 Aug 2011 13:46:17 -0400 (EDT)
Received: from jhbbsd.localnet (unknown [209.249.190.124])
	by bigwig.baldwin.cx (Postfix) with ESMTPSA id 7EA0F8A02F;
	Mon, 15 Aug 2011 13:46:17 -0400 (EDT)
From: John Baldwin <jhb@freebsd.org>
To: freebsd-fs@freebsd.org
Date: Mon, 15 Aug 2011 13:43:14 -0400
User-Agent: KMail/1.13.5 (FreeBSD/8.2-CBSD-20110617; KDE/4.5.5; amd64; ; )
References: <1687823014.1491995.1312757266327.JavaMail.root@erie.cs.uoguelph.ca>
In-Reply-To: <1687823014.1491995.1312757266327.JavaMail.root@erie.cs.uoguelph.ca>
MIME-Version: 1.0
Content-Type: Text/Plain;
  charset="utf-8"
Content-Transfer-Encoding: 7bit
Message-Id: <201108151343.14655.jhb@freebsd.org>
X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.2.6
	(bigwig.baldwin.cx); Mon, 15 Aug 2011 13:46:17 -0400 (EDT)
Cc: onwahe@gmail.com
Subject: Re: NFS calculation of max commit size
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 15 Aug 2011 17:46:18 -0000

On Sunday, August 07, 2011 6:47:46 pm Rick Macklem wrote:
> A recent PR (kern/159351) noted that the following
> calculation results in a divide-by-zero when
> desiredvnodes < 1000.
> 
> 	nmp->nm_wcommitsize = hibufspace / (desiredvnodes / 1000);
> 
> Just fixing the divide-by-zero is easy enough, but I'm not
> sure what this calculation is trying to do. Making it a fraction
> of "hibufspace" makes sense (nm_wcommitsize is the maximum # of
> bytes of uncommitted data in the NFS client's buffer cache blocks,
> if I understand it correctly), but why divide it by
> 
>                 (desiredvnodes / 1000) ??
> 
> Maybe thinking that fewer vnodes means sharing it with fewer
> other file systems or ???
> 
> Anyhow, it seems to me that the formulae is bogus for small
> values of desiredvnodes (for example desiredvnodes == 1500
> implies nm_wcommitsize == hibufspace, which sounds too large
> to me).
> 
> I'm thinking that putting an upper limit of 10% of hibufspace
> might make sense. ie. Change the above to:
> 
> 	if (desiredvnodes >= 11000)
> 		nmp->nm_wcommitsize = hibufspace / (desiredvnodes / 1000);
> 	else
> 		nmp->nm_wcommitsize = hibufspace / 10;
> 
> Anyone have comments or insight into this calculation?
> 
> rick
> ps: jhb, I hope you don't mind. I emailed you first and then
>     thought others might have some ideas, too.

Oh no, this is fine.  A broader discussion is probably warranted.  I honestly 
don't know what the goal is.  I do think it is an attempt to share with other 
file systems, but I'm not sure how desiredvnodes / 1000 is useful for that.
It also seems that we can end up setting this woefully low as well.  That is, 
I wonder if we need a minimum of 10% of hibufspace so that it can scale 
between 10% and 90% of hibufspace (but I'm not sure what you would use to
pick the scaling factor sanely).  To my mind what you really want to do is 
something like 'hibufspace / (number of active mounts)', but that will not 
really work correctly unless we recalculate the value on each mount and 
unmount operation.

-- 
John Baldwin

From owner-freebsd-fs@FreeBSD.ORG  Mon Aug 15 22:58:15 2011
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id C57281065678;
	Mon, 15 Aug 2011 22:58:15 +0000 (UTC)
	(envelope-from rmacklem@uoguelph.ca)
Received: from esa-jnhn.mail.uoguelph.ca (esa-jnhn.mail.uoguelph.ca
	[131.104.91.44])
	by mx1.freebsd.org (Postfix) with ESMTP id 6DBAC8FC1A;
	Mon, 15 Aug 2011 22:58:15 +0000 (UTC)
X-IronPort-Anti-Spam-Filtered: true
X-IronPort-Anti-Spam-Result: AsMAADqjSU6DaFvO/2dsb2JhbABBhEiUAJBFgUABAQUjBFIbDgoCAg0ZAlkGLrEgkVuBLIQLgRAEkxKREQ
X-IronPort-AV: E=Sophos;i="4.67,376,1309752000"; d="scan'208";a="134489408"
Received: from erie.cs.uoguelph.ca (HELO zcs3.mail.uoguelph.ca)
	([131.104.91.206])
	by esa-jnhn-pri.mail.uoguelph.ca with ESMTP; 15 Aug 2011 18:58:14 -0400
Received: from zcs3.mail.uoguelph.ca (localhost.localdomain [127.0.0.1])
	by zcs3.mail.uoguelph.ca (Postfix) with ESMTP id 85B90B4010;
	Mon, 15 Aug 2011 18:58:14 -0400 (EDT)
Date: Mon, 15 Aug 2011 18:58:14 -0400 (EDT)
From: Rick Macklem <rmacklem@uoguelph.ca>
To: John Baldwin <jhb@freebsd.org>
Message-ID: <1730399830.175988.1313449094531.JavaMail.root@erie.cs.uoguelph.ca>
In-Reply-To: <201108151343.14655.jhb@freebsd.org>
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit
X-Originating-IP: [172.17.91.201]
X-Mailer: Zimbra 6.0.10_GA_2692 (ZimbraWebClient - FF3.0 (Win)/6.0.10_GA_2692)
Cc: freebsd-fs@freebsd.org, onwahe@gmail.com
Subject: Re: NFS calculation of max commit size
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 15 Aug 2011 22:58:16 -0000

John Baldwin wrote:
> On Sunday, August 07, 2011 6:47:46 pm Rick Macklem wrote:
> > A recent PR (kern/159351) noted that the following
> > calculation results in a divide-by-zero when
> > desiredvnodes < 1000.
> >
> > 	nmp->nm_wcommitsize = hibufspace / (desiredvnodes / 1000);
> >
> > Just fixing the divide-by-zero is easy enough, but I'm not
> > sure what this calculation is trying to do. Making it a fraction
> > of "hibufspace" makes sense (nm_wcommitsize is the maximum # of
> > bytes of uncommitted data in the NFS client's buffer cache blocks,
> > if I understand it correctly), but why divide it by
> >
> >                 (desiredvnodes / 1000) ??
> >
> > Maybe thinking that fewer vnodes means sharing it with fewer
> > other file systems or ???
> >
> > Anyhow, it seems to me that the formulae is bogus for small
> > values of desiredvnodes (for example desiredvnodes == 1500
> > implies nm_wcommitsize == hibufspace, which sounds too large
> > to me).
> >
> > I'm thinking that putting an upper limit of 10% of hibufspace
> > might make sense. ie. Change the above to:
> >
> > 	if (desiredvnodes >= 11000)
> > 		nmp->nm_wcommitsize = hibufspace / (desiredvnodes / 1000);
> > 	else
> > 		nmp->nm_wcommitsize = hibufspace / 10;
> >
> > Anyone have comments or insight into this calculation?
> >
> > rick
> > ps: jhb, I hope you don't mind. I emailed you first and then
> >     thought others might have some ideas, too.
> 
> Oh no, this is fine. A broader discussion is probably warranted. I
> honestly
> don't know what the goal is. I do think it is an attempt to share with
> other
> file systems, but I'm not sure how desiredvnodes / 1000 is useful for
> that.
> It also seems that we can end up setting this woefully low as well.
> That is,
> I wonder if we need a minimum of 10% of hibufspace so that it can
> scale
> between 10% and 90% of hibufspace (but I'm not sure what you would use
> to
> pick the scaling factor sanely). To my mind what you really want to do
> is
> something like 'hibufspace / (number of active mounts)', but that will
> not
> really work correctly unless we recalculate the value on each mount
> and
> unmount operation.
> 
> --
> John Baldwin
Btw, this was done by r147280 6.5years ago, so the formula doesn't seem
to be causing a lot of grief. Also of some interest is the fact that
wcommitsize appears to have been setable on a per-mount-point-basis until
mount_nfs(8) was converted to nmount(2). { There is no nmount option to set it. }

Btw, when nm_wcommitsize is exceeded, writes become synchronous, so it affects
how much write behind happens. This, in turn, affects how bursty (is this a real
word? hopefully you get what I mean?) the write traffic to the server is.

What I'm not sure about is what happens when multiple mounts use up the entire
buffer cache with write behinds. I'll try a little experiment to see if I
can find that out. (If making it large isn't detrimental, then I tend to
agree that the above sets nm_wcommitsize very small.)

Since "desiredvnodes" will seldom be less than 1000, I'm not going to
rush to a solution.

Anyone who has insight into what this formula should be, please let us know.

rick

From owner-freebsd-fs@FreeBSD.ORG  Tue Aug 16 02:26:00 2011
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 25EDB106566C
	for <freebsd-fs@freebsd.org>; Tue, 16 Aug 2011 02:26:00 +0000 (UTC)
	(envelope-from jdc@koitsu.dyndns.org)
Received: from qmta02.westchester.pa.mail.comcast.net
	(qmta02.westchester.pa.mail.comcast.net [76.96.62.24])
	by mx1.freebsd.org (Postfix) with ESMTP id C2DC18FC0A
	for <freebsd-fs@freebsd.org>; Tue, 16 Aug 2011 02:25:58 +0000 (UTC)
Received: from omta17.westchester.pa.mail.comcast.net ([76.96.62.89])
	by qmta02.westchester.pa.mail.comcast.net with comcast
	id Lpjz1h0031vXlb852qRztJ; Tue, 16 Aug 2011 02:25:59 +0000
Received: from koitsu.dyndns.org ([67.180.84.87])
	by omta17.westchester.pa.mail.comcast.net with comcast
	id LqRw1h00c1t3BNj3dqRxQs; Tue, 16 Aug 2011 02:25:58 +0000
Received: by icarus.home.lan (Postfix, from userid 1000)
	id E60D2102C1A; Mon, 15 Aug 2011 19:25:54 -0700 (PDT)
Date: Mon, 15 Aug 2011 19:25:54 -0700
From: Jeremy Chadwick <freebsd@jdc.parodius.com>
To: Rick Macklem <rmacklem@uoguelph.ca>
Message-ID: <20110816022554.GA6018@icarus.home.lan>
References: <201108151343.14655.jhb@freebsd.org>
	<1730399830.175988.1313449094531.JavaMail.root@erie.cs.uoguelph.ca>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <1730399830.175988.1313449094531.JavaMail.root@erie.cs.uoguelph.ca>
User-Agent: Mutt/1.5.21 (2010-09-15)
Cc: freebsd-fs@freebsd.org, onwahe@gmail.com
Subject: Re: NFS calculation of max commit size
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 16 Aug 2011 02:26:00 -0000

On Mon, Aug 15, 2011 at 06:58:14PM -0400, Rick Macklem wrote:
> John Baldwin wrote:
> > On Sunday, August 07, 2011 6:47:46 pm Rick Macklem wrote:
> > > A recent PR (kern/159351) noted that the following
> > > calculation results in a divide-by-zero when
> > > desiredvnodes < 1000.
> > >
> > > 	nmp->nm_wcommitsize = hibufspace / (desiredvnodes / 1000);
> > >
> > > Just fixing the divide-by-zero is easy enough, but I'm not
> > > sure what this calculation is trying to do. Making it a fraction
> > > of "hibufspace" makes sense (nm_wcommitsize is the maximum # of
> > > bytes of uncommitted data in the NFS client's buffer cache blocks,
> > > if I understand it correctly), but why divide it by
> > >
> > >                 (desiredvnodes / 1000) ??
> > >
> > > Maybe thinking that fewer vnodes means sharing it with fewer
> > > other file systems or ???
> > >
> > > Anyhow, it seems to me that the formulae is bogus for small
> > > values of desiredvnodes (for example desiredvnodes == 1500
> > > implies nm_wcommitsize == hibufspace, which sounds too large
> > > to me).
> > >
> > > I'm thinking that putting an upper limit of 10% of hibufspace
> > > might make sense. ie. Change the above to:
> > >
> > > 	if (desiredvnodes >= 11000)
> > > 		nmp->nm_wcommitsize = hibufspace / (desiredvnodes / 1000);
> > > 	else
> > > 		nmp->nm_wcommitsize = hibufspace / 10;
> > >
> > > Anyone have comments or insight into this calculation?
> > >
> > > rick
> > > ps: jhb, I hope you don't mind. I emailed you first and then
> > >     thought others might have some ideas, too.
> > 
> > Oh no, this is fine. A broader discussion is probably warranted. I
> > honestly
> > don't know what the goal is. I do think it is an attempt to share with
> > other
> > file systems, but I'm not sure how desiredvnodes / 1000 is useful for
> > that.
> > It also seems that we can end up setting this woefully low as well.
> > That is,
> > I wonder if we need a minimum of 10% of hibufspace so that it can
> > scale
> > between 10% and 90% of hibufspace (but I'm not sure what you would use
> > to
> > pick the scaling factor sanely). To my mind what you really want to do
> > is
> > something like 'hibufspace / (number of active mounts)', but that will
> > not
> > really work correctly unless we recalculate the value on each mount
> > and
> > unmount operation.
> > 
> > --
> > John Baldwin
> Btw, this was done by r147280 6.5years ago, so the formula doesn't seem
> to be causing a lot of grief. Also of some interest is the fact that
> wcommitsize appears to have been setable on a per-mount-point-basis until
> mount_nfs(8) was converted to nmount(2). { There is no nmount option to set it. }
> 
> Btw, when nm_wcommitsize is exceeded, writes become synchronous, so it affects
> how much write behind happens. This, in turn, affects how bursty (is this a real
> word? hopefully you get what I mean?) the write traffic to the server is.
> 
> What I'm not sure about is what happens when multiple mounts use up the entire
> buffer cache with write behinds. I'll try a little experiment to see if I
> can find that out. (If making it large isn't detrimental, then I tend to
> agree that the above sets nm_wcommitsize very small.)
> 
> Since "desiredvnodes" will seldom be less than 1000, I'm not going to
> rush to a solution.
> 
> Anyone who has insight into what this formula should be, please let us know.

The commit message tries to explain it, but it's more than just a
one-line change.

http://www.freebsd.org/cgi/cvsweb.cgi/src/sys/nfsclient/nfs_vfsops.c#rev1.177

There's also an associated PR:

http://www.freebsd.org/cgi/query-pr.cgi?pr=79208

-- 
| Jeremy Chadwick                                jdc at parodius.com |
| Parodius Networking                       http://www.parodius.com/ |
| UNIX Systems Administrator                   Mountain View, CA, US |
| Making life hard for others since 1977.               PGP 4BD6C0CB |


From owner-freebsd-fs@FreeBSD.ORG  Tue Aug 16 04:05:20 2011
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 26E25106566C;
	Tue, 16 Aug 2011 04:05:20 +0000 (UTC) (envelope-from jwd@FreeBSD.org)
Received: from freefall.freebsd.org (freefall.freebsd.org
	[IPv6:2001:4f8:fff6::28])
	by mx1.freebsd.org (Postfix) with ESMTP id F240A8FC12;
	Tue, 16 Aug 2011 04:05:19 +0000 (UTC)
Received: from freefall.freebsd.org (localhost [127.0.0.1])
	by freefall.freebsd.org (8.14.4/8.14.4) with ESMTP id p7G45J7v058575;
	Tue, 16 Aug 2011 04:05:19 GMT
	(envelope-from jwd@freefall.freebsd.org)
Received: (from jwd@localhost)
	by freefall.freebsd.org (8.14.4/8.14.4/Submit) id p7G45JxW058574;
	Tue, 16 Aug 2011 04:05:19 GMT (envelope-from jwd)
Date: Tue, 16 Aug 2011 04:05:19 +0000
From: John <jwd@freebsd.org>
To: freebsd-fs@freebsd.org
Message-ID: <20110816040519.GA49864@FreeBSD.org>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
User-Agent: Mutt/1.4.2.3i
Cc: freebsd-current@freebsd.org
Subject: Three LOR with latest -current
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 16 Aug 2011 04:05:20 -0000

Hi folks,

   I'm seeing 3 lock order reversals with an up-to-date -current
system. Stock system, GENERIC kernel. Let me know if this isn't
enough information. Just booting the system and the dmesg.

Thanks,
John


lock order reversal:
 1st 0xfffffe0289627db8 ufs (ufs) @ /usr/src.2011-08-14_10.53pm_EDT/sys/ufs/ffs/ffs_snapshot.c:425
 2nd 0xffffff9f0db49778 bufwait (bufwait) @ /usr/src.2011-08-14_10.53pm_EDT/sys/kern/vfs_bio.c:2658
 3rd 0xfffffe00404a8098 ufs (ufs) @ /usr/src.2011-08-14_10.53pm_EDT/sys/ufs/ffs/ffs_snapshot.c:546
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2a
kdb_backtrace() at kdb_backtrace+0x37
_witness_debugger() at _witness_debugger+0x2e
witness_checkorder() at witness_checkorder+0x807
__lockmgr_args() at __lockmgr_args+0xdc6
ffs_lock() at ffs_lock+0x8c
VOP_LOCK1_APV() at VOP_LOCK1_APV+0x9b
_vn_lock() at _vn_lock+0x47
ffs_snapshot() at ffs_snapshot+0x1c31
ffs_mount() at ffs_mount+0xa24
vfs_donmount() at vfs_donmount+0xddc
nmount() at nmount+0x63
syscallenter() at syscallenter+0x1aa
syscall() at syscall+0x4c
Xfast_syscall() at Xfast_syscall+0xdd
--- syscall (378, FreeBSD ELF64, nmount), rip = 0x800abde1c, rsp = 0x7fffffffd968, rbp = 0x801008130 ---


lock order reversal:
 1st 0xffffff9f0db49778 bufwait (bufwait) @ /usr/src.2011-08-14_10.53pm_EDT/sys/kern/vfs_bio.c:2658
 2nd 0xfffffe004034dcb0 snaplk (snaplk) @ /usr/src.2011-08-14_10.53pm_EDT/sys/ufs/ffs/ffs_snapshot.c:818
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2a
kdb_backtrace() at kdb_backtrace+0x37
_witness_debugger() at _witness_debugger+0x2e
witness_checkorder() at witness_checkorder+0x807
__lockmgr_args() at __lockmgr_args+0xdc6
ffs_lock() at ffs_lock+0x8c
VOP_LOCK1_APV() at VOP_LOCK1_APV+0x9b
_vn_lock() at _vn_lock+0x47
ffs_snapshot() at ffs_snapshot+0x1b0c
ffs_mount() at ffs_mount+0xa24
vfs_donmount() at vfs_donmount+0xddc
nmount() at nmount+0x63
syscallenter() at syscallenter+0x1aa
syscall() at syscall+0x4c
Xfast_syscall() at Xfast_syscall+0xdd
--- syscall (378, FreeBSD ELF64, nmount), rip = 0x800abde1c, rsp = 0x7fffffffd968, rbp = 0x801008130 ---


lock order reversal:
 1st 0xfffffe004034dcb0 snaplk (snaplk) @ /usr/src.2011-08-14_10.53pm_EDT/sys/kern/vfs_vnops.c:301
 2nd 0xfffffe0289627db8 ufs (ufs) @ /usr/src.2011-08-14_10.53pm_EDT/sys/ufs/ffs/ffs_snapshot.c:1620
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2a
kdb_backtrace() at kdb_backtrace+0x37
_witness_debugger() at _witness_debugger+0x2e
witness_checkorder() at witness_checkorder+0x807
__lockmgr_args() at __lockmgr_args+0xdc6
ffs_snapremove() at ffs_snapremove+0xe7
ffs_truncate() at ffs_truncate+0x302
ufs_inactive() at ufs_inactive+0x260
vinactive() at vinactive+0x72
vputx() at vputx+0x386
vn_close() at vn_close+0x118
vn_closefile() at vn_closefile+0x5a
_fdrop() at _fdrop+0x23
closef() at closef+0x5c
kern_close() at kern_close+0x121
syscallenter() at syscallenter+0x1aa
syscall() at syscall+0x4c
Xfast_syscall() at Xfast_syscall+0xdd
--- syscall (6, FreeBSD ELF64, close), rip = 0x800b5e2bc, rsp = 0x7fffffffd968, rbp = 0 ---


From owner-freebsd-fs@FreeBSD.ORG  Tue Aug 16 10:12:32 2011
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@FreeBSD.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id E8EDF106564A
	for <freebsd-fs@FreeBSD.org>; Tue, 16 Aug 2011 10:12:31 +0000 (UTC)
	(envelope-from simon@comsys.ntu-kpi.kiev.ua)
Received: from comsys.kpi.ua (comsys.kpi.ua [77.47.192.42])
	by mx1.freebsd.org (Postfix) with ESMTP id 670408FC19
	for <freebsd-fs@FreeBSD.org>; Tue, 16 Aug 2011 10:12:31 +0000 (UTC)
Received: from pm513-1.comsys.kpi.ua ([10.18.52.101]
	helo=pm513-1.comsys.ntu-kpi.kiev.ua)
	by comsys.kpi.ua with esmtpsa (TLSv1:AES256-SHA:256) (Exim 4.63)
	(envelope-from <simon@comsys.ntu-kpi.kiev.ua>)
	id 1QtGdK-0001gW-8M; Tue, 16 Aug 2011 13:12:30 +0300
Received: by pm513-1.comsys.ntu-kpi.kiev.ua (Postfix, from userid 1001)
	id DF6711CC21; Tue, 16 Aug 2011 13:12:29 +0300 (EEST)
Date: Tue, 16 Aug 2011 13:12:29 +0300
From: Andrey Simonenko <simon@comsys.ntu-kpi.kiev.ua>
To: Martin Birgmeier <Martin.Birgmeier@aon.at>
Message-ID: <20110816101229.GA2012@pm513-1.comsys.ntu-kpi.kiev.ua>
References: <4E4657BD.2090803@aon.at>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <4E4657BD.2090803@aon.at>
User-Agent: Mutt/1.5.21 (2010-09-15)
X-Authenticated-User: simon@comsys.ntu-kpi.kiev.ua
X-Authenticator: plain
X-Sender-Verify: SUCCEEDED (sender exists & accepts mail)
X-Exim-Version: 4.63 (build at 10-Dec-2010 16:36:10)
X-Date: 2011-08-16 13:12:30
X-Connected-IP: 10.18.52.101:58923
X-Message-Linecount: 96
X-Body-Linecount: 80
X-Message-Size: 3854
X-Body-Size: 3209
Cc: freebsd-fs@FreeBSD.org
Subject: Re: Does nfse support specifying multiple exports for one mount
 point?
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 16 Aug 2011 10:12:32 -0000

On Sat, Aug 13, 2011 at 12:53:49PM +0200, Martin Birgmeier wrote:
> See http://www.freebsd.org/cgi/query-pr.cgi?pr=147881 - can I specify 
> multiple exports with nfse?
> 
> I am using the patch proposed in PR 147881, even though I believe it is 
> incomplete (I read that somewhere). For me, it is working fine; for 
> example, I have
> 
> [0]# zfs list -o name,sharenfs hal.1/backup/dumps
> NAME                SHARENFS
> hal.1/backup/dumps  -network 192.168.0.0 -mask 255.255.0.0;-network 
> fec0:0:0:4d42::/56
> [0]#
> 
> which in /etc/zfs/exports translates to
> 
> /z/backup/dumps -network 192.168.0.0 -mask 255.255.0.0
> /z/backup/dumps -network fec0:0:0:4d42::/56
> 
> How can I specify this using nfse?
> 

PR/147881 proposes a way how to specify different options for different
address specifications in one line.  Eg. different -mapall options for
different hosts in one line.

>From the nfs.exports(5) manual page: after any address specification it is
possible to use already specified option in the same line and its value will
overwrite previous option's value and it will be used for next address
specification.

Such options are: -mapall and -maproot, -ro and -rw, -sec.  It is possible
to create reverse logic options for -no_* and -mnt_export_brief options
as well.

The ``*'' hostname represents default export and can be used in a line with
other address specification.

Example:

% cat exports
/fs -ro -mapall 1:2:3 1.1.1.1 -sec krb5 -maproot 2:3:4 * 2.2.2.2 -rw 3.3.3.3
% nfse -t exports
configure: reading file exports

Pathname /fs
    Export specifications:
        -rw -sec krb5 -maproot 2:3:4 -host 3.3.3.3
        -ro -sec krb5 -maproot 2:3:4 -host 2.2.2.2
        -ro -sec sys -mapall 1:2:3 -host 1.1.1.1
        -ro -sec krb5 -maproot 2:3:4 *

The exports(5) manual page says that address specifications must be specified
after options.  The nfs.exports(5) file format allows to use options after
address specifications, so they can overwrite previously specified options.

If you applied cddl.diff patch, then you can use zfs share/unshare to change
ZFS NFS exports, and of course they will be changed atomically and changes
will be applied only for one file system in a time.  As a result if one used
zfs share/unshare for ZFS file system, then exports settings from other
exports files for this file system will be flushed.

The -alldirs options is also supported by the "zfs share" command, but
its logic does not follow logic described in nfs.exports(5).  If the -alldirs
options is used then nfse will create two exports: "/fs ..." and
"/fs -subdir -alldirs ...".  This is because of logic how zfs share/unshare
works:

1. "zfs sharenfs ..." are not incremental.  When we run "zfs sharenfs"
   for some file system it completely substitutes its settings.

2. There is no way to pass several settings for one file system at least
   for mountd.

3. "zfs sharenfs" does not allow to export subdirectories.

If you unsure about configuration logic for nfse, then just call "nfse -t"
and verify its output.  At any time run "nfse -c show" and verify current
NFS exports settings.  If you prefer to run nfse(8) in compatible mode with
mountd(8), then run it with the -C switch.

From owner-freebsd-fs@FreeBSD.ORG  Tue Aug 16 13:50:53 2011
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id F3E5C106564A
	for <freebsd-fs@freebsd.org>; Tue, 16 Aug 2011 13:50:52 +0000 (UTC)
	(envelope-from jhb@freebsd.org)
Received: from cyrus.watson.org (cyrus.watson.org [65.122.17.42])
	by mx1.freebsd.org (Postfix) with ESMTP id B59B48FC0C
	for <freebsd-fs@freebsd.org>; Tue, 16 Aug 2011 13:50:52 +0000 (UTC)
Received: from bigwig.baldwin.cx (66.111.2.69.static.nyinternet.net
	[66.111.2.69])
	by cyrus.watson.org (Postfix) with ESMTPSA id 2E93F46B23;
	Tue, 16 Aug 2011 09:50:52 -0400 (EDT)
Received: from jhbbsd.localnet (unknown [209.249.190.124])
	by bigwig.baldwin.cx (Postfix) with ESMTPSA id C4D838A037;
	Tue, 16 Aug 2011 09:50:51 -0400 (EDT)
From: John Baldwin <jhb@freebsd.org>
To: Jeremy Chadwick <freebsd@jdc.parodius.com>
Date: Tue, 16 Aug 2011 09:31:35 -0400
User-Agent: KMail/1.13.5 (FreeBSD/8.2-CBSD-20110617; KDE/4.5.5; amd64; ; )
References: <201108151343.14655.jhb@freebsd.org>
	<1730399830.175988.1313449094531.JavaMail.root@erie.cs.uoguelph.ca>
	<20110816022554.GA6018@icarus.home.lan>
In-Reply-To: <20110816022554.GA6018@icarus.home.lan>
MIME-Version: 1.0
Content-Type: Text/Plain;
  charset="iso-8859-1"
Content-Transfer-Encoding: 7bit
Message-Id: <201108160931.35626.jhb@freebsd.org>
X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.2.6
	(bigwig.baldwin.cx); Tue, 16 Aug 2011 09:50:51 -0400 (EDT)
Cc: freebsd-fs@freebsd.org, onwahe@gmail.com
Subject: Re: NFS calculation of max commit size
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 16 Aug 2011 13:50:53 -0000

On Monday, August 15, 2011 10:25:54 pm Jeremy Chadwick wrote:
> On Mon, Aug 15, 2011 at 06:58:14PM -0400, Rick Macklem wrote:
> > John Baldwin wrote:
> > > On Sunday, August 07, 2011 6:47:46 pm Rick Macklem wrote:
> > > > A recent PR (kern/159351) noted that the following
> > > > calculation results in a divide-by-zero when
> > > > desiredvnodes < 1000.
> > > >
> > > > 	nmp->nm_wcommitsize = hibufspace / (desiredvnodes / 1000);
> > > >
> > > > Just fixing the divide-by-zero is easy enough, but I'm not
> > > > sure what this calculation is trying to do. Making it a fraction
> > > > of "hibufspace" makes sense (nm_wcommitsize is the maximum # of
> > > > bytes of uncommitted data in the NFS client's buffer cache blocks,
> > > > if I understand it correctly), but why divide it by
> > > >
> > > >                 (desiredvnodes / 1000) ??
> > > >
> > > > Maybe thinking that fewer vnodes means sharing it with fewer
> > > > other file systems or ???
> > > >
> > > > Anyhow, it seems to me that the formulae is bogus for small
> > > > values of desiredvnodes (for example desiredvnodes == 1500
> > > > implies nm_wcommitsize == hibufspace, which sounds too large
> > > > to me).
> > > >
> > > > I'm thinking that putting an upper limit of 10% of hibufspace
> > > > might make sense. ie. Change the above to:
> > > >
> > > > 	if (desiredvnodes >= 11000)
> > > > 		nmp->nm_wcommitsize = hibufspace / (desiredvnodes / 1000);
> > > > 	else
> > > > 		nmp->nm_wcommitsize = hibufspace / 10;
> > > >
> > > > Anyone have comments or insight into this calculation?
> > > >
> > > > rick
> > > > ps: jhb, I hope you don't mind. I emailed you first and then
> > > >     thought others might have some ideas, too.
> > > 
> > > Oh no, this is fine. A broader discussion is probably warranted. I
> > > honestly
> > > don't know what the goal is. I do think it is an attempt to share with
> > > other
> > > file systems, but I'm not sure how desiredvnodes / 1000 is useful for
> > > that.
> > > It also seems that we can end up setting this woefully low as well.
> > > That is,
> > > I wonder if we need a minimum of 10% of hibufspace so that it can
> > > scale
> > > between 10% and 90% of hibufspace (but I'm not sure what you would use
> > > to
> > > pick the scaling factor sanely). To my mind what you really want to do
> > > is
> > > something like 'hibufspace / (number of active mounts)', but that will
> > > not
> > > really work correctly unless we recalculate the value on each mount
> > > and
> > > unmount operation.
> > > 
> > > --
> > > John Baldwin
> > Btw, this was done by r147280 6.5years ago, so the formula doesn't seem
> > to be causing a lot of grief. Also of some interest is the fact that
> > wcommitsize appears to have been setable on a per-mount-point-basis until
> > mount_nfs(8) was converted to nmount(2). { There is no nmount option to set it. }
> > 
> > Btw, when nm_wcommitsize is exceeded, writes become synchronous, so it affects
> > how much write behind happens. This, in turn, affects how bursty (is this a real
> > word? hopefully you get what I mean?) the write traffic to the server is.
> > 
> > What I'm not sure about is what happens when multiple mounts use up the entire
> > buffer cache with write behinds. I'll try a little experiment to see if I
> > can find that out. (If making it large isn't detrimental, then I tend to
> > agree that the above sets nm_wcommitsize very small.)
> > 
> > Since "desiredvnodes" will seldom be less than 1000, I'm not going to
> > rush to a solution.
> > 
> > Anyone who has insight into what this formula should be, please let us know.
> 
> The commit message tries to explain it, but it's more than just a
> one-line change.
> 
> http://www.freebsd.org/cgi/cvsweb.cgi/src/sys/nfsclient/nfs_vfsops.c#rev1.177
> 
> There's also an associated PR:
> 
> http://www.freebsd.org/cgi/query-pr.cgi?pr=79208

The commit added the limit which is sensible, but it doesn't explain the logic
for how the limit is computed (that is, why it uses desiredvnodes / 1000).

-- 
John Baldwin

From owner-freebsd-fs@FreeBSD.ORG  Tue Aug 16 16:22:24 2011
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 4AA1B106566C
	for <freebsd-fs@freebsd.org>; Tue, 16 Aug 2011 16:22:24 +0000 (UTC)
	(envelope-from tdb@carrick.bishnet.net)
Received: from carrick.bishnet.net (carrick.bishnet.net
	[IPv6:2a01:348:132:1::1])
	by mx1.freebsd.org (Postfix) with ESMTP id 136718FC18
	for <freebsd-fs@freebsd.org>; Tue, 16 Aug 2011 16:22:23 +0000 (UTC)
Received: from carrick-users.bishnet.net ([2a01:348:132:51::10])
	by carrick.bishnet.net with esmtps (TLSv1:AES256-SHA:256)
	(Exim 4.76 (FreeBSD)) (envelope-from <tdb@carrick.bishnet.net>)
	id 1QtMPH-000DFv-CF
	for freebsd-fs@freebsd.org; Tue, 16 Aug 2011 17:22:23 +0100
Received: (from tdb@localhost)
	by carrick-users.bishnet.net (8.14.4/8.14.4/Submit) id p7GGMN8b050958
	for freebsd-fs@freebsd.org; Tue, 16 Aug 2011 17:22:23 +0100 (BST)
	(envelope-from tdb)
Date: Tue, 16 Aug 2011 17:22:23 +0100
From: Tim Bishop <tim@bishnet.net>
To: freebsd-fs@freebsd.org
Message-ID: <20110816162223.GG7564@carrick-users.bishnet.net>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
X-PGP-Key: 0x5AE7D984, http://www.bishnet.net/tim/tim-bishnet-net.asc
X-PGP-Fingerprint: 1453 086E 9376 1A50 ECF6  AE05 7DCE D659 5AE7 D984
User-Agent: Mutt/1.5.21 (2010-09-15)
Subject: Did zpool offline, then reboot, now "freebsd zfs i/o error - all
 block copies unavailable"
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 16 Aug 2011 16:22:24 -0000

I suspected one of my disks (in a zpool mirror) had a problem, so I
thought I'd test this out by offlining the disk and rebooting.

Unfortunately this failed during the reboot with the following error:

freebsd zfs i/o error - all block copies unavailable

A search of the archive suggests people have hit this before, but not in
the same circumstance (I think).

I figured the pool being degraded might be the issue, so I booted from a
livecd and detatched the offlined disk leaving the pool in a good state.
Rebooted again but the same problem occured.

I'm now back in the livecd reattaching the disk, but this'll take a long
time (ETA 24h). Fingers crossed it works!

Whilst I wait, does anybody have any ideas what went wrong?

Tim.

-- 
Tim Bishop
http://www.bishnet.net/tim/
PGP Key: 0x5AE7D984

From owner-freebsd-fs@FreeBSD.ORG  Tue Aug 16 17:04:57 2011
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 70115106564A
	for <freebsd-fs@freebsd.org>; Tue, 16 Aug 2011 17:04:57 +0000 (UTC)
	(envelope-from tdb@carrick.bishnet.net)
Received: from carrick.bishnet.net (carrick.bishnet.net
	[IPv6:2a01:348:132:1::1])
	by mx1.freebsd.org (Postfix) with ESMTP id 36E318FC16
	for <freebsd-fs@freebsd.org>; Tue, 16 Aug 2011 17:04:57 +0000 (UTC)
Received: from carrick-users.bishnet.net ([2a01:348:132:51::10])
	by carrick.bishnet.net with esmtps (TLSv1:AES256-SHA:256)
	(Exim 4.76 (FreeBSD)) (envelope-from <tdb@carrick.bishnet.net>)
	id 1QtN4S-000Gr9-LG
	for freebsd-fs@freebsd.org; Tue, 16 Aug 2011 18:04:56 +0100
Received: (from tdb@localhost)
	by carrick-users.bishnet.net (8.14.4/8.14.4/Submit) id p7GH4ucs064741
	for freebsd-fs@freebsd.org; Tue, 16 Aug 2011 18:04:56 +0100 (BST)
	(envelope-from tdb)
Date: Tue, 16 Aug 2011 18:04:56 +0100
From: Tim Bishop <tim@bishnet.net>
To: freebsd-fs@freebsd.org
Message-ID: <20110816170456.GH7564@carrick-users.bishnet.net>
References: <20110816162223.GG7564@carrick-users.bishnet.net>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <20110816162223.GG7564@carrick-users.bishnet.net>
X-PGP-Key: 0x5AE7D984, http://www.bishnet.net/tim/tim-bishnet-net.asc
X-PGP-Fingerprint: 1453 086E 9376 1A50 ECF6  AE05 7DCE D659 5AE7 D984
User-Agent: Mutt/1.5.21 (2010-09-15)
Subject: Re: Did zpool offline, then reboot, now "freebsd zfs i/o error - all
 block copies unavailable"
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 16 Aug 2011 17:04:57 -0000

On Tue, Aug 16, 2011 at 05:22:23PM +0100, Tim Bishop wrote:
> I suspected one of my disks (in a zpool mirror) had a problem, so I
> thought I'd test this out by offlining the disk and rebooting.
> 
> Unfortunately this failed during the reboot with the following error:
> 
> freebsd zfs i/o error - all block copies unavailable
> 
> A search of the archive suggests people have hit this before, but not in
> the same circumstance (I think).
> 
> I figured the pool being degraded might be the issue, so I booted from a
> livecd and detatched the offlined disk leaving the pool in a good state.
> Rebooted again but the same problem occured.
> 
> I'm now back in the livecd reattaching the disk, but this'll take a long
> time (ETA 24h). Fingers crossed it works!
> 
> Whilst I wait, does anybody have any ideas what went wrong?

Doh - simple I think. The cache file didn't match the real state of the
system. Copied the one from the livecd's /boot/zfs to the single disk
pool and it now boots fine.

Tim.

-- 
Tim Bishop
http://www.bishnet.net/tim/
PGP Key: 0x5AE7D984

From owner-freebsd-fs@FreeBSD.ORG  Wed Aug 17 01:28:38 2011
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id CAD96106566C
	for <freebsd-fs@freebsd.org>; Wed, 17 Aug 2011 01:28:38 +0000 (UTC)
	(envelope-from jdc@koitsu.dyndns.org)
Received: from qmta03.emeryville.ca.mail.comcast.net
	(qmta03.emeryville.ca.mail.comcast.net [76.96.30.32])
	by mx1.freebsd.org (Postfix) with ESMTP id AFBF68FC1C
	for <freebsd-fs@freebsd.org>; Wed, 17 Aug 2011 01:28:38 +0000 (UTC)
Received: from omta01.emeryville.ca.mail.comcast.net ([76.96.30.11])
	by qmta03.emeryville.ca.mail.comcast.net with comcast
	id MDDQ1h0070EPchoA3DUaVs; Wed, 17 Aug 2011 01:28:34 +0000
Received: from koitsu.dyndns.org ([67.180.84.87])
	by omta01.emeryville.ca.mail.comcast.net with comcast
	id MDU91h00G1t3BNj8MDUPmC; Wed, 17 Aug 2011 01:28:39 +0000
Received: by icarus.home.lan (Postfix, from userid 1000)
	id 50382102C1A; Tue, 16 Aug 2011 18:28:06 -0700 (PDT)
Date: Tue, 16 Aug 2011 18:28:06 -0700
From: Jeremy Chadwick <freebsd@jdc.parodius.com>
To: John Baldwin <jhb@freebsd.org>
Message-ID: <20110817012806.GA29555@icarus.home.lan>
References: <201108151343.14655.jhb@freebsd.org>
	<1730399830.175988.1313449094531.JavaMail.root@erie.cs.uoguelph.ca>
	<20110816022554.GA6018@icarus.home.lan>
	<201108160931.35626.jhb@freebsd.org>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <201108160931.35626.jhb@freebsd.org>
User-Agent: Mutt/1.5.21 (2010-09-15)
Cc: freebsd-fs@freebsd.org, onwahe@gmail.com
Subject: Re: NFS calculation of max commit size
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 17 Aug 2011 01:28:38 -0000

On Tue, Aug 16, 2011 at 09:31:35AM -0400, John Baldwin wrote:
> On Monday, August 15, 2011 10:25:54 pm Jeremy Chadwick wrote:
> > On Mon, Aug 15, 2011 at 06:58:14PM -0400, Rick Macklem wrote:
> > > John Baldwin wrote:
> > > > On Sunday, August 07, 2011 6:47:46 pm Rick Macklem wrote:
> > > > > A recent PR (kern/159351) noted that the following
> > > > > calculation results in a divide-by-zero when
> > > > > desiredvnodes < 1000.
> > > > >
> > > > > 	nmp->nm_wcommitsize = hibufspace / (desiredvnodes / 1000);
> > > > >
> > > > > Just fixing the divide-by-zero is easy enough, but I'm not
> > > > > sure what this calculation is trying to do. Making it a fraction
> > > > > of "hibufspace" makes sense (nm_wcommitsize is the maximum # of
> > > > > bytes of uncommitted data in the NFS client's buffer cache blocks,
> > > > > if I understand it correctly), but why divide it by
> > > > >
> > > > >                 (desiredvnodes / 1000) ??
> > > > >
> > > > > Maybe thinking that fewer vnodes means sharing it with fewer
> > > > > other file systems or ???
> > > > >
> > > > > Anyhow, it seems to me that the formulae is bogus for small
> > > > > values of desiredvnodes (for example desiredvnodes == 1500
> > > > > implies nm_wcommitsize == hibufspace, which sounds too large
> > > > > to me).
> > > > >
> > > > > I'm thinking that putting an upper limit of 10% of hibufspace
> > > > > might make sense. ie. Change the above to:
> > > > >
> > > > > 	if (desiredvnodes >= 11000)
> > > > > 		nmp->nm_wcommitsize = hibufspace / (desiredvnodes / 1000);
> > > > > 	else
> > > > > 		nmp->nm_wcommitsize = hibufspace / 10;
> > > > >
> > > > > Anyone have comments or insight into this calculation?
> > > > >
> > > > > rick
> > > > > ps: jhb, I hope you don't mind. I emailed you first and then
> > > > >     thought others might have some ideas, too.
> > > > 
> > > > Oh no, this is fine. A broader discussion is probably warranted. I
> > > > honestly
> > > > don't know what the goal is. I do think it is an attempt to share with
> > > > other
> > > > file systems, but I'm not sure how desiredvnodes / 1000 is useful for
> > > > that.
> > > > It also seems that we can end up setting this woefully low as well.
> > > > That is,
> > > > I wonder if we need a minimum of 10% of hibufspace so that it can
> > > > scale
> > > > between 10% and 90% of hibufspace (but I'm not sure what you would use
> > > > to
> > > > pick the scaling factor sanely). To my mind what you really want to do
> > > > is
> > > > something like 'hibufspace / (number of active mounts)', but that will
> > > > not
> > > > really work correctly unless we recalculate the value on each mount
> > > > and
> > > > unmount operation.
> > > > 
> > > > --
> > > > John Baldwin
> > > Btw, this was done by r147280 6.5years ago, so the formula doesn't seem
> > > to be causing a lot of grief. Also of some interest is the fact that
> > > wcommitsize appears to have been setable on a per-mount-point-basis until
> > > mount_nfs(8) was converted to nmount(2). { There is no nmount option to set it. }
> > > 
> > > Btw, when nm_wcommitsize is exceeded, writes become synchronous, so it affects
> > > how much write behind happens. This, in turn, affects how bursty (is this a real
> > > word? hopefully you get what I mean?) the write traffic to the server is.
> > > 
> > > What I'm not sure about is what happens when multiple mounts use up the entire
> > > buffer cache with write behinds. I'll try a little experiment to see if I
> > > can find that out. (If making it large isn't detrimental, then I tend to
> > > agree that the above sets nm_wcommitsize very small.)
> > > 
> > > Since "desiredvnodes" will seldom be less than 1000, I'm not going to
> > > rush to a solution.
> > > 
> > > Anyone who has insight into what this formula should be, please let us know.
> > 
> > The commit message tries to explain it, but it's more than just a
> > one-line change.
> > 
> > http://www.freebsd.org/cgi/cvsweb.cgi/src/sys/nfsclient/nfs_vfsops.c#rev1.177
> > 
> > There's also an associated PR:
> > 
> > http://www.freebsd.org/cgi/query-pr.cgi?pr=79208
> 
> The commit added the limit which is sensible, but it doesn't explain the logic
> for how the limit is computed (that is, why it uses desiredvnodes / 1000).

Understood -- what I was getting at was that the individuals responsible
for the commit (there were multiples who reviewed it) could be contacted
and inquiries submit.  :-)

-- 
| Jeremy Chadwick                                jdc at parodius.com |
| Parodius Networking                       http://www.parodius.com/ |
| UNIX Systems Administrator                   Mountain View, CA, US |
| Making life hard for others since 1977.               PGP 4BD6C0CB |


From owner-freebsd-fs@FreeBSD.ORG  Wed Aug 17 09:32:01 2011
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@FreeBSD.ORG
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 17C1B106564A
	for <freebsd-fs@FreeBSD.ORG>; Wed, 17 Aug 2011 09:32:01 +0000 (UTC)
	(envelope-from prvs=1210f20b9f=killing@multiplay.co.uk)
Received: from mail1.multiplay.co.uk (mail1.multiplay.co.uk [85.236.96.23])
	by mx1.freebsd.org (Postfix) with ESMTP id 8E2348FC08
	for <freebsd-fs@FreeBSD.ORG>; Wed, 17 Aug 2011 09:32:00 +0000 (UTC)
X-MDAV-Processed: mail1.multiplay.co.uk, Wed, 17 Aug 2011 10:20:16 +0100
X-Spam-Processed: mail1.multiplay.co.uk, Wed, 17 Aug 2011 10:20:16 +0100
X-Spam-Checker-Version: SpamAssassin 3.2.5 (2008-06-10) on
	mail1.multiplay.co.uk
X-Spam-Level: 
X-Spam-Status: No, score=-5.0 required=6.0 tests=USER_IN_WHITELIST
	shortcircuit=ham autolearn=disabled version=3.2.5
Received: from r2d2 ([188.220.16.49])
	by mail1.multiplay.co.uk (mail1.multiplay.co.uk [85.236.96.23])
	(MDaemon PRO v10.0.4) with ESMTP id md50014632193.msg
	for <freebsd-fs@FreeBSD.ORG>; Wed, 17 Aug 2011 10:20:14 +0100
X-MDRemoteIP: 188.220.16.49
X-Return-Path: prvs=1210f20b9f=killing@multiplay.co.uk
X-Envelope-From: killing@multiplay.co.uk
X-MDaemon-Deliver-To: freebsd-fs@FreeBSD.ORG
Message-ID: <F7A3C3E95ADC4193ABBF02AE64BBF3E1@multiplay.co.uk>
From: "Steven Hartland" <killing@multiplay.co.uk>
To: "Jeremy Chadwick" <freebsd@jdc.parodius.com>
References: <20110728012437.GA23430@icarus.home.lan><FD3A11BEFD064193AA24C1DF09EDD719@multiplay.co.uk><20110728103234.GA33275@icarus.home.lan><A6828B6CE6764E13A44B1ABF61CF3FED@multiplay.co.uk><20110728145917.GA37805@icarus.home.lan><2A07CD8AE6AE49A5BAED59A7E547D1F9@multiplay.co.uk><2D117F9F212A4CCBA6B7F51E8705BDB7@multiplay.co.uk><20110805033001.GA47366@icarus.home.lan><20110805044725.GA48395@icarus.home.lan><C8CAF8EE63BF41B9A71C987F6A6ACB4C@multiplay.co.uk><20110806041822.GA11439@icarus.home.lan>
	<42162705FC5E4E748A1A57285AECA49A@multiplay.co.uk>
Date: Wed, 17 Aug 2011 10:20:51 +0100
MIME-Version: 1.0
Content-Type: text/plain; format=flowed; charset="iso-8859-1";
	reply-type=response
Content-Transfer-Encoding: 7bit
X-Priority: 3
X-MSMail-Priority: Normal
X-Mailer: Microsoft Outlook Express 6.00.2900.5931
X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2900.6109
Cc: freebsd-fs@FreeBSD.ORG
Subject: Re: Questions about erasing an ssd to restore performance
	underFreeBSD
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 17 Aug 2011 09:32:01 -0000

----- Original Message ----- 
From: "Steven Hartland" <killing@multiplay.co.uk>

All our tests have now been successful so I've now
submitted this patch as a PR, which I hope can
be included in a future release, 9.0 maybe if its
not too late :)
http://www.freebsd.org/cgi/query-pr.cgi?pr=159833

    Regards
    Steve

================================================
This e.mail is private and confidential between Multiplay (UK) Ltd. and the person or entity to whom it is addressed. In the event of misdirection, the recipient is prohibited from using, copying, printing or otherwise disseminating it or any information contained in it. 

In the event of misdirection, illegible or incomplete transmission please telephone +44 845 868 1337
or return the E.mail to postmaster@multiplay.co.uk.


From owner-freebsd-fs@FreeBSD.ORG  Wed Aug 17 13:15:16 2011
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id E1F7210656B3;
	Wed, 17 Aug 2011 13:15:16 +0000 (UTC)
	(envelope-from rmacklem@uoguelph.ca)
Received: from esa-jnhn.mail.uoguelph.ca (esa-jnhn.mail.uoguelph.ca
	[131.104.91.44])
	by mx1.freebsd.org (Postfix) with ESMTP id 53DDE8FC0C;
	Wed, 17 Aug 2011 13:15:16 +0000 (UTC)
X-IronPort-Anti-Spam-Filtered: true
X-IronPort-Anti-Spam-Result: AsAAADm+S06DaFvO/2dsb2JhbABChEiUSpBNgUABAQQBIwRSGwcHCgICDRkCWQYcEAKHVQSkSpFzgSyEDIEQBJMTkRE
X-IronPort-AV: E=Sophos;i="4.68,240,1312171200"; d="scan'208";a="134665502"
Received: from erie.cs.uoguelph.ca (HELO zcs3.mail.uoguelph.ca)
	([131.104.91.206])
	by esa-jnhn-pri.mail.uoguelph.ca with ESMTP; 17 Aug 2011 09:15:15 -0400
Received: from zcs3.mail.uoguelph.ca (localhost.localdomain [127.0.0.1])
	by zcs3.mail.uoguelph.ca (Postfix) with ESMTP id 48B8DB3F1E;
	Wed, 17 Aug 2011 09:15:15 -0400 (EDT)
Date: Wed, 17 Aug 2011 09:15:15 -0400 (EDT)
From: Rick Macklem <rmacklem@uoguelph.ca>
To: Jeremy Chadwick <freebsd@jdc.parodius.com>
Message-ID: <1313769356.247298.1313586915280.JavaMail.root@erie.cs.uoguelph.ca>
In-Reply-To: <20110817012806.GA29555@icarus.home.lan>
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit
X-Originating-IP: [172.17.91.201]
X-Mailer: Zimbra 6.0.10_GA_2692 (ZimbraWebClient - FF3.0 (Win)/6.0.10_GA_2692)
Cc: freebsd-fs@freebsd.org, onwahe@gmail.com
Subject: Re: NFS calculation of max commit size
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 17 Aug 2011 13:15:17 -0000

Jeremy Chadwick wrote:
> On Tue, Aug 16, 2011 at 09:31:35AM -0400, John Baldwin wrote:
> > On Monday, August 15, 2011 10:25:54 pm Jeremy Chadwick wrote:
> > > On Mon, Aug 15, 2011 at 06:58:14PM -0400, Rick Macklem wrote:
> > > > John Baldwin wrote:
> > > > > On Sunday, August 07, 2011 6:47:46 pm Rick Macklem wrote:
> > > > > > A recent PR (kern/159351) noted that the following
> > > > > > calculation results in a divide-by-zero when
> > > > > > desiredvnodes < 1000.
> > > > > >
> > > > > > 	nmp->nm_wcommitsize = hibufspace / (desiredvnodes / 1000);
> > > > > >
> > > > > > Just fixing the divide-by-zero is easy enough, but I'm not
> > > > > > sure what this calculation is trying to do. Making it a
> > > > > > fraction
> > > > > > of "hibufspace" makes sense (nm_wcommitsize is the maximum #
> > > > > > of
> > > > > > bytes of uncommitted data in the NFS client's buffer cache
> > > > > > blocks,
> > > > > > if I understand it correctly), but why divide it by
> > > > > >
> > > > > >                 (desiredvnodes / 1000) ??
> > > > > >
> > > > > > Maybe thinking that fewer vnodes means sharing it with fewer
> > > > > > other file systems or ???
> > > > > >
> > > > > > Anyhow, it seems to me that the formulae is bogus for small
> > > > > > values of desiredvnodes (for example desiredvnodes == 1500
> > > > > > implies nm_wcommitsize == hibufspace, which sounds too large
> > > > > > to me).
> > > > > >
> > > > > > I'm thinking that putting an upper limit of 10% of
> > > > > > hibufspace
> > > > > > might make sense. ie. Change the above to:
> > > > > >
> > > > > > 	if (desiredvnodes >= 11000)
> > > > > > 		nmp->nm_wcommitsize = hibufspace / (desiredvnodes / 1000);
> > > > > > 	else
> > > > > > 		nmp->nm_wcommitsize = hibufspace / 10;
> > > > > >
> > > > > > Anyone have comments or insight into this calculation?
> > > > > >
> > > > > > rick
> > > > > > ps: jhb, I hope you don't mind. I emailed you first and then
> > > > > >     thought others might have some ideas, too.
> > > > >
> > > > > Oh no, this is fine. A broader discussion is probably
> > > > > warranted. I
> > > > > honestly
> > > > > don't know what the goal is. I do think it is an attempt to
> > > > > share with
> > > > > other
> > > > > file systems, but I'm not sure how desiredvnodes / 1000 is
> > > > > useful for
> > > > > that.
> > > > > It also seems that we can end up setting this woefully low as
> > > > > well.
> > > > > That is,
> > > > > I wonder if we need a minimum of 10% of hibufspace so that it
> > > > > can
> > > > > scale
> > > > > between 10% and 90% of hibufspace (but I'm not sure what you
> > > > > would use
> > > > > to
> > > > > pick the scaling factor sanely). To my mind what you really
> > > > > want to do
> > > > > is
> > > > > something like 'hibufspace / (number of active mounts)', but
> > > > > that will
> > > > > not
> > > > > really work correctly unless we recalculate the value on each
> > > > > mount
> > > > > and
> > > > > unmount operation.
> > > > >
> > > > > --
> > > > > John Baldwin
> > > > Btw, this was done by r147280 6.5years ago, so the formula
> > > > doesn't seem
> > > > to be causing a lot of grief. Also of some interest is the fact
> > > > that
> > > > wcommitsize appears to have been setable on a
> > > > per-mount-point-basis until
> > > > mount_nfs(8) was converted to nmount(2). { There is no nmount
> > > > option to set it. }
> > > >
> > > > Btw, when nm_wcommitsize is exceeded, writes become synchronous,
> > > > so it affects
> > > > how much write behind happens. This, in turn, affects how bursty
> > > > (is this a real
> > > > word? hopefully you get what I mean?) the write traffic to the
> > > > server is.
> > > >
> > > > What I'm not sure about is what happens when multiple mounts use
> > > > up the entire
> > > > buffer cache with write behinds. I'll try a little experiment to
> > > > see if I
> > > > can find that out. (If making it large isn't detrimental, then I
> > > > tend to
> > > > agree that the above sets nm_wcommitsize very small.)
> > > >
> > > > Since "desiredvnodes" will seldom be less than 1000, I'm not
> > > > going to
> > > > rush to a solution.
> > > >
> > > > Anyone who has insight into what this formula should be, please
> > > > let us know.
> > >
> > > The commit message tries to explain it, but it's more than just a
> > > one-line change.
> > >
> > > http://www.freebsd.org/cgi/cvsweb.cgi/src/sys/nfsclient/nfs_vfsops.c#rev1.177
> > >
> > > There's also an associated PR:
> > >
> > > http://www.freebsd.org/cgi/query-pr.cgi?pr=79208
> >
> > The commit added the limit which is sensible, but it doesn't explain
> > the logic
> > for how the limit is computed (that is, why it uses desiredvnodes /
> > 1000).
> 
> Understood -- what I was getting at was that the individuals
> responsible
> for the commit (there were multiples who reviewed it) could be
> contacted
> and inquiries submit. :-)
> 
I did email the original committer and have not heard back. (I didn't
try the reviewer(s).)

I'm going to start doing a little experimentation with this and will
report back when I have something that might be of interest.

I think that any fraction of hibufspace should be sufficient to avoid
the deadlock. Also, since the buffer cache code doesn't use vnode locking
these days, I'm not even sure if write backs are blocked by the wrire
vnode op in progress. (ie. I'm not sure the deadlock it originally fixed
would still happen without it.)

rick

> --
> | Jeremy Chadwick jdc at parodius.com |
> | Parodius Networking http://www.parodius.com/ |
> | UNIX Systems Administrator Mountain View, CA, US |
> | Making life hard for others since 1977. PGP 4BD6C0CB |

From owner-freebsd-fs@FreeBSD.ORG  Wed Aug 17 13:52:38 2011
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 34F6E106564A
	for <freebsd-fs@freebsd.org>; Wed, 17 Aug 2011 13:52:37 +0000 (UTC)
	(envelope-from kostikbel@gmail.com)
Received: from mail.zoral.com.ua (mx0.zoral.com.ua [91.193.166.200])
	by mx1.freebsd.org (Postfix) with ESMTP id 6BCFF8FC0A
	for <freebsd-fs@freebsd.org>; Wed, 17 Aug 2011 13:52:36 +0000 (UTC)
Received: from deviant.kiev.zoral.com.ua (root@deviant.kiev.zoral.com.ua
	[10.1.1.148])
	by mail.zoral.com.ua (8.14.2/8.14.2) with ESMTP id p7HDqVXv042198
	(version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO);
	Wed, 17 Aug 2011 16:52:31 +0300 (EEST)
	(envelope-from kostikbel@gmail.com)
Received: from deviant.kiev.zoral.com.ua (kostik@localhost [127.0.0.1])
	by deviant.kiev.zoral.com.ua (8.14.4/8.14.4) with ESMTP id
	p7HDqUSG091356; Wed, 17 Aug 2011 16:52:30 +0300 (EEST)
	(envelope-from kostikbel@gmail.com)
Received: (from kostik@localhost)
	by deviant.kiev.zoral.com.ua (8.14.4/8.14.4/Submit) id p7HDqUU3091354; 
	Wed, 17 Aug 2011 16:52:30 +0300 (EEST)
	(envelope-from kostikbel@gmail.com)
X-Authentication-Warning: deviant.kiev.zoral.com.ua: kostik set sender to
	kostikbel@gmail.com using -f
Date: Wed, 17 Aug 2011 16:52:30 +0300
From: Kostik Belousov <kostikbel@gmail.com>
To: Rick Macklem <rmacklem@uoguelph.ca>
Message-ID: <20110817135230.GW17489@deviant.kiev.zoral.com.ua>
References: <20110817012806.GA29555@icarus.home.lan>
	<1313769356.247298.1313586915280.JavaMail.root@erie.cs.uoguelph.ca>
Mime-Version: 1.0
Content-Type: multipart/signed; micalg=pgp-sha1;
	protocol="application/pgp-signature"; boundary="DTWYra+TaQJg/4dl"
Content-Disposition: inline
In-Reply-To: <1313769356.247298.1313586915280.JavaMail.root@erie.cs.uoguelph.ca>
User-Agent: Mutt/1.4.2.3i
X-Virus-Scanned: clamav-milter 0.95.2 at skuns.kiev.zoral.com.ua
X-Virus-Status: Clean
X-Spam-Status: No, score=-3.3 required=5.0 tests=ALL_TRUSTED,AWL,BAYES_00,
	DNS_FROM_OPENWHOIS autolearn=no version=3.2.5
X-Spam-Checker-Version: SpamAssassin 3.2.5 (2008-06-10) on
	skuns.kiev.zoral.com.ua
Cc: freebsd-fs@freebsd.org, onwahe@gmail.com
Subject: Re: NFS calculation of max commit size
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 17 Aug 2011 13:52:38 -0000


--DTWYra+TaQJg/4dl
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

On Wed, Aug 17, 2011 at 09:15:15AM -0400, Rick Macklem wrote:
>=20
> I think that any fraction of hibufspace should be sufficient to avoid
> the deadlock. Also, since the buffer cache code doesn't use vnode locking
> these days, I'm not even sure if write backs are blocked by the wrire
> vnode op in progress. (ie. I'm not sure the deadlock it originally fixed
> would still happen without it.)

bufdaemon definitely acquires vnode lock when flushing dirty buffer,
this was a problem on its own. I think you refer to the nfsiod operation.

There is another op that is performed without holding the vnode lock
consistently from (old)nfs code, namely, truncation. It would be useful
to fix this. Please see r188386.

--DTWYra+TaQJg/4dl
Content-Type: application/pgp-signature
Content-Disposition: inline

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (FreeBSD)

iEYEARECAAYFAk5Lx54ACgkQC3+MBN1Mb4irygCgntPbEsHt+JVa1uL9BfJzv4Lz
EBkAoORNKVitJTHM8xVseUyPQvSzpi1N
=FgyM
-----END PGP SIGNATURE-----

--DTWYra+TaQJg/4dl--

From owner-freebsd-fs@FreeBSD.ORG  Wed Aug 17 20:51:07 2011
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 6C28B106566B
	for <freebsd-fs@freebsd.org>; Wed, 17 Aug 2011 20:51:07 +0000 (UTC)
	(envelope-from Martin.Birgmeier@aon.at)
Received: from email.aon.at (smtpout03.highway.telekom.at [195.3.96.115])
	by mx1.freebsd.org (Postfix) with ESMTP id BA25A8FC1B
	for <freebsd-fs@freebsd.org>; Wed, 17 Aug 2011 20:51:06 +0000 (UTC)
Received: (qmail 15674 invoked from network); 17 Aug 2011 20:51:04 -0000
X-Spam-Checker-Version: SpamAssassin 3.2.0 (2007-05-01) on
	WARSBL604.highway.telekom.at
X-Spam-Level: 
Received: from 188-23-212-142.adsl.highway.telekom.at (HELO gandalf.xyzzy)
	([188.23.212.142]) (envelope-sender <Martin.Birgmeier@aon.at>)
	by smarthub77.res.a1.net (qmail-ldap-1.03) with AES256-SHA encrypted
	SMTP for <freebsd-fs@freebsd.org>; 17 Aug 2011 20:51:03 -0000
Received: from atpcdvvc.xyzzy (atpcdvvc.xyzzy [IPv6:fec0:0:0:4d42::84])
	by gandalf.xyzzy (8.14.4/8.14.4) with ESMTP id p7HKp33j024521
	for <freebsd-fs@freebsd.org>; Wed, 17 Aug 2011 22:51:03 +0200 (CEST)
	(envelope-from Martin.Birgmeier@aon.at)
Message-ID: <4E4C29B7.3010806@aon.at>
Date: Wed, 17 Aug 2011 22:51:03 +0200
From: Martin Birgmeier <Martin.Birgmeier@aon.at>
User-Agent: Mozilla/5.0 (X11; FreeBSD i386;
	rv:5.0) Gecko/20110708 Thunderbird/5.0
MIME-Version: 1.0
To: freebsd-fs@freebsd.org
References: <4E4657BD.2090803@aon.at>
	<20110816101229.GA2012@pm513-1.comsys.ntu-kpi.kiev.ua>
In-Reply-To: <20110816101229.GA2012@pm513-1.comsys.ntu-kpi.kiev.ua>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Subject: Re: Does nfse support specifying multiple exports for one mount
	point?
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 17 Aug 2011 20:51:07 -0000

Thank you Andrey.

Regards,

Martin

On 08/16/11 12:12, Andrey Simonenko wrote:
> On Sat, Aug 13, 2011 at 12:53:49PM +0200, Martin Birgmeier wrote:
>> See http://www.freebsd.org/cgi/query-pr.cgi?pr=147881 - can I specify
>> multiple exports with nfse?
>>
>> I am using the patch proposed in PR 147881, even though I believe it is
>> incomplete (I read that somewhere). For me, it is working fine; for
>> example, I have
>>
>> [0]# zfs list -o name,sharenfs hal.1/backup/dumps
>> NAME                SHARENFS
>> hal.1/backup/dumps  -network 192.168.0.0 -mask 255.255.0.0;-network
>> fec0:0:0:4d42::/56
>> [0]#
>>
>> which in /etc/zfs/exports translates to
>>
>> /z/backup/dumps -network 192.168.0.0 -mask 255.255.0.0
>> /z/backup/dumps -network fec0:0:0:4d42::/56
>>
>> How can I specify this using nfse?
>>
> PR/147881 proposes a way how to specify different options for different
> address specifications in one line.  Eg. different -mapall options for
> different hosts in one line.
>
> > From the nfs.exports(5) manual page: after any address specification it is
> possible to use already specified option in the same line and its value will
> overwrite previous option's value and it will be used for next address
> specification.
>
> Such options are: -mapall and -maproot, -ro and -rw, -sec.  It is possible
> to create reverse logic options for -no_* and -mnt_export_brief options
> as well.
>
> The ``*'' hostname represents default export and can be used in a line with
> other address specification.
>
> Example:
>
> % cat exports
> /fs -ro -mapall 1:2:3 1.1.1.1 -sec krb5 -maproot 2:3:4 * 2.2.2.2 -rw 3.3.3.3
> % nfse -t exports
> configure: reading file exports
>
> Pathname /fs
>      Export specifications:
>          -rw -sec krb5 -maproot 2:3:4 -host 3.3.3.3
>          -ro -sec krb5 -maproot 2:3:4 -host 2.2.2.2
>          -ro -sec sys -mapall 1:2:3 -host 1.1.1.1
>          -ro -sec krb5 -maproot 2:3:4 *
>
> The exports(5) manual page says that address specifications must be specified
> after options.  The nfs.exports(5) file format allows to use options after
> address specifications, so they can overwrite previously specified options.
>
> If you applied cddl.diff patch, then you can use zfs share/unshare to change
> ZFS NFS exports, and of course they will be changed atomically and changes
> will be applied only for one file system in a time.  As a result if one used
> zfs share/unshare for ZFS file system, then exports settings from other
> exports files for this file system will be flushed.
>
> The -alldirs options is also supported by the "zfs share" command, but
> its logic does not follow logic described in nfs.exports(5).  If the -alldirs
> options is used then nfse will create two exports: "/fs ..." and
> "/fs -subdir -alldirs ...".  This is because of logic how zfs share/unshare
> works:
>
> 1. "zfs sharenfs ..." are not incremental.  When we run "zfs sharenfs"
>     for some file system it completely substitutes its settings.
>
> 2. There is no way to pass several settings for one file system at least
>     for mountd.
>
> 3. "zfs sharenfs" does not allow to export subdirectories.
>
> If you unsure about configuration logic for nfse, then just call "nfse -t"
> and verify its output.  At any time run "nfse -c show" and verify current
> NFS exports settings.  If you prefer to run nfse(8) in compatible mode with
> mountd(8), then run it with the -C switch.
> _______________________________________________
> freebsd-fs@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-fs
> To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org"
>
>

From owner-freebsd-fs@FreeBSD.ORG  Wed Aug 17 22:18:55 2011
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 75C87106566C
	for <freebsd-fs@freebsd.org>; Wed, 17 Aug 2011 22:18:55 +0000 (UTC)
	(envelope-from rmacklem@uoguelph.ca)
Received: from esa-annu.mail.uoguelph.ca (esa-annu.mail.uoguelph.ca
	[131.104.91.36])
	by mx1.freebsd.org (Postfix) with ESMTP id 3154C8FC17
	for <freebsd-fs@freebsd.org>; Wed, 17 Aug 2011 22:18:54 +0000 (UTC)
X-IronPort-Anti-Spam-Filtered: true
X-IronPort-Anti-Spam-Result: Ap4EAO49TE6DaFvO/2dsb2JhbABChEmlIIFAAQEFIwRSGxgCAg0ZAlkGLK0KkVeBLIQMgRAEkxOREQ
X-IronPort-AV: E=Sophos;i="4.68,241,1312171200"; d="scan'208";a="131408281"
Received: from erie.cs.uoguelph.ca (HELO zcs3.mail.uoguelph.ca)
	([131.104.91.206])
	by esa-annu-pri.mail.uoguelph.ca with ESMTP; 17 Aug 2011 18:18:53 -0400
Received: from zcs3.mail.uoguelph.ca (localhost.localdomain [127.0.0.1])
	by zcs3.mail.uoguelph.ca (Postfix) with ESMTP id B3BB3B402E;
	Wed, 17 Aug 2011 18:18:53 -0400 (EDT)
Date: Wed, 17 Aug 2011 18:18:53 -0400 (EDT)
From: Rick Macklem <rmacklem@uoguelph.ca>
To: Kostik Belousov <kostikbel@gmail.com>
Message-ID: <1632122286.297610.1313619533702.JavaMail.root@erie.cs.uoguelph.ca>
In-Reply-To: <20110817135230.GW17489@deviant.kiev.zoral.com.ua>
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit
X-Originating-IP: [172.17.91.202]
X-Mailer: Zimbra 6.0.10_GA_2692 (ZimbraWebClient - FF3.0 (Win)/6.0.10_GA_2692)
Cc: freebsd-fs@freebsd.org, onwahe@gmail.com
Subject: Re: NFS calculation of max commit size
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 17 Aug 2011 22:18:55 -0000

Kostik Belousov wrote:
> On Wed, Aug 17, 2011 at 09:15:15AM -0400, Rick Macklem wrote:
> >
> > I think that any fraction of hibufspace should be sufficient to
> > avoid
> > the deadlock. Also, since the buffer cache code doesn't use vnode
> > locking
> > these days, I'm not even sure if write backs are blocked by the
> > wrire
> > vnode op in progress. (ie. I'm not sure the deadlock it originally
> > fixed
> > would still happen without it.)
> 
> bufdaemon definitely acquires vnode lock when flushing dirty buffer,
> this was a problem on its own. I think you refer to the nfsiod
> operation.
> 
Ok, so I think this means that the deadlock can still occur.
I haven't yet played with the code, but I now think I might unedrstand
the logic behind dividing by "(desiredvnodes / 1000)".

If a single large write is happening to one NFS vnode, setting
nm_wcommitsize to any fraction of hibufspace should avoid the deadlock,
I think. (If I understand it correctly, the deadlock occurs when an
NFS VOP_WRITE() runs out of buffer cache and no buffer cache blocks
can be cleaned out because it is holding a lock on the vnode.)

But, what happens if K processes concurrently do large writes on K
NFS vnodes?
- It seems to me that they call could deadlock when the buffer cache
  becomes exhausted, since they all hold locks on their respective
  vnodes and, therefore, none of the dirty buffers can be flushed.
  - If this is correct, then I think the only "safe" answer is:
     nm_wcommitsize = hibufspace / desiredvnodes;
    since it is possible that almost all vnodes could be assigned to
    NFS files being written concurrently with large writes.
  However, this would result in an absurdly low value for nm_wcommitsize.

--> My best guess is the original author assumed that 0.1% of all vnodes
    would be a reasonable upper bound on the number being written by NFS
    concurrently with large writes.

By the way, since nm_wcommitsize is applied to a single write, it only
affects a single write(2) syscall of more than nm_wcommitsize bytes of
data. (The PR refers to a writev() of 60Mbytes in size.)
I honestly have no idea how many apps. do write() syscalls of megabytes
in size, so I'm not sure how important it would be to make it larger
than "hibufspace / (desiredvnodes / 1000)", which is about 2Mbytes on
the 256Mbyte laptop I have here without any tuning tweaks?

I think there might be a better way to do this than calculating a
fixed "guestimate" for nm_wcommitsize and then using it for the life
of the NFS mount.
- The NFS VOP_WRITE() can keep track of a running total of how many
  bytes is being written:
  - add uio_resid to this running total at the beginning of the VOP_WRITE()
    and subtract it back out at the end of VOP_WRITE().
  - if this running total exceeds something like 80% of hibufspace, then
    do synchronous writes (ie. use that test instead of
        if (nm_wcommitsize < uio->uio_resid) to make the decision.

Does this sound reasonable to others?
(This is actually getting interesting. Who would have guessed that a
 divide by zero bug report would lead to this...)

rick
> There is another op that is performed without holding the vnode lock
> consistently from (old)nfs code, namely, truncation. It would be
> useful
> to fix this. Please see r188386.

From owner-freebsd-fs@FreeBSD.ORG  Wed Aug 17 23:56:58 2011
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 76F201065673
	for <freebsd-fs@freebsd.org>; Wed, 17 Aug 2011 23:56:58 +0000 (UTC)
	(envelope-from rmacklem@uoguelph.ca)
Received: from esa-annu.mail.uoguelph.ca (esa-annu.mail.uoguelph.ca
	[131.104.91.36])
	by mx1.freebsd.org (Postfix) with ESMTP id 30F658FC12
	for <freebsd-fs@freebsd.org>; Wed, 17 Aug 2011 23:56:58 +0000 (UTC)
X-IronPort-Anti-Spam-Filtered: true
X-IronPort-Anti-Spam-Result: Ap0EADJUTE6DaFvO/2dsb2JhbABBhEmlIIFAAQYjVhsaAg0ZAlkGrQ+RUoEshAyBEASTE5ER
X-IronPort-AV: E=Sophos;i="4.68,242,1312171200"; d="scan'208";a="131414575"
Received: from erie.cs.uoguelph.ca (HELO zcs3.mail.uoguelph.ca)
	([131.104.91.206])
	by esa-annu-pri.mail.uoguelph.ca with ESMTP; 17 Aug 2011 19:56:57 -0400
Received: from zcs3.mail.uoguelph.ca (localhost.localdomain [127.0.0.1])
	by zcs3.mail.uoguelph.ca (Postfix) with ESMTP id 7FC75B3F64;
	Wed, 17 Aug 2011 19:56:57 -0400 (EDT)
Date: Wed, 17 Aug 2011 19:56:57 -0400 (EDT)
From: Rick Macklem <rmacklem@uoguelph.ca>
To: Kostik Belousov <kostikbel@gmail.com>
Message-ID: <1075004291.300557.1313625417507.JavaMail.root@erie.cs.uoguelph.ca>
In-Reply-To: <20110817135230.GW17489@deviant.kiev.zoral.com.ua>
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit
X-Originating-IP: [172.17.91.202]
X-Mailer: Zimbra 6.0.10_GA_2692 (ZimbraWebClient - FF3.0 (Win)/6.0.10_GA_2692)
Cc: freebsd-fs@freebsd.org, onwahe@gmail.com
Subject: Re: NFS calculation of max commit size
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 17 Aug 2011 23:56:58 -0000

Just to correct myself...

- The NFS VOP_WRITE() can keep track of a running total of how many
  bytes is being written:
  - add uio_resid to this running total at the beginning of the VOP_WRITE()
    and subtract it back out at the end of VOP_WRITE().
This was incorrectly stated. The value should be subtracted back out when
the write rpc completes (ie. buffer has been flushed), since the running
total needs to be "how many unwritten NFS bytes are in the buffer cache".
At least that was what I was/am thinking...

rick

From owner-freebsd-fs@FreeBSD.ORG  Thu Aug 18 03:00:16 2011
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id D850E106566C
	for <freebsd-fs@freebsd.org>; Thu, 18 Aug 2011 03:00:16 +0000 (UTC)
	(envelope-from kaduk@mit.edu)
Received: from dmz-mailsec-scanner-5.mit.edu (DMZ-MAILSEC-SCANNER-5.MIT.EDU
	[18.7.68.34]) by mx1.freebsd.org (Postfix) with ESMTP id 674CA8FC0C
	for <freebsd-fs@freebsd.org>; Thu, 18 Aug 2011 03:00:16 +0000 (UTC)
X-AuditID: 12074422-b7ba7ae000000a14-99-4e4c803872bb
Received: from mailhub-auth-2.mit.edu ( [18.7.62.36])
	by dmz-mailsec-scanner-5.mit.edu (Symantec Messaging Gateway) with SMTP
	id 55.C9.02580.8308C4E4; Wed, 17 Aug 2011 23:00:08 -0400 (EDT)
Received: from outgoing.mit.edu (OUTGOING-AUTH.MIT.EDU [18.7.22.103])
	by mailhub-auth-2.mit.edu (8.13.8/8.9.2) with ESMTP id p7I30Fm8011862; 
	Wed, 17 Aug 2011 23:00:15 -0400
Received: from multics.mit.edu (MULTICS.MIT.EDU [18.187.1.73])
	(authenticated bits=56) (User authenticated as kaduk@ATHENA.MIT.EDU)
	by outgoing.mit.edu (8.13.6/8.12.4) with ESMTP id p7I30DVZ001892
	(version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NOT);
	Wed, 17 Aug 2011 23:00:15 -0400 (EDT)
Received: (from kaduk@localhost) by multics.mit.edu (8.12.9.20060308)
	id p7I30Cva009453; Wed, 17 Aug 2011 23:00:12 -0400 (EDT)
Date: Wed, 17 Aug 2011 23:00:12 -0400 (EDT)
From: Benjamin Kaduk <kaduk@MIT.EDU>
To: John <jwd@freebsd.org>
In-Reply-To: <20110816040519.GA49864@FreeBSD.org>
Message-ID: <alpine.GSO.1.10.1108172256550.7526@multics.mit.edu>
References: <20110816040519.GA49864@FreeBSD.org>
User-Agent: Alpine 1.10 (GSO 962 2008-03-14)
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; format=flowed; charset=US-ASCII
X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFnrLIsWRmVeSWpSXmKPExsUixG6nomvR4ONncK5Ly2LOmw9MFsce/2Sz
	WL/yDasDs8eMT/NZAhijuGxSUnMyy1KL9O0SuDKW3f3DWHCDu6Jz3w+mBsZXnF2MnBwSAiYS
	rX3/mSBsMYkL99azdTFycQgJ7GOUeLPwIpSzgVFi4rmpLBDOASaJM92n2CGcBkaJ7surWEH6
	WQS0Jf7dPA42i01ARWLmm41sILaIgJTE0zmXWUBsZgFziacfloHVCAPVL305H6yGU8BQ4uvd
	bcxdjBwcvAIOEk/PRIKEhQQMJI6vWAhWIiqgI7F6/xSwMbwCghInZz6BGmkp8W/tL9YJjIKz
	kKRmIUktYGRaxSibklulm5uYmVOcmqxbnJyYl5dapGuql5tZopeaUrqJERyqLko7GH8eVDrE
	KMDBqMTDa/jK20+INbGsuDL3EKMkB5OSKO/eWh8/Ib6k/JTKjMTijPii0pzU4kOMEhzMSiK8
	bQpAOd6UxMqq1KJ8mJQ0B4uSOC/XTgc/IYH0xJLU7NTUgtQimKwMB4eSBG9zHVCjYFFqempF
	WmZOCUKaiYMTZDgP0HC2KpDhxQWJucWZ6RD5U4yKUuK8nvVACQGQREZpHlwvLJW8YhQHekWY
	dy5IFQ8wDcF1vwIazAQ0+NYuD5DBJYkIKakGxi6TxNkzqvIW/uN6FKPW8KbM4sd+q93+mdpd
	fxcZaoZb5tVdlr13t9PJYPvTLoUnnh+mXNvWbMSZ0R+er5ARKcuV1nz5J8u97TlntnNMbtsv
	oz7DsuhekPMfy2ld22/wBz69Pl3+w2WhQ0cv6DZ2GLvN0H+d78q04uBBrukPJWK7iux+y6l4
	KLEUZyQaajEXFScCAL6rYnwAAwAA
Cc: freebsd-fs@freebsd.org, freebsd-current@freebsd.org
Subject: Re: Three LOR with latest -current
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 18 Aug 2011 03:00:16 -0000

Hello John,

These seem to be well-known, per
http://ipv4.sources.zabbadoz.net/freebsd/lor.html

On Tue, 16 Aug 2011, John wrote:

> Hi folks,
>
>   I'm seeing 3 lock order reversals with an up-to-date -current
> system. Stock system, GENERIC kernel. Let me know if this isn't
> enough information. Just booting the system and the dmesg.
>
> Thanks,
> John
>
>
> lock order reversal:
> 1st 0xfffffe0289627db8 ufs (ufs) @ /usr/src.2011-08-14_10.53pm_EDT/sys/ufs/ffs/ffs_snapshot.c:425
> 2nd 0xffffff9f0db49778 bufwait (bufwait) @ /usr/src.2011-08-14_10.53pm_EDT/sys/kern/vfs_bio.c:2658
> 3rd 0xfffffe00404a8098 ufs (ufs) @ /usr/src.2011-08-14_10.53pm_EDT/sys/ufs/ffs/ffs_snapshot.c:546

This looks like #285.

>
>
> lock order reversal:
> 1st 0xffffff9f0db49778 bufwait (bufwait) @ /usr/src.2011-08-14_10.53pm_EDT/sys/kern/vfs_bio.c:2658
> 2nd 0xfffffe004034dcb0 snaplk (snaplk) @ /usr/src.2011-08-14_10.53pm_EDT/sys/ufs/ffs/ffs_snapshot.c:818

The line numbers are a bit off, but this could be #269.

>
>
> lock order reversal:
> 1st 0xfffffe004034dcb0 snaplk (snaplk) @ /usr/src.2011-08-14_10.53pm_EDT/sys/kern/vfs_vnops.c:301
> 2nd 0xfffffe0289627db8 ufs (ufs) @ /usr/src.2011-08-14_10.53pm_EDT/sys/ufs/ffs/ffs_snapshot.c:1620

And this would be #240.

Since they are so commonly reported (but no deadlocks have been attributed 
to them), it seems likely that they are harmless.
Perhaps we should tell WITNESS to not warn about them ...

-Ben Kaduk

From owner-freebsd-fs@FreeBSD.ORG  Thu Aug 18 12:58:53 2011
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 91A21106566B
	for <freebsd-fs@freebsd.org>; Thu, 18 Aug 2011 12:58:53 +0000 (UTC)
	(envelope-from kostikbel@gmail.com)
Received: from mail.zoral.com.ua (mx0.zoral.com.ua [91.193.166.200])
	by mx1.freebsd.org (Postfix) with ESMTP id 2D3C68FC08
	for <freebsd-fs@freebsd.org>; Thu, 18 Aug 2011 12:58:52 +0000 (UTC)
Received: from deviant.kiev.zoral.com.ua (root@deviant.kiev.zoral.com.ua
	[10.1.1.148])
	by mail.zoral.com.ua (8.14.2/8.14.2) with ESMTP id p7ICwnag045817
	(version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO);
	Thu, 18 Aug 2011 15:58:49 +0300 (EEST)
	(envelope-from kostikbel@gmail.com)
Received: from deviant.kiev.zoral.com.ua (kostik@localhost [127.0.0.1])
	by deviant.kiev.zoral.com.ua (8.14.4/8.14.4) with ESMTP id
	p7ICwnfA011247; Thu, 18 Aug 2011 15:58:49 +0300 (EEST)
	(envelope-from kostikbel@gmail.com)
Received: (from kostik@localhost)
	by deviant.kiev.zoral.com.ua (8.14.4/8.14.4/Submit) id p7ICwnFt011246; 
	Thu, 18 Aug 2011 15:58:49 +0300 (EEST)
	(envelope-from kostikbel@gmail.com)
X-Authentication-Warning: deviant.kiev.zoral.com.ua: kostik set sender to
	kostikbel@gmail.com using -f
Date: Thu, 18 Aug 2011 15:58:49 +0300
From: Kostik Belousov <kostikbel@gmail.com>
To: Rick Macklem <rmacklem@uoguelph.ca>
Message-ID: <20110818125849.GE17489@deviant.kiev.zoral.com.ua>
References: <20110817135230.GW17489@deviant.kiev.zoral.com.ua>
	<1632122286.297610.1313619533702.JavaMail.root@erie.cs.uoguelph.ca>
Mime-Version: 1.0
Content-Type: multipart/signed; micalg=pgp-sha1;
	protocol="application/pgp-signature"; boundary="ADZ8S6Yea/b683e6"
Content-Disposition: inline
In-Reply-To: <1632122286.297610.1313619533702.JavaMail.root@erie.cs.uoguelph.ca>
User-Agent: Mutt/1.4.2.3i
X-Virus-Scanned: clamav-milter 0.95.2 at skuns.kiev.zoral.com.ua
X-Virus-Status: Clean
X-Spam-Status: No, score=-3.3 required=5.0 tests=ALL_TRUSTED,AWL,BAYES_00,
	DNS_FROM_OPENWHOIS autolearn=no version=3.2.5
X-Spam-Checker-Version: SpamAssassin 3.2.5 (2008-06-10) on
	skuns.kiev.zoral.com.ua
Cc: freebsd-fs@freebsd.org, onwahe@gmail.com
Subject: Re: NFS calculation of max commit size
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 18 Aug 2011 12:58:53 -0000


--ADZ8S6Yea/b683e6
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

On Wed, Aug 17, 2011 at 06:18:53PM -0400, Rick Macklem wrote:
> Kostik Belousov wrote:
> > On Wed, Aug 17, 2011 at 09:15:15AM -0400, Rick Macklem wrote:
> > >
> > > I think that any fraction of hibufspace should be sufficient to
> > > avoid
> > > the deadlock. Also, since the buffer cache code doesn't use vnode
> > > locking
> > > these days, I'm not even sure if write backs are blocked by the
> > > wrire
> > > vnode op in progress. (ie. I'm not sure the deadlock it originally
> > > fixed
> > > would still happen without it.)
> >=20
> > bufdaemon definitely acquires vnode lock when flushing dirty buffer,
> > this was a problem on its own. I think you refer to the nfsiod
> > operation.
> >=20
> Ok, so I think this means that the deadlock can still occur.
> I haven't yet played with the code, but I now think I might unedrstand
> the logic behind dividing by "(desiredvnodes / 1000)".
>=20
> If a single large write is happening to one NFS vnode, setting
> nm_wcommitsize to any fraction of hibufspace should avoid the deadlock,
> I think. (If I understand it correctly, the deadlock occurs when an
> NFS VOP_WRITE() runs out of buffer cache and no buffer cache blocks
> can be cleaned out because it is holding a lock on the vnode.)
No, if nfs write vop tries to allocate a new buffer, then vfs_bio.c
will note that attempt is made to allocate with the vnode lock held,
and will do a pass of the dirty cache flushing buffers owned by the vnode.
See a call to buf_do_flush() from getnewbuf() and buf_do_flush() code
itself.

This is what I referred to as 'a problem on its own'. The change helped
to fix a bufdaemon deadlock you described, that indeed happen relatively
often.

>=20
> But, what happens if K processes concurrently do large writes on K
> NFS vnodes?
> - It seems to me that they call could deadlock when the buffer cache
>   becomes exhausted, since they all hold locks on their respective
>   vnodes and, therefore, none of the dirty buffers can be flushed.
>   - If this is correct, then I think the only "safe" answer is:
>      nm_wcommitsize =3D hibufspace / desiredvnodes;
>     since it is possible that almost all vnodes could be assigned to
>     NFS files being written concurrently with large writes.
>   However, this would result in an absurdly low value for nm_wcommitsize.
>=20
> --> My best guess is the original author assumed that 0.1% of all vnodes
>     would be a reasonable upper bound on the number being written by NFS
>     concurrently with large writes.
>=20
> By the way, since nm_wcommitsize is applied to a single write, it only
> affects a single write(2) syscall of more than nm_wcommitsize bytes of
> data. (The PR refers to a writev() of 60Mbytes in size.)
> I honestly have no idea how many apps. do write() syscalls of megabytes
> in size, so I'm not sure how important it would be to make it larger
> than "hibufspace / (desiredvnodes / 1000)", which is about 2Mbytes on
> the 256Mbyte laptop I have here without any tuning tweaks?
>=20
> I think there might be a better way to do this than calculating a
> fixed "guestimate" for nm_wcommitsize and then using it for the life
> of the NFS mount.
> - The NFS VOP_WRITE() can keep track of a running total of how many
>   bytes is being written:
>   - add uio_resid to this running total at the beginning of the VOP_WRITE=
()
>     and subtract it back out at the end of VOP_WRITE().
>   - if this running total exceeds something like 80% of hibufspace, then
>     do synchronous writes (ie. use that test instead of
>         if (nm_wcommitsize < uio->uio_resid) to make the decision.
>=20
> Does this sound reasonable to others?
> (This is actually getting interesting. Who would have guessed that a
>  divide by zero bug report would lead to this...)
>=20
> rick
> > There is another op that is performed without holding the vnode lock
> > consistently from (old)nfs code, namely, truncation. It would be
> > useful
> > to fix this. Please see r188386.

--ADZ8S6Yea/b683e6
Content-Type: application/pgp-signature
Content-Disposition: inline

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (FreeBSD)

iEYEARECAAYFAk5NDIgACgkQC3+MBN1Mb4goIwCgsDM23cix0FchRJmbDXilSyZY
JEkAoJ6o/edVJVLaeF50bY2E88rTPoWR
=MwS4
-----END PGP SIGNATURE-----

--ADZ8S6Yea/b683e6--

From owner-freebsd-fs@FreeBSD.ORG  Fri Aug 19 17:40:33 2011
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: by hub.freebsd.org (Postfix, from userid 1233)
	id 579B71065673; Fri, 19 Aug 2011 17:40:33 +0000 (UTC)
Date: Fri, 19 Aug 2011 17:40:33 +0000
From: Alexander Best <arundel@freebsd.org>
To: freebsd-fs@freebsd.org
Message-ID: <20110819174033.GA68015@freebsd.org>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Subject: probably embarrising SUJ question
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 19 Aug 2011 17:40:33 -0000

hi there,

i recently saw somebody using mount -o async in combination with gjournal. i
just wanted to ask, whether async can also be used with SUJ? or will this put
me in a dangerous situation, where my fs will get hosed after a crash?

cheers.
alex

From owner-freebsd-fs@FreeBSD.ORG  Fri Aug 19 18:19:48 2011
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 57B2E1065673;
	Fri, 19 Aug 2011 18:19:48 +0000 (UTC)
	(envelope-from mckusick@mckusick.com)
Received: from chez.mckusick.com (chez.mckusick.com [70.36.157.235])
	by mx1.freebsd.org (Postfix) with ESMTP id 380258FC1D;
	Fri, 19 Aug 2011 18:19:48 +0000 (UTC)
Received: from chez.mckusick.com (localhost [127.0.0.1])
	by chez.mckusick.com (8.14.3/8.14.3) with ESMTP id p7JHqHEt030978;
	Fri, 19 Aug 2011 10:52:17 -0700 (PDT)
	(envelope-from mckusick@chez.mckusick.com)
Message-Id: <201108191752.p7JHqHEt030978@chez.mckusick.com>
To: Alexander Best <arundel@freebsd.org>
In-reply-to: <20110819174033.GA68015@freebsd.org> 
Date: Fri, 19 Aug 2011 10:52:17 -0700
From: Kirk McKusick <mckusick@mckusick.com>
X-Spam-Status: No, score=0.0 required=5.0 tests=MISSING_MID, UNPARSEABLE_RELAY
	autolearn=failed version=3.2.5
X-Spam-Checker-Version: SpamAssassin 3.2.5 (2008-06-10) on chez.mckusick.com
Cc: freebsd-fs@freebsd.org
Subject: Re: probably embarrassing SUJ question 
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 19 Aug 2011 18:19:48 -0000

> Date: Fri, 19 Aug 2011 17:40:33 +0000
> From: Alexander Best <arundel@freebsd.org>
> To: freebsd-fs@freebsd.org
> Subject: probably embarrassing SUJ question
> 
> Hi there,
> 
> I recently saw somebody using mount -o async in combination with
> gjournal. I just wanted to ask, whether async can also be used with
> SUJ? or will this put me in a dangerous situation, where my fs will
> get hosed after a crash?
> 
> cheers.
> alex

The async flag is incompatible with soft updates or journaled
soft updates. But not to fear, we added a seatbelt in ffs_mount:

	/*
	 * Soft updates is incompatible with "async",
	 * so if we are doing softupdates stop the user
	 * from setting the async flag in an update.
	 * Softdep_mount() clears it in an initial mount
	 * or ro->rw remount.
	 */
	if (MOUNTEDSOFTDEP(mp)) {
		MNT_ILOCK(mp);
		mp->mnt_flag &= ~MNT_ASYNC;
		MNT_IUNLOCK(mp);
	}

So, nothing bad will happen.

	Kirk McKusick

From owner-freebsd-fs@FreeBSD.ORG  Fri Aug 19 18:54:26 2011
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: by hub.freebsd.org (Postfix, from userid 1233)
	id 9C0A9106566C; Fri, 19 Aug 2011 18:54:26 +0000 (UTC)
Date: Fri, 19 Aug 2011 18:54:26 +0000
From: Alexander Best <arundel@freebsd.org>
To: Kirk McKusick <mckusick@mckusick.com>
Message-ID: <20110819185426.GA77630@freebsd.org>
References: <20110819174033.GA68015@freebsd.org>
	<201108191752.p7JHqHEt030978@chez.mckusick.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <201108191752.p7JHqHEt030978@chez.mckusick.com>
Cc: freebsd-fs@freebsd.org
Subject: Re: probably embarrassing SUJ question
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 19 Aug 2011 18:54:26 -0000

On Fri Aug 19 11, Kirk McKusick wrote:
> > Date: Fri, 19 Aug 2011 17:40:33 +0000
> > From: Alexander Best <arundel@freebsd.org>
> > To: freebsd-fs@freebsd.org
> > Subject: probably embarrassing SUJ question
> > 
> > Hi there,
> > 
> > I recently saw somebody using mount -o async in combination with
> > gjournal. I just wanted to ask, whether async can also be used with
> > SUJ? or will this put me in a dangerous situation, where my fs will
> > get hosed after a crash?
> > 
> > cheers.
> > alex
> 
> The async flag is incompatible with soft updates or journaled
> soft updates. But not to fear, we added a seatbelt in ffs_mount:
> 
> 	/*
> 	 * Soft updates is incompatible with "async",
> 	 * so if we are doing softupdates stop the user
> 	 * from setting the async flag in an update.
> 	 * Softdep_mount() clears it in an initial mount
> 	 * or ro->rw remount.
> 	 */
> 	if (MOUNTEDSOFTDEP(mp)) {
> 		MNT_ILOCK(mp);
> 		mp->mnt_flag &= ~MNT_ASYNC;
> 		MNT_IUNLOCK(mp);
> 	}
> 
> So, nothing bad will happen.

ah..thanks a lot. good thing you provide such kind of seatbelts. :)

cheers.
alex

> 
> 	Kirk McKusick

From owner-freebsd-fs@FreeBSD.ORG  Fri Aug 19 19:03:24 2011
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: by hub.freebsd.org (Postfix, from userid 1233)
	id 527311065747; Fri, 19 Aug 2011 19:03:24 +0000 (UTC)
Date: Fri, 19 Aug 2011 19:03:24 +0000
From: Alexander Best <arundel@freebsd.org>
To: freebsd-fs@freebsd.org
Message-ID: <20110819190324.GA78837@freebsd.org>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Subject: touch(1) not working on directories in an msdosfs(5) envirement
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 19 Aug 2011 19:03:24 -0000

hi there,

can somebody confirm this issue? is it already known?

otaku% ll|grep HELL
drwxr-xr-x  1 arundel  arundel     16384 19 Aug 19:57 HELLO
-rw-r--r--  1 arundel  arundel         0 19 Aug 20:13 HELLO2
otaku% touch HELLO*
otaku% ll|grep HELL
drwxr-xr-x  1 arundel  arundel     16384 19 Aug 19:57 HELLO
-rw-r--r--  1 arundel  arundel         0 19 Aug 20:55 HELLO2

doing the same on a UFS2 partition works as expected.


cheers.
alex

From owner-freebsd-fs@FreeBSD.ORG  Fri Aug 19 19:40:33 2011
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 2C763106564A
	for <freebsd-fs@freebsd.org>; Fri, 19 Aug 2011 19:40:33 +0000 (UTC)
	(envelope-from rmacklem@uoguelph.ca)
Received: from esa-annu.mail.uoguelph.ca (esa-annu.mail.uoguelph.ca
	[131.104.91.36])
	by mx1.freebsd.org (Postfix) with ESMTP id DAFEE8FC1E
	for <freebsd-fs@freebsd.org>; Fri, 19 Aug 2011 19:40:32 +0000 (UTC)
X-IronPort-Anti-Spam-Filtered: true
X-IronPort-Anti-Spam-Result: Ap8EAG+7Tk6DaFvO/2dsb2JhbABBhEukOYFAAQEBAQMBAQEgKyALGw4KAgINGQIpAQkYAQ0GCAcEARwEh1SnNZE7gSyEDIEQBJEFgg6REQ
X-IronPort-AV: E=Sophos;i="4.68,252,1312171200"; d="scan'208";a="131640050"
Received: from erie.cs.uoguelph.ca (HELO zcs3.mail.uoguelph.ca)
	([131.104.91.206])
	by esa-annu-pri.mail.uoguelph.ca with ESMTP; 19 Aug 2011 15:40:31 -0400
Received: from zcs3.mail.uoguelph.ca (localhost.localdomain [127.0.0.1])
	by zcs3.mail.uoguelph.ca (Postfix) with ESMTP id B962BB3F2B;
	Fri, 19 Aug 2011 15:40:31 -0400 (EDT)
Date: Fri, 19 Aug 2011 15:40:31 -0400 (EDT)
From: Rick Macklem <rmacklem@uoguelph.ca>
To: Alexander Best <arundel@freebsd.org>
Message-ID: <1092971110.92110.1313782831745.JavaMail.root@erie.cs.uoguelph.ca>
In-Reply-To: <20110819190324.GA78837@freebsd.org>
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit
X-Originating-IP: [172.17.91.202]
X-Mailer: Zimbra 6.0.10_GA_2692 (ZimbraWebClient - FF3.0 (Win)/6.0.10_GA_2692)
Cc: freebsd-fs@freebsd.org
Subject: Re: touch(1) not working on directories in an msdosfs(5) envirement
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 19 Aug 2011 19:40:33 -0000

Alexander Best wrote:
> hi there,
> 
> can somebody confirm this issue? is it already known?
> 
> otaku% ll|grep HELL
> drwxr-xr-x 1 arundel arundel 16384 19 Aug 19:57 HELLO
> -rw-r--r-- 1 arundel arundel 0 19 Aug 20:13 HELLO2
> otaku% touch HELLO*
> otaku% ll|grep HELL
> drwxr-xr-x 1 arundel arundel 16384 19 Aug 19:57 HELLO
> -rw-r--r-- 1 arundel arundel 0 19 Aug 20:55 HELLO2
> 
Yes, FAT file systems do not maintain a directory modify
time. (The original FAT12,16 structure didn't even have a
modify time for the root dir.)

Just like Windows.

This causes issues when a FAT fs is exported via NFS and
someone was going to experiment with an "in memory only"
modify time for dirs, to minimize caching issues, but I
haven't heard back from them lately.

Apparently Mac OS X chooses to update the modify time that
exists on FAT32 file systems, but that isn't Windows compatible.

rick
> doing the same on a UFS2 partition works as expected.
> 
> 
> cheers.
> alex
> _______________________________________________
> freebsd-fs@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-fs
> To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org"

From owner-freebsd-fs@FreeBSD.ORG  Fri Aug 19 22:11:07 2011
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 66848106566C
	for <freebsd-fs@freebsd.org>; Fri, 19 Aug 2011 22:11:07 +0000 (UTC)
	(envelope-from freebsd-fs@m.gmane.org)
Received: from lo.gmane.org (lo.gmane.org [80.91.229.12])
	by mx1.freebsd.org (Postfix) with ESMTP id 1F9518FC1D
	for <freebsd-fs@freebsd.org>; Fri, 19 Aug 2011 22:11:06 +0000 (UTC)
Received: from list by lo.gmane.org with local (Exim 4.69)
	(envelope-from <freebsd-fs@m.gmane.org>) id 1QuXHL-00036w-US
	for freebsd-fs@freebsd.org; Sat, 20 Aug 2011 00:11:03 +0200
Received: from 208.88.188.90.adsl.tomsknet.ru ([90.188.88.208])
	by main.gmane.org with esmtp (Gmexim 0.1 (Debian))
	id 1AlnuQ-0007hv-00
	for <freebsd-fs@freebsd.org>; Sat, 20 Aug 2011 00:11:03 +0200
Received: from vadim_nuclight by 208.88.188.90.adsl.tomsknet.ru with local
	(Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00
	for <freebsd-fs@freebsd.org>; Sat, 20 Aug 2011 00:11:03 +0200
X-Injected-Via-Gmane: http://gmane.org/
To: freebsd-fs@freebsd.org
From: Vadim Goncharov <vadim_nuclight@mail.ru>
Date: Fri, 19 Aug 2011 22:10:49 +0000 (UTC)
Organization: Nuclear Lightning @ Tomsk, TPU AVTF Hostel
Lines: 26
Message-ID: <slrnj4tnr9.2di5.vadim_nuclight@kernblitz.nuclight.avtf.net>
References: <20110819190324.GA78837@freebsd.org>
	<1092971110.92110.1313782831745.JavaMail.root@erie.cs.uoguelph.ca>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
X-Complaints-To: usenet@dough.gmane.org
X-Gmane-NNTP-Posting-Host: 208.88.188.90.adsl.tomsknet.ru
X-Comment-To: Rick Macklem
User-Agent: slrn/0.9.9p1 (FreeBSD)
Subject: Re: touch(1) not working on directories in an msdosfs(5) envirement
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
Reply-To: vadim_nuclight@mail.ru
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 19 Aug 2011 22:11:07 -0000

Hi Rick Macklem! 

On Fri, 19 Aug 2011 15:40:31 -0400 (EDT); Rick Macklem wrote about 'Re: touch(1) not working on directories in an msdosfs(5) envirement':

> Yes, FAT file systems do not maintain a directory modify
> time. (The original FAT12,16 structure didn't even have a
> modify time for the root dir.)

> Just like Windows.

> This causes issues when a FAT fs is exported via NFS and
> someone was going to experiment with an "in memory only"
> modify time for dirs, to minimize caching issues, but I
> haven't heard back from them lately.

> Apparently Mac OS X chooses to update the modify time that
> exists on FAT32 file systems, but that isn't Windows compatible.

What? I've just now created a test directory and changed it's modify time
in Far Manager on Windows 2000, in a FAT32 partition. In fact it allows to
change all three directory times, creation and access, too. So, I conclude,
the FAT supports it.

-- 
WBR, Vadim Goncharov. ICQ#166852181       mailto:vadim_nuclight@mail.ru
[Anti-Greenpeace][Sober FreeBSD zealot][http://nuclight.livejournal.com]


From owner-freebsd-fs@FreeBSD.ORG  Fri Aug 19 22:58:56 2011
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 647691065670
	for <freebsd-fs@freebsd.org>; Fri, 19 Aug 2011 22:58:56 +0000 (UTC)
	(envelope-from rmacklem@uoguelph.ca)
Received: from esa-annu.mail.uoguelph.ca (esa-annu.mail.uoguelph.ca
	[131.104.91.36])
	by mx1.freebsd.org (Postfix) with ESMTP id 24A858FC12
	for <freebsd-fs@freebsd.org>; Fri, 19 Aug 2011 22:58:56 +0000 (UTC)
X-IronPort-Anti-Spam-Filtered: true
X-IronPort-Anti-Spam-Result: AqAEAGbqTk6DaFvO/2dsb2JhbABBhEukOoFAAQEBAQMBAQEgBCcgCxsYAgINFgMCKQEJFQMBDQYIBwQBHASHVKc2kSmBLIQMgRAEkQWCDpER
X-IronPort-AV: E=Sophos;i="4.68,253,1312171200"; d="scan'208";a="131656528"
Received: from erie.cs.uoguelph.ca (HELO zcs3.mail.uoguelph.ca)
	([131.104.91.206])
	by esa-annu-pri.mail.uoguelph.ca with ESMTP; 19 Aug 2011 18:58:55 -0400
Received: from zcs3.mail.uoguelph.ca (localhost.localdomain [127.0.0.1])
	by zcs3.mail.uoguelph.ca (Postfix) with ESMTP id 5BC15B3F9F;
	Fri, 19 Aug 2011 18:58:55 -0400 (EDT)
Date: Fri, 19 Aug 2011 18:58:55 -0400 (EDT)
From: Rick Macklem <rmacklem@uoguelph.ca>
To: vadim nuclight <vadim_nuclight@mail.ru>
Message-ID: <1303085986.99226.1313794735324.JavaMail.root@erie.cs.uoguelph.ca>
In-Reply-To: <slrnj4tnr9.2di5.vadim_nuclight@kernblitz.nuclight.avtf.net>
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit
X-Originating-IP: [172.17.91.203]
X-Mailer: Zimbra 6.0.10_GA_2692 (ZimbraWebClient - FF3.0 (Win)/6.0.10_GA_2692)
Cc: freebsd-fs@freebsd.org
Subject: Re: touch(1) not working on directories in an msdosfs(5) envirement
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 19 Aug 2011 22:58:56 -0000

Vadim Goncharov wrote:
> Hi Rick Macklem!
> 
> On Fri, 19 Aug 2011 15:40:31 -0400 (EDT); Rick Macklem wrote about
> 'Re: touch(1) not working on directories in an msdosfs(5) envirement':
> 
> > Yes, FAT file systems do not maintain a directory modify
> > time. (The original FAT12,16 structure didn't even have a
> > modify time for the root dir.)
> 
> > Just like Windows.
> 
> > This causes issues when a FAT fs is exported via NFS and
> > someone was going to experiment with an "in memory only"
> > modify time for dirs, to minimize caching issues, but I
> > haven't heard back from them lately.
> 
> > Apparently Mac OS X chooses to update the modify time that
> > exists on FAT32 file systems, but that isn't Windows compatible.
> 
> What? I've just now created a test directory and changed it's modify
> time
> in Far Manager on Windows 2000, in a FAT32 partition. In fact it
> allows to
> change all three directory times, creation and access, too. So, I
> conclude,
> the FAT supports it.
> 
Well, FAT32 (not the root dir of FAT12 or FAT16) does have a modify
time stored on disk for the directory entry for a directory.

The case I was thinking of (because that was what affected NFS client
caching) was the case where an entry is added to a directory. I just
checked that and it does not change the directory's modify time when
an entry is added to a directory (at least for Windows7 personal...).

I'm not enough of a Windows guy to even know what "Far Manager" is,
but I'm not surprised that there is a tool that can change it.

msdosfs_setattr() in sys/fs/msdosfs/msdosfs_vnops.c definitely only
does it for non-directories:
		if (vp->v_type != VDIR) {
			if ((pmp->pm_flags & MSDOSFSMNT_NOWIN95) == 0 &&
			    vap->va_atime.tv_sec != VNOVAL) {
				dep->de_flag &= ~DE_ACCESS;
				timespec2fattime(&vap->va_atime, 0,
				    &dep->de_ADate, NULL, NULL);
			}
			if (vap->va_mtime.tv_sec != VNOVAL) {
				dep->de_flag &= ~DE_UPDATE;
				timespec2fattime(&vap->va_mtime, 0,
				    &dep->de_MDate, &dep->de_MTime, NULL);
			}
			dep->de_Attributes |= ATTR_ARCHIVE;
			dep->de_flag |= DE_MODIFIED;
		}

I'm not the author of the above, but I had assumed that it was
because Windows doesn't normally update it. Obviously, the above
code could easily be changed (although I haven't tested that), if
that is now considered correct behaviour. (It might have been
because the msdosfs is meant to work for all FAT variants.)

rick
> --
> WBR, Vadim Goncharov. ICQ#166852181 mailto:vadim_nuclight@mail.ru
> [Anti-Greenpeace][Sober FreeBSD
> zealot][http://nuclight.livejournal.com]
> 
> _______________________________________________
> freebsd-fs@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-fs
> To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org"

From owner-freebsd-fs@FreeBSD.ORG  Sat Aug 20 04:11:05 2011
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@hub.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 621E7106564A;
	Sat, 20 Aug 2011 04:11:05 +0000 (UTC)
	(envelope-from linimon@FreeBSD.org)
Received: from freefall.freebsd.org (freefall.freebsd.org
	[IPv6:2001:4f8:fff6::28])
	by mx1.freebsd.org (Postfix) with ESMTP id 3A5AE8FC13;
	Sat, 20 Aug 2011 04:11:05 +0000 (UTC)
Received: from freefall.freebsd.org (localhost [127.0.0.1])
	by freefall.freebsd.org (8.14.4/8.14.4) with ESMTP id p7K4B5Vd041462;
	Sat, 20 Aug 2011 04:11:05 GMT
	(envelope-from linimon@freefall.freebsd.org)
Received: (from linimon@localhost)
	by freefall.freebsd.org (8.14.4/8.14.4/Submit) id p7K4B5Ae041453;
	Sat, 20 Aug 2011 04:11:05 GMT (envelope-from linimon)
Date: Sat, 20 Aug 2011 04:11:05 GMT
Message-Id: <201108200411.p7K4B5Ae041453@freefall.freebsd.org>
To: linimon@FreeBSD.org, freebsd-amd64@FreeBSD.org, freebsd-fs@FreeBSD.org
From: linimon@FreeBSD.org
Cc: 
Subject: Re: kern/159930: [ufs] [panic] kernel core
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sat, 20 Aug 2011 04:11:05 -0000

Old Synopsis: kernel core
New Synopsis: [ufs] [panic] kernel core

Responsible-Changed-From-To: freebsd-amd64->freebsd-fs
Responsible-Changed-By: linimon
Responsible-Changed-When: Sat Aug 20 04:10:35 UTC 2011
Responsible-Changed-Why: 
attempt to reclassify.

http://www.freebsd.org/cgi/query-pr.cgi?pr=159930

From owner-freebsd-fs@FreeBSD.ORG  Sat Aug 20 07:00:24 2011
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@hub.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 2895D106566B
	for <freebsd-fs@hub.freebsd.org>; Sat, 20 Aug 2011 07:00:24 +0000 (UTC)
	(envelope-from gnats@FreeBSD.org)
Received: from freefall.freebsd.org (freefall.freebsd.org
	[IPv6:2001:4f8:fff6::28])
	by mx1.freebsd.org (Postfix) with ESMTP id 182208FC08
	for <freebsd-fs@hub.freebsd.org>; Sat, 20 Aug 2011 07:00:24 +0000 (UTC)
Received: from freefall.freebsd.org (localhost [127.0.0.1])
	by freefall.freebsd.org (8.14.4/8.14.4) with ESMTP id p7K70Nqb000173
	for <freebsd-fs@freefall.freebsd.org>; Sat, 20 Aug 2011 07:00:23 GMT
	(envelope-from gnats@freefall.freebsd.org)
Received: (from gnats@localhost)
	by freefall.freebsd.org (8.14.4/8.14.4/Submit) id p7K70NDo000172;
	Sat, 20 Aug 2011 07:00:23 GMT (envelope-from gnats)
Date: Sat, 20 Aug 2011 07:00:23 GMT
Message-Id: <201108200700.p7K70NDo000172@freefall.freebsd.org>
To: freebsd-fs@FreeBSD.org
From: Sergey Kandaurov <pluknet@gmail.com>
Cc: 
Subject: Re: kern/159930: [ufs] [panic] kernel core
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
Reply-To: Sergey Kandaurov <pluknet@gmail.com>
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sat, 20 Aug 2011 07:00:24 -0000

The following reply was made to PR kern/159930; it has been noted by GNATS.

From: Sergey Kandaurov <pluknet@gmail.com>
To: bug-followup@FreeBSD.org, nospam@ofloo.net
Cc:  
Subject: Re: kern/159930: [ufs] [panic] kernel core
Date: Sat, 20 Aug 2011 10:58:29 +0400

 Do you use "options QUOTA" ?
 How often do you experience this crash?
 Can you show the exact way to reproduce it?
 Can you check if the following patch helps you?
 Thanks.
 
 --- sys/ufs/ffs/ffs_inode.c      2010-06-14 06:09:06.000000000 +0400
 +++ sys/ufs/ffs/ffs_inode.c 2010-12-09 15:25:28.000000000 +0300
 @@ -215,7 +215,7 @@
                         osize = ip->i_din2->di_extsize;
                         ip->i_din2->di_blocks -= extblocks;
  #ifdef QUOTA
 -                       (void) chkdq(ip, -extblocks, NOCRED, 0);
 +                       (void) chkdq(ip, -extblocks, NOCRED, FORCE);
  #endif
                         vinvalbuf(vp, V_ALT, 0, 0);
                         ffs_pages_remove(vp,
 @@ -290,7 +290,7 @@
                         UFS_UNLOCK(ump);
                 } else {
  #ifdef QUOTA
 -                       (void) chkdq(ip, -datablocks, NOCRED, 0);
 +                       (void) chkdq(ip, -datablocks, NOCRED, FORCE);
  #endif
                         softdep_setup_freeblocks(ip, length, needextclean ?
                             IO_EXT | IO_NORMAL : IO_NORMAL);
 @@ -526,7 +526,7 @@
                 DIP_SET(ip, i_blocks, 0);
         ip->i_flag |= IN_CHANGE;
  #ifdef QUOTA
 -       (void) chkdq(ip, -blocksreleased, NOCRED, 0);
 +       (void) chkdq(ip, -blocksreleased, NOCRED, FORCE);
  #endif
         return (allerror);
  }
 
 -- 
 wbr,
 pluknet

From owner-freebsd-fs@FreeBSD.ORG  Sat Aug 20 07:14:03 2011
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id E5F6F1065674
	for <freebsd-fs@freebsd.org>; Sat, 20 Aug 2011 07:14:02 +0000 (UTC)
	(envelope-from brde@optusnet.com.au)
Received: from mail04.syd.optusnet.com.au (mail04.syd.optusnet.com.au
	[211.29.132.185])
	by mx1.freebsd.org (Postfix) with ESMTP id 81AF98FC13
	for <freebsd-fs@freebsd.org>; Sat, 20 Aug 2011 07:14:01 +0000 (UTC)
Received: from c122-106-165-191.carlnfd1.nsw.optusnet.com.au
	(c122-106-165-191.carlnfd1.nsw.optusnet.com.au [122.106.165.191])
	by mail04.syd.optusnet.com.au (8.13.1/8.13.1) with ESMTP id
	p7K7DxtU022662
	(version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO);
	Sat, 20 Aug 2011 17:13:59 +1000
Date: Sat, 20 Aug 2011 17:13:59 +1000 (EST)
From: Bruce Evans <brde@optusnet.com.au>
X-X-Sender: bde@besplex.bde.org
To: Rick Macklem <rmacklem@uoguelph.ca>
In-Reply-To: <1303085986.99226.1313794735324.JavaMail.root@erie.cs.uoguelph.ca>
Message-ID: <20110820164559.Q872@besplex.bde.org>
References: <1303085986.99226.1313794735324.JavaMail.root@erie.cs.uoguelph.ca>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed
Cc: vadim nuclight <vadim_nuclight@mail.ru>, freebsd-fs@freebsd.org
Subject: Re: touch(1) not working on directories in an msdosfs(5) envirement
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sat, 20 Aug 2011 07:14:03 -0000

On Fri, 19 Aug 2011, Rick Macklem wrote:

> Vadim Goncharov wrote:
>> On Fri, 19 Aug 2011 15:40:31 -0400 (EDT); Rick Macklem wrote about
>> ...
>>> Apparently Mac OS X chooses to update the modify time that
>>> exists on FAT32 file systems, but that isn't Windows compatible.
>>
>> What? I've just now created a test directory and changed it's modify
>> time
>> in Far Manager on Windows 2000, in a FAT32 partition. In fact it
>> allows to
>> change all three directory times, creation and access, too. So, I
>> conclude,
>> the FAT supports it.
>>
> Well, FAT32 (not the root dir of FAT12 or FAT16) does have a modify
> time stored on disk for the directory entry for a directory.

In a previous reply, I might have misremembered the limitations of
old FAT on directories.  Now ISTR something before (?) FAT12 which only
had the root directory.

> The case I was thinking of (because that was what affected NFS client
> caching) was the case where an entry is added to a directory. I just
> checked that and it does not change the directory's modify time when
> an entry is added to a directory (at least for Windows7 personal...).

This is the intentional part of msdosfs's compatibility.

> I'm not enough of a Windows guy to even know what "Far Manager" is,
> but I'm not surprised that there is a tool that can change it.

Me either, but I know cygwin can do most things (but it is so slow that
it is faster to reboot to run FreeBSD utilities to do anything involving
more than a few hundred files -- even a simple find -name takes 10-100
times longer in cygwin).

> msdosfs_setattr() in sys/fs/msdosfs/msdosfs_vnops.c definitely only
> does it for non-directories:
> 		if (vp->v_type != VDIR) {
> 			if ((pmp->pm_flags & MSDOSFSMNT_NOWIN95) == 0 &&
> 			    vap->va_atime.tv_sec != VNOVAL) {
> 				dep->de_flag &= ~DE_ACCESS;
> 				timespec2fattime(&vap->va_atime, 0,
> 				    &dep->de_ADate, NULL, NULL);
> 			}
> 			if (vap->va_mtime.tv_sec != VNOVAL) {
> 				dep->de_flag &= ~DE_UPDATE;
> 				timespec2fattime(&vap->va_mtime, 0,
> 				    &dep->de_MDate, &dep->de_MTime, NULL);
> 			}
> 			dep->de_Attributes |= ATTR_ARCHIVE;
> 			dep->de_flag |= DE_MODIFIED;
> 		}

Yes, the special case for directories is just a bug (except for ATTR_ARCHIVE).

> I'm not the author of the above, but I had assumed that it was
> because Windows doesn't normally update it. Obviously, the above
> code could easily be changed (although I haven't tested that), if
> that is now considered correct behaviour. (It might have been
> because the msdosfs is meant to work for all FAT variants.)

But this is msdosfs_setattr(), whose purpose is to update it.  The
non-update for directory changes seems to be only here in detrunc():

% 	/*
% 	 * Write out the updated directory entry.  Even if the update fails
% 	 * we free the trailing clusters.
% 	 */
% 	dep->de_FileSize = length;
% 	if (!isadir)
% 		dep->de_flag |= DE_UPDATE|DE_MODIFIED;
% 	allerror = vtruncbuf(DETOV(dep), cred, td, length, pmp->pm_bpcluster);

I don't quite understand how extensions or changes in place either set or
avoid setting DE_MODIFIED -- grep didn't seem to show any relevant settings
-- maybe all cases go through detrunc().

Bruce

From owner-freebsd-fs@FreeBSD.ORG  Sat Aug 20 07:50:11 2011
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@hub.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 5E646106566B
	for <freebsd-fs@hub.freebsd.org>; Sat, 20 Aug 2011 07:50:11 +0000 (UTC)
	(envelope-from gnats@FreeBSD.org)
Received: from freefall.freebsd.org (freefall.freebsd.org
	[IPv6:2001:4f8:fff6::28])
	by mx1.freebsd.org (Postfix) with ESMTP id 4E7048FC12
	for <freebsd-fs@hub.freebsd.org>; Sat, 20 Aug 2011 07:50:11 +0000 (UTC)
Received: from freefall.freebsd.org (localhost [127.0.0.1])
	by freefall.freebsd.org (8.14.4/8.14.4) with ESMTP id p7K7oBJL069225
	for <freebsd-fs@freefall.freebsd.org>; Sat, 20 Aug 2011 07:50:11 GMT
	(envelope-from gnats@freefall.freebsd.org)
Received: (from gnats@localhost)
	by freefall.freebsd.org (8.14.4/8.14.4/Submit) id p7K7oBKE069224;
	Sat, 20 Aug 2011 07:50:11 GMT (envelope-from gnats)
Date: Sat, 20 Aug 2011 07:50:11 GMT
Message-Id: <201108200750.p7K7oBKE069224@freefall.freebsd.org>
To: freebsd-fs@FreeBSD.org
From: dfilter@FreeBSD.ORG (dfilter service)
Cc: 
Subject: Re: kern/157728: commit references a PR
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
Reply-To: dfilter service <dfilter@FreeBSD.ORG>
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sat, 20 Aug 2011 07:50:11 -0000

The following reply was made to PR kern/157728; it has been noted by GNATS.

From: dfilter@FreeBSD.ORG (dfilter service)
To: bug-followup@FreeBSD.org
Cc:  
Subject: Re: kern/157728: commit references a PR
Date: Sat, 20 Aug 2011 07:43:25 +0000 (UTC)

 Author: mm
 Date: Sat Aug 20 07:43:10 2011
 New Revision: 225022
 URL: http://svn.freebsd.org/changeset/base/225022
 
 Log:
   MFC r224814, r224855:
   
   MFC r224814 [1]:
   Fix race between dmu_objset_prefetch() invoked from
   zfs_ioc_dataset_list_next() and dsl_dir_destroy_check() indirectly
   invoked from dmu_recv_existing_end() via dsl_dataset_destroy() by not
   prefetching temporary clones, as these count as always inconsistent.
   In addition, do not prefetch hidden datasets at all as we are not
   going to process these later.
   
   Filed as Illumos Bug #1346
   
   MFC r224855:
   zfs_ioctl.c: improve code readability in zfs_ioc_dataset_list_next()
   
   zvol.c: fix calling of dmu_objset_prefetch() in zvol_create_minors()
   by passing full instead of relative dataset name and prefetching all
   visible datasets to be processed later instead of just the pool name
   
   PR:		kern/157728 [1]
   Tested by:	Borja Marcos <borjam@sarenet.es> [1], mm
   Reviewed by:	pjd
 
 Modified:
   stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_ioctl.c
   stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zvol.c
 Directory Properties:
   stable/8/sys/   (props changed)
   stable/8/sys/amd64/include/xen/   (props changed)
   stable/8/sys/cddl/contrib/opensolaris/   (props changed)
   stable/8/sys/contrib/dev/acpica/   (props changed)
   stable/8/sys/contrib/pf/   (props changed)
 
 Modified: stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_ioctl.c
 ==============================================================================
 --- stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_ioctl.c	Sat Aug 20 06:08:31 2011	(r225021)
 +++ stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_ioctl.c	Sat Aug 20 07:43:10 2011	(r225022)
 @@ -1963,8 +1963,10 @@ top:
  		uint64_t cookie = 0;
  		int len = sizeof (zc->zc_name) - (p - zc->zc_name);
  
 -		while (dmu_dir_list_next(os, len, p, NULL, &cookie) == 0)
 -			(void) dmu_objset_prefetch(zc->zc_name, NULL);
 +		while (dmu_dir_list_next(os, len, p, NULL, &cookie) == 0) {
 +			if (!dataset_name_hidden(zc->zc_name))
 +				(void) dmu_objset_prefetch(zc->zc_name, NULL);
 +		}
  	}
  
  	do {
 
 Modified: stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zvol.c
 ==============================================================================
 --- stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zvol.c	Sat Aug 20 06:08:31 2011	(r225021)
 +++ stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zvol.c	Sat Aug 20 07:43:10 2011	(r225022)
 @@ -2200,11 +2200,11 @@ zvol_create_minors(const char *name)
  	p = osname + strlen(osname);
  	len = MAXPATHLEN - (p - osname);
  
 -	if (strchr(name, '/') == NULL) {
 -		/* Prefetch only for pool name. */
 -		cookie = 0;
 -		while (dmu_dir_list_next(os, len, p, NULL, &cookie) == 0)
 -			(void) dmu_objset_prefetch(p, NULL);
 +	/* Prefetch the datasets. */
 +	cookie = 0;
 +	while (dmu_dir_list_next(os, len, p, NULL, &cookie) == 0) {
 +		if (!dataset_name_hidden(osname))
 +			(void) dmu_objset_prefetch(osname, NULL);
  	}
  
  	cookie = 0;
 _______________________________________________
 svn-src-all@freebsd.org mailing list
 http://lists.freebsd.org/mailman/listinfo/svn-src-all
 To unsubscribe, send any mail to "svn-src-all-unsubscribe@freebsd.org"
 

From owner-freebsd-fs@FreeBSD.ORG  Sat Aug 20 08:59:08 2011
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id AED9D106564A
	for <freebsd-fs@freebsd.org>; Sat, 20 Aug 2011 08:59:08 +0000 (UTC)
	(envelope-from brde@optusnet.com.au)
Received: from fallbackmx09.syd.optusnet.com.au
	(fallbackmx09.syd.optusnet.com.au [211.29.132.242])
	by mx1.freebsd.org (Postfix) with ESMTP id 3054B8FC0A
	for <freebsd-fs@freebsd.org>; Sat, 20 Aug 2011 08:59:07 +0000 (UTC)
Received: from mail07.syd.optusnet.com.au (mail07.syd.optusnet.com.au
	[211.29.132.188])
	by fallbackmx09.syd.optusnet.com.au (8.13.1/8.13.1) with ESMTP id
	p7K6j4Y0016467
	for <freebsd-fs@freebsd.org>; Sat, 20 Aug 2011 16:45:05 +1000
Received: from c122-106-165-191.carlnfd1.nsw.optusnet.com.au
	(c122-106-165-191.carlnfd1.nsw.optusnet.com.au [122.106.165.191])
	by mail07.syd.optusnet.com.au (8.13.1/8.13.1) with ESMTP id
	p7K6ix48003992
	(version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO);
	Sat, 20 Aug 2011 16:45:00 +1000
Date: Sat, 20 Aug 2011 16:44:59 +1000 (EST)
From: Bruce Evans <brde@optusnet.com.au>
X-X-Sender: bde@besplex.bde.org
To: Rick Macklem <rmacklem@uoguelph.ca>
In-Reply-To: <1092971110.92110.1313782831745.JavaMail.root@erie.cs.uoguelph.ca>
Message-ID: <20110820145112.Y872@besplex.bde.org>
References: <1092971110.92110.1313782831745.JavaMail.root@erie.cs.uoguelph.ca>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed
Cc: freebsd-fs@freebsd.org, Alexander Best <arundel@freebsd.org>
Subject: Re: touch(1) not working on directories in an msdosfs(5) envirement
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sat, 20 Aug 2011 08:59:08 -0000

On Fri, 19 Aug 2011, Rick Macklem wrote:

> Alexander Best wrote:
>> can somebody confirm this issue? is it already known?
>>
>> otaku% ll|grep HELL
>> drwxr-xr-x 1 arundel arundel 16384 19 Aug 19:57 HELLO
>> -rw-r--r-- 1 arundel arundel 0 19 Aug 20:13 HELLO2
>> otaku% touch HELLO*
>> otaku% ll|grep HELL
>> drwxr-xr-x 1 arundel arundel 16384 19 Aug 19:57 HELLO
>> -rw-r--r-- 1 arundel arundel 0 19 Aug 20:55 HELLO2

This is fixed (hacked around to keep the diffs small) in my version:

% Index: msdosfs_vnops.c
% ===================================================================
% RCS file: /home/ncvs/src/sys/fs/msdosfs/msdosfs_vnops.c,v
% retrieving revision 1.147
% diff -u -2 -r1.147 msdosfs_vnops.c
% --- msdosfs_vnops.c	4 Feb 2004 21:52:53 -0000	1.147
% +++ msdosfs_vnops.c	12 Nov 2007 21:47:48 -0000
% @@ -457,5 +457,7 @@
%  		    (error = VOP_ACCESS(ap->a_vp, VWRITE, cred, ap->a_td))))
%  			return (error);
% +#if 0
%  		if (vp->v_type != VDIR) {
% +#endif
%  			if ((pmp->pm_flags & MSDOSFSMNT_NOWIN95) == 0 &&
%  			    vap->va_atime.tv_sec != VNOVAL) {

The main part of the fix is to just remove the special case for directories
which just breaks utimes() on directories.  Even DOS and Windows never
had this brokenness.  What DOS and Windows do specially for directories
is not update their modification time when their contents is changed.
FreeBSD is compatible with this, and the above special case is apparently
the result of trying too hard to be compatible with DOS and Windows.

% @@ -463,4 +465,7 @@
%  				unix2dostime(&vap->va_atime, &dep->de_ADate,
%  				    NULL, NULL);
% +				if (vp->v_type != VDIR)
% +					dep->de_Attributes |= ATTR_ARCHIVE;
% +				dep->de_flag |= DE_MODIFIED;
%  			}
%  			if (vap->va_mtime.tv_sec != VNOVAL) {

Now that setting times on directories is unbroken, we have to be more
careful with the archive bit.  In DOS and Windows, the archive bit is
meaningless for FAT* directories and is not set by any syscall that I
know of (unlike for ffs IIRC).  The above avoids setting it for directories.
Not setting DE_MODIFIED is an unrelated micro-optimization (try harder not
to set it when we didn't change anything).

% @@ -468,8 +473,11 @@
%  				unix2dostime(&vap->va_mtime, &dep->de_MDate,
%  				    &dep->de_MTime, NULL);
% +				if (vp->v_type != VDIR)
% +					dep->de_Attributes |= ATTR_ARCHIVE;
% +				dep->de_flag |= DE_MODIFIED;

Similarly for the mtime.

%  			}
% -			dep->de_Attributes |= ATTR_ARCHIVE;
% -			dep->de_flag |= DE_MODIFIED;

This was moved early.

% +#if 0
%  		}
% +#endif

Finish hacking way the special case for directories.

%  	}
%  	/*
% @@ -494,5 +502,5 @@
%  		}
%  	}
% -	return (deupdat(dep, 1));
% +	return (deupdat(dep, 0));

Remove an unrelated pessimization (a synchronous update where even an
asynchronous update is more than what is needed).  Even ffs in Net/2
didn't have the full pessimization here -- it pessimized SETATTR on
times but not on the more important ownership and permission attributes.
ffs still had the pessimization for times in 4.4BSD-Lite2, but FreeBSD
fixed it in 1998 (ufs_vnops.c 1.79; the fix was buried in a mega-commit
with a content-free log message :-(), and I fixed it in my version of
ffs long before that.  msdosfs_setattr() is a little different from
ufs_setattr(); in particular, it does the (previous synchronous) update
for all successful calls while ufs_setattr() only ever did it for times,
so the pessimization has a wider scope in msdosfs.

%  }

The above is only the least serious of the bugs in msdosfs_setattr() :-(.
With the above fix, plain touch works as well as possible -- it cannot
work perfectly since setting of atimes is not always supported.  But
touch -r and more importantly, cp -p only work as well as possible for
root, since they use utimes() without the null timeptr arg that allows
plain touch to work.  A non-null timeptr arg ends up normally requiring
root permissions for msdosfs where it normally doesn't require extra
permissions for ffs, because ownership requirements for the non-null case
cannot be satisfied by file systems that don't really support ownerships.
We fudge the ownerships and use weak checks on them  in most places, but
for utimes() we use strict checks that almost always fail: from my old
version:

% 	if (vap->va_flags != VNOVAL) {
% 		if (vp->v_mount->mnt_flag & MNT_RDONLY)
% 			return (EROFS);
% 		if (cred->cr_uid != pmp->pm_uid &&
% 		    (error = suser_cred(cred, PRISON_ROOT)))
% 			return (error);

The implementation of this check has changed significantly in -current,
but its semantics and result havven't.  The file must be owned by
someone (pmp->pm_uid), and no one else except root has permission for
utimes() with a non-null timeptr.  This works right in ffs because the
file can normally be owned by its rightful owner, but in msdosfs the
owner is faked and there can be only one owner for the whole file
systems.  I use owner root and group msdosfs for all msdosfs file
systems.  The group permissions allow non-root users in group msdosfs
to do almost everything  except this unimportant utimes() operation
to almost all msdosfs files.

% 		/*
% 		 * We are very inconsistent about handling unsupported
% 		 * attributes.  We ignored the access time and the
% 		 * read and execute bits.  We were strict for the other
% 		 * attributes.
% 		 *
% 		 * Here we are strict, stricter than ufs in not allowing
% 		 * users to attempt to set SF_SETTABLE bits or anyone to
% 		 * set unsupported bits.  However, we ignore attempts to
% 		 * set ATTR_ARCHIVE for directories `cp -pr' from a more
% 		 * sensible filesystem attempts it a lot.
% 		 */

This comment is partly about the problem as it affects non-time attributes.
There is a problem with cp -p for almost all attributes, since msdosfs
doesn't really support them so cp -p from another file system that supports
more of them of them must either fail, or the failures must be silently or
unsilently ignored.  cp -p unsilently ignores EPERM errors for *chown(),
but doesn't ignore any (?) other error (exit status != 0).  This gives the
rather silly handling for typical errors for cp -p to msdosfs: chown()
usually fails at the syscall level but succeeds with a warning at the
shell level, but the less important utimes() usually fails at both levels.

There is a related problem with file time granularity.  It is the usual
case that the file system has a different granularity than the system
and other file systems.  When the target has more granularity than the
source, it is usually impossible to duplicate the times.  Having utimes()
fail when the times cannot be duplicated would be too strict in general,
but sometimes you would like to be strict.  POSIX has only recently
started addressing this problem.  (Old?) FAT with its 2-second granularity
has always been unable to represent the default 1-second granularity, and
has always handled this by silently truncating to the granularity that it
supports.

My test directory for testing this mail shows another granularity problem:
(the file system is FAT32 with Win95 long names): after mkdir of it, a
stat utility on it gives:

% file=z
% ...
% atime=Sat Aug 20 00:00:00 2011 (1313762400.0)
% ctime=Sat Aug 20 16:14:29 2011 (1313820869.740000000)
% mtime=Sat Aug 20 16:14:28 2011 (1313820868.0)

This has the expected 2-second granularity for the mtime, but the other
times are strange:
- the atime is far in the past, and according to other tests has a
   granularity of at least 200 seconds
- the ctime has a granularity of 100 msec.  This differs significantly
   from the mtime's granularity, so the ctime is up to 1.99 seconds in
   advance of the mtime.  This is probably a local bug -- I probably
   don't have the fix for confusion between the ctime and the creation
   time (birthtime).  msdosfs only has a creation time so the ctime must
   be faked and should usually be the same as the mtime.  But how does
   the creation time have more precision?
In other tests, creat() of a file sets the mtime and ctime reasonably,
but the atime remains with a fixed value far in the past.  touch
advances the mtime correctly, but doesn't update the ctime.  This is
consistent with displayed ctime actually being the creation time.

> Yes, FAT file systems do not maintain a directory modify
> time.

Er, yes they do...

> (The original FAT12,16 structure didn't even have a
> modify time for the root dir.)

... except for the root directory, they don't, and this doesn't depend on
the the version -- there is no directory entry for the root file system,
so the modification time can't be stored in the usual place for root
directories, only.

> Just like Windows.

I normally use cygwin for managing file times on Windows, and touch and
cp -p work reasonably well with it.  In particular, tuch updates directory
times.  15-25 years ago, I used DOS utitities to manage file times and
don't remember any problems with touch.  Even DOS 2.1 (?) has a syscall
like utimes().

> This causes issues when a FAT fs is exported via NFS and
> someone was going to experiment with an "in memory only"
> modify time for dirs, to minimize caching issues, but I
> haven't heard back from them lately.

"Memory only" times must never escape to userland.  Linux has (had?)
file times in vfs, which makes many things easy, but old versions of
Linux did let them escape to userland.  I ran fussy POSIX conformance
tests for times.  The tests would succeed for non-POSIX file systems,
but only due to the times being in memory, and then unly while the
files were cached in memory.

> Apparently Mac OS X chooses to update the modify time that
> exists on FAT32 file systems, but that isn't Windows compatible.

Yes, it's a bug in Mac OS to be incompatile.  However, I sometimes
wish for a mount option to control this.  Similarly for weakening
of checking for attributes that cannot be set.  Also, for file
times, there is another annoying problem which might be best handled
by a mount option: msdosfs file time change twice a year with daylight
saving.  I sometimes back up msdosfs files to ffs where their times
don't change like this, and would like an easy way to stop the changes.
Moving across timezones might cause even more frequent changes, but this
doesn't affect me.

Bruce

From owner-freebsd-fs@FreeBSD.ORG  Sat Aug 20 09:42:54 2011
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: by hub.freebsd.org (Postfix, from userid 1233)
	id 3600E1065674; Sat, 20 Aug 2011 09:42:54 +0000 (UTC)
Date: Sat, 20 Aug 2011 09:42:54 +0000
From: Alexander Best <arundel@freebsd.org>
To: Bruce Evans <brde@optusnet.com.au>
Message-ID: <20110820094254.GA66130@freebsd.org>
References: <1092971110.92110.1313782831745.JavaMail.root@erie.cs.uoguelph.ca>
	<20110820145112.Y872@besplex.bde.org>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <20110820145112.Y872@besplex.bde.org>
Cc: freebsd-fs@freebsd.org
Subject: Re: touch(1) not working on directories in an msdosfs(5) envirement
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sat, 20 Aug 2011 09:42:54 -0000

On Sat Aug 20 11, Bruce Evans wrote:
> On Fri, 19 Aug 2011, Rick Macklem wrote:
> 
> >Alexander Best wrote:
> >>can somebody confirm this issue? is it already known?
> >>
> >>otaku% ll|grep HELL
> >>drwxr-xr-x 1 arundel arundel 16384 19 Aug 19:57 HELLO
> >>-rw-r--r-- 1 arundel arundel 0 19 Aug 20:13 HELLO2
> >>otaku% touch HELLO*
> >>otaku% ll|grep HELL
> >>drwxr-xr-x 1 arundel arundel 16384 19 Aug 19:57 HELLO
> >>-rw-r--r-- 1 arundel arundel 0 19 Aug 20:55 HELLO2
> 
> This is fixed (hacked around to keep the diffs small) in my version:

WOW! such a lot of detailed info on the subject and useful lines of code. you
should really get back to being an active committer, so we all get the
benefit of your code. ;)

> 
> % Index: msdosfs_vnops.c
> % ===================================================================
> % RCS file: /home/ncvs/src/sys/fs/msdosfs/msdosfs_vnops.c,v
> % retrieving revision 1.147
> % diff -u -2 -r1.147 msdosfs_vnops.c
> % --- msdosfs_vnops.c	4 Feb 2004 21:52:53 -0000	1.147
> % +++ msdosfs_vnops.c	12 Nov 2007 21:47:48 -0000
> % @@ -457,5 +457,7 @@
> %  		    (error = VOP_ACCESS(ap->a_vp, VWRITE, cred, ap->a_td))))
> %  			return (error);
> % +#if 0
> %  		if (vp->v_type != VDIR) {
> % +#endif
> %  			if ((pmp->pm_flags & MSDOSFSMNT_NOWIN95) == 0 &&
> %  			    vap->va_atime.tv_sec != VNOVAL) {
> 
> The main part of the fix is to just remove the special case for directories
> which just breaks utimes() on directories.  Even DOS and Windows never
> had this brokenness.  What DOS and Windows do specially for directories
> is not update their modification time when their contents is changed.
> FreeBSD is compatible with this, and the above special case is apparently
> the result of trying too hard to be compatible with DOS and Windows.
> 
> % @@ -463,4 +465,7 @@
> %  				unix2dostime(&vap->va_atime, &dep->de_ADate,
> %  				    NULL, NULL);
> % +				if (vp->v_type != VDIR)
> % +					dep->de_Attributes |= ATTR_ARCHIVE;
> % +				dep->de_flag |= DE_MODIFIED;
> %  			}
> %  			if (vap->va_mtime.tv_sec != VNOVAL) {
> 
> Now that setting times on directories is unbroken, we have to be more
> careful with the archive bit.  In DOS and Windows, the archive bit is
> meaningless for FAT* directories and is not set by any syscall that I
> know of (unlike for ffs IIRC).  The above avoids setting it for directories.
> Not setting DE_MODIFIED is an unrelated micro-optimization (try harder not
> to set it when we didn't change anything).
> 
> % @@ -468,8 +473,11 @@
> %  				unix2dostime(&vap->va_mtime, &dep->de_MDate,
> %  				    &dep->de_MTime, NULL);
> % +				if (vp->v_type != VDIR)
> % +					dep->de_Attributes |= ATTR_ARCHIVE;
> % +				dep->de_flag |= DE_MODIFIED;
> 
> Similarly for the mtime.
> 
> %  			}
> % -			dep->de_Attributes |= ATTR_ARCHIVE;
> % -			dep->de_flag |= DE_MODIFIED;
> 
> This was moved early.
> 
> % +#if 0
> %  		}
> % +#endif
> 
> Finish hacking way the special case for directories.
> 
> %  	}
> %  	/*
> % @@ -494,5 +502,5 @@
> %  		}
> %  	}
> % -	return (deupdat(dep, 1));
> % +	return (deupdat(dep, 0));
> 
> Remove an unrelated pessimization (a synchronous update where even an
> asynchronous update is more than what is needed).  Even ffs in Net/2
> didn't have the full pessimization here -- it pessimized SETATTR on
> times but not on the more important ownership and permission attributes.
> ffs still had the pessimization for times in 4.4BSD-Lite2, but FreeBSD
> fixed it in 1998 (ufs_vnops.c 1.79; the fix was buried in a mega-commit
> with a content-free log message :-(), and I fixed it in my version of
> ffs long before that.  msdosfs_setattr() is a little different from
> ufs_setattr(); in particular, it does the (previous synchronous) update
> for all successful calls while ufs_setattr() only ever did it for times,
> so the pessimization has a wider scope in msdosfs.
> 
> %  }
> 
> The above is only the least serious of the bugs in msdosfs_setattr() :-(.
> With the above fix, plain touch works as well as possible -- it cannot
> work perfectly since setting of atimes is not always supported.  But
> touch -r and more importantly, cp -p only work as well as possible for
> root, since they use utimes() without the null timeptr arg that allows
> plain touch to work.  A non-null timeptr arg ends up normally requiring
> root permissions for msdosfs where it normally doesn't require extra
> permissions for ffs, because ownership requirements for the non-null case
> cannot be satisfied by file systems that don't really support ownerships.
> We fudge the ownerships and use weak checks on them  in most places, but
> for utimes() we use strict checks that almost always fail: from my old
> version:
> 
> % 	if (vap->va_flags != VNOVAL) {
> % 		if (vp->v_mount->mnt_flag & MNT_RDONLY)
> % 			return (EROFS);
> % 		if (cred->cr_uid != pmp->pm_uid &&
> % 		    (error = suser_cred(cred, PRISON_ROOT)))
> % 			return (error);
> 
> The implementation of this check has changed significantly in -current,
> but its semantics and result havven't.  The file must be owned by
> someone (pmp->pm_uid), and no one else except root has permission for
> utimes() with a non-null timeptr.  This works right in ffs because the
> file can normally be owned by its rightful owner, but in msdosfs the
> owner is faked and there can be only one owner for the whole file
> systems.  I use owner root and group msdosfs for all msdosfs file
> systems.  The group permissions allow non-root users in group msdosfs
> to do almost everything  except this unimportant utimes() operation
> to almost all msdosfs files.
> 
> % 		/*
> % 		 * We are very inconsistent about handling unsupported
> % 		 * attributes.  We ignored the access time and the
> % 		 * read and execute bits.  We were strict for the other
> % 		 * attributes.
> % 		 *
> % 		 * Here we are strict, stricter than ufs in not allowing
> % 		 * users to attempt to set SF_SETTABLE bits or anyone to
> % 		 * set unsupported bits.  However, we ignore attempts to
> % 		 * set ATTR_ARCHIVE for directories `cp -pr' from a more
> % 		 * sensible filesystem attempts it a lot.
> % 		 */
> 
> This comment is partly about the problem as it affects non-time attributes.
> There is a problem with cp -p for almost all attributes, since msdosfs
> doesn't really support them so cp -p from another file system that supports
> more of them of them must either fail, or the failures must be silently or
> unsilently ignored.  cp -p unsilently ignores EPERM errors for *chown(),
> but doesn't ignore any (?) other error (exit status != 0).  This gives the
> rather silly handling for typical errors for cp -p to msdosfs: chown()
> usually fails at the syscall level but succeeds with a warning at the
> shell level, but the less important utimes() usually fails at both levels.
> 
> There is a related problem with file time granularity.  It is the usual
> case that the file system has a different granularity than the system
> and other file systems.  When the target has more granularity than the
> source, it is usually impossible to duplicate the times.  Having utimes()
> fail when the times cannot be duplicated would be too strict in general,
> but sometimes you would like to be strict.  POSIX has only recently
> started addressing this problem.  (Old?) FAT with its 2-second granularity
> has always been unable to represent the default 1-second granularity, and
> has always handled this by silently truncating to the granularity that it
> supports.
> 
> My test directory for testing this mail shows another granularity problem:
> (the file system is FAT32 with Win95 long names): after mkdir of it, a
> stat utility on it gives:
> 
> % file=z
> % ...
> % atime=Sat Aug 20 00:00:00 2011 (1313762400.0)
> % ctime=Sat Aug 20 16:14:29 2011 (1313820869.740000000)
> % mtime=Sat Aug 20 16:14:28 2011 (1313820868.0)
> 
> This has the expected 2-second granularity for the mtime, but the other
> times are strange:
> - the atime is far in the past, and according to other tests has a
>   granularity of at least 200 seconds
> - the ctime has a granularity of 100 msec.  This differs significantly
>   from the mtime's granularity, so the ctime is up to 1.99 seconds in
>   advance of the mtime.  This is probably a local bug -- I probably
>   don't have the fix for confusion between the ctime and the creation
>   time (birthtime).  msdosfs only has a creation time so the ctime must
>   be faked and should usually be the same as the mtime.  But how does
>   the creation time have more precision?
> In other tests, creat() of a file sets the mtime and ctime reasonably,
> but the atime remains with a fixed value far in the past.  touch
> advances the mtime correctly, but doesn't update the ctime.  This is
> consistent with displayed ctime actually being the creation time.
> 
> >Yes, FAT file systems do not maintain a directory modify
> >time.
> 
> Er, yes they do...
> 
> >(The original FAT12,16 structure didn't even have a
> >modify time for the root dir.)
> 
> ... except for the root directory, they don't, and this doesn't depend on
> the the version -- there is no directory entry for the root file system,
> so the modification time can't be stored in the usual place for root
> directories, only.
> 
> >Just like Windows.
> 
> I normally use cygwin for managing file times on Windows, and touch and
> cp -p work reasonably well with it.  In particular, tuch updates directory
> times.  15-25 years ago, I used DOS utitities to manage file times and
> don't remember any problems with touch.  Even DOS 2.1 (?) has a syscall
> like utimes().
> 
> >This causes issues when a FAT fs is exported via NFS and
> >someone was going to experiment with an "in memory only"
> >modify time for dirs, to minimize caching issues, but I
> >haven't heard back from them lately.
> 
> "Memory only" times must never escape to userland.  Linux has (had?)
> file times in vfs, which makes many things easy, but old versions of
> Linux did let them escape to userland.  I ran fussy POSIX conformance
> tests for times.  The tests would succeed for non-POSIX file systems,
> but only due to the times being in memory, and then unly while the
> files were cached in memory.
> 
> >Apparently Mac OS X chooses to update the modify time that
> >exists on FAT32 file systems, but that isn't Windows compatible.
> 
> Yes, it's a bug in Mac OS to be incompatile.  However, I sometimes
> wish for a mount option to control this.  Similarly for weakening
> of checking for attributes that cannot be set.  Also, for file
> times, there is another annoying problem which might be best handled
> by a mount option: msdosfs file time change twice a year with daylight
> saving.  I sometimes back up msdosfs files to ffs where their times
> don't change like this, and would like an easy way to stop the changes.
> Moving across timezones might cause even more frequent changes, but this
> doesn't affect me.
> 
> Bruce