From owner-freebsd-geom@FreeBSD.ORG  Sun Oct 14 01:31:59 2007
Return-Path: <owner-freebsd-geom@FreeBSD.ORG>
Delivered-To: freebsd-geom@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 2D6F116A417;
	Sun, 14 Oct 2007 01:31:59 +0000 (UTC)
	(envelope-from anderson@freebsd.org)
Received: from ns.trinitel.com (186.161.36.72.static.reverse.ltdomains.com
	[72.36.161.186])
	by mx1.freebsd.org (Postfix) with ESMTP id EB5D313C46A;
	Sun, 14 Oct 2007 01:31:58 +0000 (UTC)
	(envelope-from anderson@freebsd.org)
Received: from proton.local
	(r74-193-81-203.pfvlcmta01.grtntx.tl.dh.suddenlink.net
	[74.193.81.203]) (authenticated bits=0)
	by ns.trinitel.com (8.14.1/8.14.1) with ESMTP id l9E1Vw8w008356;
	Sat, 13 Oct 2007 20:31:58 -0500 (CDT)
	(envelope-from anderson@freebsd.org)
Message-ID: <47117184.3030309@freebsd.org>
Date: Sat, 13 Oct 2007 20:31:48 -0500
From: Eric Anderson <anderson@freebsd.org>
User-Agent: Thunderbird 2.0.0.6 (Macintosh/20070728)
MIME-Version: 1.0
To: Ivan Voras <ivoras@freebsd.org>
References: <fel7fn$k6d$1@sea.gmane.org>
In-Reply-To: <fel7fn$k6d$1@sea.gmane.org>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
X-Spam-Status: No, score=-2.0 required=5.0 tests=AWL,BAYES_00 autolearn=ham
	version=3.1.8
X-Spam-Checker-Version: SpamAssassin 3.1.8 (2007-02-13) on ns.trinitel.com
Cc: freebsd-geom@freebsd.org
Subject: Re: Disk mounting in recent Linuxes
X-BeenThere: freebsd-geom@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: GEOM-specific discussions and implementations
	<freebsd-geom.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-geom>,
	<mailto:freebsd-geom-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-geom>
List-Post: <mailto:freebsd-geom@freebsd.org>
List-Help: <mailto:freebsd-geom-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-geom>,
	<mailto:freebsd-geom-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sun, 14 Oct 2007 01:31:59 -0000

Ivan Voras wrote:
> Hi,
> 
> I've installed a Linux (openSUSE) on a laptop and this is what I got by
> default in fstab:
> 
> /dev/disk/by-id/scsi-SATA_Hitachi_HTS5412_HP0400BEG1922A-part2 /
>             ext3       acl,user_xattr        1 1
> /dev/disk/by-id/scsi-SATA_Hitachi_HTS5412_HP0400BEG1922A-part4 /data
>             ext3       acl,user_xattr        1 2
> /dev/disk/by-id/scsi-SATA_Hitachi_HTS5412_HP0400BEG1922A-part3 swap
>             swap       defaults              0 0
> 
> A similar option (to use a device by id instead of location) also exists
> for network cards.
> 
> (This is just a "FYI" post, I'm not complaining :) ).


I was actually wondering if we should start labeling our filesystems at 
newfs time in the installer for this type of setup.  There are a handful 
of potential 'gotchas' though.

Eric

From owner-freebsd-geom@FreeBSD.ORG  Sun Oct 14 13:35:03 2007
Return-Path: <owner-freebsd-geom@FreeBSD.ORG>
Delivered-To: freebsd-geom@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 0BE4D16A41A;
	Sun, 14 Oct 2007 13:35:03 +0000 (UTC)
	(envelope-from d_elbracht@ecngs.de)
Received: from ecngs.de (mail.ecngs.de [217.73.144.50])
	by mx1.freebsd.org (Postfix) with ESMTP id 2F5A013C459;
	Sun, 14 Oct 2007 13:35:02 +0000 (UTC)
	(envelope-from d_elbracht@ecngs.de)
Received: from EC1a (ec1.elbracht.net [217.73.144.99]) 
	by ecngs.de (SurgeMail 3.8f2) with ESMTP id 1773130-1922481 
	for multiple; Sun, 14 Oct 2007 15:22:59 +0200
From: "d_elbracht" <d_elbracht@ecngs.de>
To: <freebsd-stable@freebsd.org>,
	<freebsd-geom@freebsd.org>
Date: Sun, 14 Oct 2007 15:22:32 +0200
Message-ID: <008801c80e65$47cbe650$639049d9@EC1a>
MIME-Version: 1.0
Content-Type: text/plain;
	charset="us-ascii"
Content-Transfer-Encoding: 7bit
X-Mailer: Microsoft Office Outlook 11
Thread-Index: AcgOZUbPq0zqvOG2QwSFpRt2OPaAhw==
X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2900.3138
Cc: 
Subject: g_vfs_done():da3s1a[READ(offset=81064794762854400,
	length=8192)]error = 5
X-BeenThere: freebsd-geom@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: GEOM-specific discussions and implementations
	<freebsd-geom.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-geom>,
	<mailto:freebsd-geom-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-geom>
List-Post: <mailto:freebsd-geom@freebsd.org>
List-Help: <mailto:freebsd-geom-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-geom>,
	<mailto:freebsd-geom-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sun, 14 Oct 2007 13:35:03 -0000

we are trying to diagnose errors seen on 6.2, SMP, amd64, cvsup'ed of
2007-10-09

Mainboard is a Tyan Thunder h2000M (S3992-E) with 16 GB RAM and 2 x Opteron
2216, da3 is on a 3ware 9550-12

we are seeing this error:
g_vfs_done():da3s1a[READ(offset=81064794762854400, length=8192)]error = 5
on a 12 GB Hyperdrive

the offset changes sometimes, but it is always 81064794xxxxxxxxx and well
out the 12GB range.

We did have the Hyperdrive connected directly to the mainboards SATA0 (ad4)
with similar errors.
We used to have a md instead of the hyperdrive before, coming up with
similar errors.

Blocksize on the partition is 8192 (newsfs -b 8192 ..). 
We did have a blocksize of 65536 before, but after some hours (sometimes
days), the machine will be unresponsible with "newbuf" as a waitmessage in
top and has to be hard-reset. 
Regarding "newbuf", as well as nbufkv and nbufbs, I will write a seperate
message to the list.

According to systat -vm, da3 does tps > 500 (yes, that's a lot)

This leads to an assumption, the error has to do with very high IOs per
second on a SMP machine.
The system-disk is a RAID1 on an ICP 5805. All other disks (51) are 20
gstripe'd partitions.

Any hint to diagnose / fix the problem is well appreciated.

Cheers,

Dieter


From owner-freebsd-geom@FreeBSD.ORG  Sun Oct 14 14:09:02 2007
Return-Path: <owner-freebsd-geom@FreeBSD.ORG>
Delivered-To: freebsd-geom@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 92BDA16A417;
	Sun, 14 Oct 2007 14:09:02 +0000 (UTC)
	(envelope-from d_elbracht@ecngs.de)
Received: from ecngs.de (mail.ecngs.de [217.73.144.50])
	by mx1.freebsd.org (Postfix) with ESMTP id B409413C48E;
	Sun, 14 Oct 2007 14:09:01 +0000 (UTC)
	(envelope-from d_elbracht@ecngs.de)
Received: from EC1a (ec1.elbracht.net [217.73.144.99]) 
	by ecngs.de (SurgeMail 3.8f2) with ESMTP id 1773204-1922481 
	for multiple; Sun, 14 Oct 2007 16:09:11 +0200
From: "d_elbracht" <d_elbracht@ecngs.de>
To: "'Scott Long'" <scottl@samsco.org>
References: <008801c80e65$47cbe650$639049d9@EC1a> <47121F9F.7050900@samsco.org>
Date: Sun, 14 Oct 2007 16:08:44 +0200
Message-ID: <008d01c80e6b$bb95b7e0$639049d9@EC1a>
MIME-Version: 1.0
Content-Type: text/plain;
	charset="us-ascii"
Content-Transfer-Encoding: 7bit
X-Mailer: Microsoft Office Outlook 11
Thread-Index: AcgOa5ciuX2g50BFT+K6MBnLZJ0DxQAASbtQ
In-Reply-To: <47121F9F.7050900@samsco.org>
X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2900.3138
Cc: freebsd-stable@freebsd.org, freebsd-geom@freebsd.org
Subject: AW: g_vfs_done():da3s1a[READ(offset=81064794762854400,
	length=8192)]error = 5
X-BeenThere: freebsd-geom@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: GEOM-specific discussions and implementations
	<freebsd-geom.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-geom>,
	<mailto:freebsd-geom-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-geom>
List-Post: <mailto:freebsd-geom@freebsd.org>
List-Help: <mailto:freebsd-geom-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-geom>,
	<mailto:freebsd-geom-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sun, 14 Oct 2007 14:09:02 -0000

> > we are trying to diagnose errors seen on 6.2, SMP, amd64, 
> cvsup'ed of
> > 2007-10-09
> > 
> > Mainboard is a Tyan Thunder h2000M (S3992-E) with 16 GB RAM and 2 x 
> > Opteron 2216, da3 is on a 3ware 9550-12
> > 
> > we are seeing this error:
> > g_vfs_done():da3s1a[READ(offset=81064794762854400, 
> length=8192)]error 
> > = 5 on a 12 GB Hyperdrive
> > 
> > the offset changes sometimes, but it is always 
> 81064794xxxxxxxxx and 
> > well out the 12GB range.
> > 
> > We did have the Hyperdrive connected directly to the 
> mainboards SATA0 
> > (ad4) with similar errors.
> > We used to have a md instead of the hyperdrive before, 
> coming up with 
> > similar errors.
> > 
> > Blocksize on the partition is 8192 (newsfs -b 8192 ..). 
> > We did have a blocksize of 65536 before, but after some hours 
> > (sometimes days), the machine will be unresponsible with 
> "newbuf" as a 
> > waitmessage in top and has to be hard-reset.
> > Regarding "newbuf", as well as nbufkv and nbufbs, I will write a 
> > seperate message to the list.
> > 
> > According to systat -vm, da3 does tps > 500 (yes, that's a lot)
> > 
> > This leads to an assumption, the error has to do with very high IOs 
> > per second on a SMP machine.
> > The system-disk is a RAID1 on an ICP 5805. All other disks 
> (51) are 20 
> > gstripe'd partitions.
> > 
> > Any hint to diagnose / fix the problem is well appreciated.
> > 
> > Cheers,
> > 
> > Dieter
> > 
> 
> I can geneate 30,000 I/O's per second for hours on end on 
> several types of storage hardware on FreeBSD SMP, and have no 
> problems.  Since you're seeing this problem both when 
> connected to a 3ware controller and when connected to a 
> simple ATA/SATA controller (both of which have also been 
> observed to do high amounts of I/O with no problems), I 
> suspect that the problem is with your disk device, not with 
> FreeBSD.  I don't know anything about a "hyperdrive" though, 
> so more information might help.
> 
> Scott

Well, how about this:
> > We used to have a md instead of the hyperdrive before, 
> coming up with 
> > similar errors.

here ist some info about the hyperdrive.
http://www.hyperossystems.co.uk/

We could go back the the md (memory-disk) to try again. 

What exactly does the "offset" in the error-message mean ? Isn't that like a
seek on the disk ? And what does "error=5" mean ?

Sure, the whole thing could be a problem of the application running. It's
diablo 5. The history file (dhistory) about 2 GB in size resides on the
hyperdrive. 

Dieter


From owner-freebsd-geom@FreeBSD.ORG  Sun Oct 14 14:15:27 2007
Return-Path: <owner-freebsd-geom@FreeBSD.ORG>
Delivered-To: freebsd-geom@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id EDA8416A41A
	for <freebsd-geom@freebsd.org>; Sun, 14 Oct 2007 14:15:27 +0000 (UTC)
	(envelope-from scottl@samsco.org)
Received: from pooker.samsco.org (pooker.samsco.org [168.103.85.57])
	by mx1.freebsd.org (Postfix) with ESMTP id 6B26313C447
	for <freebsd-geom@freebsd.org>; Sun, 14 Oct 2007 14:15:27 +0000 (UTC)
	(envelope-from scottl@samsco.org)
Received: from phobos.samsco.home (phobos.samsco.home [192.168.254.11])
	(authenticated bits=0)
	by pooker.samsco.org (8.13.8/8.13.8) with ESMTP id l9EDtAlH059643;
	Sun, 14 Oct 2007 07:55:10 -0600 (MDT)
	(envelope-from scottl@samsco.org)
Message-ID: <47121F9F.7050900@samsco.org>
Date: Sun, 14 Oct 2007 07:54:39 -0600
From: Scott Long <scottl@samsco.org>
User-Agent: Mozilla/5.0 (Macintosh; U; Intel Mac OS X; en-US;
	rv:1.8.1.6) Gecko/20070802 SeaMonkey/1.1.4
MIME-Version: 1.0
To: d_elbracht <d_elbracht@ecngs.de>
References: <008801c80e65$47cbe650$639049d9@EC1a>
In-Reply-To: <008801c80e65$47cbe650$639049d9@EC1a>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
X-Greylist: Sender succeeded SMTP AUTH authentication, not delayed by
	milter-greylist-2.0.2 (pooker.samsco.org [168.103.85.57]);
	Sun, 14 Oct 2007 07:55:10 -0600 (MDT)
X-Spam-Status: No, score=-1.4 required=5.5 tests=ALL_TRUSTED autolearn=failed
	version=3.1.8
X-Spam-Checker-Version: SpamAssassin 3.1.8 (2007-02-13) on pooker.samsco.org
Cc: freebsd-stable@freebsd.org, freebsd-geom@freebsd.org
Subject: Re: g_vfs_done():da3s1a[READ(offset=81064794762854400,
 length=8192)]error = 5
X-BeenThere: freebsd-geom@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: GEOM-specific discussions and implementations
	<freebsd-geom.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-geom>,
	<mailto:freebsd-geom-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-geom>
List-Post: <mailto:freebsd-geom@freebsd.org>
List-Help: <mailto:freebsd-geom-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-geom>,
	<mailto:freebsd-geom-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sun, 14 Oct 2007 14:15:28 -0000

d_elbracht wrote:
> we are trying to diagnose errors seen on 6.2, SMP, amd64, cvsup'ed of
> 2007-10-09
> 
> Mainboard is a Tyan Thunder h2000M (S3992-E) with 16 GB RAM and 2 x Opteron
> 2216, da3 is on a 3ware 9550-12
> 
> we are seeing this error:
> g_vfs_done():da3s1a[READ(offset=81064794762854400, length=8192)]error = 5
> on a 12 GB Hyperdrive
> 
> the offset changes sometimes, but it is always 81064794xxxxxxxxx and well
> out the 12GB range.
> 
> We did have the Hyperdrive connected directly to the mainboards SATA0 (ad4)
> with similar errors.
> We used to have a md instead of the hyperdrive before, coming up with
> similar errors.
> 
> Blocksize on the partition is 8192 (newsfs -b 8192 ..). 
> We did have a blocksize of 65536 before, but after some hours (sometimes
> days), the machine will be unresponsible with "newbuf" as a waitmessage in
> top and has to be hard-reset. 
> Regarding "newbuf", as well as nbufkv and nbufbs, I will write a seperate
> message to the list.
> 
> According to systat -vm, da3 does tps > 500 (yes, that's a lot)
> 
> This leads to an assumption, the error has to do with very high IOs per
> second on a SMP machine.
> The system-disk is a RAID1 on an ICP 5805. All other disks (51) are 20
> gstripe'd partitions.
> 
> Any hint to diagnose / fix the problem is well appreciated.
> 
> Cheers,
> 
> Dieter
> 

I can geneate 30,000 I/O's per second for hours on end on several types
of storage hardware on FreeBSD SMP, and have no problems.  Since you're
seeing this problem both when connected to a 3ware controller and when
connected to a simple ATA/SATA controller (both of which have also been
observed to do high amounts of I/O with no problems), I suspect that the
problem is with your disk device, not with FreeBSD.  I don't know
anything about a "hyperdrive" though, so more information might help.

Scott

From owner-freebsd-geom@FreeBSD.ORG  Sun Oct 14 14:33:44 2007
Return-Path: <owner-freebsd-geom@FreeBSD.ORG>
Delivered-To: freebsd-geom@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 1C39816A41A
	for <freebsd-geom@freebsd.org>; Sun, 14 Oct 2007 14:33:44 +0000 (UTC)
	(envelope-from arne_woerner@yahoo.com)
Received: from web30304.mail.mud.yahoo.com (web30304.mail.mud.yahoo.com
	[209.191.69.66])
	by mx1.freebsd.org (Postfix) with SMTP id A086E13C447
	for <freebsd-geom@freebsd.org>; Sun, 14 Oct 2007 14:33:43 +0000 (UTC)
	(envelope-from arne_woerner@yahoo.com)
Received: (qmail 78278 invoked by uid 60001); 14 Oct 2007 14:27:02 -0000
DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=s1024; d=yahoo.com;
	h=X-YMail-OSG:Received:Date:From:Subject:To:Cc:In-Reply-To:MIME-Version:Content-Type:Content-Transfer-Encoding:Message-ID;
	b=vfPEY0OcYQMGg30lw7rDlpzNOcBjTEW1ScJfVt8ILqEOVPdpwZOXiAVvYOtRZ4ZIFs1ay9fZEvHGXe2KJve9OxXLhgWjhkmfHGvJIY6K3yxyuQJnjQpp1zGTiKbDWEKn7F7F6s2j1S+y7WaOAPE/FGULy9eglR3DO/wk2HtYl30=;
X-YMail-OSG: c0UpGFQVM1m9hWvo7E.Vo_ArdpzNu24EcH2bp8UcvlXgTlUEtLd4rDPqrmckPu_c17_kLw0nQc06Ngrb8qs4c3YHAn3qL1QqCSSo6S6x1UpYSg08Qp9XQSc5fG2QJQ--
Received: from [84.141.122.18] by web30304.mail.mud.yahoo.com via HTTP;
	Sun, 14 Oct 2007 07:27:01 PDT
Date: Sun, 14 Oct 2007 07:27:01 -0700 (PDT)
From: Arne "W�rner" <arne_woerner@yahoo.com>
To: d_elbracht <d_elbracht@ecngs.de>
In-Reply-To: <008d01c80e6b$bb95b7e0$639049d9@EC1a>
MIME-Version: 1.0
Content-Type: text/plain; charset=iso-8859-1
Content-Transfer-Encoding: 8bit
Message-ID: <956094.77414.qm@web30304.mail.mud.yahoo.com>
Cc: freebsd-geom@freebsd.org
Subject: Re: AW: g_vfs_done():da3s1a[READ(offset=81064794762854400,
	length=8192)]error = 5
X-BeenThere: freebsd-geom@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: GEOM-specific discussions and implementations
	<freebsd-geom.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-geom>,
	<mailto:freebsd-geom-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-geom>
List-Post: <mailto:freebsd-geom@freebsd.org>
List-Help: <mailto:freebsd-geom-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-geom>,
	<mailto:freebsd-geom-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sun, 14 Oct 2007 14:33:44 -0000

That 64kB block size problem is a quite old bug... etc@fluffles.net reported it
some months ago... It is somehow due to memory fragmentation and a dead lock...

--- d_elbracht <d_elbracht@ecngs.de> wrote:
> We could go back the the md (memory-disk) to try again. 
> 
a memory disk (md) should never deliver an EIO (5)... So u must have done
something different, than just reading/writing to/from a md...

But u can take a look at the source code and search for "EIO"...
http://www.freebsd.org/cgi/cvsweb.cgi/src/sys/dev/md/md.c

A possible cause for that EIO is, that the file system parameters r somehow
bad, so that it reads from negative or too large offsets...
http://www.freebsd.org/cgi/cvsweb.cgi/~checkout~/src/sys/geom/geom_io.c?rev=1.75;content-type=text%2Fplain
if (bp->bio_offset > pp->mediasize)
			return (EIO);

-Arne


      ____________________________________________________________________________________
Shape Yahoo! in your own image.  Join our Network Research Panel today!   http://surveylink.yahoo.com/gmrs/yahoo_panel_invite.asp?a=7 


From owner-freebsd-geom@FreeBSD.ORG  Sun Oct 14 14:42:13 2007
Return-Path: <owner-freebsd-geom@FreeBSD.ORG>
Delivered-To: freebsd-geom@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 2DA3D16A41A;
	Sun, 14 Oct 2007 14:42:13 +0000 (UTC)
	(envelope-from d_elbracht@ecngs.de)
Received: from ecngs.de (mail.ecngs.de [217.73.144.50])
	by mx1.freebsd.org (Postfix) with ESMTP id EAF7D13C457;
	Sun, 14 Oct 2007 14:42:11 +0000 (UTC)
	(envelope-from d_elbracht@ecngs.de)
Received: from EC1a (ec1.elbracht.net [217.73.144.99]) 
	by ecngs.de (SurgeMail 3.8f2) with ESMTP id 1773237-1922481 
	for multiple; Sun, 14 Oct 2007 16:42:32 +0200
From: "d_elbracht" <d_elbracht@ecngs.de>
To: <freebsd-geom@freebsd.org>,
	<freebsd-stable@freebsd.org>
Date: Sun, 14 Oct 2007 16:42:06 +0200
Message-ID: <008e01c80e70$64c92910$639049d9@EC1a>
MIME-Version: 1.0
Content-Type: text/plain;
	charset="us-ascii"
Content-Transfer-Encoding: 7bit
X-Mailer: Microsoft Office Outlook 11
Thread-Index: AcgOcGQ+rp1Mn0RqSFKuKB0DklKcNw==
X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2900.3138
Cc: 
Subject: newbuf, nbufkv, nbufbs
X-BeenThere: freebsd-geom@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: GEOM-specific discussions and implementations
	<freebsd-geom.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-geom>,
	<mailto:freebsd-geom-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-geom>
List-Post: <mailto:freebsd-geom@freebsd.org>
List-Help: <mailto:freebsd-geom-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-geom>,
	<mailto:freebsd-geom-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sun, 14 Oct 2007 14:42:13 -0000

We have 2 machines involved with this problem.

machine1, SMP, i386, 4 GB RAM was recently upgraded from 5.4 to 6.2 cvsup'ed
2007-10-10

a partition of about 2.5 TB (gstripe -s 1048576) was newfs'ed with blocksize
of 65536 and fragsize of 8192 
On 5.4, this was running for months with no problem.

On 6.2 after a few hours of high thruput (network tx and rx 400-500 Mbit
each), it became unresponsible with top showing a lot of processes with
waitmessage newbuf.

So, reset, fsck etc and it run again, only after a few hours, it became
unresponsible again, showing processes with nbufkv and nbufbs

this time, I did newfs with blocksize of 32768 and fragsize of 4096 and it's
running. Thruput is decreased to 300-400 Mbit

Note, it did NEVER show the problem on 5.4


machine2, SMP, amd64, 16 GB RAM, 6.2 cvsup'ed 2007-10-09
20 partitions involving 51 disks, all gstripe -s 1048576, newfs -b 65536 -f
8192
1 partion of 12 GB, (da3s1a) newfs -b 65536 -f 8192
after a few hours, top shows newbuf and the machine is unresponsible.
tps on da3s1a is > 500, the others are < 100
I did newfs -b 8192 -f 1024 /dev/da3s1a and it's running without the problem
(yet)


The problem seems to have to do with -b 65536 and lot's of IOPS ond 6.2

Any clue ? e.g. increase BKVASIZE to 131072 and kern.nbuf to 32768 ?

Cheers,

Dieter


From owner-freebsd-geom@FreeBSD.ORG  Sun Oct 14 14:46:23 2007
Return-Path: <owner-freebsd-geom@FreeBSD.ORG>
Delivered-To: freebsd-geom@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 194F916A46D
	for <freebsd-geom@freebsd.org>; Sun, 14 Oct 2007 14:46:23 +0000 (UTC)
	(envelope-from arne_woerner@yahoo.com)
Received: from web30308.mail.mud.yahoo.com (web30308.mail.mud.yahoo.com
	[209.191.69.70])
	by mx1.freebsd.org (Postfix) with SMTP id 9A72C13C465
	for <freebsd-geom@freebsd.org>; Sun, 14 Oct 2007 14:46:22 +0000 (UTC)
	(envelope-from arne_woerner@yahoo.com)
Received: (qmail 24282 invoked by uid 60001); 14 Oct 2007 14:19:40 -0000
DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=s1024; d=yahoo.com;
	h=X-YMail-OSG:Received:Date:From:Subject:To:Cc:In-Reply-To:MIME-Version:Content-Type:Content-Transfer-Encoding:Message-ID;
	b=BJoF8stMqjSQT2kb4OdO0ThJRD43491QGv2cjC4TEq6mu6R7NQQ44xRSVSA/si8pa3FT/H6Jikyh40bsNFpPRQsCpuffWmlpFq9Q4DCAVBXXto3tEyzneR/fjnkZBZncD4f9Id4s7c8UlTvwcznwVKu0d4RUEEGSr2+hGKTj1OQ=;
X-YMail-OSG: _cMoOjwVM1mmS7AdYeP1xkwMo_0KmFOOWP86RFOXaZGEK6hfhhYxUvd9HSY5zxX0DJYUhF3soI0VAg3_.baZG96yKVIxyJkILXfp0WHDcv05VgCvoOOxtW0Se0JgmtVJRH3yL4gQlVnvpCe7
Received: from [84.141.122.18] by web30308.mail.mud.yahoo.com via HTTP;
	Sun, 14 Oct 2007 07:19:40 PDT
Date: Sun, 14 Oct 2007 07:19:40 -0700 (PDT)
From: Arne "W�rner" <arne_woerner@yahoo.com>
To: Scott Long <scottl@samsco.org>, d_elbracht <d_elbracht@ecngs.de>
In-Reply-To: <47121F9F.7050900@samsco.org>
MIME-Version: 1.0
Content-Type: text/plain; charset=iso-8859-1
Content-Transfer-Encoding: 8bit
Message-ID: <847856.24179.qm@web30308.mail.mud.yahoo.com>
Cc: freebsd-geom@freebsd.org
Subject: Re: g_vfs_done():da3s1a[READ(offset=81064794762854400,
	length=8192)]error = 5
X-BeenThere: freebsd-geom@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: GEOM-specific discussions and implementations
	<freebsd-geom.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-geom>,
	<mailto:freebsd-geom-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-geom>
List-Post: <mailto:freebsd-geom@freebsd.org>
List-Help: <mailto:freebsd-geom-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-geom>,
	<mailto:freebsd-geom-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sun, 14 Oct 2007 14:46:23 -0000

--- Scott Long <scottl@samsco.org> wrote:
> I can geneate 30,000 I/O's per second for hours on end on several types
> of storage hardware on FreeBSD SMP, and have no problems.  Since you're
> seeing this problem both when connected to a 3ware controller and when
> connected to a simple ATA/SATA controller (both of which have also been
> observed to do high amounts of I/O with no problems), I suspect that the
> problem is with your disk device, not with FreeBSD.  I don't know
> anything about a "hyperdrive" though, so more information might help.
> 
> Scott
> 
I would say so, too...

Especially because errno 5 is EIO:
http://www.freebsd.org/cgi/man.cgi?query=errno&apropos=0&sektion=0&manpath=FreeBSD+6.2-RELEASE&format=html

-Arne


____________________________________________________________________________________
Take the Internet to Go: Yahoo!Go puts the Internet in your pocket: mail, news, photos & more. 
http://mobile.yahoo.com/go?refer=1GNXIC

From owner-freebsd-geom@FreeBSD.ORG  Sun Oct 14 15:59:45 2007
Return-Path: <owner-freebsd-geom@FreeBSD.ORG>
Delivered-To: freebsd-geom@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 1122416A417
	for <freebsd-geom@freebsd.org>; Sun, 14 Oct 2007 15:59:45 +0000 (UTC)
	(envelope-from lars@larseighner.com)
Received: from mail.team1internet.com (mail.team1internet.com [216.110.13.10])
	by mx1.freebsd.org (Postfix) with ESMTP id E348C13C480
	for <freebsd-geom@freebsd.org>; Sun, 14 Oct 2007 15:59:44 +0000 (UTC)
	(envelope-from lars@larseighner.com)
Received: by mail.team1internet.com (Postfix, from userid 12346)
	id 51A5416B751; Sun, 14 Oct 2007 10:38:43 -0500 (CDT)
Received: from larseighner.com (unknown [216.110.13.72])
	by mail.team1internet.com (Postfix) with SMTP
	id 876C116B747; Sun, 14 Oct 2007 10:38:41 -0500 (CDT)
Received: by larseighner.com (nbSMTP-1.00) for uid 1001
	lars@larseighner.com; Sun, 14 Oct 2007 10:37:39 -0500 (CDT)
Date: Sun, 14 Oct 2007 10:37:38 -0500 (CDT)
From: Lars Eighner <stableuser@larseighner.com>
X-X-Sender: lars@debranded.6dollardialup.com
To: d_elbracht <d_elbracht@ecngs.de>
In-Reply-To: <008801c80e65$47cbe650$639049d9@EC1a>
Message-ID: <20071014103129.W19754@qroenaqrq.6qbyyneqvnyhc.pbz>
References: <008801c80e65$47cbe650$639049d9@EC1a>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed
X-Sanitizer: Anomy and SpamAssassin mail filter - see
	http://www.6dollardialup.com/support/spaminfo.html
X-Spam-Status: No, hits=-3.2 required=10.0
	tests=EMAIL_ATTRIBUTION,IN_REP_TO,J_CHICKENPOX_52,OACYS_SINGLE,
	QUOTED_EMAIL_TEXT,REFERENCES,RM_sl_Parens, TO_LOCALPART_EQ_REAL
	version=2.43
X-Spam-Level: 
Cc: freebsd-stable@freebsd.org, freebsd-geom@freebsd.org
Subject: Re: g_vfs_done():da3s1a[READ(offset=81064794762854400,
 length=8192)]error = 5
X-BeenThere: freebsd-geom@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: GEOM-specific discussions and implementations
	<freebsd-geom.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-geom>,
	<mailto:freebsd-geom-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-geom>
List-Post: <mailto:freebsd-geom@freebsd.org>
List-Help: <mailto:freebsd-geom-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-geom>,
	<mailto:freebsd-geom-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sun, 14 Oct 2007 15:59:45 -0000

On Sun, 14 Oct 2007, d_elbracht wrote:

> we are trying to diagnose errors seen on 6.2, SMP, amd64, cvsup'ed of
> 2007-10-09
>
> Mainboard is a Tyan Thunder h2000M (S3992-E) with 16 GB RAM and 2 x Opteron
> 2216, da3 is on a 3ware 9550-12
>
> we are seeing this error:
> g_vfs_done():da3s1a[READ(offset=81064794762854400, length=8192)]error = 5
> on a 12 GB Hyperdrive

I trashed a perfectly disk drive before learning that there is a serious bug
in g_vfs.  Apparently it is one of those things which shows up in some
configurations and not others.  Although I am told they are unable to
isolate the problem, all the reports I've seen were from people using AMD
systems.


From owner-freebsd-geom@FreeBSD.ORG  Sun Oct 14 16:09:21 2007
Return-Path: <owner-freebsd-geom@FreeBSD.ORG>
Delivered-To: freebsd-geom@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id CD3B916A46B;
	Sun, 14 Oct 2007 16:09:21 +0000 (UTC)
	(envelope-from d_elbracht@ecngs.de)
Received: from ecngs.de (mail.ecngs.de [217.73.144.50])
	by mx1.freebsd.org (Postfix) with ESMTP id 144D113C457;
	Sun, 14 Oct 2007 16:09:20 +0000 (UTC)
	(envelope-from d_elbracht@ecngs.de)
Received: from EC1a (ec1.elbracht.net [217.73.144.99]) 
	by ecngs.de (SurgeMail 3.8f2) with ESMTP id 1773356-1922481 
	for multiple; Sun, 14 Oct 2007 18:09:25 +0200
From: "d_elbracht" <d_elbracht@ecngs.de>
To: =?iso-8859-1?Q?'Arne_W=F6rner'?= <arne_woerner@yahoo.com>,
	"'Scott Long'" <scottl@samsco.org>
References: <47121F9F.7050900@samsco.org>
	<847856.24179.qm@web30308.mail.mud.yahoo.com>
Date: Sun, 14 Oct 2007 18:08:57 +0200
Message-ID: <008f01c80e7c$876c89b0$639049d9@EC1a>
MIME-Version: 1.0
Content-Type: text/plain;
	charset="iso-8859-1"
Content-Transfer-Encoding: 7bit
X-Mailer: Microsoft Office Outlook 11
Thread-Index: AcgOcRU1cBtJiEb5TsWoeT7TKFDfDAACwHKw
In-Reply-To: <847856.24179.qm@web30308.mail.mud.yahoo.com>
X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2900.3138
Cc: freebsd-stable@freebsd.org, freebsd-geom@freebsd.org
Subject: AW: g_vfs_done():da3s1a[READ(offset=81064794762854400,
	length=8192)]error = 5
X-BeenThere: freebsd-geom@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: GEOM-specific discussions and implementations
	<freebsd-geom.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-geom>,
	<mailto:freebsd-geom-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-geom>
List-Post: <mailto:freebsd-geom@freebsd.org>
List-Help: <mailto:freebsd-geom-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-geom>,
	<mailto:freebsd-geom-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sun, 14 Oct 2007 16:09:21 -0000

> --- Scott Long <scottl@samsco.org> wrote:
> > I can geneate 30,000 I/O's per second for hours on end on several 
> > types of storage hardware on FreeBSD SMP, and have no 
> problems.  Since 
> > you're seeing this problem both when connected to a 3ware 
> controller 
> > and when connected to a simple ATA/SATA controller (both of 
> which have 
> > also been observed to do high amounts of I/O with no problems), I 
> > suspect that the problem is with your disk device, not with 
> FreeBSD.  
> > I don't know anything about a "hyperdrive" though, so more 
> information might help.
> > 
> > Scott
> > 
> I would say so, too...
> 
> Especially because errno 5 is EIO:
> http://www.freebsd.org/cgi/man.cgi?query=errno&apropos=0&sekti
> on=0&manpath=FreeBSD+6.2-RELEASE&format=html
> 
> -Arne

I would agree with you on that, if the error (EIO) is NOT because of the
READ going wrong in the first place.

>From my understanding, the offset 81064794762854400 is NOT within the 12 GB
of the drive anymore. Or, does the offset mean something else ?

Dieter


From owner-freebsd-geom@FreeBSD.ORG  Sun Oct 14 18:45:29 2007
Return-Path: <owner-freebsd-geom@FreeBSD.ORG>
Delivered-To: freebsd-geom@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 856CF16A41A;
	Sun, 14 Oct 2007 18:45:29 +0000 (UTC)
	(envelope-from scottl@samsco.org)
Received: from pooker.samsco.org (pooker.samsco.org [168.103.85.57])
	by mx1.freebsd.org (Postfix) with ESMTP id DAAED13C44B;
	Sun, 14 Oct 2007 18:45:28 +0000 (UTC)
	(envelope-from scottl@samsco.org)
Received: from phobos.samsco.home (phobos.samsco.home [192.168.254.11])
	(authenticated bits=0)
	by pooker.samsco.org (8.13.8/8.13.8) with ESMTP id l9EIFPer060995;
	Sun, 14 Oct 2007 12:15:25 -0600 (MDT)
	(envelope-from scottl@samsco.org)
Message-ID: <47125C9E.1040109@samsco.org>
Date: Sun, 14 Oct 2007 12:14:54 -0600
From: Scott Long <scottl@samsco.org>
User-Agent: Mozilla/5.0 (Macintosh; U; Intel Mac OS X; en-US;
	rv:1.8.1.6) Gecko/20070802 SeaMonkey/1.1.4
MIME-Version: 1.0
To: Lars Eighner <stableuser@larseighner.com>
References: <008801c80e65$47cbe650$639049d9@EC1a>
	<20071014103129.W19754@qroenaqrq.6qbyyneqvnyhc.pbz>
In-Reply-To: <20071014103129.W19754@qroenaqrq.6qbyyneqvnyhc.pbz>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
X-Greylist: Sender succeeded SMTP AUTH authentication, not delayed by
	milter-greylist-2.0.2 (pooker.samsco.org [168.103.85.57]);
	Sun, 14 Oct 2007 12:15:26 -0600 (MDT)
X-Spam-Status: No, score=-1.4 required=5.5 tests=ALL_TRUSTED autolearn=failed
	version=3.1.8
X-Spam-Checker-Version: SpamAssassin 3.1.8 (2007-02-13) on pooker.samsco.org
Cc: freebsd-stable@freebsd.org, d_elbracht <d_elbracht@ecngs.de>,
	freebsd-geom@freebsd.org
Subject: Re: g_vfs_done():da3s1a[READ(offset=81064794762854400,
 length=8192)]error = 5
X-BeenThere: freebsd-geom@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: GEOM-specific discussions and implementations
	<freebsd-geom.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-geom>,
	<mailto:freebsd-geom-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-geom>
List-Post: <mailto:freebsd-geom@freebsd.org>
List-Help: <mailto:freebsd-geom-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-geom>,
	<mailto:freebsd-geom-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sun, 14 Oct 2007 18:45:29 -0000

Lars Eighner wrote:
> On Sun, 14 Oct 2007, d_elbracht wrote:
> 
>> we are trying to diagnose errors seen on 6.2, SMP, amd64, cvsup'ed of
>> 2007-10-09
>>
>> Mainboard is a Tyan Thunder h2000M (S3992-E) with 16 GB RAM and 2 x 
>> Opteron
>> 2216, da3 is on a 3ware 9550-12
>>
>> we are seeing this error:
>> g_vfs_done():da3s1a[READ(offset=81064794762854400, length=8192)]error = 5
>> on a 12 GB Hyperdrive
> 
> I trashed a perfectly disk drive before learning that there is a serious 
> bug
> in g_vfs.  Apparently it is one of those things which shows up in some
> configurations and not others.  Although I am told they are unable to
> isolate the problem, all the reports I've seen were from people using AMD
> systems.
> 

Are you talking about problems with ATA controllers, AMD64 (or 
i386+PAE), and more than 4GB of RAM?  Or something else?

Scott


From owner-freebsd-geom@FreeBSD.ORG  Sun Oct 14 23:22:35 2007
Return-Path: <owner-freebsd-geom@FreeBSD.ORG>
Delivered-To: freebsd-geom@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 629A916A468;
	Sun, 14 Oct 2007 23:22:35 +0000 (UTC)
	(envelope-from scottl@samsco.org)
Received: from pooker.samsco.org (pooker.samsco.org [168.103.85.57])
	by mx1.freebsd.org (Postfix) with ESMTP id 0B32013C50D;
	Sun, 14 Oct 2007 23:22:34 +0000 (UTC)
	(envelope-from scottl@samsco.org)
Received: from phobos.samsco.home (phobos.samsco.home [192.168.254.11])
	(authenticated bits=0)
	by pooker.samsco.org (8.13.8/8.13.8) with ESMTP id l9ENMSTf062135;
	Sun, 14 Oct 2007 17:22:29 -0600 (MDT)
	(envelope-from scottl@samsco.org)
Message-ID: <4712A494.30803@samsco.org>
Date: Sun, 14 Oct 2007 17:21:56 -0600
From: Scott Long <scottl@samsco.org>
User-Agent: Mozilla/5.0 (Macintosh; U; Intel Mac OS X; en-US;
	rv:1.8.1.6) Gecko/20070802 SeaMonkey/1.1.4
MIME-Version: 1.0
To: Ivan Voras <ivoras@freebsd.org>
References: <008801c80e65$47cbe650$639049d9@EC1a> <feu58o$5uo$1@ger.gmane.org>
In-Reply-To: <feu58o$5uo$1@ger.gmane.org>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
X-Greylist: Sender succeeded SMTP AUTH authentication, not delayed by
	milter-greylist-2.0.2 (pooker.samsco.org [168.103.85.57]);
	Sun, 14 Oct 2007 17:22:29 -0600 (MDT)
X-Spam-Status: No, score=-1.4 required=5.5 tests=ALL_TRUSTED autolearn=failed
	version=3.1.8
X-Spam-Checker-Version: SpamAssassin 3.1.8 (2007-02-13) on pooker.samsco.org
Cc: freebsd-stable@freebsd.org, freebsd-geom@freebsd.org
Subject: Re: g_vfs_done():da3s1a[READ(offset=81064794762854400,
 length=8192)]error = 5
X-BeenThere: freebsd-geom@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: GEOM-specific discussions and implementations
	<freebsd-geom.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-geom>,
	<mailto:freebsd-geom-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-geom>
List-Post: <mailto:freebsd-geom@freebsd.org>
List-Help: <mailto:freebsd-geom-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-geom>,
	<mailto:freebsd-geom-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sun, 14 Oct 2007 23:22:35 -0000

Ivan Voras wrote:
> d_elbracht wrote:
>> we are trying to diagnose errors seen on 6.2, SMP, amd64, cvsup'ed of
>> 2007-10-09
>>
>> Mainboard is a Tyan Thunder h2000M (S3992-E) with 16 GB RAM and 2 x Opteron
>> 2216, da3 is on a 3ware 9550-12
>>
>> we are seeing this error:
>> g_vfs_done():da3s1a[READ(offset=81064794762854400, length=8192)]error = 5
>> on a 12 GB Hyperdrive
>>
>> the offset changes sometimes, but it is always 81064794xxxxxxxxx and well
>> out the 12GB range.
> 
> Yes.
> 
>> According to systat -vm, da3 does tps > 500 (yes, that's a lot)
> 
> That's not a lot :) That's actually low for a modern solid state drive.
> 
>> This leads to an assumption, the error has to do with very high IOs per
>> second on a SMP machine.
> 
> Either that or file system errors. Does fsck run ok or does it say
> anything unusual?
> 

No, filesystem corruption has nothing to do with g_vfs_done messages.

Scott

From owner-freebsd-geom@FreeBSD.ORG  Mon Oct 15 00:04:57 2007
Return-Path: <owner-freebsd-geom@FreeBSD.ORG>
Delivered-To: freebsd-geom@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 1916316A46C
	for <freebsd-geom@freebsd.org>; Mon, 15 Oct 2007 00:04:57 +0000 (UTC)
	(envelope-from janm@transactionware.com)
Received: from mail.transactionware.com (mail.transactionware.com
	[203.14.245.7]) by mx1.freebsd.org (Postfix) with SMTP id 1A3D613C480
	for <freebsd-geom@freebsd.org>; Mon, 15 Oct 2007 00:04:55 +0000 (UTC)
	(envelope-from janm@transactionware.com)
Received: (qmail 90464 invoked from network); 14 Oct 2007 23:38:35 -0000
Received: from midgard.transactionware.com (192.168.1.55)
	by dm.transactionware.com with SMTP; 14 Oct 2007 23:38:35 -0000
Received: (qmail 20180 invoked by uid 907); 14 Oct 2007 23:38:12 -0000
Received: from [192.168.1.51] (HELO janmxp) (192.168.1.51)
	by midgard.transactionware.com (qpsmtpd/0.32) with ESMTP;
	Mon, 15 Oct 2007 09:38:12 +1000
From: "Jan Mikkelsen" <janm@transactionware.com>
To: "'Scott Long'" <scottl@samsco.org>,
	"'Ivan Voras'" <ivoras@freebsd.org>
References: <008801c80e65$47cbe650$639049d9@EC1a> <feu58o$5uo$1@ger.gmane.org>
	<4712A494.30803@samsco.org>
In-Reply-To: <4712A494.30803@samsco.org>
Date: Mon, 15 Oct 2007 09:38:12 +1000
Organization: Transactionware
Message-ID: <000a01c80ebb$49227f90$db677eb0$@com>
MIME-Version: 1.0
Content-Type: text/plain;
	charset="US-ASCII"
Content-Transfer-Encoding: 7bit
X-Mailer: Microsoft Office Outlook 12.0
Thread-Index: AcgOuTwXgPkltFJRQheLCsRVqNf9bwAAU00w
Content-Language: en-au
Cc: freebsd-stable@freebsd.org, freebsd-geom@freebsd.org
Subject: RE: g_vfs_done():da3s1a[READ(offset=81064794762854400,
	length=8192)]error = 5
X-BeenThere: freebsd-geom@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: GEOM-specific discussions and implementations
	<freebsd-geom.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-geom>,
	<mailto:freebsd-geom-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-geom>
List-Post: <mailto:freebsd-geom@freebsd.org>
List-Help: <mailto:freebsd-geom-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-geom>,
	<mailto:freebsd-geom-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 15 Oct 2007 00:04:57 -0000

Scott Long wrote:
> Ivan Voras wrote:
> > Either that or file system errors. Does fsck run ok or does
> it say
> > anything unusual?
> >
> 
> No, filesystem corruption has nothing to do with g_vfs_done
> messages.

Well, perhaps not directly but I think filesystem corruption can
indirectly cause g_vfs_done messages.

If a filesystem is corrupt, the filesystem might attempt to read an
out-of-range block, leading to a g_vfs_done error.  This was the
case for some of the arcmsr problems last year.

In this case, I think the original poster said that the block
number was out of range for the device.

Regards,

Jan Mikkelsen


From owner-freebsd-geom@FreeBSD.ORG  Mon Oct 15 00:14:05 2007
Return-Path: <owner-freebsd-geom@FreeBSD.ORG>
Delivered-To: freebsd-geom@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 1199F16A419;
	Mon, 15 Oct 2007 00:14:05 +0000 (UTC)
	(envelope-from scottl@samsco.org)
Received: from pooker.samsco.org (pooker.samsco.org [168.103.85.57])
	by mx1.freebsd.org (Postfix) with ESMTP id A924A13C455;
	Mon, 15 Oct 2007 00:14:04 +0000 (UTC)
	(envelope-from scottl@samsco.org)
Received: from phobos.samsco.home (phobos.samsco.home [192.168.254.11])
	(authenticated bits=0)
	by pooker.samsco.org (8.13.8/8.13.8) with ESMTP id l9F0DvlP062319;
	Sun, 14 Oct 2007 18:13:58 -0600 (MDT)
	(envelope-from scottl@samsco.org)
Message-ID: <4712B0A6.1050408@samsco.org>
Date: Sun, 14 Oct 2007 18:13:26 -0600
From: Scott Long <scottl@samsco.org>
User-Agent: Mozilla/5.0 (Macintosh; U; Intel Mac OS X; en-US;
	rv:1.8.1.6) Gecko/20070802 SeaMonkey/1.1.4
MIME-Version: 1.0
To: Jan Mikkelsen <janm@transactionware.com>
References: <008801c80e65$47cbe650$639049d9@EC1a> <feu58o$5uo$1@ger.gmane.org>
	<4712A494.30803@samsco.org> <000a01c80ebb$49227f90$db677eb0$@com>
In-Reply-To: <000a01c80ebb$49227f90$db677eb0$@com>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
X-Greylist: Sender succeeded SMTP AUTH authentication, not delayed by
	milter-greylist-2.0.2 (pooker.samsco.org [168.103.85.57]);
	Sun, 14 Oct 2007 18:13:58 -0600 (MDT)
X-Spam-Status: No, score=-1.4 required=5.5 tests=ALL_TRUSTED autolearn=failed
	version=3.1.8
X-Spam-Checker-Version: SpamAssassin 3.1.8 (2007-02-13) on pooker.samsco.org
Cc: freebsd-stable@freebsd.org, 'Ivan Voras' <ivoras@freebsd.org>,
	freebsd-geom@freebsd.org
Subject: Re: g_vfs_done():da3s1a[READ(offset=81064794762854400,
 length=8192)]error = 5
X-BeenThere: freebsd-geom@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: GEOM-specific discussions and implementations
	<freebsd-geom.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-geom>,
	<mailto:freebsd-geom-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-geom>
List-Post: <mailto:freebsd-geom@freebsd.org>
List-Help: <mailto:freebsd-geom-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-geom>,
	<mailto:freebsd-geom-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 15 Oct 2007 00:14:05 -0000

Jan Mikkelsen wrote:
> Scott Long wrote:
>> Ivan Voras wrote:
>>> Either that or file system errors. Does fsck run ok or does
>> it say
>>> anything unusual?
>>>
>> No, filesystem corruption has nothing to do with g_vfs_done
>> messages.
> 
> Well, perhaps not directly but I think filesystem corruption can
> indirectly cause g_vfs_done messages.
> 
> If a filesystem is corrupt, the filesystem might attempt to read an
> out-of-range block, leading to a g_vfs_done error.  This was the
> case for some of the arcmsr problems last year.
> 
> In this case, I think the original poster said that the block
> number was out of range for the device.
> 
> Regards,
> 
> Jan Mikkelsen
> 
> 

Yeah, you're right, the block number is absurd, and it could well be 
caused by a bad block pointer in the filesystem.  It sounds like he's
getting this problem even on fresh installs, which ordinarily would
point to a bad driver.  If it's happening with both TWA and ATA, it's
hard to blame both of those drivers.

Scott


From owner-freebsd-geom@FreeBSD.ORG  Mon Oct 15 08:21:06 2007
Return-Path: <owner-freebsd-geom@FreeBSD.ORG>
Delivered-To: freebsd-geom@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 6461616A46B;
	Mon, 15 Oct 2007 08:21:06 +0000 (UTC)
	(envelope-from d_elbracht@ecngs.de)
Received: from ecngs.de (mail.ecngs.de [217.73.144.50])
	by mx1.freebsd.org (Postfix) with ESMTP id 7547B13C447;
	Mon, 15 Oct 2007 08:21:04 +0000 (UTC)
	(envelope-from d_elbracht@ecngs.de)
Received: from EC1a (ec1.elbracht.net [217.73.144.99]) 
	by ecngs.de (SurgeMail 3.8f2) with ESMTP id 1774348-1922481 
	for multiple; Mon, 15 Oct 2007 10:21:26 +0200
From: "d_elbracht" <d_elbracht@ecngs.de>
To: "'Ivan Voras'" <ivoras@freebsd.org>,
	<freebsd-stable@freebsd.org>
References: <008801c80e65$47cbe650$639049d9@EC1a> <feu58o$5uo$1@ger.gmane.org>
Date: Mon, 15 Oct 2007 10:20:57 +0200
Message-ID: <00cb01c80f04$50b11ed0$639049d9@EC1a>
MIME-Version: 1.0
Content-Type: text/plain;
	charset="us-ascii"
Content-Transfer-Encoding: 7bit
X-Mailer: Microsoft Office Outlook 11
Thread-Index: AcgOsevpOahtmKUeQKG7YhTDqm4A3wATlmcA
In-Reply-To: <feu58o$5uo$1@ger.gmane.org>
X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2900.3138
Cc: freebsd-geom@freebsd.org
Subject: AW: g_vfs_done():da3s1a[READ(offset=81064794762854400,
	length=8192)]error = 5
X-BeenThere: freebsd-geom@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: GEOM-specific discussions and implementations
	<freebsd-geom.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-geom>,
	<mailto:freebsd-geom-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-geom>
List-Post: <mailto:freebsd-geom@freebsd.org>
List-Help: <mailto:freebsd-geom-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-geom>,
	<mailto:freebsd-geom-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 15 Oct 2007 08:21:06 -0000

> > we are trying to diagnose errors seen on 6.2, SMP, amd64, 
> cvsup'ed of
> > 2007-10-09
> > 
> > Mainboard is a Tyan Thunder h2000M (S3992-E) with 16 GB RAM and 2 x 
> > Opteron 2216, da3 is on a 3ware 9550-12
> > 
> > we are seeing this error:
> > g_vfs_done():da3s1a[READ(offset=81064794762854400, 
> length=8192)]error 
> > = 5 on a 12 GB Hyperdrive
> > 
> > the offset changes sometimes, but it is always 
> 81064794xxxxxxxxx and 
> > well out the 12GB range.
> 
> Yes.
> 
> > According to systat -vm, da3 does tps > 500 (yes, that's a lot)
> 
> That's not a lot :) That's actually low for a modern solid 
> state drive.
> 
> > This leads to an assumption, the error has to do with very high IOs 
> > per second on a SMP machine.
> 
> Either that or file system errors. Does fsck run ok or does 
> it say anything unusual?
> 
> There are several theoretical reasons for such errors that 
> are connected with the fact you use solid state drives, but 
> all are tricky to diagnose if you don't have a certain 
> repeatable test you can try. For example:
> some SSDs optimize writes to "spread out" the IO on the 
> chips, but some do it by looking into file system structures 
> to determine where it's safe to relocate the write - 
> obviously this works only with a known and supported file 
> system. This is a really wild guess, but maybe the SSD 
> firmware has error somewhere in this area, trying to 
> interpret UFS as it was FAT? If you manage to get a 
> repeatable failure test, you can try formatting the drive as 
> FAT32 and trying it on that.
> 
> Or maybe it's just a bad drive...
> 
> > The system-disk is a RAID1 on an ICP 5805. All other disks 
> (51) are 20 
> > gstripe'd partitions.
> 
> 51 drives and 20 partitions?
> 
According to the manufaturer, the drive handles any filesystem. In other
words, it's as transparent as any harddisk would be.
Also, as written before, we have seen the error=5 with weird offsets on an
md (memory disk) before too.
fsck on the disk does NOT show any error.

yes, 20 partitions on the other 51 disks (/dev/stripe/data ..datann). That's
for hashfeed from diablo.

One basic question to ask: where does the value for offset= in g_vfs_done()
come from ? 
>From the time the error shows up in syslog I believe, the error only
happens, when a file get's appended.

Dieter


From owner-freebsd-geom@FreeBSD.ORG  Mon Oct 15 09:05:55 2007
Return-Path: <owner-freebsd-geom@FreeBSD.ORG>
Delivered-To: freebsd-geom@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id BB36416A417
	for <freebsd-geom@freebsd.org>; Mon, 15 Oct 2007 09:05:55 +0000 (UTC)
	(envelope-from ivoras@gmail.com)
Received: from nz-out-0506.google.com (nz-out-0506.google.com [64.233.162.235])
	by mx1.freebsd.org (Postfix) with ESMTP id 6C66313C467
	for <freebsd-geom@freebsd.org>; Mon, 15 Oct 2007 09:05:55 +0000 (UTC)
	(envelope-from ivoras@gmail.com)
Received: by nz-out-0506.google.com with SMTP id l8so798493nzf
	for <freebsd-geom@freebsd.org>; Mon, 15 Oct 2007 02:05:54 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=beta;
	h=domainkey-signature:received:received:message-id:date:from:sender:to:subject:cc:in-reply-to:mime-version:content-type:content-transfer-encoding:content-disposition:references:x-google-sender-auth;
	bh=WIClknsSnPfxISdvAzWuHmvEhZq1qoSb1UyLu/2nYo8=;
	b=XM8x7V1ck80TXgwvNPkcQy4OazVx8cqK8TN4IG+JN9WPYxAlwYcG/SyqUjeQVbSe6Rxze01iY8LCa4tAI4Dt+5mE6XM+GhF01S7HPJiMcHLnmT63Gg1yFcsT5s6ftfB0asn9UQh7S/0ChfPCFNJJrG1zhjSUUsO3rYKLEM86BqM=
DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=beta;
	h=received:message-id:date:from:sender:to:subject:cc:in-reply-to:mime-version:content-type:content-transfer-encoding:content-disposition:references:x-google-sender-auth;
	b=oZzwQI0cTs+xE8G+rM1iG2sudtZzb//PZAKxHg10kbytWT7rfOCg/HL7u1192PFP5bSWNyWhNF2hXuHhxYwQwM7+vpT4jKy8o+c+PMQZbMWdiilCVXoHBH+EYabLKTQ/UeRpp3d8Zjun6SnhSG8NoGJzLXRxXnPR9NMy/rLMRBU=
Received: by 10.141.15.19 with SMTP id s19mr2574331rvi.1192439154081;
	Mon, 15 Oct 2007 02:05:54 -0700 (PDT)
Received: by 10.141.211.5 with HTTP; Mon, 15 Oct 2007 02:05:54 -0700 (PDT)
Message-ID: <9bbcef730710150205o7c344432kc8bc828da64bff1f@mail.gmail.com>
Date: Mon, 15 Oct 2007 11:05:54 +0200
From: "Ivan Voras" <ivoras@freebsd.org>
Sender: ivoras@gmail.com
To: d_elbracht <d_elbracht@ecngs.de>
In-Reply-To: <00cb01c80f04$50b11ed0$639049d9@EC1a>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 7bit
Content-Disposition: inline
References: <008801c80e65$47cbe650$639049d9@EC1a> <feu58o$5uo$1@ger.gmane.org>
	<00cb01c80f04$50b11ed0$639049d9@EC1a>
X-Google-Sender-Auth: 11c2e076073077f8
Cc: freebsd-stable@freebsd.org, freebsd-geom@freebsd.org
Subject: Re: g_vfs_done():da3s1a[READ(offset=81064794762854400,
	length=8192)]error = 5
X-BeenThere: freebsd-geom@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: GEOM-specific discussions and implementations
	<freebsd-geom.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-geom>,
	<mailto:freebsd-geom-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-geom>
List-Post: <mailto:freebsd-geom@freebsd.org>
List-Help: <mailto:freebsd-geom-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-geom>,
	<mailto:freebsd-geom-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 15 Oct 2007 09:05:55 -0000

On 15/10/2007, d_elbracht <d_elbracht@ecngs.de> wrote:

> One basic question to ask: where does the value for offset= in g_vfs_done()
> come from ?

Either from the file system or from bugs in the code. I don't remember
seeing similar reports before so the probability of there being bugs
in the code is fairly small.

This is all on raw hardware, not vmware, right?

> From the time the error shows up in syslog I believe, the error only
> happens, when a file get's appended.

From owner-freebsd-geom@FreeBSD.ORG  Mon Oct 15 09:14:36 2007
Return-Path: <owner-freebsd-geom@FreeBSD.ORG>
Delivered-To: freebsd-geom@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id D3C4D16A419;
	Mon, 15 Oct 2007 09:14:36 +0000 (UTC)
	(envelope-from d_elbracht@ecngs.de)
Received: from ecngs.de (mail.ecngs.de [217.73.144.50])
	by mx1.freebsd.org (Postfix) with ESMTP id E604213C478;
	Mon, 15 Oct 2007 09:14:35 +0000 (UTC)
	(envelope-from d_elbracht@ecngs.de)
Received: from EC1a (ec1.elbracht.net [217.73.144.99]) 
	by ecngs.de (SurgeMail 3.8f2) with ESMTP id 1774497-1922481 
	for multiple; Mon, 15 Oct 2007 11:14:57 +0200
From: "d_elbracht" <d_elbracht@ecngs.de>
To: "'Ivan Voras'" <ivoras@freebsd.org>
References: <008801c80e65$47cbe650$639049d9@EC1a> <feu58o$5uo$1@ger.gmane.org>
	<00cb01c80f04$50b11ed0$639049d9@EC1a>
	<9bbcef730710150205o7c344432kc8bc828da64bff1f@mail.gmail.com>
Date: Mon, 15 Oct 2007 11:14:29 +0200
Message-ID: <00cc01c80f0b$cafa7e50$639049d9@EC1a>
MIME-Version: 1.0
Content-Type: text/plain;
	charset="us-ascii"
Content-Transfer-Encoding: 7bit
X-Mailer: Microsoft Office Outlook 11
Thread-Index: AcgPCq+WuEiplNxCQ/aZVM3L84OMoQAANokw
In-Reply-To: <9bbcef730710150205o7c344432kc8bc828da64bff1f@mail.gmail.com>
X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2900.3138
Cc: freebsd-stable@freebsd.org, freebsd-geom@freebsd.org
Subject: AW: g_vfs_done():da3s1a[READ(offset=81064794762854400,
	length=8192)]error = 5
X-BeenThere: freebsd-geom@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: GEOM-specific discussions and implementations
	<freebsd-geom.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-geom>,
	<mailto:freebsd-geom-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-geom>
List-Post: <mailto:freebsd-geom@freebsd.org>
List-Help: <mailto:freebsd-geom-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-geom>,
	<mailto:freebsd-geom-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 15 Oct 2007 09:14:36 -0000

> > One basic question to ask: where does the value for offset= in 
> > g_vfs_done() come from ?
> 
> Either from the file system or from bugs in the code. I don't 
> remember seeing similar reports before so the probability of 
> there being bugs in the code is fairly small.
> 
> This is all on raw hardware, not vmware, right?
> 
> > From the time the error shows up in syslog I believe, the 
> error only 
> > happens, when a file get's appended.

Here is a similar one:
http://www.nabble.com/g_vfs_done():mfid1-ERROR-when-writing-to-18TB-MFI-RAID
-volume-t4590438.html

it's all raw hardware, no vmware

Dieter


From owner-freebsd-geom@FreeBSD.ORG  Mon Oct 15 11:06:15 2007
Return-Path: <owner-freebsd-geom@FreeBSD.ORG>
Delivered-To: freebsd-geom@hub.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 60BF616A475
	for <freebsd-geom@hub.freebsd.org>;
	Mon, 15 Oct 2007 11:06:15 +0000 (UTC)
	(envelope-from owner-bugmaster@FreeBSD.org)
Received: from freefall.freebsd.org (freefall.freebsd.org
	[IPv6:2001:4f8:fff6::28])
	by mx1.freebsd.org (Postfix) with ESMTP id 322C413C4A6
	for <freebsd-geom@hub.freebsd.org>;
	Mon, 15 Oct 2007 11:06:15 +0000 (UTC)
	(envelope-from owner-bugmaster@FreeBSD.org)
Received: from freefall.freebsd.org (localhost [127.0.0.1])
	by freefall.freebsd.org (8.14.1/8.14.1) with ESMTP id l9FB6EXs080436
	for <freebsd-geom@FreeBSD.org>; Mon, 15 Oct 2007 11:06:14 GMT
	(envelope-from owner-bugmaster@FreeBSD.org)
Received: (from gnats@localhost)
	by freefall.freebsd.org (8.14.1/8.14.1/Submit) id l9FB6Eub080434
	for freebsd-geom@FreeBSD.org; Mon, 15 Oct 2007 11:06:14 GMT
	(envelope-from owner-bugmaster@FreeBSD.org)
Date: Mon, 15 Oct 2007 11:06:14 GMT
Message-Id: <200710151106.l9FB6Eub080434@freefall.freebsd.org>
X-Authentication-Warning: freefall.freebsd.org: gnats set sender to
	owner-bugmaster@FreeBSD.org using -f
From: FreeBSD bugmaster <bugmaster@FreeBSD.org>
To: freebsd-geom@FreeBSD.org
Cc: 
Subject: Current problem reports assigned to you
X-BeenThere: freebsd-geom@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: GEOM-specific discussions and implementations
	<freebsd-geom.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-geom>,
	<mailto:freebsd-geom-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-geom>
List-Post: <mailto:freebsd-geom@freebsd.org>
List-Help: <mailto:freebsd-geom-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-geom>,
	<mailto:freebsd-geom-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 15 Oct 2007 11:06:15 -0000


From owner-freebsd-geom@FreeBSD.ORG  Mon Oct 15 14:17:21 2007
Return-Path: <owner-freebsd-geom@FreeBSD.ORG>
Delivered-To: freebsd-geom@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 5E56016A418;
	Mon, 15 Oct 2007 14:17:21 +0000 (UTC)
	(envelope-from anderson@freebsd.org)
Received: from ns.trinitel.com (186.161.36.72.static.reverse.ltdomains.com
	[72.36.161.186])
	by mx1.freebsd.org (Postfix) with ESMTP id 310B813C467;
	Mon, 15 Oct 2007 14:17:20 +0000 (UTC)
	(envelope-from anderson@freebsd.org)
Received: from proton.storspeed.com (209-163-168-124.static.twtelecom.net
	[209.163.168.124]) (authenticated bits=0)
	by ns.trinitel.com (8.14.1/8.14.1) with ESMTP id l9FEGRLq005947;
	Mon, 15 Oct 2007 09:16:30 -0500 (CDT)
	(envelope-from anderson@freebsd.org)
Message-ID: <47137634.1010703@freebsd.org>
Date: Mon, 15 Oct 2007 09:16:20 -0500
From: Eric Anderson <anderson@freebsd.org>
User-Agent: Thunderbird 2.0.0.6 (Macintosh/20070728)
MIME-Version: 1.0
To: d_elbracht <d_elbracht@ecngs.de>
References: <008801c80e65$47cbe650$639049d9@EC1a> <feu58o$5uo$1@ger.gmane.org>
	<00cb01c80f04$50b11ed0$639049d9@EC1a>
In-Reply-To: <00cb01c80f04$50b11ed0$639049d9@EC1a>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
X-Spam-Status: No, score=-2.0 required=5.0 tests=AWL,BAYES_00 autolearn=ham
	version=3.1.8
X-Spam-Checker-Version: SpamAssassin 3.1.8 (2007-02-13) on ns.trinitel.com
Cc: 'Ivan Voras' <ivoras@freebsd.org>, freebsd-geom@freebsd.org
Subject: Re: AW: g_vfs_done():da3s1a[READ(offset=81064794762854400,
 length=8192)]error = 5
X-BeenThere: freebsd-geom@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: GEOM-specific discussions and implementations
	<freebsd-geom.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-geom>,
	<mailto:freebsd-geom-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-geom>
List-Post: <mailto:freebsd-geom@freebsd.org>
List-Help: <mailto:freebsd-geom-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-geom>,
	<mailto:freebsd-geom-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 15 Oct 2007 14:17:21 -0000

d_elbracht wrote:
>>> we are trying to diagnose errors seen on 6.2, SMP, amd64, 
>> cvsup'ed of
>>> 2007-10-09
>>>
>>> Mainboard is a Tyan Thunder h2000M (S3992-E) with 16 GB RAM and 2 x 
>>> Opteron 2216, da3 is on a 3ware 9550-12
>>>
>>> we are seeing this error:
>>> g_vfs_done():da3s1a[READ(offset=81064794762854400, 
>> length=8192)]error 
>>> = 5 on a 12 GB Hyperdrive
>>>
>>> the offset changes sometimes, but it is always 
>> 81064794xxxxxxxxx and 
>>> well out the 12GB range.
>> Yes.
>>
>>> According to systat -vm, da3 does tps > 500 (yes, that's a lot)
>> That's not a lot :) That's actually low for a modern solid 
>> state drive.
>>
>>> This leads to an assumption, the error has to do with very high IOs 
>>> per second on a SMP machine.
>> Either that or file system errors. Does fsck run ok or does 
>> it say anything unusual?
>>
>> There are several theoretical reasons for such errors that 
>> are connected with the fact you use solid state drives, but 
>> all are tricky to diagnose if you don't have a certain 
>> repeatable test you can try. For example:
>> some SSDs optimize writes to "spread out" the IO on the 
>> chips, but some do it by looking into file system structures 
>> to determine where it's safe to relocate the write - 
>> obviously this works only with a known and supported file 
>> system. This is a really wild guess, but maybe the SSD 
>> firmware has error somewhere in this area, trying to 
>> interpret UFS as it was FAT? If you manage to get a 
>> repeatable failure test, you can try formatting the drive as 
>> FAT32 and trying it on that.

Solid state drives don't behave much differently that a regular drive 
from FreeBSD's point of view.  The huge difference most people notice is 
that they perform best at their page size (or maybe what the SSD 
manufacturer might call a block size, which is not a sector size), which 
is often 128K or 256K.  IO smaller than the page size suffers a big 
penalty since most SSD devices do not have a cache onboard (although 
some do now).

>> Or maybe it's just a bad drive...

I doubt it's a bad device..

>>> The system-disk is a RAID1 on an ICP 5805. All other disks 
>> (51) are 20 
>>> gstripe'd partitions.
>> 51 drives and 20 partitions?
>>
> According to the manufaturer, the drive handles any filesystem. In other
> words, it's as transparent as any harddisk would be.
> Also, as written before, we have seen the error=5 with weird offsets on an
> md (memory disk) before too.
> fsck on the disk does NOT show any error.
> 
> yes, 20 partitions on the other 51 disks (/dev/stripe/data ..datann). That's
> for hashfeed from diablo.
> 
> One basic question to ask: where does the value for offset= in g_vfs_done()
> come from ? 
>>From the time the error shows up in syslog I believe, the error only
> happens, when a file get's appended.

I wonder if (wild guess follows) there's a 32/64 bit conversion problem 
somewhere, like a 32bit number cast as 64bit or something.

I'd like to see a full trace to see what path it takes.  Maybe putting a 
  panic in the error path would be worth doing.

Eric


From owner-freebsd-geom@FreeBSD.ORG  Mon Oct 15 15:06:40 2007
Return-Path: <owner-freebsd-geom@FreeBSD.ORG>
Delivered-To: freebsd-geom@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 5C30816A418
	for <freebsd-geom@freebsd.org>; Mon, 15 Oct 2007 15:06:40 +0000 (UTC)
	(envelope-from phk@critter.freebsd.dk)
Received: from phk.freebsd.dk (phk.freebsd.dk [130.225.244.222])
	by mx1.freebsd.org (Postfix) with ESMTP id E031B13C442
	for <freebsd-geom@freebsd.org>; Mon, 15 Oct 2007 15:06:39 +0000 (UTC)
	(envelope-from phk@critter.freebsd.dk)
Received: from critter.freebsd.dk (unknown [192.168.61.3])
	by phk.freebsd.dk (Postfix) with ESMTP id 95CF517105;
	Mon, 15 Oct 2007 15:06:37 +0000 (UTC)
Received: from critter.freebsd.dk (localhost [127.0.0.1])
	by critter.freebsd.dk (8.14.1/8.14.1) with ESMTP id l9FF6abM048314;
	Mon, 15 Oct 2007 15:06:36 GMT (envelope-from phk@critter.freebsd.dk)
To: Eric Anderson <anderson@freebsd.org>
From: "Poul-Henning Kamp" <phk@phk.freebsd.dk>
In-Reply-To: Your message of "Mon, 15 Oct 2007 09:16:20 EST."
	<47137634.1010703@freebsd.org> 
Date: Mon, 15 Oct 2007 15:06:36 +0000
Message-ID: <48313.1192460796@critter.freebsd.dk>
Sender: phk@critter.freebsd.dk
Cc: d_elbracht <d_elbracht@ecngs.de>, 'Ivan Voras' <ivoras@freebsd.org>,
	freebsd-geom@freebsd.org
Subject: Re: AW: g_vfs_done():da3s1a[READ(offset=81064794762854400,
	length=8192)]error = 5 
X-BeenThere: freebsd-geom@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: GEOM-specific discussions and implementations
	<freebsd-geom.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-geom>,
	<mailto:freebsd-geom-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-geom>
List-Post: <mailto:freebsd-geom@freebsd.org>
List-Help: <mailto:freebsd-geom-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-geom>,
	<mailto:freebsd-geom-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 15 Oct 2007 15:06:40 -0000

In message <47137634.1010703@freebsd.org>, Eric Anderson writes:

>Solid state drives don't behave much differently that a regular drive 
>from FreeBSD's point of view.

Yes and no.  The effective lack of seek time has the potential to expose
a lot of flawed reasoning in filesystems with respect to ordering and
duration of I/O requests.

It might be a good idea to have GEOM module that could implement a
seek-time sort of behaviour, just for being able to falsifying that
theory.

-- 
Poul-Henning Kamp       | UNIX since Zilog Zeus 3.20
phk@FreeBSD.ORG         | TCP/IP since RFC 956
FreeBSD committer       | BSD since 4.3-tahoe    
Never attribute to malice what can adequately be explained by incompetence.

From owner-freebsd-geom@FreeBSD.ORG  Mon Oct 15 16:52:35 2007
Return-Path: <owner-freebsd-geom@FreeBSD.ORG>
Delivered-To: freebsd-geom@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 7DAF116A419;
	Mon, 15 Oct 2007 16:52:35 +0000 (UTC)
	(envelope-from anderson@freebsd.org)
Received: from ns.trinitel.com (186.161.36.72.static.reverse.ltdomains.com
	[72.36.161.186])
	by mx1.freebsd.org (Postfix) with ESMTP id 5104613C467;
	Mon, 15 Oct 2007 16:52:35 +0000 (UTC)
	(envelope-from anderson@freebsd.org)
Received: from proton.storspeed.com (209-163-168-124.static.twtelecom.net
	[209.163.168.124]) (authenticated bits=0)
	by ns.trinitel.com (8.14.1/8.14.1) with ESMTP id l9FGqF4U080757;
	Mon, 15 Oct 2007 11:52:16 -0500 (CDT)
	(envelope-from anderson@freebsd.org)
Message-ID: <47139AB8.9060602@freebsd.org>
Date: Mon, 15 Oct 2007 11:52:08 -0500
From: Eric Anderson <anderson@freebsd.org>
User-Agent: Thunderbird 2.0.0.6 (Macintosh/20070728)
MIME-Version: 1.0
To: Poul-Henning Kamp <phk@phk.freebsd.dk>
References: <48313.1192460796@critter.freebsd.dk>
In-Reply-To: <48313.1192460796@critter.freebsd.dk>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
X-Spam-Status: No, score=-2.0 required=5.0 tests=AWL,BAYES_00 autolearn=ham
	version=3.1.8
X-Spam-Checker-Version: SpamAssassin 3.1.8 (2007-02-13) on ns.trinitel.com
Cc: d_elbracht <d_elbracht@ecngs.de>, 'Ivan Voras' <ivoras@freebsd.org>,
	freebsd-geom@freebsd.org
Subject: Re: AW: g_vfs_done():da3s1a[READ(offset=81064794762854400,
 length=8192)]error = 5
X-BeenThere: freebsd-geom@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: GEOM-specific discussions and implementations
	<freebsd-geom.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-geom>,
	<mailto:freebsd-geom-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-geom>
List-Post: <mailto:freebsd-geom@freebsd.org>
List-Help: <mailto:freebsd-geom-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-geom>,
	<mailto:freebsd-geom-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 15 Oct 2007 16:52:35 -0000

Poul-Henning Kamp wrote:
> In message <47137634.1010703@freebsd.org>, Eric Anderson writes:
> 
>> Solid state drives don't behave much differently that a regular drive 
>>from FreeBSD's point of view.
> 
> Yes and no.  The effective lack of seek time has the potential to expose
> a lot of flawed reasoning in filesystems with respect to ordering and
> duration of I/O requests.
> 
> It might be a good idea to have GEOM module that could implement a
> seek-time sort of behaviour, just for being able to falsifying that
> theory.
> 


Or an option to gnop?

Eric


From owner-freebsd-geom@FreeBSD.ORG  Mon Oct 15 17:47:08 2007
Return-Path: <owner-freebsd-geom@FreeBSD.ORG>
Delivered-To: freebsd-geom@hub.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 06D1C16A474
	for <freebsd-geom@hub.freebsd.org>;
	Mon, 15 Oct 2007 17:47:08 +0000 (UTC)
	(envelope-from owner-bugmaster@FreeBSD.org)
Received: from freefall.freebsd.org (freefall.freebsd.org
	[IPv6:2001:4f8:fff6::28])
	by mx1.freebsd.org (Postfix) with ESMTP id DDDC113C46E
	for <freebsd-geom@hub.freebsd.org>;
	Mon, 15 Oct 2007 17:47:07 +0000 (UTC)
	(envelope-from owner-bugmaster@FreeBSD.org)
Received: from freefall.freebsd.org (localhost [127.0.0.1])
	by freefall.freebsd.org (8.14.1/8.14.1) with ESMTP id l9FHl7hF014949
	for <freebsd-geom@FreeBSD.org>; Mon, 15 Oct 2007 17:47:07 GMT
	(envelope-from owner-bugmaster@FreeBSD.org)
Received: (from gnats@localhost)
	by freefall.freebsd.org (8.14.1/8.14.1/Submit) id l9FHl7lO014945
	for freebsd-geom@FreeBSD.org; Mon, 15 Oct 2007 17:47:07 GMT
	(envelope-from owner-bugmaster@FreeBSD.org)
Date: Mon, 15 Oct 2007 17:47:07 GMT
Message-Id: <200710151747.l9FHl7lO014945@freefall.freebsd.org>
X-Authentication-Warning: freefall.freebsd.org: gnats set sender to
	owner-bugmaster@FreeBSD.org using -f
From: FreeBSD bugmaster <bugmaster@FreeBSD.org>
To: freebsd-geom@FreeBSD.org
Cc: 
Subject: Current problem reports assigned to freebsd-geom@FreeBSD.org
X-BeenThere: freebsd-geom@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: GEOM-specific discussions and implementations
	<freebsd-geom.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-geom>,
	<mailto:freebsd-geom-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-geom>
List-Post: <mailto:freebsd-geom@freebsd.org>
List-Help: <mailto:freebsd-geom-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-geom>,
	<mailto:freebsd-geom-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 15 Oct 2007 17:47:08 -0000

Current FreeBSD problem reports
Critical problems
Serious problems

S Tracker      Resp.      Description
--------------------------------------------------------------------------------
o kern/73177   geom       kldload geom_* causes panic due to memory exhaustion
o kern/76538   geom       [gbde] nfs-write on gbde partition stalls and continue
o kern/83464   geom       [geom] [patch] Unhandled malloc failures within libgeo
o kern/84556   geom       [geom] GBDE-encrypted swap causes panic at shutdown
o kern/87544   geom       [gbde] mmaping large files on a gbde filesystem deadlo
o kern/89102   geom       [geom_vfs] [panic] panic when forced unmount FS from u
o bin/90093    geom       fdisk(8) incapable of altering in-core geometry
o kern/90582   geom       [geom_mirror] [panic] Restore cause panic string (ffs_
o kern/98034   geom       [geom] dereference of NULL pointer in acd_geom_detach 
o kern/104389  geom       [geom] [patch] sys/geom/geom_dump.c doesn't encode XML
o kern/113419  geom       [geom] geom fox multipathing not failing back
o misc/113543  geom       [geom] [patch] geom(8) utilities don't work inside the
o kern/113957  geom       [gmirror] gmirror is intermittently reporting a degrad
o kern/115572  geom       [gbde] gbde partitions fail at 28bit/48bit LBA address

14 problems total.

Non-critical problems

S Tracker      Resp.      Description
--------------------------------------------------------------------------------
o bin/78131    geom       gbde "destroy" not working.
o kern/79251   geom       [2TB] newfs fails on 2.6TB gbde device
o kern/94632   geom       [geom] Kernel output resets input while GELI asks for 
f kern/105390  geom       [geli] filesystem on a md backed by sparse file with s
o kern/107707  geom       [geom] [patch] add new class geom_xbox360 to slice up 
p bin/110705   geom       gmirror control utility does not exit with correct exi
o kern/113837  geom       [geom] unable to access 1024 sector size storage
o kern/113885  geom       [geom] [patch] improved gmirror balance algorithm
o kern/114532  geom       GEOM_MIRROR shows up in kldstat even if compiled in th
o kern/115547  geom       [geom] [patch] for GEOM Eli to get password from stdin

10 problems total.


From owner-freebsd-geom@FreeBSD.ORG  Tue Oct 16 10:05:35 2007
Return-Path: <owner-freebsd-geom@FreeBSD.ORG>
Delivered-To: freebsd-geom@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 598B516A417;
	Tue, 16 Oct 2007 10:05:35 +0000 (UTC)
	(envelope-from d_elbracht@ecngs.de)
Received: from ecngs.de (mail.ecngs.de [217.73.144.50])
	by mx1.freebsd.org (Postfix) with ESMTP id 1F6FA13C45D;
	Tue, 16 Oct 2007 10:05:33 +0000 (UTC)
	(envelope-from d_elbracht@ecngs.de)
Received: from EC1a (ec1.elbracht.net [217.73.144.99]) 
	by ecngs.de (SurgeMail 3.8f2) with ESMTP id 1777227-1922481 
	for multiple; Tue, 16 Oct 2007 12:05:54 +0200
From: "d_elbracht" <d_elbracht@ecngs.de>
To: "'Eric Anderson'" <anderson@freebsd.org>
References: <008801c80e65$47cbe650$639049d9@EC1a> <feu58o$5uo$1@ger.gmane.org>
	<00cb01c80f04$50b11ed0$639049d9@EC1a>
	<47137634.1010703@freebsd.org>
Date: Tue, 16 Oct 2007 12:05:23 +0200
Message-ID: <000b01c80fdc$12582f10$639049d9@EC1a>
MIME-Version: 1.0
Content-Type: text/plain;
	charset="us-ascii"
Content-Transfer-Encoding: 7bit
X-Mailer: Microsoft Office Outlook 11
In-Reply-To: <47137634.1010703@freebsd.org>
X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2900.3138
Thread-Index: AcgPN6GqbK/sjYCaTtGAOMP49/u7+wAo/O0w
Cc: freebsd-stable@freebsd.org, 'Ivan Voras' <ivoras@freebsd.org>,
	freebsd-geom@freebsd.org
Subject: AW:  Re: AW: g_vfs_done():da3s1a[READ(offset=81064794762854400,
	length=8192)]error = 5
X-BeenThere: freebsd-geom@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: GEOM-specific discussions and implementations
	<freebsd-geom.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-geom>,
	<mailto:freebsd-geom-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-geom>
List-Post: <mailto:freebsd-geom@freebsd.org>
List-Help: <mailto:freebsd-geom-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-geom>,
	<mailto:freebsd-geom-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 16 Oct 2007 10:05:35 -0000

> > One basic question to ask: where does the value for offset= in 
> > g_vfs_done() come from ?
> >>From the time the error shows up in syslog I believe, the error only
> > happens, when a file get's appended.
> 
> I wonder if (wild guess follows) there's a 32/64 bit 
> conversion problem somewhere, like a 32bit number cast as 
> 64bit or something.
> 
> I'd like to see a full trace to see what path it takes.  
> Maybe putting a
>   panic in the error path would be worth doing.
> 

can you give me some hints please how to do this ? I'm willing to try about
everything to get this problem nailed down.

Dieter


From owner-freebsd-geom@FreeBSD.ORG  Tue Oct 16 14:48:20 2007
Return-Path: <owner-freebsd-geom@FreeBSD.ORG>
Delivered-To: freebsd-geom@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 1375E16A420;
	Tue, 16 Oct 2007 14:48:20 +0000 (UTC) (envelope-from avg@icyb.net.ua)
Received: from falcon.cybervisiontech.com (falcon.cybervisiontech.com
	[217.20.163.9])
	by mx1.freebsd.org (Postfix) with ESMTP id D05CC13C46A;
	Tue, 16 Oct 2007 14:48:19 +0000 (UTC) (envelope-from avg@icyb.net.ua)
Received: from localhost (localhost [127.0.0.1])
	by falcon.cybervisiontech.com (Postfix) with ESMTP id E8EB9744009;
	Tue, 16 Oct 2007 17:14:06 +0300 (EEST)
X-Virus-Scanned: Debian amavisd-new at falcon.cybervisiontech.com
Received: from falcon.cybervisiontech.com ([127.0.0.1])
	by localhost (falcon.cybervisiontech.com [127.0.0.1]) (amavisd-new,
	port 10024)
	with ESMTP id nuJ0nV7ZYjJR; Tue, 16 Oct 2007 17:14:06 +0300 (EEST)
Received: from [10.2.1.87] (gateway.cybervisiontech.com.ua [88.81.251.18])
	(using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits))
	(No client certificate requested)
	by falcon.cybervisiontech.com (Postfix) with ESMTP id 52AFE744008;
	Tue, 16 Oct 2007 17:14:06 +0300 (EEST)
Message-ID: <4714C724.6000809@icyb.net.ua>
Date: Tue, 16 Oct 2007 17:13:56 +0300
From: Andriy Gapon <avg@icyb.net.ua>
User-Agent: Thunderbird 2.0.0.6 (X11/20070803)
MIME-Version: 1.0
To: d_elbracht <d_elbracht@ecngs.de>
References: <1192382586.00813930.1192369201@10.7.7.3>
	<1192414981.00814129.1192401601@10.7.7.3>
	<1192447399.00814254.1192437006@10.7.7.3>
	<1192468999.00814418.1192458001@10.7.7.3>
	<1192540986.00814865.1192529403@10.7.7.3>
In-Reply-To: <1192540986.00814865.1192529403@10.7.7.3>
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 7bit
Cc: freebsd-geom@freebsd.org, freebsd-stable@freebsd.org,
	'Ivan Voras' <ivoras@freebsd.org>
Subject: Re: AW:  Re: AW: g_vfs_done():da3s1a[READ(offset=81064794762854400, 
 length=8192)]error = 5
X-BeenThere: freebsd-geom@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: GEOM-specific discussions and implementations
	<freebsd-geom.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-geom>,
	<mailto:freebsd-geom-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-geom>
List-Post: <mailto:freebsd-geom@freebsd.org>
List-Help: <mailto:freebsd-geom-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-geom>,
	<mailto:freebsd-geom-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 16 Oct 2007 14:48:20 -0000


Just a wild shot here: I have seen a similar message recently when I
played with my disks. I re-arranged some partitions (and filesystems)
within a slice and it so happened (and I almost know why) that there was
some discrepancy between on-disk and in-memory label of that slice. I
ran newfs on one of the new partitions and apparently it used one label
to determine its size, but after the reboot the other label was used. As
a result I had a UFS2 filesystem with size larger than a partition that
hosted it. And after that I saw the messages similar to the one in the
subject.

All of the above is a result of my understanding of how these things
work, so it may be incorrect. But making sure that disklabels match
(that is, there is only one disklabel) and re-newfs-ing the filesystems
did help me.

So I would compare, just in case, outputs of, say, 'dumpfs -m' near '-s'
and disklabel output.

Just my 2 bits.


P.S. example of the error that I had:
g_vfs_done():ad4s1e[READ(offset=20420280320, length=16384)]error = 5

-- 
Andriy Gapon

From owner-freebsd-geom@FreeBSD.ORG  Tue Oct 16 14:56:15 2007
Return-Path: <owner-freebsd-geom@FreeBSD.ORG>
Delivered-To: freebsd-geom@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 76BE216A421;
	Tue, 16 Oct 2007 14:56:15 +0000 (UTC)
	(envelope-from anderson@freebsd.org)
Received: from ns.trinitel.com (186.161.36.72.static.reverse.ltdomains.com
	[72.36.161.186])
	by mx1.freebsd.org (Postfix) with ESMTP id 6633413C4A5;
	Tue, 16 Oct 2007 14:56:15 +0000 (UTC)
	(envelope-from anderson@freebsd.org)
Received: from proton.storspeed.com (209-163-168-124.static.twtelecom.net
	[209.163.168.124]) (authenticated bits=0)
	by ns.trinitel.com (8.14.1/8.14.1) with ESMTP id l9GEtrpe084402;
	Tue, 16 Oct 2007 09:55:55 -0500 (CDT)
	(envelope-from anderson@freebsd.org)
Message-ID: <4714D0F1.2000903@freebsd.org>
Date: Tue, 16 Oct 2007 09:55:45 -0500
From: Eric Anderson <anderson@freebsd.org>
User-Agent: Thunderbird 2.0.0.6 (Macintosh/20070728)
MIME-Version: 1.0
To: d_elbracht <d_elbracht@ecngs.de>
References: <008801c80e65$47cbe650$639049d9@EC1a> <feu58o$5uo$1@ger.gmane.org>
	<00cb01c80f04$50b11ed0$639049d9@EC1a>
	<47137634.1010703@freebsd.org>
	<000b01c80fdc$12582f10$639049d9@EC1a>
In-Reply-To: <000b01c80fdc$12582f10$639049d9@EC1a>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
X-Spam-Status: No, score=-2.0 required=5.0 tests=AWL,BAYES_00 autolearn=ham
	version=3.1.8
X-Spam-Checker-Version: SpamAssassin 3.1.8 (2007-02-13) on ns.trinitel.com
Cc: 'Ivan Voras' <ivoras@freebsd.org>, freebsd-geom@freebsd.org
Subject: Re: AW:  Re: AW: g_vfs_done():da3s1a[READ(offset=81064794762854400, 
 length=8192)]error = 5
X-BeenThere: freebsd-geom@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: GEOM-specific discussions and implementations
	<freebsd-geom.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-geom>,
	<mailto:freebsd-geom-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-geom>
List-Post: <mailto:freebsd-geom@freebsd.org>
List-Help: <mailto:freebsd-geom-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-geom>,
	<mailto:freebsd-geom-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 16 Oct 2007 14:56:15 -0000

d_elbracht wrote:
>>> One basic question to ask: where does the value for offset= in 
>>> g_vfs_done() come from ?
>>> >From the time the error shows up in syslog I believe, the error only
>>> happens, when a file get's appended.
>> I wonder if (wild guess follows) there's a 32/64 bit 
>> conversion problem somewhere, like a 32bit number cast as 
>> 64bit or something.
>>
>> I'd like to see a full trace to see what path it takes.  
>> Maybe putting a
>>   panic in the error path would be worth doing.
>>
> 
> can you give me some hints please how to do this ? I'm willing to try about
> everything to get this problem nailed down.


I would add debugging to your kernel config, and then around here:

http://fxr.googlebit.com/source/sys/geom/geom_vfs.c?v=8-CURRENT#L77

change the printf to a panic(), and recompile your kernel.  Also, don't 
forget to set up a dump partition (swap).  You can find out how to do 
the debugging parts and dump partition in the Handbook.

Eric

From owner-freebsd-geom@FreeBSD.ORG  Wed Oct 17 08:12:06 2007
Return-Path: <owner-freebsd-geom@FreeBSD.ORG>
Delivered-To: freebsd-geom@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id B415016A419
	for <freebsd-geom@freebsd.org>; Wed, 17 Oct 2007 08:12:06 +0000 (UTC)
	(envelope-from avg@icyb.net.ua)
Received: from falcon.cybervisiontech.com (falcon.cybervisiontech.com
	[217.20.163.9])
	by mx1.freebsd.org (Postfix) with ESMTP id 35D2613C458
	for <freebsd-geom@freebsd.org>; Wed, 17 Oct 2007 08:12:05 +0000 (UTC)
	(envelope-from avg@icyb.net.ua)
Received: from localhost (localhost [127.0.0.1])
	by falcon.cybervisiontech.com (Postfix) with ESMTP id 7854843C315;
	Wed, 17 Oct 2007 11:12:02 +0300 (EEST)
X-Virus-Scanned: Debian amavisd-new at falcon.cybervisiontech.com
Received: from falcon.cybervisiontech.com ([127.0.0.1])
	by localhost (falcon.cybervisiontech.com [127.0.0.1]) (amavisd-new,
	port 10024)
	with ESMTP id w1QngMTGcRYH; Wed, 17 Oct 2007 11:12:02 +0300 (EEST)
Received: from [10.2.1.87] (gateway.cybervisiontech.com.ua [88.81.251.18])
	(using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits))
	(No client certificate requested)
	by falcon.cybervisiontech.com (Postfix) with ESMTP id E449743C28E;
	Wed, 17 Oct 2007 11:12:01 +0300 (EEST)
Message-ID: <4715C3D1.3070308@icyb.net.ua>
Date: Wed, 17 Oct 2007 11:12:01 +0300
From: Andriy Gapon <avg@icyb.net.ua>
User-Agent: Thunderbird 2.0.0.6 (X11/20070803)
MIME-Version: 1.0
To: freebsd-geom@freebsd.org
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 7bit
Cc: Pawel Jakub Dawidek <pjd@FreeBSD.org>
Subject: gjournal: FLUSHCACHE timed out
X-BeenThere: freebsd-geom@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: GEOM-specific discussions and implementations
	<freebsd-geom.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-geom>,
	<mailto:freebsd-geom-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-geom>
List-Post: <mailto:freebsd-geom@freebsd.org>
List-Help: <mailto:freebsd-geom-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-geom>,
	<mailto:freebsd-geom-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 17 Oct 2007 08:12:06 -0000


Couple of days ago I started using gjournal on FreeBSD 6.2 using a patch
from here:
http://people.freebsd.org/~pjd/patches/gjournal6.patch

I actually had to make 4 minor and obvious tweaks to the patch to make
it apply cleanly to my src.
I started to get the following messages sometimes:

kernel: ad4: FAILURE - FLUSHCACHE timed out
kernel: GEOM_JOURNAL: Flush cache of ad4s1ge: error=5.
kernel: ad4: FAILURE - FLUSHCACHE timed out
kernel: GEOM_JOURNAL: Flush cache of ad4s1ge: error=5.
kernel: ad4: FAILURE - FLUSHCACHE timed out
kernel: GEOM_JOURNAL: Flush cache of ad4s1ge: error=5.
vvvvvvvvv this one is unusual and is found only once
kernel: handle_workitem_freeblocks: block count
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
kernel: ad4: FAILURE - FLUSHCACHE timed out
kernel: GEOM_JOURNAL: Flush cache of ad4s1ge: error=5.
kernel: ad4: FAILURE - FLUSHCACHE timed out
kernel: GEOM_JOURNAL: Flush cache of ad4s1ge: error=5.
kernel: ad4: FAILURE - FLUSHCACHE timed out
kernel: GEOM_JOURNAL: Flush cache of ad4s1ge: error=5.
kernel: ad4: FAILURE - FLUSHCACHE timed out
kernel: GEOM_JOURNAL: Flush cache of ad4s1ge: error=5.
kernel: ad4: FAILURE - FLUSHCACHE timed out
kernel: GEOM_JOURNAL: Flush cache of ad4s1ge: error=5.
kernel: ad4: FAILURE - FLUSHCACHE timed out
kernel: GEOM_JOURNAL: Flush cache of ad4s1ge: error=5.

ad4s1ge (please don't pay attention to its slightly unusual name, this
is for historic reasons) is a journal partition/consumer for my /var
filesystem/partition/provider.
Size of /var is 16G, size of the journal is slightly less than 1G (1G -
32 sectors actually). /var is UFS2 with softupdates enabled.

I noticed that I get these messages only when I run 'dump' on any of my
filesystems. I think that dump is using /tmp or /var/tmp for some
temporary data and in my setup both of those are in /var filesystem.

So my I guess is that /var is being written "too" actively and I have to
tune some parameters to make things smooth.

More information:
$ uname -srm
FreeBSD 6.2-RELEASE-p6 amd64

$ sysctl -a | fgrep journal
kern.geom.journal.debug: 0
kern.geom.journal.switch_time: 10
kern.geom.journal.parallel_flushes: 16
kern.geom.journal.accept_immediately: 64
kern.geom.journal.parallel_copies: 16
kern.geom.journal.record_entries: 20
kern.geom.journal.optimize: 0
kern.geom.journal.cache.used: 16384
kern.geom.journal.cache.limit: 209715200
kern.geom.journal.cache.divisor: 2
kern.geom.journal.cache.switch: 90
kern.geom.journal.cache.misses: 0
kern.geom.journal.cache.alloc_failures: 0
kern.geom.journal.stats.skipped_bytes: 241266688
kern.geom.journal.stats.combined_ios: 62184
kern.geom.journal.stats.switches: 24144
kern.geom.journal.stats.wait_for_copy: 0
kern.geom.journal.stats.low_mem: 287
 journal_data     4    18K       -   624220  512,2048,4096

$ dmesg | fgrep ad4 | head -1
ad4: 286168MB <WDC WD3000JS-19PDB0 21.00M21> at ata2-master SATA300

$ dmesg | fgrep -B1 ata2 | head -2
atapci1: <nVidia nForce MCP51 SATA300 controller> port
0x9f0-0x9f7,0xbf0-0xbf3,0x970-0x977,0xb70-0xb73,0xe000-0xe00f mem
0xfe02d000-0xfe02dfff irq 20 at device 14.0 on pci0
ata2: <ATA channel 0> on atapci1

$ geom journal list
Geom name: gjournal 4283925943
ID: 4283925943
Providers:
1. Name: ad4s1e.journal
   Mediasize: 17179868672 (16G)
   Sectorsize: 512
   Mode: r1w1e1
Consumers:
1. Name: ad4s1e
   Mediasize: 17179869184 (16G)
   Sectorsize: 512
   Mode: r1w1e1
   Role: Data
2. Name: ad4s1ge
   Mediasize: 1073733632 (1.0G)
   Sectorsize: 512
   Mode: r1w1e1
   Jend: 1073733120
   Jstart: 0
   Role: Journal

$ mount | fgrep var
/dev/ad4s1e.journal on /var (ufs, local, soft-updates)

$ bsdlabel /dev/ad4s1g | fgrep e:
  e:  2097136  6291456      swap

-- 
Andriy Gapon

From owner-freebsd-geom@FreeBSD.ORG  Wed Oct 17 11:41:49 2007
Return-Path: <owner-freebsd-geom@FreeBSD.ORG>
Delivered-To: freebsd-geom@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 3135C16A4E6;
	Wed, 17 Oct 2007 11:41:49 +0000 (UTC)
	(envelope-from anderson@freebsd.org)
Received: from ns.trinitel.com (186.161.36.72.static.reverse.ltdomains.com
	[72.36.161.186])
	by mx1.freebsd.org (Postfix) with ESMTP id 79D6813C46A;
	Wed, 17 Oct 2007 11:41:45 +0000 (UTC)
	(envelope-from anderson@freebsd.org)
Received: from proton.storspeed.com (209-163-168-124.static.twtelecom.net
	[209.163.168.124]) (authenticated bits=0)
	by ns.trinitel.com (8.14.1/8.14.1) with ESMTP id l9HBfhas071547;
	Wed, 17 Oct 2007 06:41:44 -0500 (CDT)
	(envelope-from anderson@freebsd.org)
Message-ID: <4715F4EE.9020104@freebsd.org>
Date: Wed, 17 Oct 2007 06:41:34 -0500
From: Eric Anderson <anderson@freebsd.org>
User-Agent: Thunderbird 2.0.0.6 (Macintosh/20070728)
MIME-Version: 1.0
To: Andriy Gapon <avg@icyb.net.ua>
References: <4715C3D1.3070308@icyb.net.ua>
In-Reply-To: <4715C3D1.3070308@icyb.net.ua>
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
X-Spam-Status: No, score=-2.0 required=5.0 tests=AWL,BAYES_00 autolearn=ham
	version=3.1.8
X-Spam-Checker-Version: SpamAssassin 3.1.8 (2007-02-13) on ns.trinitel.com
Cc: Pawel Jakub Dawidek <pjd@freebsd.org>, freebsd-geom@freebsd.org
Subject: Re: gjournal: FLUSHCACHE timed out
X-BeenThere: freebsd-geom@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: GEOM-specific discussions and implementations
	<freebsd-geom.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-geom>,
	<mailto:freebsd-geom-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-geom>
List-Post: <mailto:freebsd-geom@freebsd.org>
List-Help: <mailto:freebsd-geom-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-geom>,
	<mailto:freebsd-geom-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 17 Oct 2007 11:41:49 -0000

Andriy Gapon wrote:
> Couple of days ago I started using gjournal on FreeBSD 6.2 using a patch
> from here:
> http://people.freebsd.org/~pjd/patches/gjournal6.patch
> 
> I actually had to make 4 minor and obvious tweaks to the patch to make
> it apply cleanly to my src.
> I started to get the following messages sometimes:
> 
> kernel: ad4: FAILURE - FLUSHCACHE timed out
> kernel: GEOM_JOURNAL: Flush cache of ad4s1ge: error=5.
> kernel: ad4: FAILURE - FLUSHCACHE timed out
> kernel: GEOM_JOURNAL: Flush cache of ad4s1ge: error=5.
> kernel: ad4: FAILURE - FLUSHCACHE timed out
> kernel: GEOM_JOURNAL: Flush cache of ad4s1ge: error=5.
> vvvvvvvvv this one is unusual and is found only once
> kernel: handle_workitem_freeblocks: block count
> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Ok, that's interesting.. Other threads are talking about a similar 
warning, not related to gjournal.


> ad4s1ge (please don't pay attention to its slightly unusual name, this
> is for historic reasons) is a journal partition/consumer for my /var
> filesystem/partition/provider.
> Size of /var is 16G, size of the journal is slightly less than 1G (1G -
> 32 sectors actually). /var is UFS2 with softupdates enabled.


Pawel, correct me if I'm wrong here - but I think you really need to 
turn *off* softupdates on gjournaled file systems.


> I noticed that I get these messages only when I run 'dump' on any of my
> filesystems. I think that dump is using /tmp or /var/tmp for some
> temporary data and in my setup both of those are in /var filesystem.
> 
> So my I guess is that /var is being written "too" actively and I have to
> tune some parameters to make things smooth.

A few things to note:

- you can turn on 'async' option for your gjournaled file system, and 
get better performance
- you might be able to at the 'noatime' option to your file system mount 
also
- You might try turning your journal switch time from 10 down to 5, and 
see if it alleviates some pressure on your disk.


Eric


From owner-freebsd-geom@FreeBSD.ORG  Wed Oct 17 14:44:39 2007
Return-Path: <owner-freebsd-geom@FreeBSD.ORG>
Delivered-To: freebsd-geom@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 720F316A421;
	Wed, 17 Oct 2007 14:44:39 +0000 (UTC) (envelope-from avg@icyb.net.ua)
Received: from falcon.cybervisiontech.com (falcon.cybervisiontech.com
	[217.20.163.9])
	by mx1.freebsd.org (Postfix) with ESMTP id 7DAC113C48A;
	Wed, 17 Oct 2007 14:44:38 +0000 (UTC) (envelope-from avg@icyb.net.ua)
Received: from localhost (localhost [127.0.0.1])
	by falcon.cybervisiontech.com (Postfix) with ESMTP id 7413B74400D;
	Wed, 17 Oct 2007 17:44:37 +0300 (EEST)
X-Virus-Scanned: Debian amavisd-new at falcon.cybervisiontech.com
Received: from falcon.cybervisiontech.com ([127.0.0.1])
	by localhost (falcon.cybervisiontech.com [127.0.0.1]) (amavisd-new,
	port 10024)
	with ESMTP id DWxzRFf0hDCo; Wed, 17 Oct 2007 17:44:37 +0300 (EEST)
Received: from [10.2.1.87] (gateway.cybervisiontech.com.ua [88.81.251.18])
	(using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits))
	(No client certificate requested)
	by falcon.cybervisiontech.com (Postfix) with ESMTP id 0666074400A;
	Wed, 17 Oct 2007 17:44:36 +0300 (EEST)
Message-ID: <47161FD1.5010501@icyb.net.ua>
Date: Wed, 17 Oct 2007 17:44:33 +0300
From: Andriy Gapon <avg@icyb.net.ua>
User-Agent: Thunderbird 2.0.0.6 (X11/20070803)
MIME-Version: 1.0
To: Eric Anderson <anderson@freebsd.org>
References: <4715C3D1.3070308@icyb.net.ua> <4715F4EE.9020104@freebsd.org>
In-Reply-To: <4715F4EE.9020104@freebsd.org>
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 7bit
Cc: Pawel Jakub Dawidek <pjd@freebsd.org>, freebsd-geom@freebsd.org
Subject: Re: gjournal: FLUSHCACHE timed out
X-BeenThere: freebsd-geom@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: GEOM-specific discussions and implementations
	<freebsd-geom.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-geom>,
	<mailto:freebsd-geom-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-geom>
List-Post: <mailto:freebsd-geom@freebsd.org>
List-Help: <mailto:freebsd-geom-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-geom>,
	<mailto:freebsd-geom-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 17 Oct 2007 14:44:39 -0000

on 17/10/2007 14:41 Eric Anderson said the following:
> Andriy Gapon wrote:
>> ad4s1ge (please don't pay attention to its slightly unusual name, this
>> is for historic reasons) is a journal partition/consumer for my /var
>> filesystem/partition/provider.
>> Size of /var is 16G, size of the journal is slightly less than 1G (1G -
>> 32 sectors actually). /var is UFS2 with softupdates enabled.
> 
> 
> Pawel, correct me if I'm wrong here - but I think you really need to 
> turn *off* softupdates on gjournaled file systems.

I was under a big mis-impression that I have to have softupdates enabled
for snapshots to work. Now that I know that I was wrong I will turn off
the softupdates. But it seems that there is nothing that would preclude
_in principle_ combination of softupdates/gjournal. Anyway, I care only
out of curiosity.

>> I noticed that I get these messages only when I run 'dump' on any of my
>> filesystems. I think that dump is using /tmp or /var/tmp for some
>> temporary data and in my setup both of those are in /var filesystem.
>>
>> So my I guess is that /var is being written "too" actively and I have to
>> tune some parameters to make things smooth.
> 
> A few things to note:
> 
> - you can turn on 'async' option for your gjournaled file system, and 
> get better performance

will do

> - you might be able to at the 'noatime' option to your file system mount 
> also

probably will do as well

> - You might try turning your journal switch time from 10 down to 5, and 
> see if it alleviates some pressure on your disk.

I already did this and it helped! I don't see the messages anymore.
Thank you!
I will try to set this back to 10 after I do away with softupdates and
see what happens.

Thank you very much again.

-- 
Andriy Gapon

From owner-freebsd-geom@FreeBSD.ORG  Wed Oct 17 18:41:41 2007
Return-Path: <owner-freebsd-geom@FreeBSD.ORG>
Delivered-To: freebsd-geom@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id AA85116A46C
	for <freebsd-geom@freebsd.org>; Wed, 17 Oct 2007 18:41:41 +0000 (UTC)
	(envelope-from kurtseel@primetime.com)
Received: from mail.primetime.com (mail.primetime.com [146.145.135.164])
	by mx1.freebsd.org (Postfix) with ESMTP id 14D5413C48D
	for <freebsd-geom@freebsd.org>; Wed, 17 Oct 2007 18:41:40 +0000 (UTC)
	(envelope-from kurtseel@primetime.com)
Received: from [10.200.1.130] (deca.khome.utcorp.net [10.200.1.130])
	by mail.primetime.com (Postfix) with ESMTP id C61FEF9C425
	for <freebsd-geom@freebsd.org>; Wed, 17 Oct 2007 13:20:40 -0400 (EDT)
Message-ID: <471650AA.30903@primetime.com>
Date: Wed, 17 Oct 2007 14:12:58 -0400
From: kurtseel <kurtseel@primetime.com>
User-Agent: Thunderbird 2.0.0.5 (X11/20070724)
MIME-Version: 1.0
To: freebsd-geom@freebsd.org
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Subject: gmirror + ggated question
X-BeenThere: freebsd-geom@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: GEOM-specific discussions and implementations
	<freebsd-geom.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-geom>,
	<mailto:freebsd-geom-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-geom>
List-Post: <mailto:freebsd-geom@freebsd.org>
List-Help: <mailto:freebsd-geom-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-geom>,
	<mailto:freebsd-geom-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 17 Oct 2007 18:41:41 -0000


 I built a mirror of a local drive and a ggated backed device. I ran 
iozone on it
and it runs along fine until a certain point when it slows down to a 
near stand
still. It doesn't break the mirror or crash the system, but it does slow 
the system
down to a near stop.
 I kill the iozone, and a short time later I can login and then :

# df
Filesystem            1K-blocks    Used    Avail Capacity  Mounted on
/dev/mirror/thinkcs1a   1012974  155780   776158    17%    /
devfs                         1       1        0   100%    /dev
/dev/mirror/thinkcs1e  85469448 1163474 77468420     1%    /usr
/dev/mirror/thinkcs1d   4058062   40426  3692992     1%    /var
[root@ ~/temp]# gmirror status
         Name    Status  Components
mirror/thinkc  COMPLETE  ad0
                         ggate0

 And all seems normal again. Seems like it has to do with big files ...
This is the same configuration I used in :
http://bsdtips.utcorp.net/mediawiki/index.php/Mirroring_over_network
 This is where the iozone gets stuck :

# /usr/bin/time iozone -b ${DR}/data.xls \
 >         -az -i 0 -i 1 -i 2 -i 3 -i 4 -i 5 -i 6 -i 7 -i 8 -i 9 -i 10 
-i 11 -i 12 \
 >                 | tee ${DR}/data.txt
        Iozone: Performance Test of File I/O
                Version $Revision: 3.283 $
                Compiled for 32 bit mode.
                Build: freebsd

        Contributors:William Norcott, Don Capps, Isom Crawford, Kirby 
Collins
                     Al Slater, Scott Rhine, Mike Wisner, Ken Goss
                     Steve Landherr, Brad Smith, Mark Kelly, Dr. Alain CYR,
                     Randy Dunlap, Mark Montague, Dan Million,
                     Jean-Marc Zucconi, Jeff Blomberg, Benny Halevy,
                     Erik Habbinga, Kris Strecker, Walter Wong.

        Run began: Wed Oct 17 08:20:44 2007

        Auto Mode
        Cross over of record size disabled.
        Selected test not available on the version.
        Command line used: iozone -b /root/temp/data.xls -az -i 0 -i 1 
-i 2 -i 3 -i 4 -i 5 -i 6 -i 7 -i 8 -i 9 -i 10 -i 11 -i 12
        Output is in Kbytes/sec
        Time Resolution = 0.000003 seconds.
        Processor cache size set to 1024 Kbytes.
        Processor cache line size set to 32 bytes.
        File stride size set to 17 * record size.
                                                            random  
random    bkwd  record  stride
              KB  reclen   write rewrite    read    reread    read   
write    read rewrite    read   fwrite frewrite   fread  freread
              64       4  250776  615089   853755  1067689  753149  
703784  604016  788548  940502   263581   640020  534288   829997
              64       8  324818  800303  1145118  1521595 1306782  
985384  772660  660492 1387858   338755   790871  667057  1046870
              64      16  488582  943809  1332734  1738379 1387858 
1164998  902555 1332734 1556895   537497   927503  694678  1084951
              64      32  507999  890578  1424687  1884854 1603393 
1332734  899531 1145118 1455588   551862   953870  556438  1067689
              64      64  488582  953870  1455588  1871711 1359737 
1229003  899531 1000068 1145118   546247   940502  276049   843030
             128       4  155550  292825   913951  1039607  907770  
590346  715430  961415  757845   159006   297696  749383   870953
             128       8  180012  316846  1197257  1389356 1280040 
1074995  920218 1392960 1707511   195125   324115  954577  1131643
             128      16   90973  100945  1347509  1453292 1805111 
1540885 1141265 2000136 2290237    82742    93151  963140  1320985
             128      32  231468  330704  1473231  1942249 1639715 
1457236 1163526 1855007 2205559   242882   346946  947836  1256082
             128      64  231468  338630  1473231  1829719 2030394 
1881004 1134033 1581743 1829719   241897   329082  825425  1055965
             128     128  224969  304794  1375121  1729514 1437724  
324703 1074995  312968 1256082   247130   318350  367866   633537
             256       4   15884  194821   941534  1049173  892244  
705287  752754  780099  764002    16097   172977  810116   914276
             256       8   15614  130080  1201837  1391908 1092959 
1061621 1007813 1305592 1075444    15473   284780  853247  1164052
             256      16   16407  206949  1391908  1630792 1608801 
1255226 1174236 2416067 2609861    16410   155921 1065836  1293015
             256      32   16264  293900  1179395  1580386 2047495 
1571136 1261123 2483116 3012599    16522   183783 1113358  1286816
             256      64   16705  205799  1522137  1705930 1840436 
1406494 1307182 2225754 2722351    16698   263921 1024155  1113358
             256     128   16776  122904  1496677  1684519 2047495   
16496 1273085   16489 1983206    16526   201891  565300   790436
             256     256   16666  359541  1261123  1242157 1071152  
352113 1061621  355024 1149103    16839   362087  289307   332076
             512       4   26638  185777   949618   890179  830935  
621303  798791 1077828  792599    29455   260035  838069   866473
             512       8   28995  294400  1221954  1248960 1158661  
175341 1055576 1808533 1068709    29700   228251 1062365  1089309
             512      16   28799  264454  1434125  1492949 1475510 
1207525 1248234 2679608 1142632    29426   207638 1176434  1239588
             512      32   30206  227574  1546713  1580872 1565886 
1115334 1437966 3137682 3687192    30217   177032 1240304  1302755
             512      64   32191  173080  1605694  1713303 1814646 
1454523 1484691 2960343 3606687    32217   212277 1221954  1270386
             512     128   32201  164894  1605694  1701088 1784488 
1028282 1446684   11813 2912169    32456   190897  775988  1068709
             512     256   32327  172788  1147517  1143241 1213667   
18740 1000030   18774 1292561    32060   258844  321219   402797
             512     512   31906  362602   654637   658855  649686  
366501  627475  250481  653840    32167   372929  245190   315462
            1024       4   49328  209067  1016941  1003868  820524  
500496  798110 1110551  820524    49887   211133  878241   820524
            1024       8   50836  206743  1293112  1308078 1171754  
297674 1159730 1943057 1029124    51235   223388 1097497  1134009
            1024      16   51824  236866  1546945  1563274 1514756  
902983 1387524 3011017 1378174    52279   238299 1270543  1299372
            1024      32   52513  232519  1659943  1662514 1659302  
896574 1554223 3482169 1530409    53350   241338 1319734  1356412
            1024      64   60692  229390  1738559  1729458 1622936  
387303 1585199 3459729 4213159    60459   228925 1206310  1428124
            1024     128   51672  227808  1695325  1718387 1792993  
875199 1602350   10321 3836789    60556   217363 1010006  1154431
            1024     256   56008  212229  1006927  1154431 1011909  
638427 1008819   13892 1218977    56933   246519  372619   411063
            1024     512   54505  312374   624412   620622  601332   
20017  605230   20003  618299    59928   306949  255160   296380
            1024    1024   60101  336954   613179   614935  607713  
411063  602682  406665  573394    59767   412366  217506   289410
            2048       4   15041   16933  1006977   994157  818602  
458878  835237 1079623  825445    15027    16914  868598   878548
            2048       8   15081   16986  1347352  1338534 1159013  
613329 1139034 1972694 1153721    15066    16978 1145413  1155117
            2048      16   15131   16962  1591195  1636983 1444313  
759672 1463756 3098355 1434185    15128    17045 1338743  1291444
            2048      32   15096   17044  1711000  1613916 1655598  
763385 1622757 3750380 1494833    15075    16978 1392341  1423254
            2048      64   14870   17047  1705226  1809010 1708278  
753739 1712364 3961384 1624292    14903    16975 1418085  1443342
            2048     128   14881   16993  1791653  1810535 1729603  
535151 1801044    9637 4383923    14931    16968 1183933  1247373
            2048     256   14870   16894  1230752  1251918 1042162  
635048 1021952   12290 1511403    15077    16877  392047   366692
            2048     512   14878   16698   473003   612629  618763  
543686  609716   14194  642026    14931    16720  276235   292031
            2048    1024   14843   16735   627670   615174  607001   
20775  544963   20762  595018    14878    16703  259401   288694
            2048    2048   14844   16751   622124   608765  614646   
16729  614250   16731  603929    14851    16717  236706   292987
            4096       4   13174   13875   931793  1003472  765167  
411617  849083 1152205  807726    13188    13886  818540   910412
            4096       8   13196   13869  1284436  1335145 1086479  
541363 1063939 1983494 1116052    13213    13853 1115473  1120931
            4096      16   13202   13881  1542674  1655505 1407561  
664510 1485708 3217555 1425665    13195    13910 1346974  1320470
            4096      32   13209   13890  1722393  1793414 1553274  
721254 1607485 3999581 1641741    13208    13886 1378202  1421772
            4096      64   13109   13892  1827759  1837730 1669663  
683682 1759436 3893527 1741423    13134    13882 1416263  1462936
            4096     128   13120   13888  1728284  1811760 1753510  
731418 1689033    9333 1765584    13154    13891 1258375  1343918
            4096     256   13184   13815  1123717  1149584 1058433  
574954 1076809   11607 1614434    13116    13841  397163   392185
            4096     512   13113   13809   608623   607525  604129  
550841  606817   12407  602984    13139    13886  279759   290026
            4096    1024   13120   13847   606089   567227  591401  
537250  589109   14366  598551    13109    13464  273158   291508
            4096    2048   13120   13800   593711   608084  598572   
13792  567321   13773  601379    13101    13788  261675   291384
            4096    4096   13106   13800   606453   613010  602962   
13797  597344   13804  599533    13103    13841  237752   293074
            8192       4   12393   11680  1027583  1030449  737896  
374095  874929 1141579  694888    12388    11686  927102   865693
            8192       8   12417   11684  1346732  1341840 1052835  
483698 1202591 1998540 1012116    12398    11677 1162985  1154777
            8192      16   12412   11677  1652274  1633811 1365084  
627409 1542703 3236777 1399159    12425    11664 1353149  1342312
            8192      32   12420   11673  1760194  1783585 1557176  
671306 1691813 4055482 1528905    12350    11687 1411227  1432645
            8192      64   12377   11673  1839739  1829746 1691146  
243505 1745618 4215703 1744023    12359    11681 1465207  1461593
            8192     128   12407   11683  1852135  1819957 1691479  
606822 1762813    9201 1719584    12355    11688 1332681  1361028
            8192     256   12365   11690  1222139  1160667 1021079  
597483 1023725   11138 1084162    12394    11665  401195   405226
            8192     512   12357   11649   584603   585220  578978  
453549  579838   11675  583600    12368    11661  290331   293830
            8192    1024   12375   11608   577917   575613  570623  
529373  528209   12463  574842    12390    11666  286674   292467
            8192    2048   12372   11658   582147   544248  573825  
528681  570518   11648  577354    12373    11630  276964   285734
            8192    4096   12374   11651   579427   577305  573069   
11649  577587   11656  574400    12269    11644  265518   295612
            8192    8192   12387   11649   575160   578364  569441   
11644  573623   11650  578822    12380    11651  241836   297231
           16384       4   11557   11686  1022909  1020297  694792   
11470  837507 1162226  742909    11556    11691  919774   903937
           16384       8   11561   11701  1353134  1356954 1014557   
42137 1216135 2066600 1053213    11559    11688 1178268  1170023
           16384      16   11576   11699  1619427  1631887 1338793  
602488 1514364 3315870 1331503    11576    11685 1368057  1315773
           16384      32   11551   11708  1754551  1759672 1464284  
646385 1628947 4129920 1528920    11574    11709 1445952  1427657
           16384      64   11577   11702  1841534  1817426 1520261  
456494 1775265 4379710 1671298    11478    11692 1482447  1470362
           16384     128   11556   11702  1836956  1804209 1693957  
396753 1781662    9119 1705053    11546    11692 1188724  1317691
           16384     256   11569   11693  1147018  1107710  982782  
535142 1045633   11146  979434    11559    11615  418685   416019
           16384     512   11561   11680   577899   580149  569858  
517106  578347   11335  572007    11559    11682  290388   290264
           16384    1024   11555   11601   572727   572803  561560  
470441  566625   11687  573429    11559    11687  288674   293420
           16384    2048   11564   11673   574979   569839  569225  
528073  569598   11690  572088    11556    11687  287191   294745
           16384    4096   11575   11690   574052   571883  570113  
498265  560790   11676  571208    11551    11678  279681   294755
           16384    8192   11562   11680   576310   571845  568298   
11685  572746   11686  572469    11471    11686  265637   295917
           16384   16384   11559   11689   553923   573587  571251   
11685  567646   11688  567172    11561    11692  240498   296822
           32768       4   11199   11253  1013483  1018425  652242    
3875  876784 1184967  735697    11117    11255  916611   920528
           32768       8   11200   11249  1308513  1355729  965409    
9951 1203784 2062334 1029047    11187    11242 1184885  1169032
           32768      16   11201   11247  1613220  1629033 1283763   
32214 1531567 3280446 1316043    11117    11257 1367437  1348201
           32768      32   11152   11257  1734301  1760696 1487556   
41111 1701515 4031482 1529709    11201    11232 1453635  1436122
           32768      64   11188   11261  1836966  1827149 1622916   
42493 1795521 4348175 1676198    11189    11208 1490266  1465665
           32768     128   11188   11249  1840188  1812093 1653114   
43633 1803508    9080 1696852    11187    11232 1356237  1318531
           32768     256   11197   11253  1121580  1127515  997869   
51577 1112024   11042 1044294    11192    11247  415553   419236
           32768     512   11186   11253   593139   599245  584370   
39487  599420   11169  597442    11176    11247  285580   284614
           32768    1024   11187   11252   589162   590075  581869   
49214  585625   11342  586923    11147    11256  285908   287842
           32768    2048   11180   11251   591801   590466  586345   
37755  583851   11245  589382    11193    11251  284624   288549
           32768    4096   11164   11241   490670   591801  586515   
63111  585174   11240  588008    11195    11247  280836   286636
           32768    8192   11181   11219   554242   555361  552078   
34415  549706   11252  556134    11190    11211  274837   287928
           32768   16384   11181   11236   556134   554414  551307   
11250  551612   11254  552375    11189    11226  260926   288146
           65536       4   11084   11136   669191   673013  514350    
1754  742188 1167346  563430    11099    11142  600360   599345
           65536       8   11098   11118   793723   793558  776971    
2061 1017642 2097833  795476    11108    11143  692894   694244
           65536      16   11104   11148   867419   882757 1050525    
9078 1292111 3307581 1002829    11101    11156  756767   742996
           65536      32   11102   11130   924396   922404 1208597    
1334 1421325 4096515 1114899    11091    11150  772540   780962
           65536      64   11106   11145   941732   939989   

From owner-freebsd-geom@FreeBSD.ORG  Wed Oct 17 19:28:22 2007
Return-Path: <owner-freebsd-geom@FreeBSD.ORG>
Delivered-To: freebsd-geom@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 36FC116A46B
	for <freebsd-geom@freebsd.org>; Wed, 17 Oct 2007 19:28:22 +0000 (UTC)
	(envelope-from eksffa@freebsdbrasil.com.br)
Received: from capeta.freebsdbrasil.com.br (vrrp.freebsdbrasil.com.br
	[200.210.70.30])
	by mx1.freebsd.org (Postfix) with SMTP id 4FCA713C4AC
	for <freebsd-geom@freebsd.org>; Wed, 17 Oct 2007 19:28:20 +0000 (UTC)
	(envelope-from eksffa@freebsdbrasil.com.br)
Received: (qmail 60926 invoked from network); 17 Oct 2007 17:01:35 -0200
Received: from unknown (HELO claire.bh.freebsdbrasil.com.br) (201.78.125.207)
	by capeta.freebsdbrasil.com.br with SMTP; 17 Oct 2007 17:01:35 -0200
Message-ID: <47165C0B.7080707@freebsdbrasil.com.br>
Date: Wed, 17 Oct 2007 17:01:31 -0200
From: Patrick Tracanelli <eksffa@freebsdbrasil.com.br>
Organization: FreeBSD Brasil LTDA
User-Agent: Thunderbird 2.0.0.0 (X11/20070612)
MIME-Version: 1.0
To: kurtseel <kurtseel@primetime.com>
References: <471650AA.30903@primetime.com>
In-Reply-To: <471650AA.30903@primetime.com>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Cc: freebsd-geom@freebsd.org
Subject: Re: gmirror + ggated question
X-BeenThere: freebsd-geom@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: GEOM-specific discussions and implementations
	<freebsd-geom.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-geom>,
	<mailto:freebsd-geom-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-geom>
List-Post: <mailto:freebsd-geom@freebsd.org>
List-Help: <mailto:freebsd-geom-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-geom>,
	<mailto:freebsd-geom-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 17 Oct 2007 19:28:22 -0000

kurtseel escreveu:
> 
> I built a mirror of a local drive and a ggated backed device. I ran 
> iozone on it
> and it runs along fine until a certain point when it slows down to a 
> near stand
> still. It doesn't break the mirror or crash the system, but it does slow 
> the system
> down to a near stop.
> I kill the iozone, and a short time later I can login and then :
> 
> # df
> Filesystem            1K-blocks    Used    Avail Capacity  Mounted on
> /dev/mirror/thinkcs1a   1012974  155780   776158    17%    /
> devfs                         1       1        0   100%    /dev
> /dev/mirror/thinkcs1e  85469448 1163474 77468420     1%    /usr
> /dev/mirror/thinkcs1d   4058062   40426  3692992     1%    /var
> [root@ ~/temp]# gmirror status
>         Name    Status  Components
> mirror/thinkc  COMPLETE  ad0
>                         ggate0
> 
> And all seems normal again. Seems like it has to do with big files ...
> This is the same configuration I used in :
> http://bsdtips.utcorp.net/mediawiki/index.php/Mirroring_over_network
> This is where the iozone gets stuck :

Did you try raising send and receive buffers on ggated? I found myself 
confortable with -S and -R around 512k-780k. I didnt, however, did an 
iozone stress test, just a production test (real load) before going 
production.

Try raising the buffer and let us know about your tests. TCP_NODELAY is 
also worth trying.

-- 
Patrick Tracanelli

FreeBSD Brasil LTDA.
(31) 3281-9633 / 3281-3547
316601@sip.freebsdbrasil.com.br
http://www.freebsdbrasil.com.br
"Long live Hanin Elias, Kim Deal!"


From owner-freebsd-geom@FreeBSD.ORG  Wed Oct 17 19:50:49 2007
Return-Path: <owner-freebsd-geom@FreeBSD.ORG>
Delivered-To: freebsd-geom@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 20FF016A49A
	for <freebsd-geom@freebsd.org>; Wed, 17 Oct 2007 19:50:49 +0000 (UTC)
	(envelope-from kurtseel@primetime.com)
Received: from mail.primetime.com (mail.primetime.com [146.145.135.164])
	by mx1.freebsd.org (Postfix) with ESMTP id E390E13C46E
	for <freebsd-geom@freebsd.org>; Wed, 17 Oct 2007 19:50:48 +0000 (UTC)
	(envelope-from kurtseel@primetime.com)
Received: from [10.200.1.130] (unknown [10.200.1.130])
	by mail.primetime.com (Postfix) with ESMTP id 8F98AF9C423;
	Wed, 17 Oct 2007 14:49:05 -0400 (EDT)
Message-ID: <47166562.60803@primetime.com>
Date: Wed, 17 Oct 2007 15:41:22 -0400
From: kurtseel <kurtseel@primetime.com>
User-Agent: Thunderbird 2.0.0.5 (X11/20070724)
MIME-Version: 1.0
To: Patrick Tracanelli <eksffa@freebsdbrasil.com.br>
References: <471650AA.30903@primetime.com>
	<47165C0B.7080707@freebsdbrasil.com.br>
In-Reply-To: <47165C0B.7080707@freebsdbrasil.com.br>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Cc: freebsd-geom@freebsd.org
Subject: Re: gmirror + ggated question
X-BeenThere: freebsd-geom@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: GEOM-specific discussions and implementations
	<freebsd-geom.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-geom>,
	<mailto:freebsd-geom-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-geom>
List-Post: <mailto:freebsd-geom@freebsd.org>
List-Help: <mailto:freebsd-geom-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-geom>,
	<mailto:freebsd-geom-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 17 Oct 2007 19:50:49 -0000

Patrick Tracanelli wrote:
> kurtseel escreveu:
>>
>> I built a mirror of a local drive and a ggated backed device. I ran 
>> iozone on it
>> and it runs along fine until a certain point when it slows down to a 
>> near stand
>> still. It doesn't break the mirror or crash the system, but it does 
>> slow the system
>> down to a near stop.
>> I kill the iozone, and a short time later I can login and then :
>>
>> # df
>> Filesystem            1K-blocks    Used    Avail Capacity  Mounted on
>> /dev/mirror/thinkcs1a   1012974  155780   776158    17%    /
>> devfs                         1       1        0   100%    /dev
>> /dev/mirror/thinkcs1e  85469448 1163474 77468420     1%    /usr
>> /dev/mirror/thinkcs1d   4058062   40426  3692992     1%    /var
>> [root@ ~/temp]# gmirror status
>>         Name    Status  Components
>> mirror/thinkc  COMPLETE  ad0
>>                         ggate0
>>
>> And all seems normal again. Seems like it has to do with big files ...
>> This is the same configuration I used in :
>> http://bsdtips.utcorp.net/mediawiki/index.php/Mirroring_over_network
>> This is where the iozone gets stuck :
>
> Did you try raising send and receive buffers on ggated? I found myself 
> confortable with -S and -R around 512k-780k. I didnt, however, did an 
> iozone stress test, just a production test (real load) before going 
> production.
>
> Try raising the buffer and let us know about your tests. TCP_NODELAY 
> is also worth trying.
>
 Makes sense. So now I get this :

Test (/root/benchmarks) > ggated -v -R 262144 -S 262144 /etc/ggated.conf
info: Reading exports file (/etc/ggated.conf).
debug: Added 10.200.1.200/32 /dev/ad10 RW to exports list.
info: Exporting 1 object(s).
error: Cannot open stream socket: No buffer space available.
error: Exiting.

Test (/root/benchmarks) > ggated -v -R 524288 -S 524288 /etc/ggated.conf
info: Reading exports file (/etc/ggated.conf).
debug: Added 10.200.1.200/32 /dev/ad10 RW to exports list.
info: Exporting 1 object(s).
error: Cannot open stream socket: No buffer space available.
error: Exiting.

 I have raised

sysctl net.inet.tcp.sendspace=4194304
sysctl net.inet.tcp.recvspace=4194304
sysctl kern.ipc.maxsockbuf=2097152

 Which I saw in a posting ...

 It even happens here :

Test (/root/benchmarks) > ggated -v -R 1 -S 1 /etc/ggated.conf
info: Reading exports file (/etc/ggated.conf).
debug: Added 10.200.1.200/32 /dev/ad10 RW to exports list.
info: Exporting 1 object(s).
error: Cannot open stream socket: No buffer space available.
error: Exiting.


From owner-freebsd-geom@FreeBSD.ORG  Wed Oct 17 20:13:04 2007
Return-Path: <owner-freebsd-geom@FreeBSD.ORG>
Delivered-To: freebsd-geom@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 9A19816A4D1
	for <freebsd-geom@freebsd.org>; Wed, 17 Oct 2007 20:13:04 +0000 (UTC)
	(envelope-from pjd@garage.freebsd.pl)
Received: from mail.garage.freebsd.pl (arm132.internetdsl.tpnet.pl
	[83.17.198.132])
	by mx1.freebsd.org (Postfix) with ESMTP id 4DA9313C43E
	for <freebsd-geom@freebsd.org>; Wed, 17 Oct 2007 20:13:01 +0000 (UTC)
	(envelope-from pjd@garage.freebsd.pl)
Received: by mail.garage.freebsd.pl (Postfix, from userid 65534)
	id 149F645F56; Wed, 17 Oct 2007 22:12:58 +0200 (CEST)
Received: from localhost (154.81.datacomsa.pl [195.34.81.154])
	(using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits))
	(No client certificate requested)
	by mail.garage.freebsd.pl (Postfix) with ESMTP id ED5CF45F42;
	Wed, 17 Oct 2007 22:12:52 +0200 (CEST)
Date: Wed, 17 Oct 2007 22:12:35 +0200
From: Pawel Jakub Dawidek <pjd@FreeBSD.org>
To: Andriy Gapon <avg@icyb.net.ua>
Message-ID: <20071017201235.GD50219@garage.freebsd.pl>
References: <4715C3D1.3070308@icyb.net.ua> <4715F4EE.9020104@freebsd.org>
	<47161FD1.5010501@icyb.net.ua>
Mime-Version: 1.0
Content-Type: multipart/signed; micalg=pgp-sha1;
	protocol="application/pgp-signature"; boundary="zS7rBR6csb6tI2e1"
Content-Disposition: inline
In-Reply-To: <47161FD1.5010501@icyb.net.ua>
User-Agent: Mutt/1.4.2.3i
X-PGP-Key-URL: http://people.freebsd.org/~pjd/pjd.asc
X-OS: FreeBSD 7.0-CURRENT i386
X-Spam-Checker-Version: SpamAssassin 3.0.4 (2005-06-05) on 
	mail.garage.freebsd.pl
X-Spam-Level: 
X-Spam-Status: No, score=-2.6 required=3.0 tests=BAYES_00 autolearn=ham 
	version=3.0.4
Cc: freebsd-geom@freebsd.org
Subject: Re: gjournal: FLUSHCACHE timed out
X-BeenThere: freebsd-geom@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: GEOM-specific discussions and implementations
	<freebsd-geom.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-geom>,
	<mailto:freebsd-geom-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-geom>
List-Post: <mailto:freebsd-geom@freebsd.org>
List-Help: <mailto:freebsd-geom-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-geom>,
	<mailto:freebsd-geom-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 17 Oct 2007 20:13:04 -0000


--zS7rBR6csb6tI2e1
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

On Wed, Oct 17, 2007 at 05:44:33PM +0300, Andriy Gapon wrote:
> on 17/10/2007 14:41 Eric Anderson said the following:
> > Andriy Gapon wrote:
> >> ad4s1ge (please don't pay attention to its slightly unusual name, this
> >> is for historic reasons) is a journal partition/consumer for my /var
> >> filesystem/partition/provider.
> >> Size of /var is 16G, size of the journal is slightly less than 1G (1G -
> >> 32 sectors actually). /var is UFS2 with softupdates enabled.
> >=20
> >=20
> > Pawel, correct me if I'm wrong here - but I think you really need to=20
> > turn *off* softupdates on gjournaled file systems.
>=20
> I was under a big mis-impression that I have to have softupdates enabled
> for snapshots to work. Now that I know that I was wrong I will turn off
> the softupdates. But it seems that there is nothing that would preclude
> _in principle_ combination of softupdates/gjournal. Anyway, I care only
> out of curiosity.

It's not that it won't work together, but it's just hurts performance
and memory consumption.

> >> I noticed that I get these messages only when I run 'dump' on any of my
> >> filesystems. I think that dump is using /tmp or /var/tmp for some
> >> temporary data and in my setup both of those are in /var filesystem.
> >>
> >> So my I guess is that /var is being written "too" actively and I have =
to
> >> tune some parameters to make things smooth.
> >=20
> > A few things to note:
> >=20
> > - you can turn on 'async' option for your gjournaled file system, and=
=20
> > get better performance
>=20
> will do
>=20
> > - you might be able to at the 'noatime' option to your file system moun=
t=20
> > also
>=20
> probably will do as well
>=20
> > - You might try turning your journal switch time from 10 down to 5, and=
=20
> > see if it alleviates some pressure on your disk.
>=20
> I already did this and it helped! I don't see the messages anymore.
> Thank you!
> I will try to set this back to 10 after I do away with softupdates and
> see what happens.
>=20
> Thank you very much again.

You should also try this patch:

	http://www.freebsd.org/cgi/cvsweb.cgi/src/sys/dev/ata/ata-disk.c.diff?r1=
=3D1.201;r2=3D1.202

BIO_FLUSH timeout was way too small.

--=20
Pawel Jakub Dawidek                       http://www.wheel.pl
pjd@FreeBSD.org                           http://www.FreeBSD.org
FreeBSD committer                         Am I Evil? Yes, I Am!

--zS7rBR6csb6tI2e1
Content-Type: application/pgp-signature
Content-Disposition: inline

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.4 (FreeBSD)

iD8DBQFHFmyzForvXbEpPzQRAkScAJ9rnB4eERnKYOERIFHI2mKA+1zrIgCbBmDr
UqxNSAjWPjHzGqeL8p+ewsE=
=GbMN
-----END PGP SIGNATURE-----

--zS7rBR6csb6tI2e1--

From owner-freebsd-geom@FreeBSD.ORG  Wed Oct 17 20:15:55 2007
Return-Path: <owner-freebsd-geom@FreeBSD.ORG>
Delivered-To: freebsd-geom@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id E6CC916A420
	for <freebsd-geom@freebsd.org>; Wed, 17 Oct 2007 20:15:55 +0000 (UTC)
	(envelope-from pjd@garage.freebsd.pl)
Received: from mail.garage.freebsd.pl (arm132.internetdsl.tpnet.pl
	[83.17.198.132])
	by mx1.freebsd.org (Postfix) with ESMTP id 6F4F313C44B
	for <freebsd-geom@freebsd.org>; Wed, 17 Oct 2007 20:15:55 +0000 (UTC)
	(envelope-from pjd@garage.freebsd.pl)
Received: by mail.garage.freebsd.pl (Postfix, from userid 65534)
	id 35D5145F42; Wed, 17 Oct 2007 22:15:54 +0200 (CEST)
Received: from localhost (154.81.datacomsa.pl [195.34.81.154])
	(using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits))
	(No client certificate requested)
	by mail.garage.freebsd.pl (Postfix) with ESMTP id C5E2845F44;
	Wed, 17 Oct 2007 22:15:49 +0200 (CEST)
Date: Wed, 17 Oct 2007 22:15:31 +0200
From: Pawel Jakub Dawidek <pjd@FreeBSD.org>
To: kurtseel <kurtseel@primetime.com>
Message-ID: <20071017201531.GE50219@garage.freebsd.pl>
References: <471650AA.30903@primetime.com>
Mime-Version: 1.0
Content-Type: multipart/signed; micalg=pgp-sha1;
	protocol="application/pgp-signature"; boundary="C94crkcyjafcjHxo"
Content-Disposition: inline
In-Reply-To: <471650AA.30903@primetime.com>
User-Agent: Mutt/1.4.2.3i
X-PGP-Key-URL: http://people.freebsd.org/~pjd/pjd.asc
X-OS: FreeBSD 7.0-CURRENT i386
X-Spam-Checker-Version: SpamAssassin 3.0.4 (2005-06-05) on 
	mail.garage.freebsd.pl
X-Spam-Level: 
X-Spam-Status: No, score=-2.6 required=3.0 tests=BAYES_00 autolearn=ham 
	version=3.0.4
Cc: freebsd-geom@freebsd.org
Subject: Re: gmirror + ggated question
X-BeenThere: freebsd-geom@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: GEOM-specific discussions and implementations
	<freebsd-geom.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-geom>,
	<mailto:freebsd-geom-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-geom>
List-Post: <mailto:freebsd-geom@freebsd.org>
List-Help: <mailto:freebsd-geom-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-geom>,
	<mailto:freebsd-geom-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 17 Oct 2007 20:15:56 -0000


--C94crkcyjafcjHxo
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

On Wed, Oct 17, 2007 at 02:12:58PM -0400, kurtseel wrote:
>=20
> I built a mirror of a local drive and a ggated backed device. I ran=20
> iozone on it
> and it runs along fine until a certain point when it slows down to a=20
> near stand
> still. It doesn't break the mirror or crash the system, but it does slow=
=20
> the system
> down to a near stop.

You haven't said which FreeBSD version you use. If it's not HEAD nor
RELENG_7, try this patch:

	http://www.freebsd.org/cgi/cvsweb.cgi/src/sbin/ggate/shared/ggate.c.diff?r=
1=3D1.8;r2=3D1.9

--=20
Pawel Jakub Dawidek                       http://www.wheel.pl
pjd@FreeBSD.org                           http://www.FreeBSD.org
FreeBSD committer                         Am I Evil? Yes, I Am!

--C94crkcyjafcjHxo
Content-Type: application/pgp-signature
Content-Disposition: inline

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.4 (FreeBSD)

iD8DBQFHFm1jForvXbEpPzQRAiNJAJ9nykuY/E5CM11wibNiM2BvChiR7wCguO6a
VoSgOoiUzwlLUhnR7T1Iluw=
=ji1E
-----END PGP SIGNATURE-----

--C94crkcyjafcjHxo--

From owner-freebsd-geom@FreeBSD.ORG  Wed Oct 17 21:16:22 2007
Return-Path: <owner-freebsd-geom@FreeBSD.ORG>
Delivered-To: freebsd-geom@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 66F4516A473
	for <freebsd-geom@freebsd.org>; Wed, 17 Oct 2007 21:16:22 +0000 (UTC)
	(envelope-from kurtseel@primetime.com)
Received: from mail.primetime.com (mail.primetime.com [146.145.135.164])
	by mx1.freebsd.org (Postfix) with ESMTP id 414CA13C461
	for <freebsd-geom@freebsd.org>; Wed, 17 Oct 2007 21:16:22 +0000 (UTC)
	(envelope-from kurtseel@primetime.com)
Received: from [10.200.1.130] (deca.khome.utcorp.net [10.200.1.130])
	by mail.primetime.com (Postfix) with ESMTP id 5C351F9C412;
	Wed, 17 Oct 2007 16:14:35 -0400 (EDT)
Message-ID: <4716796B.9090803@primetime.com>
Date: Wed, 17 Oct 2007 17:06:51 -0400
From: kurtseel <kurtseel@primetime.com>
User-Agent: Thunderbird 2.0.0.5 (X11/20070724)
MIME-Version: 1.0
To: Pawel Jakub Dawidek <pjd@FreeBSD.org>
References: <471650AA.30903@primetime.com>
	<20071017201531.GE50219@garage.freebsd.pl>
In-Reply-To: <20071017201531.GE50219@garage.freebsd.pl>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Cc: freebsd-geom@freebsd.org
Subject: Re: gmirror + ggated question
X-BeenThere: freebsd-geom@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: GEOM-specific discussions and implementations
	<freebsd-geom.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-geom>,
	<mailto:freebsd-geom-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-geom>
List-Post: <mailto:freebsd-geom@freebsd.org>
List-Help: <mailto:freebsd-geom-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-geom>,
	<mailto:freebsd-geom-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 17 Oct 2007 21:16:22 -0000

Pawel Jakub Dawidek wrote:
> On Wed, Oct 17, 2007 at 02:12:58PM -0400, kurtseel wrote:
>   
>> I built a mirror of a local drive and a ggated backed device. I ran 
>> iozone on it
>> and it runs along fine until a certain point when it slows down to a 
>> near stand
>> still. It doesn't break the mirror or crash the system, but it does slow 
>> the system
>> down to a near stop.
>>     
>
> You haven't said which FreeBSD version you use. If it's not HEAD nor
> RELENG_7, try this patch:
>
> 	http://www.freebsd.org/cgi/cvsweb.cgi/src/sbin/ggate/shared/ggate.c.diff?r1=1.8;r2=1.9
>
>   
 Sorry.
[root@test1 /usr/src/sbin/ggate]# uname -a
FreeBSD test1.khome.utcorp.net. 6.2-RELEASE-p4 FreeBSD 6.2-RELEASE-p4 
#0: Thu Apr 26 17:40:53 UTC 2007     
root@i386-builder.daemonology.net:/usr/obj/usr/src/sys/GENERIC  i386

 I applied the patch and am resyncing the mirror now, backed by the patched
ggated. When it is done, I'll re-run the iozone.

From owner-freebsd-geom@FreeBSD.ORG  Thu Oct 18 13:08:41 2007
Return-Path: <owner-freebsd-geom@FreeBSD.ORG>
Delivered-To: freebsd-geom@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 214EF16A41B
	for <freebsd-geom@freebsd.org>; Thu, 18 Oct 2007 13:08:41 +0000 (UTC)
	(envelope-from eksffa@freebsdbrasil.com.br)
Received: from capeta.freebsdbrasil.com.br (vrrp.freebsdbrasil.com.br
	[200.210.70.30])
	by mx1.freebsd.org (Postfix) with SMTP id 65C8613C459
	for <freebsd-geom@freebsd.org>; Thu, 18 Oct 2007 13:08:40 +0000 (UTC)
	(envelope-from eksffa@freebsdbrasil.com.br)
Received: (qmail 15298 invoked from network); 18 Oct 2007 11:08:44 -0200
Received: from unknown (HELO claire.bh.freebsdbrasil.com.br) (201.78.96.93)
	by capeta.freebsdbrasil.com.br with SMTP; 18 Oct 2007 11:08:44 -0200
Message-ID: <47175AD2.5080308@freebsdbrasil.com.br>
Date: Thu, 18 Oct 2007 11:08:34 -0200
From: Patrick Tracanelli <eksffa@freebsdbrasil.com.br>
Organization: FreeBSD Brasil LTDA
User-Agent: Thunderbird 2.0.0.0 (X11/20070612)
MIME-Version: 1.0
To: kurtseel <kurtseel@primetime.com>
References: <471650AA.30903@primetime.com>
	<47165C0B.7080707@freebsdbrasil.com.br>
	<47166562.60803@primetime.com>
In-Reply-To: <47166562.60803@primetime.com>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Cc: freebsd-geom@freebsd.org
Subject: Re: gmirror + ggated question
X-BeenThere: freebsd-geom@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: GEOM-specific discussions and implementations
	<freebsd-geom.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-geom>,
	<mailto:freebsd-geom-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-geom>
List-Post: <mailto:freebsd-geom@freebsd.org>
List-Help: <mailto:freebsd-geom-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-geom>,
	<mailto:freebsd-geom-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 18 Oct 2007 13:08:41 -0000

kurtseel escreveu:
> Patrick Tracanelli wrote:
>> kurtseel escreveu:
>>>
>>> I built a mirror of a local drive and a ggated backed device. I ran 
>>> iozone on it
>>> and it runs along fine until a certain point when it slows down to a 
>>> near stand
>>> still. It doesn't break the mirror or crash the system, but it does 
>>> slow the system
>>> down to a near stop.
>>> I kill the iozone, and a short time later I can login and then :
>>>
>>> # df
>>> Filesystem            1K-blocks    Used    Avail Capacity  Mounted on
>>> /dev/mirror/thinkcs1a   1012974  155780   776158    17%    /
>>> devfs                         1       1        0   100%    /dev
>>> /dev/mirror/thinkcs1e  85469448 1163474 77468420     1%    /usr
>>> /dev/mirror/thinkcs1d   4058062   40426  3692992     1%    /var
>>> [root@ ~/temp]# gmirror status
>>>         Name    Status  Components
>>> mirror/thinkc  COMPLETE  ad0
>>>                         ggate0
>>>
>>> And all seems normal again. Seems like it has to do with big files ...
>>> This is the same configuration I used in :
>>> http://bsdtips.utcorp.net/mediawiki/index.php/Mirroring_over_network
>>> This is where the iozone gets stuck :
>>
>> Did you try raising send and receive buffers on ggated? I found myself 
>> confortable with -S and -R around 512k-780k. I didnt, however, did an 
>> iozone stress test, just a production test (real load) before going 
>> production.
>>
>> Try raising the buffer and let us know about your tests. TCP_NODELAY 
>> is also worth trying.
>>
> Makes sense. So now I get this :
> 
> Test (/root/benchmarks) > ggated -v -R 262144 -S 262144 /etc/ggated.conf
> info: Reading exports file (/etc/ggated.conf).
> debug: Added 10.200.1.200/32 /dev/ad10 RW to exports list.
> info: Exporting 1 object(s).
> error: Cannot open stream socket: No buffer space available.
> error: Exiting.
> 
> Test (/root/benchmarks) > ggated -v -R 524288 -S 524288 /etc/ggated.conf
> info: Reading exports file (/etc/ggated.conf).
> debug: Added 10.200.1.200/32 /dev/ad10 RW to exports list.
> info: Exporting 1 object(s).
> error: Cannot open stream socket: No buffer space available.
> error: Exiting.
> 
> I have raised
> 
> sysctl net.inet.tcp.sendspace=4194304
> sysctl net.inet.tcp.recvspace=4194304
> sysctl kern.ipc.maxsockbuf=2097152
> 
> Which I saw in a posting ...
> 
> It even happens here :
> 
> Test (/root/benchmarks) > ggated -v -R 1 -S 1 /etc/ggated.conf
> info: Reading exports file (/etc/ggated.conf).
> debug: Added 10.200.1.200/32 /dev/ad10 RW to exports list.
> info: Exporting 1 object(s).
> error: Cannot open stream socket: No buffer space available.
> error: Exiting.
> 

Seems that you are out of buffer spance and it is not related to ggated, 
since -R 1 and -S 1 would not demand a bounch of extra memory. In any 
case, tuning kern.ipc.maxsockbuf should be enough.

If I raise to 512K I get out of buffer space too, on the default value. 
However, just raising it solves the prob:

(eksffa@claire)~# sysctl -qw kern.ipc.maxsockbuf=`echo "524288*2" | bc `
kern.ipc.maxsockbuf: 262144 -> 1048576

(eksffa@claire)~# ggated -R 524288 -S 524288 -v
info: Reading exports file (/etc/gg.exports).
debug: Added 10.0.0.0/24 /dev/ad12 RO to exports list.
info: Exporting 1 object(s).
info: Listen on port: 3080.

And so, I can import ggate0 on the other host.

Try figuring out with netstat -m why you ran out of buffer. Also, I 
believe it can be related to the fact you have raised recvspace and 
sndspace way too high. I dont think it makes any sense raising it over 
64k on 100Mbit network, or 128-512k on 1Gbit network. You have raised 
'em up to 4MB :) Lower down to the default 32k (send) / 64k (recv) 
first. If you are on 1Gbit or 10Gbit you can, later, tray raising on 
multiple of 32K untill the point you see it makes sense (where it makes 
positive difference on your benchs).

If it is anyhow relevant, I run it on 6.2-STABLE, cvsuped on Sept 24th, 
with the patches PJD mentioned applied.

-- 
Patrick Tracanelli

FreeBSD Brasil LTDA.
(31) 3281-9633 / 3281-3547
316601@sip.freebsdbrasil.com.br
http://www.freebsdbrasil.com.br
"Long live Hanin Elias, Kim Deal!"


From owner-freebsd-geom@FreeBSD.ORG  Fri Oct 19 18:01:21 2007
Return-Path: <owner-freebsd-geom@FreeBSD.ORG>
Delivered-To: freebsd-geom@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 8E38B16A421
	for <freebsd-geom@freebsd.org>; Fri, 19 Oct 2007 18:01:21 +0000 (UTC)
	(envelope-from felipe@neuwald.biz)
Received: from itacaiunas.cepatec.org.br (itacaiunas.cepatec.org.br
	[200.152.208.51])
	by mx1.freebsd.org (Postfix) with ESMTP id 40DA013C480
	for <freebsd-geom@freebsd.org>; Fri, 19 Oct 2007 18:01:21 +0000 (UTC)
	(envelope-from felipe@neuwald.biz)
Received: from localhost (vermelho [10.0.0.5])
	by itacaiunas.cepatec.org.br (Postfix) with ESMTP id DE1DA11571A
	for <freebsd-geom@freebsd.org>; Fri, 19 Oct 2007 15:43:17 -0200 (BRST)
X-Virus-Scanned: amavisd-new at cepatec.org.br
Received: from itacaiunas.cepatec.org.br ([10.0.0.3])
	by localhost (vermelho.cepatec.org.br [10.0.0.5]) (amavisd-new,
	port 10024)
	with ESMTP id ATserKI0Mr3z for <freebsd-geom@freebsd.org>;
	Fri, 19 Oct 2007 14:43:16 -0300 (BRT)
Received: from [192.168.0.152] (unknown [200.199.198.61])
	by itacaiunas.cepatec.org.br (Postfix) with ESMTP id 0B3801154FD
	for <freebsd-geom@freebsd.org>; Fri, 19 Oct 2007 15:43:13 -0200 (BRST)
Message-ID: <4718ECB2.9050207@neuwald.biz>
Date: Fri, 19 Oct 2007 15:43:14 -0200
From: Felipe Neuwald <felipe@neuwald.biz>
User-Agent: Thunderbird 1.5.0.13 (X11/20070824)
MIME-Version: 1.0
To: freebsd-geom@freebsd.org
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Subject: gvinum - problem on hard disk
X-BeenThere: freebsd-geom@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: GEOM-specific discussions and implementations
	<freebsd-geom.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-geom>,
	<mailto:freebsd-geom-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-geom>
List-Post: <mailto:freebsd-geom@freebsd.org>
List-Help: <mailto:freebsd-geom-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-geom>,
	<mailto:freebsd-geom-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 19 Oct 2007 18:01:21 -0000

Hi folks,

I have one gvinum raid on a FreeBSD 6.1-RELEASE machine. There are 4 
disks running, as you can see:

[root@fileserver ~]# gvinum list
4 drives:
D a                     State: up       /dev/ad4        A: 0/238474 MB (0%)
D b                     State: up       /dev/ad5        A: 0/238475 MB (0%)
D c                     State: up       /dev/ad6        A: 0/238475 MB (0%)
D d                     State: up       /dev/ad7        A: 0/238475 MB (0%)

1 volume:
V data                  State: down     Plexes:       1 Size:        931 GB

1 plex:
P data.p0             S State: down     Subdisks:     4 Size:        931 GB

4 subdisks:
S data.p0.s3            State: stale    D: d            Size:        232 GB
S data.p0.s2            State: up       D: c            Size:        232 GB
S data.p0.s1            State: up       D: b            Size:        232 GB
S data.p0.s0            State: up       D: a            Size:        232 GB


But, as you can see, the data.p0.s3 is "stale". What should I do to try 
recover this and get the raid up again (and recover information)

Thanks,

Felipe Neuwald.

From owner-freebsd-geom@FreeBSD.ORG  Fri Oct 19 20:18:36 2007
Return-Path: <owner-freebsd-geom@FreeBSD.ORG>
Delivered-To: freebsd-geom@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 5B62816A421
	for <freebsd-geom@freebsd.org>; Fri, 19 Oct 2007 20:18:36 +0000 (UTC)
	(envelope-from lulf@stud.ntnu.no)
Received: from fri.itea.ntnu.no (fri.itea.ntnu.no [129.241.7.60])
	by mx1.freebsd.org (Postfix) with ESMTP id 0CA3413C44B
	for <freebsd-geom@freebsd.org>; Fri, 19 Oct 2007 20:18:35 +0000 (UTC)
	(envelope-from lulf@stud.ntnu.no)
Received: from localhost (localhost [127.0.0.1])
	by fri.itea.ntnu.no (Postfix) with ESMTP id F076B8401;
	Fri, 19 Oct 2007 22:00:32 +0200 (CEST)
Received: from caracal.stud.ntnu.no (caracal.stud.ntnu.no [129.241.56.185])
	by fri.itea.ntnu.no (Postfix) with ESMTP;
	Fri, 19 Oct 2007 22:00:32 +0200 (CEST)
Received: by caracal.stud.ntnu.no (Postfix, from userid 2312)
	id 956396240F4; Fri, 19 Oct 2007 22:00:41 +0200 (CEST)
Date: Fri, 19 Oct 2007 22:00:41 +0200
From: Ulf Lilleengen <lulf@stud.ntnu.no>
To: Felipe Neuwald <felipe@neuwald.biz>
Message-ID: <20071019200041.GA16812@stud.ntnu.no>
References: <4718ECB2.9050207@neuwald.biz>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <4718ECB2.9050207@neuwald.biz>
User-Agent: Mutt/1.5.9i
X-Content-Scanned: with sophos and spamassassin at mailgw.ntnu.no.
Cc: freebsd-geom@freebsd.org
Subject: Re: gvinum - problem on hard disk
X-BeenThere: freebsd-geom@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: GEOM-specific discussions and implementations
	<freebsd-geom.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-geom>,
	<mailto:freebsd-geom-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-geom>
List-Post: <mailto:freebsd-geom@freebsd.org>
List-Help: <mailto:freebsd-geom-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-geom>,
	<mailto:freebsd-geom-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 19 Oct 2007 20:18:36 -0000

On fre, okt 19, 2007 at 03:43:14 -0200, Felipe Neuwald wrote:
> Hi folks,
> 
> I have one gvinum raid on a FreeBSD 6.1-RELEASE machine. There are 4 
> disks running, as you can see:
> 
> [root@fileserver ~]# gvinum list
> 4 drives:
> D a                     State: up       /dev/ad4        A: 0/238474 MB (0%)
> D b                     State: up       /dev/ad5        A: 0/238475 MB (0%)
> D c                     State: up       /dev/ad6        A: 0/238475 MB (0%)
> D d                     State: up       /dev/ad7        A: 0/238475 MB (0%)
> 
> 1 volume:
> V data                  State: down     Plexes:       1 Size:        931 GB
> 
> 1 plex:
> P data.p0             S State: down     Subdisks:     4 Size:        931 GB
> 
> 4 subdisks:
> S data.p0.s3            State: stale    D: d            Size:        232 GB
> S data.p0.s2            State: up       D: c            Size:        232 GB
> S data.p0.s1            State: up       D: b            Size:        232 GB
> S data.p0.s0            State: up       D: a            Size:        232 GB
> 
> 
> But, as you can see, the data.p0.s3 is "stale". What should I do to try 
> recover this and get the raid up again (and recover information)
> 
Hello,

Since your plex organization is RAID0 (striping), recovering after a drive
failure is a problem since you don't have any redundancy, but if you didn't
replace any drives etc, this could just be gvinum fooling around. In that
case, doing a 'gvinum setstate -f up data.p0.s3' should get the volume up
again.
> 

-- 
Ulf Lilleengen