From owner-freebsd-geom@FreeBSD.ORG Sun Oct 14 01:31:59 2007 Return-Path: Delivered-To: freebsd-geom@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 2D6F116A417; Sun, 14 Oct 2007 01:31:59 +0000 (UTC) (envelope-from anderson@freebsd.org) Received: from ns.trinitel.com (186.161.36.72.static.reverse.ltdomains.com [72.36.161.186]) by mx1.freebsd.org (Postfix) with ESMTP id EB5D313C46A; Sun, 14 Oct 2007 01:31:58 +0000 (UTC) (envelope-from anderson@freebsd.org) Received: from proton.local (r74-193-81-203.pfvlcmta01.grtntx.tl.dh.suddenlink.net [74.193.81.203]) (authenticated bits=0) by ns.trinitel.com (8.14.1/8.14.1) with ESMTP id l9E1Vw8w008356; Sat, 13 Oct 2007 20:31:58 -0500 (CDT) (envelope-from anderson@freebsd.org) Message-ID: <47117184.3030309@freebsd.org> Date: Sat, 13 Oct 2007 20:31:48 -0500 From: Eric Anderson User-Agent: Thunderbird 2.0.0.6 (Macintosh/20070728) MIME-Version: 1.0 To: Ivan Voras References: In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-2.0 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.1.8 X-Spam-Checker-Version: SpamAssassin 3.1.8 (2007-02-13) on ns.trinitel.com Cc: freebsd-geom@freebsd.org Subject: Re: Disk mounting in recent Linuxes X-BeenThere: freebsd-geom@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: GEOM-specific discussions and implementations List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 14 Oct 2007 01:31:59 -0000 Ivan Voras wrote: > Hi, > > I've installed a Linux (openSUSE) on a laptop and this is what I got by > default in fstab: > > /dev/disk/by-id/scsi-SATA_Hitachi_HTS5412_HP0400BEG1922A-part2 / > ext3 acl,user_xattr 1 1 > /dev/disk/by-id/scsi-SATA_Hitachi_HTS5412_HP0400BEG1922A-part4 /data > ext3 acl,user_xattr 1 2 > /dev/disk/by-id/scsi-SATA_Hitachi_HTS5412_HP0400BEG1922A-part3 swap > swap defaults 0 0 > > A similar option (to use a device by id instead of location) also exists > for network cards. > > (This is just a "FYI" post, I'm not complaining :) ). I was actually wondering if we should start labeling our filesystems at newfs time in the installer for this type of setup. There are a handful of potential 'gotchas' though. Eric From owner-freebsd-geom@FreeBSD.ORG Sun Oct 14 13:35:03 2007 Return-Path: Delivered-To: freebsd-geom@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 0BE4D16A41A; Sun, 14 Oct 2007 13:35:03 +0000 (UTC) (envelope-from d_elbracht@ecngs.de) Received: from ecngs.de (mail.ecngs.de [217.73.144.50]) by mx1.freebsd.org (Postfix) with ESMTP id 2F5A013C459; Sun, 14 Oct 2007 13:35:02 +0000 (UTC) (envelope-from d_elbracht@ecngs.de) Received: from EC1a (ec1.elbracht.net [217.73.144.99]) by ecngs.de (SurgeMail 3.8f2) with ESMTP id 1773130-1922481 for multiple; Sun, 14 Oct 2007 15:22:59 +0200 From: "d_elbracht" To: , Date: Sun, 14 Oct 2007 15:22:32 +0200 Message-ID: <008801c80e65$47cbe650$639049d9@EC1a> MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit X-Mailer: Microsoft Office Outlook 11 Thread-Index: AcgOZUbPq0zqvOG2QwSFpRt2OPaAhw== X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2900.3138 Cc: Subject: g_vfs_done():da3s1a[READ(offset=81064794762854400, length=8192)]error = 5 X-BeenThere: freebsd-geom@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: GEOM-specific discussions and implementations List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 14 Oct 2007 13:35:03 -0000 we are trying to diagnose errors seen on 6.2, SMP, amd64, cvsup'ed of 2007-10-09 Mainboard is a Tyan Thunder h2000M (S3992-E) with 16 GB RAM and 2 x Opteron 2216, da3 is on a 3ware 9550-12 we are seeing this error: g_vfs_done():da3s1a[READ(offset=81064794762854400, length=8192)]error = 5 on a 12 GB Hyperdrive the offset changes sometimes, but it is always 81064794xxxxxxxxx and well out the 12GB range. We did have the Hyperdrive connected directly to the mainboards SATA0 (ad4) with similar errors. We used to have a md instead of the hyperdrive before, coming up with similar errors. Blocksize on the partition is 8192 (newsfs -b 8192 ..). We did have a blocksize of 65536 before, but after some hours (sometimes days), the machine will be unresponsible with "newbuf" as a waitmessage in top and has to be hard-reset. Regarding "newbuf", as well as nbufkv and nbufbs, I will write a seperate message to the list. According to systat -vm, da3 does tps > 500 (yes, that's a lot) This leads to an assumption, the error has to do with very high IOs per second on a SMP machine. The system-disk is a RAID1 on an ICP 5805. All other disks (51) are 20 gstripe'd partitions. Any hint to diagnose / fix the problem is well appreciated. Cheers, Dieter From owner-freebsd-geom@FreeBSD.ORG Sun Oct 14 14:09:02 2007 Return-Path: Delivered-To: freebsd-geom@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 92BDA16A417; Sun, 14 Oct 2007 14:09:02 +0000 (UTC) (envelope-from d_elbracht@ecngs.de) Received: from ecngs.de (mail.ecngs.de [217.73.144.50]) by mx1.freebsd.org (Postfix) with ESMTP id B409413C48E; Sun, 14 Oct 2007 14:09:01 +0000 (UTC) (envelope-from d_elbracht@ecngs.de) Received: from EC1a (ec1.elbracht.net [217.73.144.99]) by ecngs.de (SurgeMail 3.8f2) with ESMTP id 1773204-1922481 for multiple; Sun, 14 Oct 2007 16:09:11 +0200 From: "d_elbracht" To: "'Scott Long'" References: <008801c80e65$47cbe650$639049d9@EC1a> <47121F9F.7050900@samsco.org> Date: Sun, 14 Oct 2007 16:08:44 +0200 Message-ID: <008d01c80e6b$bb95b7e0$639049d9@EC1a> MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit X-Mailer: Microsoft Office Outlook 11 Thread-Index: AcgOa5ciuX2g50BFT+K6MBnLZJ0DxQAASbtQ In-Reply-To: <47121F9F.7050900@samsco.org> X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2900.3138 Cc: freebsd-stable@freebsd.org, freebsd-geom@freebsd.org Subject: AW: g_vfs_done():da3s1a[READ(offset=81064794762854400, length=8192)]error = 5 X-BeenThere: freebsd-geom@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: GEOM-specific discussions and implementations List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 14 Oct 2007 14:09:02 -0000 > > we are trying to diagnose errors seen on 6.2, SMP, amd64, > cvsup'ed of > > 2007-10-09 > > > > Mainboard is a Tyan Thunder h2000M (S3992-E) with 16 GB RAM and 2 x > > Opteron 2216, da3 is on a 3ware 9550-12 > > > > we are seeing this error: > > g_vfs_done():da3s1a[READ(offset=81064794762854400, > length=8192)]error > > = 5 on a 12 GB Hyperdrive > > > > the offset changes sometimes, but it is always > 81064794xxxxxxxxx and > > well out the 12GB range. > > > > We did have the Hyperdrive connected directly to the > mainboards SATA0 > > (ad4) with similar errors. > > We used to have a md instead of the hyperdrive before, > coming up with > > similar errors. > > > > Blocksize on the partition is 8192 (newsfs -b 8192 ..). > > We did have a blocksize of 65536 before, but after some hours > > (sometimes days), the machine will be unresponsible with > "newbuf" as a > > waitmessage in top and has to be hard-reset. > > Regarding "newbuf", as well as nbufkv and nbufbs, I will write a > > seperate message to the list. > > > > According to systat -vm, da3 does tps > 500 (yes, that's a lot) > > > > This leads to an assumption, the error has to do with very high IOs > > per second on a SMP machine. > > The system-disk is a RAID1 on an ICP 5805. All other disks > (51) are 20 > > gstripe'd partitions. > > > > Any hint to diagnose / fix the problem is well appreciated. > > > > Cheers, > > > > Dieter > > > > I can geneate 30,000 I/O's per second for hours on end on > several types of storage hardware on FreeBSD SMP, and have no > problems. Since you're seeing this problem both when > connected to a 3ware controller and when connected to a > simple ATA/SATA controller (both of which have also been > observed to do high amounts of I/O with no problems), I > suspect that the problem is with your disk device, not with > FreeBSD. I don't know anything about a "hyperdrive" though, > so more information might help. > > Scott Well, how about this: > > We used to have a md instead of the hyperdrive before, > coming up with > > similar errors. here ist some info about the hyperdrive. http://www.hyperossystems.co.uk/ We could go back the the md (memory-disk) to try again. What exactly does the "offset" in the error-message mean ? Isn't that like a seek on the disk ? And what does "error=5" mean ? Sure, the whole thing could be a problem of the application running. It's diablo 5. The history file (dhistory) about 2 GB in size resides on the hyperdrive. Dieter From owner-freebsd-geom@FreeBSD.ORG Sun Oct 14 14:15:27 2007 Return-Path: Delivered-To: freebsd-geom@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id EDA8416A41A for ; Sun, 14 Oct 2007 14:15:27 +0000 (UTC) (envelope-from scottl@samsco.org) Received: from pooker.samsco.org (pooker.samsco.org [168.103.85.57]) by mx1.freebsd.org (Postfix) with ESMTP id 6B26313C447 for ; Sun, 14 Oct 2007 14:15:27 +0000 (UTC) (envelope-from scottl@samsco.org) Received: from phobos.samsco.home (phobos.samsco.home [192.168.254.11]) (authenticated bits=0) by pooker.samsco.org (8.13.8/8.13.8) with ESMTP id l9EDtAlH059643; Sun, 14 Oct 2007 07:55:10 -0600 (MDT) (envelope-from scottl@samsco.org) Message-ID: <47121F9F.7050900@samsco.org> Date: Sun, 14 Oct 2007 07:54:39 -0600 From: Scott Long User-Agent: Mozilla/5.0 (Macintosh; U; Intel Mac OS X; en-US; rv:1.8.1.6) Gecko/20070802 SeaMonkey/1.1.4 MIME-Version: 1.0 To: d_elbracht References: <008801c80e65$47cbe650$639049d9@EC1a> In-Reply-To: <008801c80e65$47cbe650$639049d9@EC1a> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Greylist: Sender succeeded SMTP AUTH authentication, not delayed by milter-greylist-2.0.2 (pooker.samsco.org [168.103.85.57]); Sun, 14 Oct 2007 07:55:10 -0600 (MDT) X-Spam-Status: No, score=-1.4 required=5.5 tests=ALL_TRUSTED autolearn=failed version=3.1.8 X-Spam-Checker-Version: SpamAssassin 3.1.8 (2007-02-13) on pooker.samsco.org Cc: freebsd-stable@freebsd.org, freebsd-geom@freebsd.org Subject: Re: g_vfs_done():da3s1a[READ(offset=81064794762854400, length=8192)]error = 5 X-BeenThere: freebsd-geom@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: GEOM-specific discussions and implementations List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 14 Oct 2007 14:15:28 -0000 d_elbracht wrote: > we are trying to diagnose errors seen on 6.2, SMP, amd64, cvsup'ed of > 2007-10-09 > > Mainboard is a Tyan Thunder h2000M (S3992-E) with 16 GB RAM and 2 x Opteron > 2216, da3 is on a 3ware 9550-12 > > we are seeing this error: > g_vfs_done():da3s1a[READ(offset=81064794762854400, length=8192)]error = 5 > on a 12 GB Hyperdrive > > the offset changes sometimes, but it is always 81064794xxxxxxxxx and well > out the 12GB range. > > We did have the Hyperdrive connected directly to the mainboards SATA0 (ad4) > with similar errors. > We used to have a md instead of the hyperdrive before, coming up with > similar errors. > > Blocksize on the partition is 8192 (newsfs -b 8192 ..). > We did have a blocksize of 65536 before, but after some hours (sometimes > days), the machine will be unresponsible with "newbuf" as a waitmessage in > top and has to be hard-reset. > Regarding "newbuf", as well as nbufkv and nbufbs, I will write a seperate > message to the list. > > According to systat -vm, da3 does tps > 500 (yes, that's a lot) > > This leads to an assumption, the error has to do with very high IOs per > second on a SMP machine. > The system-disk is a RAID1 on an ICP 5805. All other disks (51) are 20 > gstripe'd partitions. > > Any hint to diagnose / fix the problem is well appreciated. > > Cheers, > > Dieter > I can geneate 30,000 I/O's per second for hours on end on several types of storage hardware on FreeBSD SMP, and have no problems. Since you're seeing this problem both when connected to a 3ware controller and when connected to a simple ATA/SATA controller (both of which have also been observed to do high amounts of I/O with no problems), I suspect that the problem is with your disk device, not with FreeBSD. I don't know anything about a "hyperdrive" though, so more information might help. Scott From owner-freebsd-geom@FreeBSD.ORG Sun Oct 14 14:33:44 2007 Return-Path: Delivered-To: freebsd-geom@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 1C39816A41A for ; Sun, 14 Oct 2007 14:33:44 +0000 (UTC) (envelope-from arne_woerner@yahoo.com) Received: from web30304.mail.mud.yahoo.com (web30304.mail.mud.yahoo.com [209.191.69.66]) by mx1.freebsd.org (Postfix) with SMTP id A086E13C447 for ; Sun, 14 Oct 2007 14:33:43 +0000 (UTC) (envelope-from arne_woerner@yahoo.com) Received: (qmail 78278 invoked by uid 60001); 14 Oct 2007 14:27:02 -0000 DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=s1024; d=yahoo.com; h=X-YMail-OSG:Received:Date:From:Subject:To:Cc:In-Reply-To:MIME-Version:Content-Type:Content-Transfer-Encoding:Message-ID; b=vfPEY0OcYQMGg30lw7rDlpzNOcBjTEW1ScJfVt8ILqEOVPdpwZOXiAVvYOtRZ4ZIFs1ay9fZEvHGXe2KJve9OxXLhgWjhkmfHGvJIY6K3yxyuQJnjQpp1zGTiKbDWEKn7F7F6s2j1S+y7WaOAPE/FGULy9eglR3DO/wk2HtYl30=; X-YMail-OSG: c0UpGFQVM1m9hWvo7E.Vo_ArdpzNu24EcH2bp8UcvlXgTlUEtLd4rDPqrmckPu_c17_kLw0nQc06Ngrb8qs4c3YHAn3qL1QqCSSo6S6x1UpYSg08Qp9XQSc5fG2QJQ-- Received: from [84.141.122.18] by web30304.mail.mud.yahoo.com via HTTP; Sun, 14 Oct 2007 07:27:01 PDT Date: Sun, 14 Oct 2007 07:27:01 -0700 (PDT) From: Arne "Wörner" To: d_elbracht In-Reply-To: <008d01c80e6b$bb95b7e0$639049d9@EC1a> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: 8bit Message-ID: <956094.77414.qm@web30304.mail.mud.yahoo.com> Cc: freebsd-geom@freebsd.org Subject: Re: AW: g_vfs_done():da3s1a[READ(offset=81064794762854400, length=8192)]error = 5 X-BeenThere: freebsd-geom@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: GEOM-specific discussions and implementations List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 14 Oct 2007 14:33:44 -0000 That 64kB block size problem is a quite old bug... etc@fluffles.net reported it some months ago... It is somehow due to memory fragmentation and a dead lock... --- d_elbracht wrote: > We could go back the the md (memory-disk) to try again. > a memory disk (md) should never deliver an EIO (5)... So u must have done something different, than just reading/writing to/from a md... But u can take a look at the source code and search for "EIO"... http://www.freebsd.org/cgi/cvsweb.cgi/src/sys/dev/md/md.c A possible cause for that EIO is, that the file system parameters r somehow bad, so that it reads from negative or too large offsets... http://www.freebsd.org/cgi/cvsweb.cgi/~checkout~/src/sys/geom/geom_io.c?rev=1.75;content-type=text%2Fplain if (bp->bio_offset > pp->mediasize) return (EIO); -Arne ____________________________________________________________________________________ Shape Yahoo! in your own image. Join our Network Research Panel today! http://surveylink.yahoo.com/gmrs/yahoo_panel_invite.asp?a=7 From owner-freebsd-geom@FreeBSD.ORG Sun Oct 14 14:42:13 2007 Return-Path: Delivered-To: freebsd-geom@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 2DA3D16A41A; Sun, 14 Oct 2007 14:42:13 +0000 (UTC) (envelope-from d_elbracht@ecngs.de) Received: from ecngs.de (mail.ecngs.de [217.73.144.50]) by mx1.freebsd.org (Postfix) with ESMTP id EAF7D13C457; Sun, 14 Oct 2007 14:42:11 +0000 (UTC) (envelope-from d_elbracht@ecngs.de) Received: from EC1a (ec1.elbracht.net [217.73.144.99]) by ecngs.de (SurgeMail 3.8f2) with ESMTP id 1773237-1922481 for multiple; Sun, 14 Oct 2007 16:42:32 +0200 From: "d_elbracht" To: , Date: Sun, 14 Oct 2007 16:42:06 +0200 Message-ID: <008e01c80e70$64c92910$639049d9@EC1a> MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit X-Mailer: Microsoft Office Outlook 11 Thread-Index: AcgOcGQ+rp1Mn0RqSFKuKB0DklKcNw== X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2900.3138 Cc: Subject: newbuf, nbufkv, nbufbs X-BeenThere: freebsd-geom@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: GEOM-specific discussions and implementations List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 14 Oct 2007 14:42:13 -0000 We have 2 machines involved with this problem. machine1, SMP, i386, 4 GB RAM was recently upgraded from 5.4 to 6.2 cvsup'ed 2007-10-10 a partition of about 2.5 TB (gstripe -s 1048576) was newfs'ed with blocksize of 65536 and fragsize of 8192 On 5.4, this was running for months with no problem. On 6.2 after a few hours of high thruput (network tx and rx 400-500 Mbit each), it became unresponsible with top showing a lot of processes with waitmessage newbuf. So, reset, fsck etc and it run again, only after a few hours, it became unresponsible again, showing processes with nbufkv and nbufbs this time, I did newfs with blocksize of 32768 and fragsize of 4096 and it's running. Thruput is decreased to 300-400 Mbit Note, it did NEVER show the problem on 5.4 machine2, SMP, amd64, 16 GB RAM, 6.2 cvsup'ed 2007-10-09 20 partitions involving 51 disks, all gstripe -s 1048576, newfs -b 65536 -f 8192 1 partion of 12 GB, (da3s1a) newfs -b 65536 -f 8192 after a few hours, top shows newbuf and the machine is unresponsible. tps on da3s1a is > 500, the others are < 100 I did newfs -b 8192 -f 1024 /dev/da3s1a and it's running without the problem (yet) The problem seems to have to do with -b 65536 and lot's of IOPS ond 6.2 Any clue ? e.g. increase BKVASIZE to 131072 and kern.nbuf to 32768 ? Cheers, Dieter From owner-freebsd-geom@FreeBSD.ORG Sun Oct 14 14:46:23 2007 Return-Path: Delivered-To: freebsd-geom@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 194F916A46D for ; Sun, 14 Oct 2007 14:46:23 +0000 (UTC) (envelope-from arne_woerner@yahoo.com) Received: from web30308.mail.mud.yahoo.com (web30308.mail.mud.yahoo.com [209.191.69.70]) by mx1.freebsd.org (Postfix) with SMTP id 9A72C13C465 for ; Sun, 14 Oct 2007 14:46:22 +0000 (UTC) (envelope-from arne_woerner@yahoo.com) Received: (qmail 24282 invoked by uid 60001); 14 Oct 2007 14:19:40 -0000 DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=s1024; d=yahoo.com; h=X-YMail-OSG:Received:Date:From:Subject:To:Cc:In-Reply-To:MIME-Version:Content-Type:Content-Transfer-Encoding:Message-ID; b=BJoF8stMqjSQT2kb4OdO0ThJRD43491QGv2cjC4TEq6mu6R7NQQ44xRSVSA/si8pa3FT/H6Jikyh40bsNFpPRQsCpuffWmlpFq9Q4DCAVBXXto3tEyzneR/fjnkZBZncD4f9Id4s7c8UlTvwcznwVKu0d4RUEEGSr2+hGKTj1OQ=; X-YMail-OSG: _cMoOjwVM1mmS7AdYeP1xkwMo_0KmFOOWP86RFOXaZGEK6hfhhYxUvd9HSY5zxX0DJYUhF3soI0VAg3_.baZG96yKVIxyJkILXfp0WHDcv05VgCvoOOxtW0Se0JgmtVJRH3yL4gQlVnvpCe7 Received: from [84.141.122.18] by web30308.mail.mud.yahoo.com via HTTP; Sun, 14 Oct 2007 07:19:40 PDT Date: Sun, 14 Oct 2007 07:19:40 -0700 (PDT) From: Arne "Wörner" To: Scott Long , d_elbracht In-Reply-To: <47121F9F.7050900@samsco.org> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: 8bit Message-ID: <847856.24179.qm@web30308.mail.mud.yahoo.com> Cc: freebsd-geom@freebsd.org Subject: Re: g_vfs_done():da3s1a[READ(offset=81064794762854400, length=8192)]error = 5 X-BeenThere: freebsd-geom@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: GEOM-specific discussions and implementations List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 14 Oct 2007 14:46:23 -0000 --- Scott Long wrote: > I can geneate 30,000 I/O's per second for hours on end on several types > of storage hardware on FreeBSD SMP, and have no problems. Since you're > seeing this problem both when connected to a 3ware controller and when > connected to a simple ATA/SATA controller (both of which have also been > observed to do high amounts of I/O with no problems), I suspect that the > problem is with your disk device, not with FreeBSD. I don't know > anything about a "hyperdrive" though, so more information might help. > > Scott > I would say so, too... Especially because errno 5 is EIO: http://www.freebsd.org/cgi/man.cgi?query=errno&apropos=0&sektion=0&manpath=FreeBSD+6.2-RELEASE&format=html -Arne ____________________________________________________________________________________ Take the Internet to Go: Yahoo!Go puts the Internet in your pocket: mail, news, photos & more. http://mobile.yahoo.com/go?refer=1GNXIC From owner-freebsd-geom@FreeBSD.ORG Sun Oct 14 15:59:45 2007 Return-Path: Delivered-To: freebsd-geom@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 1122416A417 for ; Sun, 14 Oct 2007 15:59:45 +0000 (UTC) (envelope-from lars@larseighner.com) Received: from mail.team1internet.com (mail.team1internet.com [216.110.13.10]) by mx1.freebsd.org (Postfix) with ESMTP id E348C13C480 for ; Sun, 14 Oct 2007 15:59:44 +0000 (UTC) (envelope-from lars@larseighner.com) Received: by mail.team1internet.com (Postfix, from userid 12346) id 51A5416B751; Sun, 14 Oct 2007 10:38:43 -0500 (CDT) Received: from larseighner.com (unknown [216.110.13.72]) by mail.team1internet.com (Postfix) with SMTP id 876C116B747; Sun, 14 Oct 2007 10:38:41 -0500 (CDT) Received: by larseighner.com (nbSMTP-1.00) for uid 1001 lars@larseighner.com; Sun, 14 Oct 2007 10:37:39 -0500 (CDT) Date: Sun, 14 Oct 2007 10:37:38 -0500 (CDT) From: Lars Eighner X-X-Sender: lars@debranded.6dollardialup.com To: d_elbracht In-Reply-To: <008801c80e65$47cbe650$639049d9@EC1a> Message-ID: <20071014103129.W19754@qroenaqrq.6qbyyneqvnyhc.pbz> References: <008801c80e65$47cbe650$639049d9@EC1a> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed X-Sanitizer: Anomy and SpamAssassin mail filter - see http://www.6dollardialup.com/support/spaminfo.html X-Spam-Status: No, hits=-3.2 required=10.0 tests=EMAIL_ATTRIBUTION,IN_REP_TO,J_CHICKENPOX_52,OACYS_SINGLE, QUOTED_EMAIL_TEXT,REFERENCES,RM_sl_Parens, TO_LOCALPART_EQ_REAL version=2.43 X-Spam-Level: Cc: freebsd-stable@freebsd.org, freebsd-geom@freebsd.org Subject: Re: g_vfs_done():da3s1a[READ(offset=81064794762854400, length=8192)]error = 5 X-BeenThere: freebsd-geom@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: GEOM-specific discussions and implementations List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 14 Oct 2007 15:59:45 -0000 On Sun, 14 Oct 2007, d_elbracht wrote: > we are trying to diagnose errors seen on 6.2, SMP, amd64, cvsup'ed of > 2007-10-09 > > Mainboard is a Tyan Thunder h2000M (S3992-E) with 16 GB RAM and 2 x Opteron > 2216, da3 is on a 3ware 9550-12 > > we are seeing this error: > g_vfs_done():da3s1a[READ(offset=81064794762854400, length=8192)]error = 5 > on a 12 GB Hyperdrive I trashed a perfectly disk drive before learning that there is a serious bug in g_vfs. Apparently it is one of those things which shows up in some configurations and not others. Although I am told they are unable to isolate the problem, all the reports I've seen were from people using AMD systems. From owner-freebsd-geom@FreeBSD.ORG Sun Oct 14 16:09:21 2007 Return-Path: Delivered-To: freebsd-geom@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id CD3B916A46B; Sun, 14 Oct 2007 16:09:21 +0000 (UTC) (envelope-from d_elbracht@ecngs.de) Received: from ecngs.de (mail.ecngs.de [217.73.144.50]) by mx1.freebsd.org (Postfix) with ESMTP id 144D113C457; Sun, 14 Oct 2007 16:09:20 +0000 (UTC) (envelope-from d_elbracht@ecngs.de) Received: from EC1a (ec1.elbracht.net [217.73.144.99]) by ecngs.de (SurgeMail 3.8f2) with ESMTP id 1773356-1922481 for multiple; Sun, 14 Oct 2007 18:09:25 +0200 From: "d_elbracht" To: =?iso-8859-1?Q?'Arne_W=F6rner'?= , "'Scott Long'" References: <47121F9F.7050900@samsco.org> <847856.24179.qm@web30308.mail.mud.yahoo.com> Date: Sun, 14 Oct 2007 18:08:57 +0200 Message-ID: <008f01c80e7c$876c89b0$639049d9@EC1a> MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit X-Mailer: Microsoft Office Outlook 11 Thread-Index: AcgOcRU1cBtJiEb5TsWoeT7TKFDfDAACwHKw In-Reply-To: <847856.24179.qm@web30308.mail.mud.yahoo.com> X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2900.3138 Cc: freebsd-stable@freebsd.org, freebsd-geom@freebsd.org Subject: AW: g_vfs_done():da3s1a[READ(offset=81064794762854400, length=8192)]error = 5 X-BeenThere: freebsd-geom@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: GEOM-specific discussions and implementations List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 14 Oct 2007 16:09:21 -0000 > --- Scott Long wrote: > > I can geneate 30,000 I/O's per second for hours on end on several > > types of storage hardware on FreeBSD SMP, and have no > problems. Since > > you're seeing this problem both when connected to a 3ware > controller > > and when connected to a simple ATA/SATA controller (both of > which have > > also been observed to do high amounts of I/O with no problems), I > > suspect that the problem is with your disk device, not with > FreeBSD. > > I don't know anything about a "hyperdrive" though, so more > information might help. > > > > Scott > > > I would say so, too... > > Especially because errno 5 is EIO: > http://www.freebsd.org/cgi/man.cgi?query=errno&apropos=0&sekti > on=0&manpath=FreeBSD+6.2-RELEASE&format=html > > -Arne I would agree with you on that, if the error (EIO) is NOT because of the READ going wrong in the first place. >From my understanding, the offset 81064794762854400 is NOT within the 12 GB of the drive anymore. Or, does the offset mean something else ? Dieter From owner-freebsd-geom@FreeBSD.ORG Sun Oct 14 18:45:29 2007 Return-Path: Delivered-To: freebsd-geom@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 856CF16A41A; Sun, 14 Oct 2007 18:45:29 +0000 (UTC) (envelope-from scottl@samsco.org) Received: from pooker.samsco.org (pooker.samsco.org [168.103.85.57]) by mx1.freebsd.org (Postfix) with ESMTP id DAAED13C44B; Sun, 14 Oct 2007 18:45:28 +0000 (UTC) (envelope-from scottl@samsco.org) Received: from phobos.samsco.home (phobos.samsco.home [192.168.254.11]) (authenticated bits=0) by pooker.samsco.org (8.13.8/8.13.8) with ESMTP id l9EIFPer060995; Sun, 14 Oct 2007 12:15:25 -0600 (MDT) (envelope-from scottl@samsco.org) Message-ID: <47125C9E.1040109@samsco.org> Date: Sun, 14 Oct 2007 12:14:54 -0600 From: Scott Long User-Agent: Mozilla/5.0 (Macintosh; U; Intel Mac OS X; en-US; rv:1.8.1.6) Gecko/20070802 SeaMonkey/1.1.4 MIME-Version: 1.0 To: Lars Eighner References: <008801c80e65$47cbe650$639049d9@EC1a> <20071014103129.W19754@qroenaqrq.6qbyyneqvnyhc.pbz> In-Reply-To: <20071014103129.W19754@qroenaqrq.6qbyyneqvnyhc.pbz> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Greylist: Sender succeeded SMTP AUTH authentication, not delayed by milter-greylist-2.0.2 (pooker.samsco.org [168.103.85.57]); Sun, 14 Oct 2007 12:15:26 -0600 (MDT) X-Spam-Status: No, score=-1.4 required=5.5 tests=ALL_TRUSTED autolearn=failed version=3.1.8 X-Spam-Checker-Version: SpamAssassin 3.1.8 (2007-02-13) on pooker.samsco.org Cc: freebsd-stable@freebsd.org, d_elbracht , freebsd-geom@freebsd.org Subject: Re: g_vfs_done():da3s1a[READ(offset=81064794762854400, length=8192)]error = 5 X-BeenThere: freebsd-geom@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: GEOM-specific discussions and implementations List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 14 Oct 2007 18:45:29 -0000 Lars Eighner wrote: > On Sun, 14 Oct 2007, d_elbracht wrote: > >> we are trying to diagnose errors seen on 6.2, SMP, amd64, cvsup'ed of >> 2007-10-09 >> >> Mainboard is a Tyan Thunder h2000M (S3992-E) with 16 GB RAM and 2 x >> Opteron >> 2216, da3 is on a 3ware 9550-12 >> >> we are seeing this error: >> g_vfs_done():da3s1a[READ(offset=81064794762854400, length=8192)]error = 5 >> on a 12 GB Hyperdrive > > I trashed a perfectly disk drive before learning that there is a serious > bug > in g_vfs. Apparently it is one of those things which shows up in some > configurations and not others. Although I am told they are unable to > isolate the problem, all the reports I've seen were from people using AMD > systems. > Are you talking about problems with ATA controllers, AMD64 (or i386+PAE), and more than 4GB of RAM? Or something else? Scott From owner-freebsd-geom@FreeBSD.ORG Sun Oct 14 23:22:35 2007 Return-Path: Delivered-To: freebsd-geom@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 629A916A468; Sun, 14 Oct 2007 23:22:35 +0000 (UTC) (envelope-from scottl@samsco.org) Received: from pooker.samsco.org (pooker.samsco.org [168.103.85.57]) by mx1.freebsd.org (Postfix) with ESMTP id 0B32013C50D; Sun, 14 Oct 2007 23:22:34 +0000 (UTC) (envelope-from scottl@samsco.org) Received: from phobos.samsco.home (phobos.samsco.home [192.168.254.11]) (authenticated bits=0) by pooker.samsco.org (8.13.8/8.13.8) with ESMTP id l9ENMSTf062135; Sun, 14 Oct 2007 17:22:29 -0600 (MDT) (envelope-from scottl@samsco.org) Message-ID: <4712A494.30803@samsco.org> Date: Sun, 14 Oct 2007 17:21:56 -0600 From: Scott Long User-Agent: Mozilla/5.0 (Macintosh; U; Intel Mac OS X; en-US; rv:1.8.1.6) Gecko/20070802 SeaMonkey/1.1.4 MIME-Version: 1.0 To: Ivan Voras References: <008801c80e65$47cbe650$639049d9@EC1a> In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Greylist: Sender succeeded SMTP AUTH authentication, not delayed by milter-greylist-2.0.2 (pooker.samsco.org [168.103.85.57]); Sun, 14 Oct 2007 17:22:29 -0600 (MDT) X-Spam-Status: No, score=-1.4 required=5.5 tests=ALL_TRUSTED autolearn=failed version=3.1.8 X-Spam-Checker-Version: SpamAssassin 3.1.8 (2007-02-13) on pooker.samsco.org Cc: freebsd-stable@freebsd.org, freebsd-geom@freebsd.org Subject: Re: g_vfs_done():da3s1a[READ(offset=81064794762854400, length=8192)]error = 5 X-BeenThere: freebsd-geom@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: GEOM-specific discussions and implementations List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 14 Oct 2007 23:22:35 -0000 Ivan Voras wrote: > d_elbracht wrote: >> we are trying to diagnose errors seen on 6.2, SMP, amd64, cvsup'ed of >> 2007-10-09 >> >> Mainboard is a Tyan Thunder h2000M (S3992-E) with 16 GB RAM and 2 x Opteron >> 2216, da3 is on a 3ware 9550-12 >> >> we are seeing this error: >> g_vfs_done():da3s1a[READ(offset=81064794762854400, length=8192)]error = 5 >> on a 12 GB Hyperdrive >> >> the offset changes sometimes, but it is always 81064794xxxxxxxxx and well >> out the 12GB range. > > Yes. > >> According to systat -vm, da3 does tps > 500 (yes, that's a lot) > > That's not a lot :) That's actually low for a modern solid state drive. > >> This leads to an assumption, the error has to do with very high IOs per >> second on a SMP machine. > > Either that or file system errors. Does fsck run ok or does it say > anything unusual? > No, filesystem corruption has nothing to do with g_vfs_done messages. Scott From owner-freebsd-geom@FreeBSD.ORG Mon Oct 15 00:04:57 2007 Return-Path: Delivered-To: freebsd-geom@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 1916316A46C for ; Mon, 15 Oct 2007 00:04:57 +0000 (UTC) (envelope-from janm@transactionware.com) Received: from mail.transactionware.com (mail.transactionware.com [203.14.245.7]) by mx1.freebsd.org (Postfix) with SMTP id 1A3D613C480 for ; Mon, 15 Oct 2007 00:04:55 +0000 (UTC) (envelope-from janm@transactionware.com) Received: (qmail 90464 invoked from network); 14 Oct 2007 23:38:35 -0000 Received: from midgard.transactionware.com (192.168.1.55) by dm.transactionware.com with SMTP; 14 Oct 2007 23:38:35 -0000 Received: (qmail 20180 invoked by uid 907); 14 Oct 2007 23:38:12 -0000 Received: from [192.168.1.51] (HELO janmxp) (192.168.1.51) by midgard.transactionware.com (qpsmtpd/0.32) with ESMTP; Mon, 15 Oct 2007 09:38:12 +1000 From: "Jan Mikkelsen" To: "'Scott Long'" , "'Ivan Voras'" References: <008801c80e65$47cbe650$639049d9@EC1a> <4712A494.30803@samsco.org> In-Reply-To: <4712A494.30803@samsco.org> Date: Mon, 15 Oct 2007 09:38:12 +1000 Organization: Transactionware Message-ID: <000a01c80ebb$49227f90$db677eb0$@com> MIME-Version: 1.0 Content-Type: text/plain; charset="US-ASCII" Content-Transfer-Encoding: 7bit X-Mailer: Microsoft Office Outlook 12.0 Thread-Index: AcgOuTwXgPkltFJRQheLCsRVqNf9bwAAU00w Content-Language: en-au Cc: freebsd-stable@freebsd.org, freebsd-geom@freebsd.org Subject: RE: g_vfs_done():da3s1a[READ(offset=81064794762854400, length=8192)]error = 5 X-BeenThere: freebsd-geom@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: GEOM-specific discussions and implementations List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 15 Oct 2007 00:04:57 -0000 Scott Long wrote: > Ivan Voras wrote: > > Either that or file system errors. Does fsck run ok or does > it say > > anything unusual? > > > > No, filesystem corruption has nothing to do with g_vfs_done > messages. Well, perhaps not directly but I think filesystem corruption can indirectly cause g_vfs_done messages. If a filesystem is corrupt, the filesystem might attempt to read an out-of-range block, leading to a g_vfs_done error. This was the case for some of the arcmsr problems last year. In this case, I think the original poster said that the block number was out of range for the device. Regards, Jan Mikkelsen From owner-freebsd-geom@FreeBSD.ORG Mon Oct 15 00:14:05 2007 Return-Path: Delivered-To: freebsd-geom@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 1199F16A419; Mon, 15 Oct 2007 00:14:05 +0000 (UTC) (envelope-from scottl@samsco.org) Received: from pooker.samsco.org (pooker.samsco.org [168.103.85.57]) by mx1.freebsd.org (Postfix) with ESMTP id A924A13C455; Mon, 15 Oct 2007 00:14:04 +0000 (UTC) (envelope-from scottl@samsco.org) Received: from phobos.samsco.home (phobos.samsco.home [192.168.254.11]) (authenticated bits=0) by pooker.samsco.org (8.13.8/8.13.8) with ESMTP id l9F0DvlP062319; Sun, 14 Oct 2007 18:13:58 -0600 (MDT) (envelope-from scottl@samsco.org) Message-ID: <4712B0A6.1050408@samsco.org> Date: Sun, 14 Oct 2007 18:13:26 -0600 From: Scott Long User-Agent: Mozilla/5.0 (Macintosh; U; Intel Mac OS X; en-US; rv:1.8.1.6) Gecko/20070802 SeaMonkey/1.1.4 MIME-Version: 1.0 To: Jan Mikkelsen References: <008801c80e65$47cbe650$639049d9@EC1a> <4712A494.30803@samsco.org> <000a01c80ebb$49227f90$db677eb0$@com> In-Reply-To: <000a01c80ebb$49227f90$db677eb0$@com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Greylist: Sender succeeded SMTP AUTH authentication, not delayed by milter-greylist-2.0.2 (pooker.samsco.org [168.103.85.57]); Sun, 14 Oct 2007 18:13:58 -0600 (MDT) X-Spam-Status: No, score=-1.4 required=5.5 tests=ALL_TRUSTED autolearn=failed version=3.1.8 X-Spam-Checker-Version: SpamAssassin 3.1.8 (2007-02-13) on pooker.samsco.org Cc: freebsd-stable@freebsd.org, 'Ivan Voras' , freebsd-geom@freebsd.org Subject: Re: g_vfs_done():da3s1a[READ(offset=81064794762854400, length=8192)]error = 5 X-BeenThere: freebsd-geom@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: GEOM-specific discussions and implementations List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 15 Oct 2007 00:14:05 -0000 Jan Mikkelsen wrote: > Scott Long wrote: >> Ivan Voras wrote: >>> Either that or file system errors. Does fsck run ok or does >> it say >>> anything unusual? >>> >> No, filesystem corruption has nothing to do with g_vfs_done >> messages. > > Well, perhaps not directly but I think filesystem corruption can > indirectly cause g_vfs_done messages. > > If a filesystem is corrupt, the filesystem might attempt to read an > out-of-range block, leading to a g_vfs_done error. This was the > case for some of the arcmsr problems last year. > > In this case, I think the original poster said that the block > number was out of range for the device. > > Regards, > > Jan Mikkelsen > > Yeah, you're right, the block number is absurd, and it could well be caused by a bad block pointer in the filesystem. It sounds like he's getting this problem even on fresh installs, which ordinarily would point to a bad driver. If it's happening with both TWA and ATA, it's hard to blame both of those drivers. Scott From owner-freebsd-geom@FreeBSD.ORG Mon Oct 15 08:21:06 2007 Return-Path: Delivered-To: freebsd-geom@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 6461616A46B; Mon, 15 Oct 2007 08:21:06 +0000 (UTC) (envelope-from d_elbracht@ecngs.de) Received: from ecngs.de (mail.ecngs.de [217.73.144.50]) by mx1.freebsd.org (Postfix) with ESMTP id 7547B13C447; Mon, 15 Oct 2007 08:21:04 +0000 (UTC) (envelope-from d_elbracht@ecngs.de) Received: from EC1a (ec1.elbracht.net [217.73.144.99]) by ecngs.de (SurgeMail 3.8f2) with ESMTP id 1774348-1922481 for multiple; Mon, 15 Oct 2007 10:21:26 +0200 From: "d_elbracht" To: "'Ivan Voras'" , References: <008801c80e65$47cbe650$639049d9@EC1a> Date: Mon, 15 Oct 2007 10:20:57 +0200 Message-ID: <00cb01c80f04$50b11ed0$639049d9@EC1a> MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit X-Mailer: Microsoft Office Outlook 11 Thread-Index: AcgOsevpOahtmKUeQKG7YhTDqm4A3wATlmcA In-Reply-To: X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2900.3138 Cc: freebsd-geom@freebsd.org Subject: AW: g_vfs_done():da3s1a[READ(offset=81064794762854400, length=8192)]error = 5 X-BeenThere: freebsd-geom@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: GEOM-specific discussions and implementations List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 15 Oct 2007 08:21:06 -0000 > > we are trying to diagnose errors seen on 6.2, SMP, amd64, > cvsup'ed of > > 2007-10-09 > > > > Mainboard is a Tyan Thunder h2000M (S3992-E) with 16 GB RAM and 2 x > > Opteron 2216, da3 is on a 3ware 9550-12 > > > > we are seeing this error: > > g_vfs_done():da3s1a[READ(offset=81064794762854400, > length=8192)]error > > = 5 on a 12 GB Hyperdrive > > > > the offset changes sometimes, but it is always > 81064794xxxxxxxxx and > > well out the 12GB range. > > Yes. > > > According to systat -vm, da3 does tps > 500 (yes, that's a lot) > > That's not a lot :) That's actually low for a modern solid > state drive. > > > This leads to an assumption, the error has to do with very high IOs > > per second on a SMP machine. > > Either that or file system errors. Does fsck run ok or does > it say anything unusual? > > There are several theoretical reasons for such errors that > are connected with the fact you use solid state drives, but > all are tricky to diagnose if you don't have a certain > repeatable test you can try. For example: > some SSDs optimize writes to "spread out" the IO on the > chips, but some do it by looking into file system structures > to determine where it's safe to relocate the write - > obviously this works only with a known and supported file > system. This is a really wild guess, but maybe the SSD > firmware has error somewhere in this area, trying to > interpret UFS as it was FAT? If you manage to get a > repeatable failure test, you can try formatting the drive as > FAT32 and trying it on that. > > Or maybe it's just a bad drive... > > > The system-disk is a RAID1 on an ICP 5805. All other disks > (51) are 20 > > gstripe'd partitions. > > 51 drives and 20 partitions? > According to the manufaturer, the drive handles any filesystem. In other words, it's as transparent as any harddisk would be. Also, as written before, we have seen the error=5 with weird offsets on an md (memory disk) before too. fsck on the disk does NOT show any error. yes, 20 partitions on the other 51 disks (/dev/stripe/data ..datann). That's for hashfeed from diablo. One basic question to ask: where does the value for offset= in g_vfs_done() come from ? >From the time the error shows up in syslog I believe, the error only happens, when a file get's appended. Dieter From owner-freebsd-geom@FreeBSD.ORG Mon Oct 15 09:05:55 2007 Return-Path: Delivered-To: freebsd-geom@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id BB36416A417 for ; Mon, 15 Oct 2007 09:05:55 +0000 (UTC) (envelope-from ivoras@gmail.com) Received: from nz-out-0506.google.com (nz-out-0506.google.com [64.233.162.235]) by mx1.freebsd.org (Postfix) with ESMTP id 6C66313C467 for ; Mon, 15 Oct 2007 09:05:55 +0000 (UTC) (envelope-from ivoras@gmail.com) Received: by nz-out-0506.google.com with SMTP id l8so798493nzf for ; Mon, 15 Oct 2007 02:05:54 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=beta; h=domainkey-signature:received:received:message-id:date:from:sender:to:subject:cc:in-reply-to:mime-version:content-type:content-transfer-encoding:content-disposition:references:x-google-sender-auth; bh=WIClknsSnPfxISdvAzWuHmvEhZq1qoSb1UyLu/2nYo8=; b=XM8x7V1ck80TXgwvNPkcQy4OazVx8cqK8TN4IG+JN9WPYxAlwYcG/SyqUjeQVbSe6Rxze01iY8LCa4tAI4Dt+5mE6XM+GhF01S7HPJiMcHLnmT63Gg1yFcsT5s6ftfB0asn9UQh7S/0ChfPCFNJJrG1zhjSUUsO3rYKLEM86BqM= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=beta; h=received:message-id:date:from:sender:to:subject:cc:in-reply-to:mime-version:content-type:content-transfer-encoding:content-disposition:references:x-google-sender-auth; b=oZzwQI0cTs+xE8G+rM1iG2sudtZzb//PZAKxHg10kbytWT7rfOCg/HL7u1192PFP5bSWNyWhNF2hXuHhxYwQwM7+vpT4jKy8o+c+PMQZbMWdiilCVXoHBH+EYabLKTQ/UeRpp3d8Zjun6SnhSG8NoGJzLXRxXnPR9NMy/rLMRBU= Received: by 10.141.15.19 with SMTP id s19mr2574331rvi.1192439154081; Mon, 15 Oct 2007 02:05:54 -0700 (PDT) Received: by 10.141.211.5 with HTTP; Mon, 15 Oct 2007 02:05:54 -0700 (PDT) Message-ID: <9bbcef730710150205o7c344432kc8bc828da64bff1f@mail.gmail.com> Date: Mon, 15 Oct 2007 11:05:54 +0200 From: "Ivan Voras" Sender: ivoras@gmail.com To: d_elbracht In-Reply-To: <00cb01c80f04$50b11ed0$639049d9@EC1a> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit Content-Disposition: inline References: <008801c80e65$47cbe650$639049d9@EC1a> <00cb01c80f04$50b11ed0$639049d9@EC1a> X-Google-Sender-Auth: 11c2e076073077f8 Cc: freebsd-stable@freebsd.org, freebsd-geom@freebsd.org Subject: Re: g_vfs_done():da3s1a[READ(offset=81064794762854400, length=8192)]error = 5 X-BeenThere: freebsd-geom@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: GEOM-specific discussions and implementations List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 15 Oct 2007 09:05:55 -0000 On 15/10/2007, d_elbracht wrote: > One basic question to ask: where does the value for offset= in g_vfs_done() > come from ? Either from the file system or from bugs in the code. I don't remember seeing similar reports before so the probability of there being bugs in the code is fairly small. This is all on raw hardware, not vmware, right? > From the time the error shows up in syslog I believe, the error only > happens, when a file get's appended. From owner-freebsd-geom@FreeBSD.ORG Mon Oct 15 09:14:36 2007 Return-Path: Delivered-To: freebsd-geom@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id D3C4D16A419; Mon, 15 Oct 2007 09:14:36 +0000 (UTC) (envelope-from d_elbracht@ecngs.de) Received: from ecngs.de (mail.ecngs.de [217.73.144.50]) by mx1.freebsd.org (Postfix) with ESMTP id E604213C478; Mon, 15 Oct 2007 09:14:35 +0000 (UTC) (envelope-from d_elbracht@ecngs.de) Received: from EC1a (ec1.elbracht.net [217.73.144.99]) by ecngs.de (SurgeMail 3.8f2) with ESMTP id 1774497-1922481 for multiple; Mon, 15 Oct 2007 11:14:57 +0200 From: "d_elbracht" To: "'Ivan Voras'" References: <008801c80e65$47cbe650$639049d9@EC1a> <00cb01c80f04$50b11ed0$639049d9@EC1a> <9bbcef730710150205o7c344432kc8bc828da64bff1f@mail.gmail.com> Date: Mon, 15 Oct 2007 11:14:29 +0200 Message-ID: <00cc01c80f0b$cafa7e50$639049d9@EC1a> MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit X-Mailer: Microsoft Office Outlook 11 Thread-Index: AcgPCq+WuEiplNxCQ/aZVM3L84OMoQAANokw In-Reply-To: <9bbcef730710150205o7c344432kc8bc828da64bff1f@mail.gmail.com> X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2900.3138 Cc: freebsd-stable@freebsd.org, freebsd-geom@freebsd.org Subject: AW: g_vfs_done():da3s1a[READ(offset=81064794762854400, length=8192)]error = 5 X-BeenThere: freebsd-geom@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: GEOM-specific discussions and implementations List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 15 Oct 2007 09:14:36 -0000 > > One basic question to ask: where does the value for offset= in > > g_vfs_done() come from ? > > Either from the file system or from bugs in the code. I don't > remember seeing similar reports before so the probability of > there being bugs in the code is fairly small. > > This is all on raw hardware, not vmware, right? > > > From the time the error shows up in syslog I believe, the > error only > > happens, when a file get's appended. Here is a similar one: http://www.nabble.com/g_vfs_done():mfid1-ERROR-when-writing-to-18TB-MFI-RAID -volume-t4590438.html it's all raw hardware, no vmware Dieter From owner-freebsd-geom@FreeBSD.ORG Mon Oct 15 11:06:15 2007 Return-Path: Delivered-To: freebsd-geom@hub.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 60BF616A475 for ; Mon, 15 Oct 2007 11:06:15 +0000 (UTC) (envelope-from owner-bugmaster@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:4f8:fff6::28]) by mx1.freebsd.org (Postfix) with ESMTP id 322C413C4A6 for ; Mon, 15 Oct 2007 11:06:15 +0000 (UTC) (envelope-from owner-bugmaster@FreeBSD.org) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.1/8.14.1) with ESMTP id l9FB6EXs080436 for ; Mon, 15 Oct 2007 11:06:14 GMT (envelope-from owner-bugmaster@FreeBSD.org) Received: (from gnats@localhost) by freefall.freebsd.org (8.14.1/8.14.1/Submit) id l9FB6Eub080434 for freebsd-geom@FreeBSD.org; Mon, 15 Oct 2007 11:06:14 GMT (envelope-from owner-bugmaster@FreeBSD.org) Date: Mon, 15 Oct 2007 11:06:14 GMT Message-Id: <200710151106.l9FB6Eub080434@freefall.freebsd.org> X-Authentication-Warning: freefall.freebsd.org: gnats set sender to owner-bugmaster@FreeBSD.org using -f From: FreeBSD bugmaster To: freebsd-geom@FreeBSD.org Cc: Subject: Current problem reports assigned to you X-BeenThere: freebsd-geom@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: GEOM-specific discussions and implementations List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 15 Oct 2007 11:06:15 -0000 From owner-freebsd-geom@FreeBSD.ORG Mon Oct 15 14:17:21 2007 Return-Path: Delivered-To: freebsd-geom@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 5E56016A418; Mon, 15 Oct 2007 14:17:21 +0000 (UTC) (envelope-from anderson@freebsd.org) Received: from ns.trinitel.com (186.161.36.72.static.reverse.ltdomains.com [72.36.161.186]) by mx1.freebsd.org (Postfix) with ESMTP id 310B813C467; Mon, 15 Oct 2007 14:17:20 +0000 (UTC) (envelope-from anderson@freebsd.org) Received: from proton.storspeed.com (209-163-168-124.static.twtelecom.net [209.163.168.124]) (authenticated bits=0) by ns.trinitel.com (8.14.1/8.14.1) with ESMTP id l9FEGRLq005947; Mon, 15 Oct 2007 09:16:30 -0500 (CDT) (envelope-from anderson@freebsd.org) Message-ID: <47137634.1010703@freebsd.org> Date: Mon, 15 Oct 2007 09:16:20 -0500 From: Eric Anderson User-Agent: Thunderbird 2.0.0.6 (Macintosh/20070728) MIME-Version: 1.0 To: d_elbracht References: <008801c80e65$47cbe650$639049d9@EC1a> <00cb01c80f04$50b11ed0$639049d9@EC1a> In-Reply-To: <00cb01c80f04$50b11ed0$639049d9@EC1a> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-2.0 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.1.8 X-Spam-Checker-Version: SpamAssassin 3.1.8 (2007-02-13) on ns.trinitel.com Cc: 'Ivan Voras' , freebsd-geom@freebsd.org Subject: Re: AW: g_vfs_done():da3s1a[READ(offset=81064794762854400, length=8192)]error = 5 X-BeenThere: freebsd-geom@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: GEOM-specific discussions and implementations List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 15 Oct 2007 14:17:21 -0000 d_elbracht wrote: >>> we are trying to diagnose errors seen on 6.2, SMP, amd64, >> cvsup'ed of >>> 2007-10-09 >>> >>> Mainboard is a Tyan Thunder h2000M (S3992-E) with 16 GB RAM and 2 x >>> Opteron 2216, da3 is on a 3ware 9550-12 >>> >>> we are seeing this error: >>> g_vfs_done():da3s1a[READ(offset=81064794762854400, >> length=8192)]error >>> = 5 on a 12 GB Hyperdrive >>> >>> the offset changes sometimes, but it is always >> 81064794xxxxxxxxx and >>> well out the 12GB range. >> Yes. >> >>> According to systat -vm, da3 does tps > 500 (yes, that's a lot) >> That's not a lot :) That's actually low for a modern solid >> state drive. >> >>> This leads to an assumption, the error has to do with very high IOs >>> per second on a SMP machine. >> Either that or file system errors. Does fsck run ok or does >> it say anything unusual? >> >> There are several theoretical reasons for such errors that >> are connected with the fact you use solid state drives, but >> all are tricky to diagnose if you don't have a certain >> repeatable test you can try. For example: >> some SSDs optimize writes to "spread out" the IO on the >> chips, but some do it by looking into file system structures >> to determine where it's safe to relocate the write - >> obviously this works only with a known and supported file >> system. This is a really wild guess, but maybe the SSD >> firmware has error somewhere in this area, trying to >> interpret UFS as it was FAT? If you manage to get a >> repeatable failure test, you can try formatting the drive as >> FAT32 and trying it on that. Solid state drives don't behave much differently that a regular drive from FreeBSD's point of view. The huge difference most people notice is that they perform best at their page size (or maybe what the SSD manufacturer might call a block size, which is not a sector size), which is often 128K or 256K. IO smaller than the page size suffers a big penalty since most SSD devices do not have a cache onboard (although some do now). >> Or maybe it's just a bad drive... I doubt it's a bad device.. >>> The system-disk is a RAID1 on an ICP 5805. All other disks >> (51) are 20 >>> gstripe'd partitions. >> 51 drives and 20 partitions? >> > According to the manufaturer, the drive handles any filesystem. In other > words, it's as transparent as any harddisk would be. > Also, as written before, we have seen the error=5 with weird offsets on an > md (memory disk) before too. > fsck on the disk does NOT show any error. > > yes, 20 partitions on the other 51 disks (/dev/stripe/data ..datann). That's > for hashfeed from diablo. > > One basic question to ask: where does the value for offset= in g_vfs_done() > come from ? >>From the time the error shows up in syslog I believe, the error only > happens, when a file get's appended. I wonder if (wild guess follows) there's a 32/64 bit conversion problem somewhere, like a 32bit number cast as 64bit or something. I'd like to see a full trace to see what path it takes. Maybe putting a panic in the error path would be worth doing. Eric From owner-freebsd-geom@FreeBSD.ORG Mon Oct 15 15:06:40 2007 Return-Path: Delivered-To: freebsd-geom@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 5C30816A418 for ; Mon, 15 Oct 2007 15:06:40 +0000 (UTC) (envelope-from phk@critter.freebsd.dk) Received: from phk.freebsd.dk (phk.freebsd.dk [130.225.244.222]) by mx1.freebsd.org (Postfix) with ESMTP id E031B13C442 for ; Mon, 15 Oct 2007 15:06:39 +0000 (UTC) (envelope-from phk@critter.freebsd.dk) Received: from critter.freebsd.dk (unknown [192.168.61.3]) by phk.freebsd.dk (Postfix) with ESMTP id 95CF517105; Mon, 15 Oct 2007 15:06:37 +0000 (UTC) Received: from critter.freebsd.dk (localhost [127.0.0.1]) by critter.freebsd.dk (8.14.1/8.14.1) with ESMTP id l9FF6abM048314; Mon, 15 Oct 2007 15:06:36 GMT (envelope-from phk@critter.freebsd.dk) To: Eric Anderson From: "Poul-Henning Kamp" In-Reply-To: Your message of "Mon, 15 Oct 2007 09:16:20 EST." <47137634.1010703@freebsd.org> Date: Mon, 15 Oct 2007 15:06:36 +0000 Message-ID: <48313.1192460796@critter.freebsd.dk> Sender: phk@critter.freebsd.dk Cc: d_elbracht , 'Ivan Voras' , freebsd-geom@freebsd.org Subject: Re: AW: g_vfs_done():da3s1a[READ(offset=81064794762854400, length=8192)]error = 5 X-BeenThere: freebsd-geom@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: GEOM-specific discussions and implementations List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 15 Oct 2007 15:06:40 -0000 In message <47137634.1010703@freebsd.org>, Eric Anderson writes: >Solid state drives don't behave much differently that a regular drive >from FreeBSD's point of view. Yes and no. The effective lack of seek time has the potential to expose a lot of flawed reasoning in filesystems with respect to ordering and duration of I/O requests. It might be a good idea to have GEOM module that could implement a seek-time sort of behaviour, just for being able to falsifying that theory. -- Poul-Henning Kamp | UNIX since Zilog Zeus 3.20 phk@FreeBSD.ORG | TCP/IP since RFC 956 FreeBSD committer | BSD since 4.3-tahoe Never attribute to malice what can adequately be explained by incompetence. From owner-freebsd-geom@FreeBSD.ORG Mon Oct 15 16:52:35 2007 Return-Path: Delivered-To: freebsd-geom@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 7DAF116A419; Mon, 15 Oct 2007 16:52:35 +0000 (UTC) (envelope-from anderson@freebsd.org) Received: from ns.trinitel.com (186.161.36.72.static.reverse.ltdomains.com [72.36.161.186]) by mx1.freebsd.org (Postfix) with ESMTP id 5104613C467; Mon, 15 Oct 2007 16:52:35 +0000 (UTC) (envelope-from anderson@freebsd.org) Received: from proton.storspeed.com (209-163-168-124.static.twtelecom.net [209.163.168.124]) (authenticated bits=0) by ns.trinitel.com (8.14.1/8.14.1) with ESMTP id l9FGqF4U080757; Mon, 15 Oct 2007 11:52:16 -0500 (CDT) (envelope-from anderson@freebsd.org) Message-ID: <47139AB8.9060602@freebsd.org> Date: Mon, 15 Oct 2007 11:52:08 -0500 From: Eric Anderson User-Agent: Thunderbird 2.0.0.6 (Macintosh/20070728) MIME-Version: 1.0 To: Poul-Henning Kamp References: <48313.1192460796@critter.freebsd.dk> In-Reply-To: <48313.1192460796@critter.freebsd.dk> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-2.0 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.1.8 X-Spam-Checker-Version: SpamAssassin 3.1.8 (2007-02-13) on ns.trinitel.com Cc: d_elbracht , 'Ivan Voras' , freebsd-geom@freebsd.org Subject: Re: AW: g_vfs_done():da3s1a[READ(offset=81064794762854400, length=8192)]error = 5 X-BeenThere: freebsd-geom@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: GEOM-specific discussions and implementations List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 15 Oct 2007 16:52:35 -0000 Poul-Henning Kamp wrote: > In message <47137634.1010703@freebsd.org>, Eric Anderson writes: > >> Solid state drives don't behave much differently that a regular drive >>from FreeBSD's point of view. > > Yes and no. The effective lack of seek time has the potential to expose > a lot of flawed reasoning in filesystems with respect to ordering and > duration of I/O requests. > > It might be a good idea to have GEOM module that could implement a > seek-time sort of behaviour, just for being able to falsifying that > theory. > Or an option to gnop? Eric From owner-freebsd-geom@FreeBSD.ORG Mon Oct 15 17:47:08 2007 Return-Path: Delivered-To: freebsd-geom@hub.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 06D1C16A474 for ; Mon, 15 Oct 2007 17:47:08 +0000 (UTC) (envelope-from owner-bugmaster@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:4f8:fff6::28]) by mx1.freebsd.org (Postfix) with ESMTP id DDDC113C46E for ; Mon, 15 Oct 2007 17:47:07 +0000 (UTC) (envelope-from owner-bugmaster@FreeBSD.org) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.1/8.14.1) with ESMTP id l9FHl7hF014949 for ; Mon, 15 Oct 2007 17:47:07 GMT (envelope-from owner-bugmaster@FreeBSD.org) Received: (from gnats@localhost) by freefall.freebsd.org (8.14.1/8.14.1/Submit) id l9FHl7lO014945 for freebsd-geom@FreeBSD.org; Mon, 15 Oct 2007 17:47:07 GMT (envelope-from owner-bugmaster@FreeBSD.org) Date: Mon, 15 Oct 2007 17:47:07 GMT Message-Id: <200710151747.l9FHl7lO014945@freefall.freebsd.org> X-Authentication-Warning: freefall.freebsd.org: gnats set sender to owner-bugmaster@FreeBSD.org using -f From: FreeBSD bugmaster To: freebsd-geom@FreeBSD.org Cc: Subject: Current problem reports assigned to freebsd-geom@FreeBSD.org X-BeenThere: freebsd-geom@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: GEOM-specific discussions and implementations List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 15 Oct 2007 17:47:08 -0000 Current FreeBSD problem reports Critical problems Serious problems S Tracker Resp. Description -------------------------------------------------------------------------------- o kern/73177 geom kldload geom_* causes panic due to memory exhaustion o kern/76538 geom [gbde] nfs-write on gbde partition stalls and continue o kern/83464 geom [geom] [patch] Unhandled malloc failures within libgeo o kern/84556 geom [geom] GBDE-encrypted swap causes panic at shutdown o kern/87544 geom [gbde] mmaping large files on a gbde filesystem deadlo o kern/89102 geom [geom_vfs] [panic] panic when forced unmount FS from u o bin/90093 geom fdisk(8) incapable of altering in-core geometry o kern/90582 geom [geom_mirror] [panic] Restore cause panic string (ffs_ o kern/98034 geom [geom] dereference of NULL pointer in acd_geom_detach o kern/104389 geom [geom] [patch] sys/geom/geom_dump.c doesn't encode XML o kern/113419 geom [geom] geom fox multipathing not failing back o misc/113543 geom [geom] [patch] geom(8) utilities don't work inside the o kern/113957 geom [gmirror] gmirror is intermittently reporting a degrad o kern/115572 geom [gbde] gbde partitions fail at 28bit/48bit LBA address 14 problems total. Non-critical problems S Tracker Resp. Description -------------------------------------------------------------------------------- o bin/78131 geom gbde "destroy" not working. o kern/79251 geom [2TB] newfs fails on 2.6TB gbde device o kern/94632 geom [geom] Kernel output resets input while GELI asks for f kern/105390 geom [geli] filesystem on a md backed by sparse file with s o kern/107707 geom [geom] [patch] add new class geom_xbox360 to slice up p bin/110705 geom gmirror control utility does not exit with correct exi o kern/113837 geom [geom] unable to access 1024 sector size storage o kern/113885 geom [geom] [patch] improved gmirror balance algorithm o kern/114532 geom GEOM_MIRROR shows up in kldstat even if compiled in th o kern/115547 geom [geom] [patch] for GEOM Eli to get password from stdin 10 problems total. From owner-freebsd-geom@FreeBSD.ORG Tue Oct 16 10:05:35 2007 Return-Path: Delivered-To: freebsd-geom@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 598B516A417; Tue, 16 Oct 2007 10:05:35 +0000 (UTC) (envelope-from d_elbracht@ecngs.de) Received: from ecngs.de (mail.ecngs.de [217.73.144.50]) by mx1.freebsd.org (Postfix) with ESMTP id 1F6FA13C45D; Tue, 16 Oct 2007 10:05:33 +0000 (UTC) (envelope-from d_elbracht@ecngs.de) Received: from EC1a (ec1.elbracht.net [217.73.144.99]) by ecngs.de (SurgeMail 3.8f2) with ESMTP id 1777227-1922481 for multiple; Tue, 16 Oct 2007 12:05:54 +0200 From: "d_elbracht" To: "'Eric Anderson'" References: <008801c80e65$47cbe650$639049d9@EC1a> <00cb01c80f04$50b11ed0$639049d9@EC1a> <47137634.1010703@freebsd.org> Date: Tue, 16 Oct 2007 12:05:23 +0200 Message-ID: <000b01c80fdc$12582f10$639049d9@EC1a> MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit X-Mailer: Microsoft Office Outlook 11 In-Reply-To: <47137634.1010703@freebsd.org> X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2900.3138 Thread-Index: AcgPN6GqbK/sjYCaTtGAOMP49/u7+wAo/O0w Cc: freebsd-stable@freebsd.org, 'Ivan Voras' , freebsd-geom@freebsd.org Subject: AW: Re: AW: g_vfs_done():da3s1a[READ(offset=81064794762854400, length=8192)]error = 5 X-BeenThere: freebsd-geom@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: GEOM-specific discussions and implementations List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 16 Oct 2007 10:05:35 -0000 > > One basic question to ask: where does the value for offset= in > > g_vfs_done() come from ? > >>From the time the error shows up in syslog I believe, the error only > > happens, when a file get's appended. > > I wonder if (wild guess follows) there's a 32/64 bit > conversion problem somewhere, like a 32bit number cast as > 64bit or something. > > I'd like to see a full trace to see what path it takes. > Maybe putting a > panic in the error path would be worth doing. > can you give me some hints please how to do this ? I'm willing to try about everything to get this problem nailed down. Dieter From owner-freebsd-geom@FreeBSD.ORG Tue Oct 16 14:48:20 2007 Return-Path: Delivered-To: freebsd-geom@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 1375E16A420; Tue, 16 Oct 2007 14:48:20 +0000 (UTC) (envelope-from avg@icyb.net.ua) Received: from falcon.cybervisiontech.com (falcon.cybervisiontech.com [217.20.163.9]) by mx1.freebsd.org (Postfix) with ESMTP id D05CC13C46A; Tue, 16 Oct 2007 14:48:19 +0000 (UTC) (envelope-from avg@icyb.net.ua) Received: from localhost (localhost [127.0.0.1]) by falcon.cybervisiontech.com (Postfix) with ESMTP id E8EB9744009; Tue, 16 Oct 2007 17:14:06 +0300 (EEST) X-Virus-Scanned: Debian amavisd-new at falcon.cybervisiontech.com Received: from falcon.cybervisiontech.com ([127.0.0.1]) by localhost (falcon.cybervisiontech.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id nuJ0nV7ZYjJR; Tue, 16 Oct 2007 17:14:06 +0300 (EEST) Received: from [10.2.1.87] (gateway.cybervisiontech.com.ua [88.81.251.18]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by falcon.cybervisiontech.com (Postfix) with ESMTP id 52AFE744008; Tue, 16 Oct 2007 17:14:06 +0300 (EEST) Message-ID: <4714C724.6000809@icyb.net.ua> Date: Tue, 16 Oct 2007 17:13:56 +0300 From: Andriy Gapon User-Agent: Thunderbird 2.0.0.6 (X11/20070803) MIME-Version: 1.0 To: d_elbracht References: <1192382586.00813930.1192369201@10.7.7.3> <1192414981.00814129.1192401601@10.7.7.3> <1192447399.00814254.1192437006@10.7.7.3> <1192468999.00814418.1192458001@10.7.7.3> <1192540986.00814865.1192529403@10.7.7.3> In-Reply-To: <1192540986.00814865.1192529403@10.7.7.3> Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Cc: freebsd-geom@freebsd.org, freebsd-stable@freebsd.org, 'Ivan Voras' Subject: Re: AW: Re: AW: g_vfs_done():da3s1a[READ(offset=81064794762854400, length=8192)]error = 5 X-BeenThere: freebsd-geom@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: GEOM-specific discussions and implementations List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 16 Oct 2007 14:48:20 -0000 Just a wild shot here: I have seen a similar message recently when I played with my disks. I re-arranged some partitions (and filesystems) within a slice and it so happened (and I almost know why) that there was some discrepancy between on-disk and in-memory label of that slice. I ran newfs on one of the new partitions and apparently it used one label to determine its size, but after the reboot the other label was used. As a result I had a UFS2 filesystem with size larger than a partition that hosted it. And after that I saw the messages similar to the one in the subject. All of the above is a result of my understanding of how these things work, so it may be incorrect. But making sure that disklabels match (that is, there is only one disklabel) and re-newfs-ing the filesystems did help me. So I would compare, just in case, outputs of, say, 'dumpfs -m' near '-s' and disklabel output. Just my 2 bits. P.S. example of the error that I had: g_vfs_done():ad4s1e[READ(offset=20420280320, length=16384)]error = 5 -- Andriy Gapon From owner-freebsd-geom@FreeBSD.ORG Tue Oct 16 14:56:15 2007 Return-Path: Delivered-To: freebsd-geom@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 76BE216A421; Tue, 16 Oct 2007 14:56:15 +0000 (UTC) (envelope-from anderson@freebsd.org) Received: from ns.trinitel.com (186.161.36.72.static.reverse.ltdomains.com [72.36.161.186]) by mx1.freebsd.org (Postfix) with ESMTP id 6633413C4A5; Tue, 16 Oct 2007 14:56:15 +0000 (UTC) (envelope-from anderson@freebsd.org) Received: from proton.storspeed.com (209-163-168-124.static.twtelecom.net [209.163.168.124]) (authenticated bits=0) by ns.trinitel.com (8.14.1/8.14.1) with ESMTP id l9GEtrpe084402; Tue, 16 Oct 2007 09:55:55 -0500 (CDT) (envelope-from anderson@freebsd.org) Message-ID: <4714D0F1.2000903@freebsd.org> Date: Tue, 16 Oct 2007 09:55:45 -0500 From: Eric Anderson User-Agent: Thunderbird 2.0.0.6 (Macintosh/20070728) MIME-Version: 1.0 To: d_elbracht References: <008801c80e65$47cbe650$639049d9@EC1a> <00cb01c80f04$50b11ed0$639049d9@EC1a> <47137634.1010703@freebsd.org> <000b01c80fdc$12582f10$639049d9@EC1a> In-Reply-To: <000b01c80fdc$12582f10$639049d9@EC1a> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-2.0 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.1.8 X-Spam-Checker-Version: SpamAssassin 3.1.8 (2007-02-13) on ns.trinitel.com Cc: 'Ivan Voras' , freebsd-geom@freebsd.org Subject: Re: AW: Re: AW: g_vfs_done():da3s1a[READ(offset=81064794762854400, length=8192)]error = 5 X-BeenThere: freebsd-geom@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: GEOM-specific discussions and implementations List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 16 Oct 2007 14:56:15 -0000 d_elbracht wrote: >>> One basic question to ask: where does the value for offset= in >>> g_vfs_done() come from ? >>> >From the time the error shows up in syslog I believe, the error only >>> happens, when a file get's appended. >> I wonder if (wild guess follows) there's a 32/64 bit >> conversion problem somewhere, like a 32bit number cast as >> 64bit or something. >> >> I'd like to see a full trace to see what path it takes. >> Maybe putting a >> panic in the error path would be worth doing. >> > > can you give me some hints please how to do this ? I'm willing to try about > everything to get this problem nailed down. I would add debugging to your kernel config, and then around here: http://fxr.googlebit.com/source/sys/geom/geom_vfs.c?v=8-CURRENT#L77 change the printf to a panic(), and recompile your kernel. Also, don't forget to set up a dump partition (swap). You can find out how to do the debugging parts and dump partition in the Handbook. Eric From owner-freebsd-geom@FreeBSD.ORG Wed Oct 17 08:12:06 2007 Return-Path: Delivered-To: freebsd-geom@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id B415016A419 for ; Wed, 17 Oct 2007 08:12:06 +0000 (UTC) (envelope-from avg@icyb.net.ua) Received: from falcon.cybervisiontech.com (falcon.cybervisiontech.com [217.20.163.9]) by mx1.freebsd.org (Postfix) with ESMTP id 35D2613C458 for ; Wed, 17 Oct 2007 08:12:05 +0000 (UTC) (envelope-from avg@icyb.net.ua) Received: from localhost (localhost [127.0.0.1]) by falcon.cybervisiontech.com (Postfix) with ESMTP id 7854843C315; Wed, 17 Oct 2007 11:12:02 +0300 (EEST) X-Virus-Scanned: Debian amavisd-new at falcon.cybervisiontech.com Received: from falcon.cybervisiontech.com ([127.0.0.1]) by localhost (falcon.cybervisiontech.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id w1QngMTGcRYH; Wed, 17 Oct 2007 11:12:02 +0300 (EEST) Received: from [10.2.1.87] (gateway.cybervisiontech.com.ua [88.81.251.18]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by falcon.cybervisiontech.com (Postfix) with ESMTP id E449743C28E; Wed, 17 Oct 2007 11:12:01 +0300 (EEST) Message-ID: <4715C3D1.3070308@icyb.net.ua> Date: Wed, 17 Oct 2007 11:12:01 +0300 From: Andriy Gapon User-Agent: Thunderbird 2.0.0.6 (X11/20070803) MIME-Version: 1.0 To: freebsd-geom@freebsd.org Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit Cc: Pawel Jakub Dawidek Subject: gjournal: FLUSHCACHE timed out X-BeenThere: freebsd-geom@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: GEOM-specific discussions and implementations List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 17 Oct 2007 08:12:06 -0000 Couple of days ago I started using gjournal on FreeBSD 6.2 using a patch from here: http://people.freebsd.org/~pjd/patches/gjournal6.patch I actually had to make 4 minor and obvious tweaks to the patch to make it apply cleanly to my src. I started to get the following messages sometimes: kernel: ad4: FAILURE - FLUSHCACHE timed out kernel: GEOM_JOURNAL: Flush cache of ad4s1ge: error=5. kernel: ad4: FAILURE - FLUSHCACHE timed out kernel: GEOM_JOURNAL: Flush cache of ad4s1ge: error=5. kernel: ad4: FAILURE - FLUSHCACHE timed out kernel: GEOM_JOURNAL: Flush cache of ad4s1ge: error=5. vvvvvvvvv this one is unusual and is found only once kernel: handle_workitem_freeblocks: block count ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ kernel: ad4: FAILURE - FLUSHCACHE timed out kernel: GEOM_JOURNAL: Flush cache of ad4s1ge: error=5. kernel: ad4: FAILURE - FLUSHCACHE timed out kernel: GEOM_JOURNAL: Flush cache of ad4s1ge: error=5. kernel: ad4: FAILURE - FLUSHCACHE timed out kernel: GEOM_JOURNAL: Flush cache of ad4s1ge: error=5. kernel: ad4: FAILURE - FLUSHCACHE timed out kernel: GEOM_JOURNAL: Flush cache of ad4s1ge: error=5. kernel: ad4: FAILURE - FLUSHCACHE timed out kernel: GEOM_JOURNAL: Flush cache of ad4s1ge: error=5. kernel: ad4: FAILURE - FLUSHCACHE timed out kernel: GEOM_JOURNAL: Flush cache of ad4s1ge: error=5. ad4s1ge (please don't pay attention to its slightly unusual name, this is for historic reasons) is a journal partition/consumer for my /var filesystem/partition/provider. Size of /var is 16G, size of the journal is slightly less than 1G (1G - 32 sectors actually). /var is UFS2 with softupdates enabled. I noticed that I get these messages only when I run 'dump' on any of my filesystems. I think that dump is using /tmp or /var/tmp for some temporary data and in my setup both of those are in /var filesystem. So my I guess is that /var is being written "too" actively and I have to tune some parameters to make things smooth. More information: $ uname -srm FreeBSD 6.2-RELEASE-p6 amd64 $ sysctl -a | fgrep journal kern.geom.journal.debug: 0 kern.geom.journal.switch_time: 10 kern.geom.journal.parallel_flushes: 16 kern.geom.journal.accept_immediately: 64 kern.geom.journal.parallel_copies: 16 kern.geom.journal.record_entries: 20 kern.geom.journal.optimize: 0 kern.geom.journal.cache.used: 16384 kern.geom.journal.cache.limit: 209715200 kern.geom.journal.cache.divisor: 2 kern.geom.journal.cache.switch: 90 kern.geom.journal.cache.misses: 0 kern.geom.journal.cache.alloc_failures: 0 kern.geom.journal.stats.skipped_bytes: 241266688 kern.geom.journal.stats.combined_ios: 62184 kern.geom.journal.stats.switches: 24144 kern.geom.journal.stats.wait_for_copy: 0 kern.geom.journal.stats.low_mem: 287 journal_data 4 18K - 624220 512,2048,4096 $ dmesg | fgrep ad4 | head -1 ad4: 286168MB at ata2-master SATA300 $ dmesg | fgrep -B1 ata2 | head -2 atapci1: port 0x9f0-0x9f7,0xbf0-0xbf3,0x970-0x977,0xb70-0xb73,0xe000-0xe00f mem 0xfe02d000-0xfe02dfff irq 20 at device 14.0 on pci0 ata2: on atapci1 $ geom journal list Geom name: gjournal 4283925943 ID: 4283925943 Providers: 1. Name: ad4s1e.journal Mediasize: 17179868672 (16G) Sectorsize: 512 Mode: r1w1e1 Consumers: 1. Name: ad4s1e Mediasize: 17179869184 (16G) Sectorsize: 512 Mode: r1w1e1 Role: Data 2. Name: ad4s1ge Mediasize: 1073733632 (1.0G) Sectorsize: 512 Mode: r1w1e1 Jend: 1073733120 Jstart: 0 Role: Journal $ mount | fgrep var /dev/ad4s1e.journal on /var (ufs, local, soft-updates) $ bsdlabel /dev/ad4s1g | fgrep e: e: 2097136 6291456 swap -- Andriy Gapon From owner-freebsd-geom@FreeBSD.ORG Wed Oct 17 11:41:49 2007 Return-Path: Delivered-To: freebsd-geom@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 3135C16A4E6; Wed, 17 Oct 2007 11:41:49 +0000 (UTC) (envelope-from anderson@freebsd.org) Received: from ns.trinitel.com (186.161.36.72.static.reverse.ltdomains.com [72.36.161.186]) by mx1.freebsd.org (Postfix) with ESMTP id 79D6813C46A; Wed, 17 Oct 2007 11:41:45 +0000 (UTC) (envelope-from anderson@freebsd.org) Received: from proton.storspeed.com (209-163-168-124.static.twtelecom.net [209.163.168.124]) (authenticated bits=0) by ns.trinitel.com (8.14.1/8.14.1) with ESMTP id l9HBfhas071547; Wed, 17 Oct 2007 06:41:44 -0500 (CDT) (envelope-from anderson@freebsd.org) Message-ID: <4715F4EE.9020104@freebsd.org> Date: Wed, 17 Oct 2007 06:41:34 -0500 From: Eric Anderson User-Agent: Thunderbird 2.0.0.6 (Macintosh/20070728) MIME-Version: 1.0 To: Andriy Gapon References: <4715C3D1.3070308@icyb.net.ua> In-Reply-To: <4715C3D1.3070308@icyb.net.ua> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-2.0 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.1.8 X-Spam-Checker-Version: SpamAssassin 3.1.8 (2007-02-13) on ns.trinitel.com Cc: Pawel Jakub Dawidek , freebsd-geom@freebsd.org Subject: Re: gjournal: FLUSHCACHE timed out X-BeenThere: freebsd-geom@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: GEOM-specific discussions and implementations List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 17 Oct 2007 11:41:49 -0000 Andriy Gapon wrote: > Couple of days ago I started using gjournal on FreeBSD 6.2 using a patch > from here: > http://people.freebsd.org/~pjd/patches/gjournal6.patch > > I actually had to make 4 minor and obvious tweaks to the patch to make > it apply cleanly to my src. > I started to get the following messages sometimes: > > kernel: ad4: FAILURE - FLUSHCACHE timed out > kernel: GEOM_JOURNAL: Flush cache of ad4s1ge: error=5. > kernel: ad4: FAILURE - FLUSHCACHE timed out > kernel: GEOM_JOURNAL: Flush cache of ad4s1ge: error=5. > kernel: ad4: FAILURE - FLUSHCACHE timed out > kernel: GEOM_JOURNAL: Flush cache of ad4s1ge: error=5. > vvvvvvvvv this one is unusual and is found only once > kernel: handle_workitem_freeblocks: block count > ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Ok, that's interesting.. Other threads are talking about a similar warning, not related to gjournal. > ad4s1ge (please don't pay attention to its slightly unusual name, this > is for historic reasons) is a journal partition/consumer for my /var > filesystem/partition/provider. > Size of /var is 16G, size of the journal is slightly less than 1G (1G - > 32 sectors actually). /var is UFS2 with softupdates enabled. Pawel, correct me if I'm wrong here - but I think you really need to turn *off* softupdates on gjournaled file systems. > I noticed that I get these messages only when I run 'dump' on any of my > filesystems. I think that dump is using /tmp or /var/tmp for some > temporary data and in my setup both of those are in /var filesystem. > > So my I guess is that /var is being written "too" actively and I have to > tune some parameters to make things smooth. A few things to note: - you can turn on 'async' option for your gjournaled file system, and get better performance - you might be able to at the 'noatime' option to your file system mount also - You might try turning your journal switch time from 10 down to 5, and see if it alleviates some pressure on your disk. Eric From owner-freebsd-geom@FreeBSD.ORG Wed Oct 17 14:44:39 2007 Return-Path: Delivered-To: freebsd-geom@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 720F316A421; Wed, 17 Oct 2007 14:44:39 +0000 (UTC) (envelope-from avg@icyb.net.ua) Received: from falcon.cybervisiontech.com (falcon.cybervisiontech.com [217.20.163.9]) by mx1.freebsd.org (Postfix) with ESMTP id 7DAC113C48A; Wed, 17 Oct 2007 14:44:38 +0000 (UTC) (envelope-from avg@icyb.net.ua) Received: from localhost (localhost [127.0.0.1]) by falcon.cybervisiontech.com (Postfix) with ESMTP id 7413B74400D; Wed, 17 Oct 2007 17:44:37 +0300 (EEST) X-Virus-Scanned: Debian amavisd-new at falcon.cybervisiontech.com Received: from falcon.cybervisiontech.com ([127.0.0.1]) by localhost (falcon.cybervisiontech.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id DWxzRFf0hDCo; Wed, 17 Oct 2007 17:44:37 +0300 (EEST) Received: from [10.2.1.87] (gateway.cybervisiontech.com.ua [88.81.251.18]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by falcon.cybervisiontech.com (Postfix) with ESMTP id 0666074400A; Wed, 17 Oct 2007 17:44:36 +0300 (EEST) Message-ID: <47161FD1.5010501@icyb.net.ua> Date: Wed, 17 Oct 2007 17:44:33 +0300 From: Andriy Gapon User-Agent: Thunderbird 2.0.0.6 (X11/20070803) MIME-Version: 1.0 To: Eric Anderson References: <4715C3D1.3070308@icyb.net.ua> <4715F4EE.9020104@freebsd.org> In-Reply-To: <4715F4EE.9020104@freebsd.org> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit Cc: Pawel Jakub Dawidek , freebsd-geom@freebsd.org Subject: Re: gjournal: FLUSHCACHE timed out X-BeenThere: freebsd-geom@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: GEOM-specific discussions and implementations List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 17 Oct 2007 14:44:39 -0000 on 17/10/2007 14:41 Eric Anderson said the following: > Andriy Gapon wrote: >> ad4s1ge (please don't pay attention to its slightly unusual name, this >> is for historic reasons) is a journal partition/consumer for my /var >> filesystem/partition/provider. >> Size of /var is 16G, size of the journal is slightly less than 1G (1G - >> 32 sectors actually). /var is UFS2 with softupdates enabled. > > > Pawel, correct me if I'm wrong here - but I think you really need to > turn *off* softupdates on gjournaled file systems. I was under a big mis-impression that I have to have softupdates enabled for snapshots to work. Now that I know that I was wrong I will turn off the softupdates. But it seems that there is nothing that would preclude _in principle_ combination of softupdates/gjournal. Anyway, I care only out of curiosity. >> I noticed that I get these messages only when I run 'dump' on any of my >> filesystems. I think that dump is using /tmp or /var/tmp for some >> temporary data and in my setup both of those are in /var filesystem. >> >> So my I guess is that /var is being written "too" actively and I have to >> tune some parameters to make things smooth. > > A few things to note: > > - you can turn on 'async' option for your gjournaled file system, and > get better performance will do > - you might be able to at the 'noatime' option to your file system mount > also probably will do as well > - You might try turning your journal switch time from 10 down to 5, and > see if it alleviates some pressure on your disk. I already did this and it helped! I don't see the messages anymore. Thank you! I will try to set this back to 10 after I do away with softupdates and see what happens. Thank you very much again. -- Andriy Gapon From owner-freebsd-geom@FreeBSD.ORG Wed Oct 17 18:41:41 2007 Return-Path: Delivered-To: freebsd-geom@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id AA85116A46C for ; Wed, 17 Oct 2007 18:41:41 +0000 (UTC) (envelope-from kurtseel@primetime.com) Received: from mail.primetime.com (mail.primetime.com [146.145.135.164]) by mx1.freebsd.org (Postfix) with ESMTP id 14D5413C48D for ; Wed, 17 Oct 2007 18:41:40 +0000 (UTC) (envelope-from kurtseel@primetime.com) Received: from [10.200.1.130] (deca.khome.utcorp.net [10.200.1.130]) by mail.primetime.com (Postfix) with ESMTP id C61FEF9C425 for ; Wed, 17 Oct 2007 13:20:40 -0400 (EDT) Message-ID: <471650AA.30903@primetime.com> Date: Wed, 17 Oct 2007 14:12:58 -0400 From: kurtseel User-Agent: Thunderbird 2.0.0.5 (X11/20070724) MIME-Version: 1.0 To: freebsd-geom@freebsd.org Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Subject: gmirror + ggated question X-BeenThere: freebsd-geom@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: GEOM-specific discussions and implementations List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 17 Oct 2007 18:41:41 -0000 I built a mirror of a local drive and a ggated backed device. I ran iozone on it and it runs along fine until a certain point when it slows down to a near stand still. It doesn't break the mirror or crash the system, but it does slow the system down to a near stop. I kill the iozone, and a short time later I can login and then : # df Filesystem 1K-blocks Used Avail Capacity Mounted on /dev/mirror/thinkcs1a 1012974 155780 776158 17% / devfs 1 1 0 100% /dev /dev/mirror/thinkcs1e 85469448 1163474 77468420 1% /usr /dev/mirror/thinkcs1d 4058062 40426 3692992 1% /var [root@ ~/temp]# gmirror status Name Status Components mirror/thinkc COMPLETE ad0 ggate0 And all seems normal again. Seems like it has to do with big files ... This is the same configuration I used in : http://bsdtips.utcorp.net/mediawiki/index.php/Mirroring_over_network This is where the iozone gets stuck : # /usr/bin/time iozone -b ${DR}/data.xls \ > -az -i 0 -i 1 -i 2 -i 3 -i 4 -i 5 -i 6 -i 7 -i 8 -i 9 -i 10 -i 11 -i 12 \ > | tee ${DR}/data.txt Iozone: Performance Test of File I/O Version $Revision: 3.283 $ Compiled for 32 bit mode. Build: freebsd Contributors:William Norcott, Don Capps, Isom Crawford, Kirby Collins Al Slater, Scott Rhine, Mike Wisner, Ken Goss Steve Landherr, Brad Smith, Mark Kelly, Dr. Alain CYR, Randy Dunlap, Mark Montague, Dan Million, Jean-Marc Zucconi, Jeff Blomberg, Benny Halevy, Erik Habbinga, Kris Strecker, Walter Wong. Run began: Wed Oct 17 08:20:44 2007 Auto Mode Cross over of record size disabled. Selected test not available on the version. Command line used: iozone -b /root/temp/data.xls -az -i 0 -i 1 -i 2 -i 3 -i 4 -i 5 -i 6 -i 7 -i 8 -i 9 -i 10 -i 11 -i 12 Output is in Kbytes/sec Time Resolution = 0.000003 seconds. Processor cache size set to 1024 Kbytes. Processor cache line size set to 32 bytes. File stride size set to 17 * record size. random random bkwd record stride KB reclen write rewrite read reread read write read rewrite read fwrite frewrite fread freread 64 4 250776 615089 853755 1067689 753149 703784 604016 788548 940502 263581 640020 534288 829997 64 8 324818 800303 1145118 1521595 1306782 985384 772660 660492 1387858 338755 790871 667057 1046870 64 16 488582 943809 1332734 1738379 1387858 1164998 902555 1332734 1556895 537497 927503 694678 1084951 64 32 507999 890578 1424687 1884854 1603393 1332734 899531 1145118 1455588 551862 953870 556438 1067689 64 64 488582 953870 1455588 1871711 1359737 1229003 899531 1000068 1145118 546247 940502 276049 843030 128 4 155550 292825 913951 1039607 907770 590346 715430 961415 757845 159006 297696 749383 870953 128 8 180012 316846 1197257 1389356 1280040 1074995 920218 1392960 1707511 195125 324115 954577 1131643 128 16 90973 100945 1347509 1453292 1805111 1540885 1141265 2000136 2290237 82742 93151 963140 1320985 128 32 231468 330704 1473231 1942249 1639715 1457236 1163526 1855007 2205559 242882 346946 947836 1256082 128 64 231468 338630 1473231 1829719 2030394 1881004 1134033 1581743 1829719 241897 329082 825425 1055965 128 128 224969 304794 1375121 1729514 1437724 324703 1074995 312968 1256082 247130 318350 367866 633537 256 4 15884 194821 941534 1049173 892244 705287 752754 780099 764002 16097 172977 810116 914276 256 8 15614 130080 1201837 1391908 1092959 1061621 1007813 1305592 1075444 15473 284780 853247 1164052 256 16 16407 206949 1391908 1630792 1608801 1255226 1174236 2416067 2609861 16410 155921 1065836 1293015 256 32 16264 293900 1179395 1580386 2047495 1571136 1261123 2483116 3012599 16522 183783 1113358 1286816 256 64 16705 205799 1522137 1705930 1840436 1406494 1307182 2225754 2722351 16698 263921 1024155 1113358 256 128 16776 122904 1496677 1684519 2047495 16496 1273085 16489 1983206 16526 201891 565300 790436 256 256 16666 359541 1261123 1242157 1071152 352113 1061621 355024 1149103 16839 362087 289307 332076 512 4 26638 185777 949618 890179 830935 621303 798791 1077828 792599 29455 260035 838069 866473 512 8 28995 294400 1221954 1248960 1158661 175341 1055576 1808533 1068709 29700 228251 1062365 1089309 512 16 28799 264454 1434125 1492949 1475510 1207525 1248234 2679608 1142632 29426 207638 1176434 1239588 512 32 30206 227574 1546713 1580872 1565886 1115334 1437966 3137682 3687192 30217 177032 1240304 1302755 512 64 32191 173080 1605694 1713303 1814646 1454523 1484691 2960343 3606687 32217 212277 1221954 1270386 512 128 32201 164894 1605694 1701088 1784488 1028282 1446684 11813 2912169 32456 190897 775988 1068709 512 256 32327 172788 1147517 1143241 1213667 18740 1000030 18774 1292561 32060 258844 321219 402797 512 512 31906 362602 654637 658855 649686 366501 627475 250481 653840 32167 372929 245190 315462 1024 4 49328 209067 1016941 1003868 820524 500496 798110 1110551 820524 49887 211133 878241 820524 1024 8 50836 206743 1293112 1308078 1171754 297674 1159730 1943057 1029124 51235 223388 1097497 1134009 1024 16 51824 236866 1546945 1563274 1514756 902983 1387524 3011017 1378174 52279 238299 1270543 1299372 1024 32 52513 232519 1659943 1662514 1659302 896574 1554223 3482169 1530409 53350 241338 1319734 1356412 1024 64 60692 229390 1738559 1729458 1622936 387303 1585199 3459729 4213159 60459 228925 1206310 1428124 1024 128 51672 227808 1695325 1718387 1792993 875199 1602350 10321 3836789 60556 217363 1010006 1154431 1024 256 56008 212229 1006927 1154431 1011909 638427 1008819 13892 1218977 56933 246519 372619 411063 1024 512 54505 312374 624412 620622 601332 20017 605230 20003 618299 59928 306949 255160 296380 1024 1024 60101 336954 613179 614935 607713 411063 602682 406665 573394 59767 412366 217506 289410 2048 4 15041 16933 1006977 994157 818602 458878 835237 1079623 825445 15027 16914 868598 878548 2048 8 15081 16986 1347352 1338534 1159013 613329 1139034 1972694 1153721 15066 16978 1145413 1155117 2048 16 15131 16962 1591195 1636983 1444313 759672 1463756 3098355 1434185 15128 17045 1338743 1291444 2048 32 15096 17044 1711000 1613916 1655598 763385 1622757 3750380 1494833 15075 16978 1392341 1423254 2048 64 14870 17047 1705226 1809010 1708278 753739 1712364 3961384 1624292 14903 16975 1418085 1443342 2048 128 14881 16993 1791653 1810535 1729603 535151 1801044 9637 4383923 14931 16968 1183933 1247373 2048 256 14870 16894 1230752 1251918 1042162 635048 1021952 12290 1511403 15077 16877 392047 366692 2048 512 14878 16698 473003 612629 618763 543686 609716 14194 642026 14931 16720 276235 292031 2048 1024 14843 16735 627670 615174 607001 20775 544963 20762 595018 14878 16703 259401 288694 2048 2048 14844 16751 622124 608765 614646 16729 614250 16731 603929 14851 16717 236706 292987 4096 4 13174 13875 931793 1003472 765167 411617 849083 1152205 807726 13188 13886 818540 910412 4096 8 13196 13869 1284436 1335145 1086479 541363 1063939 1983494 1116052 13213 13853 1115473 1120931 4096 16 13202 13881 1542674 1655505 1407561 664510 1485708 3217555 1425665 13195 13910 1346974 1320470 4096 32 13209 13890 1722393 1793414 1553274 721254 1607485 3999581 1641741 13208 13886 1378202 1421772 4096 64 13109 13892 1827759 1837730 1669663 683682 1759436 3893527 1741423 13134 13882 1416263 1462936 4096 128 13120 13888 1728284 1811760 1753510 731418 1689033 9333 1765584 13154 13891 1258375 1343918 4096 256 13184 13815 1123717 1149584 1058433 574954 1076809 11607 1614434 13116 13841 397163 392185 4096 512 13113 13809 608623 607525 604129 550841 606817 12407 602984 13139 13886 279759 290026 4096 1024 13120 13847 606089 567227 591401 537250 589109 14366 598551 13109 13464 273158 291508 4096 2048 13120 13800 593711 608084 598572 13792 567321 13773 601379 13101 13788 261675 291384 4096 4096 13106 13800 606453 613010 602962 13797 597344 13804 599533 13103 13841 237752 293074 8192 4 12393 11680 1027583 1030449 737896 374095 874929 1141579 694888 12388 11686 927102 865693 8192 8 12417 11684 1346732 1341840 1052835 483698 1202591 1998540 1012116 12398 11677 1162985 1154777 8192 16 12412 11677 1652274 1633811 1365084 627409 1542703 3236777 1399159 12425 11664 1353149 1342312 8192 32 12420 11673 1760194 1783585 1557176 671306 1691813 4055482 1528905 12350 11687 1411227 1432645 8192 64 12377 11673 1839739 1829746 1691146 243505 1745618 4215703 1744023 12359 11681 1465207 1461593 8192 128 12407 11683 1852135 1819957 1691479 606822 1762813 9201 1719584 12355 11688 1332681 1361028 8192 256 12365 11690 1222139 1160667 1021079 597483 1023725 11138 1084162 12394 11665 401195 405226 8192 512 12357 11649 584603 585220 578978 453549 579838 11675 583600 12368 11661 290331 293830 8192 1024 12375 11608 577917 575613 570623 529373 528209 12463 574842 12390 11666 286674 292467 8192 2048 12372 11658 582147 544248 573825 528681 570518 11648 577354 12373 11630 276964 285734 8192 4096 12374 11651 579427 577305 573069 11649 577587 11656 574400 12269 11644 265518 295612 8192 8192 12387 11649 575160 578364 569441 11644 573623 11650 578822 12380 11651 241836 297231 16384 4 11557 11686 1022909 1020297 694792 11470 837507 1162226 742909 11556 11691 919774 903937 16384 8 11561 11701 1353134 1356954 1014557 42137 1216135 2066600 1053213 11559 11688 1178268 1170023 16384 16 11576 11699 1619427 1631887 1338793 602488 1514364 3315870 1331503 11576 11685 1368057 1315773 16384 32 11551 11708 1754551 1759672 1464284 646385 1628947 4129920 1528920 11574 11709 1445952 1427657 16384 64 11577 11702 1841534 1817426 1520261 456494 1775265 4379710 1671298 11478 11692 1482447 1470362 16384 128 11556 11702 1836956 1804209 1693957 396753 1781662 9119 1705053 11546 11692 1188724 1317691 16384 256 11569 11693 1147018 1107710 982782 535142 1045633 11146 979434 11559 11615 418685 416019 16384 512 11561 11680 577899 580149 569858 517106 578347 11335 572007 11559 11682 290388 290264 16384 1024 11555 11601 572727 572803 561560 470441 566625 11687 573429 11559 11687 288674 293420 16384 2048 11564 11673 574979 569839 569225 528073 569598 11690 572088 11556 11687 287191 294745 16384 4096 11575 11690 574052 571883 570113 498265 560790 11676 571208 11551 11678 279681 294755 16384 8192 11562 11680 576310 571845 568298 11685 572746 11686 572469 11471 11686 265637 295917 16384 16384 11559 11689 553923 573587 571251 11685 567646 11688 567172 11561 11692 240498 296822 32768 4 11199 11253 1013483 1018425 652242 3875 876784 1184967 735697 11117 11255 916611 920528 32768 8 11200 11249 1308513 1355729 965409 9951 1203784 2062334 1029047 11187 11242 1184885 1169032 32768 16 11201 11247 1613220 1629033 1283763 32214 1531567 3280446 1316043 11117 11257 1367437 1348201 32768 32 11152 11257 1734301 1760696 1487556 41111 1701515 4031482 1529709 11201 11232 1453635 1436122 32768 64 11188 11261 1836966 1827149 1622916 42493 1795521 4348175 1676198 11189 11208 1490266 1465665 32768 128 11188 11249 1840188 1812093 1653114 43633 1803508 9080 1696852 11187 11232 1356237 1318531 32768 256 11197 11253 1121580 1127515 997869 51577 1112024 11042 1044294 11192 11247 415553 419236 32768 512 11186 11253 593139 599245 584370 39487 599420 11169 597442 11176 11247 285580 284614 32768 1024 11187 11252 589162 590075 581869 49214 585625 11342 586923 11147 11256 285908 287842 32768 2048 11180 11251 591801 590466 586345 37755 583851 11245 589382 11193 11251 284624 288549 32768 4096 11164 11241 490670 591801 586515 63111 585174 11240 588008 11195 11247 280836 286636 32768 8192 11181 11219 554242 555361 552078 34415 549706 11252 556134 11190 11211 274837 287928 32768 16384 11181 11236 556134 554414 551307 11250 551612 11254 552375 11189 11226 260926 288146 65536 4 11084 11136 669191 673013 514350 1754 742188 1167346 563430 11099 11142 600360 599345 65536 8 11098 11118 793723 793558 776971 2061 1017642 2097833 795476 11108 11143 692894 694244 65536 16 11104 11148 867419 882757 1050525 9078 1292111 3307581 1002829 11101 11156 756767 742996 65536 32 11102 11130 924396 922404 1208597 1334 1421325 4096515 1114899 11091 11150 772540 780962 65536 64 11106 11145 941732 939989 From owner-freebsd-geom@FreeBSD.ORG Wed Oct 17 19:28:22 2007 Return-Path: Delivered-To: freebsd-geom@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 36FC116A46B for ; Wed, 17 Oct 2007 19:28:22 +0000 (UTC) (envelope-from eksffa@freebsdbrasil.com.br) Received: from capeta.freebsdbrasil.com.br (vrrp.freebsdbrasil.com.br [200.210.70.30]) by mx1.freebsd.org (Postfix) with SMTP id 4FCA713C4AC for ; Wed, 17 Oct 2007 19:28:20 +0000 (UTC) (envelope-from eksffa@freebsdbrasil.com.br) Received: (qmail 60926 invoked from network); 17 Oct 2007 17:01:35 -0200 Received: from unknown (HELO claire.bh.freebsdbrasil.com.br) (201.78.125.207) by capeta.freebsdbrasil.com.br with SMTP; 17 Oct 2007 17:01:35 -0200 Message-ID: <47165C0B.7080707@freebsdbrasil.com.br> Date: Wed, 17 Oct 2007 17:01:31 -0200 From: Patrick Tracanelli Organization: FreeBSD Brasil LTDA User-Agent: Thunderbird 2.0.0.0 (X11/20070612) MIME-Version: 1.0 To: kurtseel References: <471650AA.30903@primetime.com> In-Reply-To: <471650AA.30903@primetime.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: freebsd-geom@freebsd.org Subject: Re: gmirror + ggated question X-BeenThere: freebsd-geom@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: GEOM-specific discussions and implementations List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 17 Oct 2007 19:28:22 -0000 kurtseel escreveu: > > I built a mirror of a local drive and a ggated backed device. I ran > iozone on it > and it runs along fine until a certain point when it slows down to a > near stand > still. It doesn't break the mirror or crash the system, but it does slow > the system > down to a near stop. > I kill the iozone, and a short time later I can login and then : > > # df > Filesystem 1K-blocks Used Avail Capacity Mounted on > /dev/mirror/thinkcs1a 1012974 155780 776158 17% / > devfs 1 1 0 100% /dev > /dev/mirror/thinkcs1e 85469448 1163474 77468420 1% /usr > /dev/mirror/thinkcs1d 4058062 40426 3692992 1% /var > [root@ ~/temp]# gmirror status > Name Status Components > mirror/thinkc COMPLETE ad0 > ggate0 > > And all seems normal again. Seems like it has to do with big files ... > This is the same configuration I used in : > http://bsdtips.utcorp.net/mediawiki/index.php/Mirroring_over_network > This is where the iozone gets stuck : Did you try raising send and receive buffers on ggated? I found myself confortable with -S and -R around 512k-780k. I didnt, however, did an iozone stress test, just a production test (real load) before going production. Try raising the buffer and let us know about your tests. TCP_NODELAY is also worth trying. -- Patrick Tracanelli FreeBSD Brasil LTDA. (31) 3281-9633 / 3281-3547 316601@sip.freebsdbrasil.com.br http://www.freebsdbrasil.com.br "Long live Hanin Elias, Kim Deal!" From owner-freebsd-geom@FreeBSD.ORG Wed Oct 17 19:50:49 2007 Return-Path: Delivered-To: freebsd-geom@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 20FF016A49A for ; Wed, 17 Oct 2007 19:50:49 +0000 (UTC) (envelope-from kurtseel@primetime.com) Received: from mail.primetime.com (mail.primetime.com [146.145.135.164]) by mx1.freebsd.org (Postfix) with ESMTP id E390E13C46E for ; Wed, 17 Oct 2007 19:50:48 +0000 (UTC) (envelope-from kurtseel@primetime.com) Received: from [10.200.1.130] (unknown [10.200.1.130]) by mail.primetime.com (Postfix) with ESMTP id 8F98AF9C423; Wed, 17 Oct 2007 14:49:05 -0400 (EDT) Message-ID: <47166562.60803@primetime.com> Date: Wed, 17 Oct 2007 15:41:22 -0400 From: kurtseel User-Agent: Thunderbird 2.0.0.5 (X11/20070724) MIME-Version: 1.0 To: Patrick Tracanelli References: <471650AA.30903@primetime.com> <47165C0B.7080707@freebsdbrasil.com.br> In-Reply-To: <47165C0B.7080707@freebsdbrasil.com.br> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: freebsd-geom@freebsd.org Subject: Re: gmirror + ggated question X-BeenThere: freebsd-geom@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: GEOM-specific discussions and implementations List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 17 Oct 2007 19:50:49 -0000 Patrick Tracanelli wrote: > kurtseel escreveu: >> >> I built a mirror of a local drive and a ggated backed device. I ran >> iozone on it >> and it runs along fine until a certain point when it slows down to a >> near stand >> still. It doesn't break the mirror or crash the system, but it does >> slow the system >> down to a near stop. >> I kill the iozone, and a short time later I can login and then : >> >> # df >> Filesystem 1K-blocks Used Avail Capacity Mounted on >> /dev/mirror/thinkcs1a 1012974 155780 776158 17% / >> devfs 1 1 0 100% /dev >> /dev/mirror/thinkcs1e 85469448 1163474 77468420 1% /usr >> /dev/mirror/thinkcs1d 4058062 40426 3692992 1% /var >> [root@ ~/temp]# gmirror status >> Name Status Components >> mirror/thinkc COMPLETE ad0 >> ggate0 >> >> And all seems normal again. Seems like it has to do with big files ... >> This is the same configuration I used in : >> http://bsdtips.utcorp.net/mediawiki/index.php/Mirroring_over_network >> This is where the iozone gets stuck : > > Did you try raising send and receive buffers on ggated? I found myself > confortable with -S and -R around 512k-780k. I didnt, however, did an > iozone stress test, just a production test (real load) before going > production. > > Try raising the buffer and let us know about your tests. TCP_NODELAY > is also worth trying. > Makes sense. So now I get this : Test (/root/benchmarks) > ggated -v -R 262144 -S 262144 /etc/ggated.conf info: Reading exports file (/etc/ggated.conf). debug: Added 10.200.1.200/32 /dev/ad10 RW to exports list. info: Exporting 1 object(s). error: Cannot open stream socket: No buffer space available. error: Exiting. Test (/root/benchmarks) > ggated -v -R 524288 -S 524288 /etc/ggated.conf info: Reading exports file (/etc/ggated.conf). debug: Added 10.200.1.200/32 /dev/ad10 RW to exports list. info: Exporting 1 object(s). error: Cannot open stream socket: No buffer space available. error: Exiting. I have raised sysctl net.inet.tcp.sendspace=4194304 sysctl net.inet.tcp.recvspace=4194304 sysctl kern.ipc.maxsockbuf=2097152 Which I saw in a posting ... It even happens here : Test (/root/benchmarks) > ggated -v -R 1 -S 1 /etc/ggated.conf info: Reading exports file (/etc/ggated.conf). debug: Added 10.200.1.200/32 /dev/ad10 RW to exports list. info: Exporting 1 object(s). error: Cannot open stream socket: No buffer space available. error: Exiting. From owner-freebsd-geom@FreeBSD.ORG Wed Oct 17 20:13:04 2007 Return-Path: Delivered-To: freebsd-geom@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 9A19816A4D1 for ; Wed, 17 Oct 2007 20:13:04 +0000 (UTC) (envelope-from pjd@garage.freebsd.pl) Received: from mail.garage.freebsd.pl (arm132.internetdsl.tpnet.pl [83.17.198.132]) by mx1.freebsd.org (Postfix) with ESMTP id 4DA9313C43E for ; Wed, 17 Oct 2007 20:13:01 +0000 (UTC) (envelope-from pjd@garage.freebsd.pl) Received: by mail.garage.freebsd.pl (Postfix, from userid 65534) id 149F645F56; Wed, 17 Oct 2007 22:12:58 +0200 (CEST) Received: from localhost (154.81.datacomsa.pl [195.34.81.154]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mail.garage.freebsd.pl (Postfix) with ESMTP id ED5CF45F42; Wed, 17 Oct 2007 22:12:52 +0200 (CEST) Date: Wed, 17 Oct 2007 22:12:35 +0200 From: Pawel Jakub Dawidek To: Andriy Gapon Message-ID: <20071017201235.GD50219@garage.freebsd.pl> References: <4715C3D1.3070308@icyb.net.ua> <4715F4EE.9020104@freebsd.org> <47161FD1.5010501@icyb.net.ua> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="zS7rBR6csb6tI2e1" Content-Disposition: inline In-Reply-To: <47161FD1.5010501@icyb.net.ua> User-Agent: Mutt/1.4.2.3i X-PGP-Key-URL: http://people.freebsd.org/~pjd/pjd.asc X-OS: FreeBSD 7.0-CURRENT i386 X-Spam-Checker-Version: SpamAssassin 3.0.4 (2005-06-05) on mail.garage.freebsd.pl X-Spam-Level: X-Spam-Status: No, score=-2.6 required=3.0 tests=BAYES_00 autolearn=ham version=3.0.4 Cc: freebsd-geom@freebsd.org Subject: Re: gjournal: FLUSHCACHE timed out X-BeenThere: freebsd-geom@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: GEOM-specific discussions and implementations List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 17 Oct 2007 20:13:04 -0000 --zS7rBR6csb6tI2e1 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Wed, Oct 17, 2007 at 05:44:33PM +0300, Andriy Gapon wrote: > on 17/10/2007 14:41 Eric Anderson said the following: > > Andriy Gapon wrote: > >> ad4s1ge (please don't pay attention to its slightly unusual name, this > >> is for historic reasons) is a journal partition/consumer for my /var > >> filesystem/partition/provider. > >> Size of /var is 16G, size of the journal is slightly less than 1G (1G - > >> 32 sectors actually). /var is UFS2 with softupdates enabled. > >=20 > >=20 > > Pawel, correct me if I'm wrong here - but I think you really need to=20 > > turn *off* softupdates on gjournaled file systems. >=20 > I was under a big mis-impression that I have to have softupdates enabled > for snapshots to work. Now that I know that I was wrong I will turn off > the softupdates. But it seems that there is nothing that would preclude > _in principle_ combination of softupdates/gjournal. Anyway, I care only > out of curiosity. It's not that it won't work together, but it's just hurts performance and memory consumption. > >> I noticed that I get these messages only when I run 'dump' on any of my > >> filesystems. I think that dump is using /tmp or /var/tmp for some > >> temporary data and in my setup both of those are in /var filesystem. > >> > >> So my I guess is that /var is being written "too" actively and I have = to > >> tune some parameters to make things smooth. > >=20 > > A few things to note: > >=20 > > - you can turn on 'async' option for your gjournaled file system, and= =20 > > get better performance >=20 > will do >=20 > > - you might be able to at the 'noatime' option to your file system moun= t=20 > > also >=20 > probably will do as well >=20 > > - You might try turning your journal switch time from 10 down to 5, and= =20 > > see if it alleviates some pressure on your disk. >=20 > I already did this and it helped! I don't see the messages anymore. > Thank you! > I will try to set this back to 10 after I do away with softupdates and > see what happens. >=20 > Thank you very much again. You should also try this patch: http://www.freebsd.org/cgi/cvsweb.cgi/src/sys/dev/ata/ata-disk.c.diff?r1= =3D1.201;r2=3D1.202 BIO_FLUSH timeout was way too small. --=20 Pawel Jakub Dawidek http://www.wheel.pl pjd@FreeBSD.org http://www.FreeBSD.org FreeBSD committer Am I Evil? Yes, I Am! --zS7rBR6csb6tI2e1 Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.4 (FreeBSD) iD8DBQFHFmyzForvXbEpPzQRAkScAJ9rnB4eERnKYOERIFHI2mKA+1zrIgCbBmDr UqxNSAjWPjHzGqeL8p+ewsE= =GbMN -----END PGP SIGNATURE----- --zS7rBR6csb6tI2e1-- From owner-freebsd-geom@FreeBSD.ORG Wed Oct 17 20:15:55 2007 Return-Path: Delivered-To: freebsd-geom@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id E6CC916A420 for ; Wed, 17 Oct 2007 20:15:55 +0000 (UTC) (envelope-from pjd@garage.freebsd.pl) Received: from mail.garage.freebsd.pl (arm132.internetdsl.tpnet.pl [83.17.198.132]) by mx1.freebsd.org (Postfix) with ESMTP id 6F4F313C44B for ; Wed, 17 Oct 2007 20:15:55 +0000 (UTC) (envelope-from pjd@garage.freebsd.pl) Received: by mail.garage.freebsd.pl (Postfix, from userid 65534) id 35D5145F42; Wed, 17 Oct 2007 22:15:54 +0200 (CEST) Received: from localhost (154.81.datacomsa.pl [195.34.81.154]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mail.garage.freebsd.pl (Postfix) with ESMTP id C5E2845F44; Wed, 17 Oct 2007 22:15:49 +0200 (CEST) Date: Wed, 17 Oct 2007 22:15:31 +0200 From: Pawel Jakub Dawidek To: kurtseel Message-ID: <20071017201531.GE50219@garage.freebsd.pl> References: <471650AA.30903@primetime.com> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="C94crkcyjafcjHxo" Content-Disposition: inline In-Reply-To: <471650AA.30903@primetime.com> User-Agent: Mutt/1.4.2.3i X-PGP-Key-URL: http://people.freebsd.org/~pjd/pjd.asc X-OS: FreeBSD 7.0-CURRENT i386 X-Spam-Checker-Version: SpamAssassin 3.0.4 (2005-06-05) on mail.garage.freebsd.pl X-Spam-Level: X-Spam-Status: No, score=-2.6 required=3.0 tests=BAYES_00 autolearn=ham version=3.0.4 Cc: freebsd-geom@freebsd.org Subject: Re: gmirror + ggated question X-BeenThere: freebsd-geom@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: GEOM-specific discussions and implementations List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 17 Oct 2007 20:15:56 -0000 --C94crkcyjafcjHxo Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Wed, Oct 17, 2007 at 02:12:58PM -0400, kurtseel wrote: >=20 > I built a mirror of a local drive and a ggated backed device. I ran=20 > iozone on it > and it runs along fine until a certain point when it slows down to a=20 > near stand > still. It doesn't break the mirror or crash the system, but it does slow= =20 > the system > down to a near stop. You haven't said which FreeBSD version you use. If it's not HEAD nor RELENG_7, try this patch: http://www.freebsd.org/cgi/cvsweb.cgi/src/sbin/ggate/shared/ggate.c.diff?r= 1=3D1.8;r2=3D1.9 --=20 Pawel Jakub Dawidek http://www.wheel.pl pjd@FreeBSD.org http://www.FreeBSD.org FreeBSD committer Am I Evil? Yes, I Am! --C94crkcyjafcjHxo Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.4 (FreeBSD) iD8DBQFHFm1jForvXbEpPzQRAiNJAJ9nykuY/E5CM11wibNiM2BvChiR7wCguO6a VoSgOoiUzwlLUhnR7T1Iluw= =ji1E -----END PGP SIGNATURE----- --C94crkcyjafcjHxo-- From owner-freebsd-geom@FreeBSD.ORG Wed Oct 17 21:16:22 2007 Return-Path: Delivered-To: freebsd-geom@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 66F4516A473 for ; Wed, 17 Oct 2007 21:16:22 +0000 (UTC) (envelope-from kurtseel@primetime.com) Received: from mail.primetime.com (mail.primetime.com [146.145.135.164]) by mx1.freebsd.org (Postfix) with ESMTP id 414CA13C461 for ; Wed, 17 Oct 2007 21:16:22 +0000 (UTC) (envelope-from kurtseel@primetime.com) Received: from [10.200.1.130] (deca.khome.utcorp.net [10.200.1.130]) by mail.primetime.com (Postfix) with ESMTP id 5C351F9C412; Wed, 17 Oct 2007 16:14:35 -0400 (EDT) Message-ID: <4716796B.9090803@primetime.com> Date: Wed, 17 Oct 2007 17:06:51 -0400 From: kurtseel User-Agent: Thunderbird 2.0.0.5 (X11/20070724) MIME-Version: 1.0 To: Pawel Jakub Dawidek References: <471650AA.30903@primetime.com> <20071017201531.GE50219@garage.freebsd.pl> In-Reply-To: <20071017201531.GE50219@garage.freebsd.pl> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: freebsd-geom@freebsd.org Subject: Re: gmirror + ggated question X-BeenThere: freebsd-geom@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: GEOM-specific discussions and implementations List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 17 Oct 2007 21:16:22 -0000 Pawel Jakub Dawidek wrote: > On Wed, Oct 17, 2007 at 02:12:58PM -0400, kurtseel wrote: > >> I built a mirror of a local drive and a ggated backed device. I ran >> iozone on it >> and it runs along fine until a certain point when it slows down to a >> near stand >> still. It doesn't break the mirror or crash the system, but it does slow >> the system >> down to a near stop. >> > > You haven't said which FreeBSD version you use. If it's not HEAD nor > RELENG_7, try this patch: > > http://www.freebsd.org/cgi/cvsweb.cgi/src/sbin/ggate/shared/ggate.c.diff?r1=1.8;r2=1.9 > > Sorry. [root@test1 /usr/src/sbin/ggate]# uname -a FreeBSD test1.khome.utcorp.net. 6.2-RELEASE-p4 FreeBSD 6.2-RELEASE-p4 #0: Thu Apr 26 17:40:53 UTC 2007 root@i386-builder.daemonology.net:/usr/obj/usr/src/sys/GENERIC i386 I applied the patch and am resyncing the mirror now, backed by the patched ggated. When it is done, I'll re-run the iozone. From owner-freebsd-geom@FreeBSD.ORG Thu Oct 18 13:08:41 2007 Return-Path: Delivered-To: freebsd-geom@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 214EF16A41B for ; Thu, 18 Oct 2007 13:08:41 +0000 (UTC) (envelope-from eksffa@freebsdbrasil.com.br) Received: from capeta.freebsdbrasil.com.br (vrrp.freebsdbrasil.com.br [200.210.70.30]) by mx1.freebsd.org (Postfix) with SMTP id 65C8613C459 for ; Thu, 18 Oct 2007 13:08:40 +0000 (UTC) (envelope-from eksffa@freebsdbrasil.com.br) Received: (qmail 15298 invoked from network); 18 Oct 2007 11:08:44 -0200 Received: from unknown (HELO claire.bh.freebsdbrasil.com.br) (201.78.96.93) by capeta.freebsdbrasil.com.br with SMTP; 18 Oct 2007 11:08:44 -0200 Message-ID: <47175AD2.5080308@freebsdbrasil.com.br> Date: Thu, 18 Oct 2007 11:08:34 -0200 From: Patrick Tracanelli Organization: FreeBSD Brasil LTDA User-Agent: Thunderbird 2.0.0.0 (X11/20070612) MIME-Version: 1.0 To: kurtseel References: <471650AA.30903@primetime.com> <47165C0B.7080707@freebsdbrasil.com.br> <47166562.60803@primetime.com> In-Reply-To: <47166562.60803@primetime.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: freebsd-geom@freebsd.org Subject: Re: gmirror + ggated question X-BeenThere: freebsd-geom@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: GEOM-specific discussions and implementations List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 18 Oct 2007 13:08:41 -0000 kurtseel escreveu: > Patrick Tracanelli wrote: >> kurtseel escreveu: >>> >>> I built a mirror of a local drive and a ggated backed device. I ran >>> iozone on it >>> and it runs along fine until a certain point when it slows down to a >>> near stand >>> still. It doesn't break the mirror or crash the system, but it does >>> slow the system >>> down to a near stop. >>> I kill the iozone, and a short time later I can login and then : >>> >>> # df >>> Filesystem 1K-blocks Used Avail Capacity Mounted on >>> /dev/mirror/thinkcs1a 1012974 155780 776158 17% / >>> devfs 1 1 0 100% /dev >>> /dev/mirror/thinkcs1e 85469448 1163474 77468420 1% /usr >>> /dev/mirror/thinkcs1d 4058062 40426 3692992 1% /var >>> [root@ ~/temp]# gmirror status >>> Name Status Components >>> mirror/thinkc COMPLETE ad0 >>> ggate0 >>> >>> And all seems normal again. Seems like it has to do with big files ... >>> This is the same configuration I used in : >>> http://bsdtips.utcorp.net/mediawiki/index.php/Mirroring_over_network >>> This is where the iozone gets stuck : >> >> Did you try raising send and receive buffers on ggated? I found myself >> confortable with -S and -R around 512k-780k. I didnt, however, did an >> iozone stress test, just a production test (real load) before going >> production. >> >> Try raising the buffer and let us know about your tests. TCP_NODELAY >> is also worth trying. >> > Makes sense. So now I get this : > > Test (/root/benchmarks) > ggated -v -R 262144 -S 262144 /etc/ggated.conf > info: Reading exports file (/etc/ggated.conf). > debug: Added 10.200.1.200/32 /dev/ad10 RW to exports list. > info: Exporting 1 object(s). > error: Cannot open stream socket: No buffer space available. > error: Exiting. > > Test (/root/benchmarks) > ggated -v -R 524288 -S 524288 /etc/ggated.conf > info: Reading exports file (/etc/ggated.conf). > debug: Added 10.200.1.200/32 /dev/ad10 RW to exports list. > info: Exporting 1 object(s). > error: Cannot open stream socket: No buffer space available. > error: Exiting. > > I have raised > > sysctl net.inet.tcp.sendspace=4194304 > sysctl net.inet.tcp.recvspace=4194304 > sysctl kern.ipc.maxsockbuf=2097152 > > Which I saw in a posting ... > > It even happens here : > > Test (/root/benchmarks) > ggated -v -R 1 -S 1 /etc/ggated.conf > info: Reading exports file (/etc/ggated.conf). > debug: Added 10.200.1.200/32 /dev/ad10 RW to exports list. > info: Exporting 1 object(s). > error: Cannot open stream socket: No buffer space available. > error: Exiting. > Seems that you are out of buffer spance and it is not related to ggated, since -R 1 and -S 1 would not demand a bounch of extra memory. In any case, tuning kern.ipc.maxsockbuf should be enough. If I raise to 512K I get out of buffer space too, on the default value. However, just raising it solves the prob: (eksffa@claire)~# sysctl -qw kern.ipc.maxsockbuf=`echo "524288*2" | bc ` kern.ipc.maxsockbuf: 262144 -> 1048576 (eksffa@claire)~# ggated -R 524288 -S 524288 -v info: Reading exports file (/etc/gg.exports). debug: Added 10.0.0.0/24 /dev/ad12 RO to exports list. info: Exporting 1 object(s). info: Listen on port: 3080. And so, I can import ggate0 on the other host. Try figuring out with netstat -m why you ran out of buffer. Also, I believe it can be related to the fact you have raised recvspace and sndspace way too high. I dont think it makes any sense raising it over 64k on 100Mbit network, or 128-512k on 1Gbit network. You have raised 'em up to 4MB :) Lower down to the default 32k (send) / 64k (recv) first. If you are on 1Gbit or 10Gbit you can, later, tray raising on multiple of 32K untill the point you see it makes sense (where it makes positive difference on your benchs). If it is anyhow relevant, I run it on 6.2-STABLE, cvsuped on Sept 24th, with the patches PJD mentioned applied. -- Patrick Tracanelli FreeBSD Brasil LTDA. (31) 3281-9633 / 3281-3547 316601@sip.freebsdbrasil.com.br http://www.freebsdbrasil.com.br "Long live Hanin Elias, Kim Deal!" From owner-freebsd-geom@FreeBSD.ORG Fri Oct 19 18:01:21 2007 Return-Path: Delivered-To: freebsd-geom@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 8E38B16A421 for ; Fri, 19 Oct 2007 18:01:21 +0000 (UTC) (envelope-from felipe@neuwald.biz) Received: from itacaiunas.cepatec.org.br (itacaiunas.cepatec.org.br [200.152.208.51]) by mx1.freebsd.org (Postfix) with ESMTP id 40DA013C480 for ; Fri, 19 Oct 2007 18:01:21 +0000 (UTC) (envelope-from felipe@neuwald.biz) Received: from localhost (vermelho [10.0.0.5]) by itacaiunas.cepatec.org.br (Postfix) with ESMTP id DE1DA11571A for ; Fri, 19 Oct 2007 15:43:17 -0200 (BRST) X-Virus-Scanned: amavisd-new at cepatec.org.br Received: from itacaiunas.cepatec.org.br ([10.0.0.3]) by localhost (vermelho.cepatec.org.br [10.0.0.5]) (amavisd-new, port 10024) with ESMTP id ATserKI0Mr3z for ; Fri, 19 Oct 2007 14:43:16 -0300 (BRT) Received: from [192.168.0.152] (unknown [200.199.198.61]) by itacaiunas.cepatec.org.br (Postfix) with ESMTP id 0B3801154FD for ; Fri, 19 Oct 2007 15:43:13 -0200 (BRST) Message-ID: <4718ECB2.9050207@neuwald.biz> Date: Fri, 19 Oct 2007 15:43:14 -0200 From: Felipe Neuwald User-Agent: Thunderbird 1.5.0.13 (X11/20070824) MIME-Version: 1.0 To: freebsd-geom@freebsd.org Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Subject: gvinum - problem on hard disk X-BeenThere: freebsd-geom@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: GEOM-specific discussions and implementations List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 19 Oct 2007 18:01:21 -0000 Hi folks, I have one gvinum raid on a FreeBSD 6.1-RELEASE machine. There are 4 disks running, as you can see: [root@fileserver ~]# gvinum list 4 drives: D a State: up /dev/ad4 A: 0/238474 MB (0%) D b State: up /dev/ad5 A: 0/238475 MB (0%) D c State: up /dev/ad6 A: 0/238475 MB (0%) D d State: up /dev/ad7 A: 0/238475 MB (0%) 1 volume: V data State: down Plexes: 1 Size: 931 GB 1 plex: P data.p0 S State: down Subdisks: 4 Size: 931 GB 4 subdisks: S data.p0.s3 State: stale D: d Size: 232 GB S data.p0.s2 State: up D: c Size: 232 GB S data.p0.s1 State: up D: b Size: 232 GB S data.p0.s0 State: up D: a Size: 232 GB But, as you can see, the data.p0.s3 is "stale". What should I do to try recover this and get the raid up again (and recover information) Thanks, Felipe Neuwald. From owner-freebsd-geom@FreeBSD.ORG Fri Oct 19 20:18:36 2007 Return-Path: Delivered-To: freebsd-geom@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 5B62816A421 for ; Fri, 19 Oct 2007 20:18:36 +0000 (UTC) (envelope-from lulf@stud.ntnu.no) Received: from fri.itea.ntnu.no (fri.itea.ntnu.no [129.241.7.60]) by mx1.freebsd.org (Postfix) with ESMTP id 0CA3413C44B for ; Fri, 19 Oct 2007 20:18:35 +0000 (UTC) (envelope-from lulf@stud.ntnu.no) Received: from localhost (localhost [127.0.0.1]) by fri.itea.ntnu.no (Postfix) with ESMTP id F076B8401; Fri, 19 Oct 2007 22:00:32 +0200 (CEST) Received: from caracal.stud.ntnu.no (caracal.stud.ntnu.no [129.241.56.185]) by fri.itea.ntnu.no (Postfix) with ESMTP; Fri, 19 Oct 2007 22:00:32 +0200 (CEST) Received: by caracal.stud.ntnu.no (Postfix, from userid 2312) id 956396240F4; Fri, 19 Oct 2007 22:00:41 +0200 (CEST) Date: Fri, 19 Oct 2007 22:00:41 +0200 From: Ulf Lilleengen To: Felipe Neuwald Message-ID: <20071019200041.GA16812@stud.ntnu.no> References: <4718ECB2.9050207@neuwald.biz> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <4718ECB2.9050207@neuwald.biz> User-Agent: Mutt/1.5.9i X-Content-Scanned: with sophos and spamassassin at mailgw.ntnu.no. Cc: freebsd-geom@freebsd.org Subject: Re: gvinum - problem on hard disk X-BeenThere: freebsd-geom@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: GEOM-specific discussions and implementations List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 19 Oct 2007 20:18:36 -0000 On fre, okt 19, 2007 at 03:43:14 -0200, Felipe Neuwald wrote: > Hi folks, > > I have one gvinum raid on a FreeBSD 6.1-RELEASE machine. There are 4 > disks running, as you can see: > > [root@fileserver ~]# gvinum list > 4 drives: > D a State: up /dev/ad4 A: 0/238474 MB (0%) > D b State: up /dev/ad5 A: 0/238475 MB (0%) > D c State: up /dev/ad6 A: 0/238475 MB (0%) > D d State: up /dev/ad7 A: 0/238475 MB (0%) > > 1 volume: > V data State: down Plexes: 1 Size: 931 GB > > 1 plex: > P data.p0 S State: down Subdisks: 4 Size: 931 GB > > 4 subdisks: > S data.p0.s3 State: stale D: d Size: 232 GB > S data.p0.s2 State: up D: c Size: 232 GB > S data.p0.s1 State: up D: b Size: 232 GB > S data.p0.s0 State: up D: a Size: 232 GB > > > But, as you can see, the data.p0.s3 is "stale". What should I do to try > recover this and get the raid up again (and recover information) > Hello, Since your plex organization is RAID0 (striping), recovering after a drive failure is a problem since you don't have any redundancy, but if you didn't replace any drives etc, this could just be gvinum fooling around. In that case, doing a 'gvinum setstate -f up data.p0.s3' should get the volume up again. > -- Ulf Lilleengen