From owner-freebsd-fs@FreeBSD.ORG Wed Jan 2 05:06:44 2008 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id E719216A419 for ; Wed, 2 Jan 2008 05:06:44 +0000 (UTC) (envelope-from anderson@freebsd.org) Received: from ns.trinitel.com (186.161.36.72.static.reverse.ltdomains.com [72.36.161.186]) by mx1.freebsd.org (Postfix) with ESMTP id BF37E13C43E for ; Wed, 2 Jan 2008 05:06:44 +0000 (UTC) (envelope-from anderson@freebsd.org) Received: from neutrino.vnode.org (r74-193-81-203.pfvlcmta01.grtntx.tl.dh.suddenlink.net [74.193.81.203]) (authenticated bits=0) by ns.trinitel.com (8.14.1/8.14.1) with ESMTP id m024inQt081259 for ; Tue, 1 Jan 2008 22:44:49 -0600 (CST) (envelope-from anderson@freebsd.org) Message-ID: <477B16BB.8070104@freebsd.org> Date: Tue, 01 Jan 2008 22:44:43 -0600 From: Eric Anderson User-Agent: Thunderbird 2.0.0.9 (X11/20071227) MIME-Version: 1.0 To: "freebsd-fs@freebsd.org" Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=0.0 required=5.0 tests=none autolearn=failed version=3.1.8 X-Spam-Checker-Version: SpamAssassin 3.1.8 (2007-02-13) on ns.trinitel.com Subject: ZFS i/o errors - which disk is the problem? X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 02 Jan 2008 05:06:45 -0000 I created a zpool with two new identical (500GB) SATA disks. I rsync'ed a bunch of data over to the new ZFS file systems, and started seeing i/o errors. Here's how I created the file systems: zpool create tank mirror ad6 ad8 zfs create tank/media zfs create tank/documents zfs set sharenfs=on tank/media zfs set sharenfs=on tank/documents zfs set atime=off tank zfs set mountpoint=/media tank/media zfs set mountpoint=/documents tank/documents Here's what zpool status says: # zpool status pool: tank state: ONLINE status: One or more devices has experienced an error resulting in data corruption. Applications may be affected. action: Restore the file in question if possible. Otherwise restore the entire pool from backup. see: http://www.sun.com/msg/ZFS-8000-8A scrub: scrub completed with 731 errors on Tue Jan 1 15:17:08 2008 config: NAME STATE READ WRITE CKSUM tank ONLINE 0 0 1.47K mirror ONLINE 0 0 1.47K ad6 ONLINE 0 0 5.12K ad8 ONLINE 0 0 4.66K How can I tell which drive gave the problems, or where the problem came from? I see several errors in /var/log/messages, like: ZFS: zpool I/O failure, zpool=tank error=86 and many many of these: ZFS: checksum mismatch, zpool=tank path=/dev/ad6 offset=31970426880 size=131072 for both the ad6 and ad8 devices. I'm happy to swap the drive out, but I don't know which is the problem. I was also wondering if it was a saturated I/O issue on the system (it's a fairly slow and poky old box). Any ideas/hints? Eric From owner-freebsd-fs@FreeBSD.ORG Wed Jan 2 07:02:02 2008 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id E618F16A418 for ; Wed, 2 Jan 2008 07:02:02 +0000 (UTC) (envelope-from ticso@cicely12.cicely.de) Received: from raven.bwct.de (raven.bwct.de [85.159.14.73]) by mx1.freebsd.org (Postfix) with ESMTP id 7533213C467 for ; Wed, 2 Jan 2008 07:02:02 +0000 (UTC) (envelope-from ticso@cicely12.cicely.de) Received: from cicely5.cicely.de ([10.1.1.7]) by raven.bwct.de (8.13.4/8.13.4) with ESMTP id m02720UX040813; Wed, 2 Jan 2008 08:02:00 +0100 (CET) (envelope-from ticso@cicely12.cicely.de) Received: from cicely12.cicely.de (cicely12.cicely.de [10.1.1.14]) by cicely5.cicely.de (8.13.4/8.13.4) with ESMTP id m0271l1n016018 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Wed, 2 Jan 2008 08:01:47 +0100 (CET) (envelope-from ticso@cicely12.cicely.de) Received: from cicely12.cicely.de (localhost [127.0.0.1]) by cicely12.cicely.de (8.13.4/8.13.3) with ESMTP id m0271l8M056031; Wed, 2 Jan 2008 08:01:47 +0100 (CET) (envelope-from ticso@cicely12.cicely.de) Received: (from ticso@localhost) by cicely12.cicely.de (8.13.4/8.13.3/Submit) id m0271lud056030; Wed, 2 Jan 2008 08:01:47 +0100 (CET) (envelope-from ticso) Date: Wed, 2 Jan 2008 08:01:46 +0100 From: Bernd Walter To: Eric Anderson Message-ID: <20080102070146.GH49874@cicely12.cicely.de> References: <477B16BB.8070104@freebsd.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <477B16BB.8070104@freebsd.org> X-Operating-System: FreeBSD cicely12.cicely.de 5.4-STABLE alpha User-Agent: Mutt/1.5.9i X-Spam-Status: No, score=-4.4 required=5.0 tests=ALL_TRUSTED=-1.8, BAYES_00=-2.599 autolearn=ham version=3.2.3 X-Spam-Checker-Version: SpamAssassin 3.2.3 (2007-08-08) on cicely12.cicely.de Cc: "freebsd-fs@freebsd.org" Subject: Re: ZFS i/o errors - which disk is the problem? X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: ticso@cicely.de List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 02 Jan 2008 07:02:03 -0000 On Tue, Jan 01, 2008 at 10:44:43PM -0600, Eric Anderson wrote: > I created a zpool with two new identical (500GB) SATA disks. I rsync'ed > a bunch of data over to the new ZFS file systems, and started seeing i/o > errors. > > Here's how I created the file systems: > > zpool create tank mirror ad6 ad8 > zfs create tank/media > zfs create tank/documents > zfs set sharenfs=on tank/media > zfs set sharenfs=on tank/documents > zfs set atime=off tank > zfs set mountpoint=/media tank/media > zfs set mountpoint=/documents tank/documents > > > Here's what zpool status says: > > # zpool status > pool: tank > state: ONLINE > status: One or more devices has experienced an error resulting in data > corruption. Applications may be affected. > action: Restore the file in question if possible. Otherwise restore the > entire pool from backup. > see: http://www.sun.com/msg/ZFS-8000-8A > scrub: scrub completed with 731 errors on Tue Jan 1 15:17:08 2008 > config: > > NAME STATE READ WRITE CKSUM > tank ONLINE 0 0 1.47K > mirror ONLINE 0 0 1.47K > ad6 ONLINE 0 0 5.12K > ad8 ONLINE 0 0 4.66K > > How can I tell which drive gave the problems, or where the problem came > from? I see several errors in /var/log/messages, like: > > ZFS: zpool I/O failure, zpool=tank error=86 zpool status -v should tell you more details. But it is not required, since the message below is enough. > and many many of these: > > ZFS: checksum mismatch, zpool=tank path=/dev/ad6 offset=31970426880 > size=131072 > > for both the ad6 and ad8 devices. So you have crc errors on both drives. > I'm happy to swap the drive out, but I don't know which is the problem. > I was also wondering if it was a saturated I/O issue on the system > (it's a fairly slow and poky old box). The errors mean that silently data written to disk were not the same when they were read back. I doubt that this are the drives, but if they are identic it is possible of course, since firmware bugs are not impossible. More likely you have a problematic ata controller or maybe defective ram. -- B.Walter http://www.bwct.de http://www.fizon.de bernd@bwct.de info@bwct.de support@fizon.de From owner-freebsd-fs@FreeBSD.ORG Wed Jan 2 11:09:46 2008 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id B81F016A41A; Wed, 2 Jan 2008 11:09:46 +0000 (UTC) (envelope-from kris@FreeBSD.org) Received: from weak.local (freefall.freebsd.org [IPv6:2001:4f8:fff6::28]) by mx1.freebsd.org (Postfix) with ESMTP id BC84713C458; Wed, 2 Jan 2008 11:09:45 +0000 (UTC) (envelope-from kris@FreeBSD.org) Message-ID: <477B70F9.8070903@FreeBSD.org> Date: Wed, 02 Jan 2008 12:09:45 +0100 From: Kris Kennaway User-Agent: Thunderbird 2.0.0.9 (Macintosh/20071031) MIME-Version: 1.0 To: David Taylor , freebsd-fs@freebsd.org, pjd@freebsd.org References: <20071231232319.GA90972@outcold.yadt.co.uk> In-Reply-To: <20071231232319.GA90972@outcold.yadt.co.uk> Content-Type: text/plain; charset=ISO-8859-15; format=flowed Content-Transfer-Encoding: 7bit Cc: Subject: Re: [PATCH] ZFS not caching on i386 with kmem_size >1GB X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 02 Jan 2008 11:09:46 -0000 David Taylor wrote: > Hi, > > About 2 months ago I reported that I found ZFS extremely slow for > some tasks (specifically upgrading ports). This was because ZFS > was only using the absolute minimum cache size at all times. > > The problem is here in /sys/contrib/opensolaris/uts/common/fs/zfs/arc.c: > > static int > arc_reclaim_needed(void) > { > ... > if (kmem_used() > (kmem_size() * 4) / 5) > return (1); > } > > I'm running on i386 with kmem_size set to 1GB. As a result, the > multiplication overflows and the test becomes (kmem_used() > 0). ZFS then > always tries to shrink the cache, and never grows it above the absolute > minimum size (about 30MB for each of c and p) > > The patch I have attached fixes the problem for me, although there is probably > a better way to avoid the overflow (without calling kmem_size() twice). > Best of all, portupgrade is now an order of magnitude faster! > > Of course, I'm now worried that my previously rock-solid settings will actually > trigger the kmem_map too small panics when the cache actually fills up. > FYI, kmem_size > 1GB makes no sense unless you also increase KVA_PAGES since the entire kernel only has 1GB of address space on i386. Kris From owner-freebsd-fs@FreeBSD.ORG Wed Jan 2 12:32:07 2008 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 4971D16A418 for ; Wed, 2 Jan 2008 12:32:07 +0000 (UTC) (envelope-from anderson@freebsd.org) Received: from ns.trinitel.com (186.161.36.72.static.reverse.ltdomains.com [72.36.161.186]) by mx1.freebsd.org (Postfix) with ESMTP id 0624B13C45A for ; Wed, 2 Jan 2008 12:32:06 +0000 (UTC) (envelope-from anderson@freebsd.org) Received: from proton.storspeed.com (209-163-168-124.static.tenantsolutions.com [209.163.168.124] (may be forged)) (authenticated bits=0) by ns.trinitel.com (8.14.1/8.14.1) with ESMTP id m02CW1sJ074337; Wed, 2 Jan 2008 06:32:02 -0600 (CST) (envelope-from anderson@freebsd.org) Message-ID: <477B8440.1020501@freebsd.org> Date: Wed, 02 Jan 2008 06:32:00 -0600 From: Eric Anderson User-Agent: Thunderbird 2.0.0.9 (Macintosh/20071031) MIME-Version: 1.0 To: ticso@cicely.de References: <477B16BB.8070104@freebsd.org> <20080102070146.GH49874@cicely12.cicely.de> In-Reply-To: <20080102070146.GH49874@cicely12.cicely.de> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-2.1 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.1.8 X-Spam-Checker-Version: SpamAssassin 3.1.8 (2007-02-13) on ns.trinitel.com Cc: "freebsd-fs@freebsd.org" Subject: Re: ZFS i/o errors - which disk is the problem? X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 02 Jan 2008 12:32:07 -0000 Bernd Walter wrote: > On Tue, Jan 01, 2008 at 10:44:43PM -0600, Eric Anderson wrote: >> I created a zpool with two new identical (500GB) SATA disks. I rsync'ed >> a bunch of data over to the new ZFS file systems, and started seeing i/o >> errors. >> >> Here's how I created the file systems: >> >> zpool create tank mirror ad6 ad8 >> zfs create tank/media >> zfs create tank/documents >> zfs set sharenfs=on tank/media >> zfs set sharenfs=on tank/documents >> zfs set atime=off tank >> zfs set mountpoint=/media tank/media >> zfs set mountpoint=/documents tank/documents >> >> >> Here's what zpool status says: >> >> # zpool status >> pool: tank >> state: ONLINE >> status: One or more devices has experienced an error resulting in data >> corruption. Applications may be affected. >> action: Restore the file in question if possible. Otherwise restore the >> entire pool from backup. >> see: http://www.sun.com/msg/ZFS-8000-8A >> scrub: scrub completed with 731 errors on Tue Jan 1 15:17:08 2008 >> config: >> >> NAME STATE READ WRITE CKSUM >> tank ONLINE 0 0 1.47K >> mirror ONLINE 0 0 1.47K >> ad6 ONLINE 0 0 5.12K >> ad8 ONLINE 0 0 4.66K >> >> How can I tell which drive gave the problems, or where the problem came >> from? I see several errors in /var/log/messages, like: >> >> ZFS: zpool I/O failure, zpool=tank error=86 > > zpool status -v should tell you more details. > But it is not required, since the message below is enough. Yes, I did that, but of course >700 files were listed, but that's about the only difference in output, so I omitted it here. >> and many many of these: >> >> ZFS: checksum mismatch, zpool=tank path=/dev/ad6 offset=31970426880 >> size=131072 >> >> for both the ad6 and ad8 devices. > > So you have crc errors on both drives. > >> I'm happy to swap the drive out, but I don't know which is the problem. >> I was also wondering if it was a saturated I/O issue on the system >> (it's a fairly slow and poky old box). > > The errors mean that silently data written to disk were not the same > when they were read back. > I doubt that this are the drives, but if they are identic it is possible > of course, since firmware bugs are not impossible. > More likely you have a problematic ata controller or maybe defective > ram. I can believe a problematic SATA controller (it's an add-on PCI board), but does anyone know of a way to ask ZFS which devices in a pool it thinks has issues? Eric From owner-freebsd-fs@FreeBSD.ORG Thu Jan 3 16:00:58 2008 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 3933716A468 for ; Thu, 3 Jan 2008 16:00:58 +0000 (UTC) (envelope-from gore_jarold@yahoo.com) Received: from web63012.mail.re1.yahoo.com (web63012.mail.re1.yahoo.com [69.147.96.223]) by mx1.freebsd.org (Postfix) with SMTP id C9ADA13C459 for ; Thu, 3 Jan 2008 16:00:57 +0000 (UTC) (envelope-from gore_jarold@yahoo.com) Received: (qmail 80032 invoked by uid 60001); 3 Jan 2008 15:34:15 -0000 DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=s1024; d=yahoo.com; h=X-YMail-OSG:Received:Date:From:Subject:To:MIME-Version:Content-Type:Content-Transfer-Encoding:Message-ID; b=OCR3YMNHuk6cvpCvOMhafL4HzOV/8IZHikT5cncOMRHAu+SR1/+BbN3CnqqWXopWv41U0kvv4yPm33+eMFnOAo/zEd70FgrowmmeNuGuBoaWW8ywZXnsc06+fyK6EchrfgVKjPLwva0y2RhZJoM40DY46AOeWFm+h8bKfgUTydQ=; X-YMail-OSG: oEsqjSUVM1lJnv7bQkr15KssxBsdk95tyuAUF_O6WhdhivGPqJfBf_E.tPfsL0i..RrkTNFTJaXBInLQtT_UtqIs.Q-- Received: from [71.63.232.32] by web63012.mail.re1.yahoo.com via HTTP; Thu, 03 Jan 2008 07:34:15 PST Date: Thu, 3 Jan 2008 07:34:15 -0800 (PST) From: Gore Jarold To: freebsd-fs@freebsd.org MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: 8bit Message-ID: <370651.79772.qm@web63012.mail.re1.yahoo.com> Subject: moving slots with a 3ware raid controller ... danger ? X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 03 Jan 2008 16:00:58 -0000 I am running a 3ware 9650SE-16ML on FreeBSD 6.1-RELEASE. I think that the card is in a 4x PCI-E slot, and I think that is causing my system to crash periodically under very high IO load. I am planning on moving the card from the 4X slot to an 8x slot, which is the speed recommended for it. Before I do this, I would like a sanity check - is there ANY reason that moving slots with this card would be dangerous ? Will FreeBSD care that the card comes in on a new slot ? Will the card ? The arrays ? Is there anything at all I should know about this proposed slot move that would put my filesystems in danger in any way ? War stories and speculation appreciated... Thanks. ____________________________________________________________________________________ Never miss a thing. Make Yahoo your home page. http://www.yahoo.com/r/hs From owner-freebsd-fs@FreeBSD.ORG Thu Jan 3 16:50:30 2008 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 833F116A419 for ; Thu, 3 Jan 2008 16:50:30 +0000 (UTC) (envelope-from peter.schuller@infidyne.com) Received: from smtp.infidyne.com (ds9.infidyne.com [88.80.6.206]) by mx1.freebsd.org (Postfix) with ESMTP id 3B44A13C459 for ; Thu, 3 Jan 2008 16:50:29 +0000 (UTC) (envelope-from peter.schuller@infidyne.com) Received: from c-8216e555.03-51-73746f3.cust.bredbandsbolaget.se (c-8216e555.03-51-73746f3.cust.bredbandsbolaget.se [85.229.22.130]) by smtp.infidyne.com (Postfix) with ESMTP id 108D477B9E; Thu, 3 Jan 2008 17:50:28 +0100 (CET) From: Peter Schuller To: freebsd-fs@freebsd.org Date: Thu, 3 Jan 2008 17:50:22 +0100 User-Agent: KMail/1.9.7 References: <477B16BB.8070104@freebsd.org> <20080102070146.GH49874@cicely12.cicely.de> <477B8440.1020501@freebsd.org> In-Reply-To: <477B8440.1020501@freebsd.org> MIME-Version: 1.0 Content-Type: multipart/signed; boundary="nextPart1990334.O8aOO8jLRy"; protocol="application/pgp-signature"; micalg=pgp-sha1 Content-Transfer-Encoding: 7bit Message-Id: <200801031750.31035.peter.schuller@infidyne.com> Cc: ticso@cicely.de Subject: Re: ZFS i/o errors - which disk is the problem? X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 03 Jan 2008 16:50:30 -0000 --nextPart1990334.O8aOO8jLRy Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable Content-Disposition: inline > I can believe a problematic SATA controller (it's an add-on PCI board), > but does anyone know of a way to ask ZFS which devices in a pool it > thinks has issues? That is exactly what zpool status is intended to tell you. That is, the dis= ks=20 that you are seeing checksum errors on are the ones seeing the faults. In=20 your case both drives show checksum errors (for some reason). =2D-=20 / Peter Schuller PGP userID: 0xE9758B7D or 'Peter Schuller ' Key retrieval: Send an E-Mail to getpgpkey@scode.org E-Mail: peter.schuller@infidyne.com Web: http://www.scode.org --nextPart1990334.O8aOO8jLRy Content-Type: application/pgp-signature; name=signature.asc Content-Description: This is a digitally signed message part. -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.4 (FreeBSD) iD8DBQBHfRJXDNor2+l1i30RAl5OAKCPu9R8PEi69OgurHbw/QlqoGE/BQCeKcN7 0Bn0gH+DJFs2F8M6ToXsGyo= =VWXW -----END PGP SIGNATURE----- --nextPart1990334.O8aOO8jLRy-- From owner-freebsd-fs@FreeBSD.ORG Thu Jan 3 16:51:21 2008 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id C808D16A41B for ; Thu, 3 Jan 2008 16:51:21 +0000 (UTC) (envelope-from scottl@samsco.org) Received: from pooker.samsco.org (pooker.samsco.org [168.103.85.57]) by mx1.freebsd.org (Postfix) with ESMTP id 51FC413C4E9 for ; Thu, 3 Jan 2008 16:51:21 +0000 (UTC) (envelope-from scottl@samsco.org) Received: from phobos.samsco.home (phobos.samsco.home [192.168.254.11]) (authenticated bits=0) by pooker.samsco.org (8.13.8/8.13.8) with ESMTP id m03GDCpv062396; Thu, 3 Jan 2008 09:13:12 -0700 (MST) (envelope-from scottl@samsco.org) Message-ID: <477D0998.9050407@samsco.org> Date: Thu, 03 Jan 2008 09:13:12 -0700 From: Scott Long User-Agent: Mozilla/5.0 (Macintosh; U; Intel Mac OS X; en-US; rv:1.8.1.11) Gecko/20071128 SeaMonkey/1.1.7 MIME-Version: 1.0 To: Gore Jarold References: <370651.79772.qm@web63012.mail.re1.yahoo.com> In-Reply-To: <370651.79772.qm@web63012.mail.re1.yahoo.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Greylist: Sender succeeded SMTP AUTH authentication, not delayed by milter-greylist-2.0.2 (pooker.samsco.org [168.103.85.57]); Thu, 03 Jan 2008 09:13:13 -0700 (MST) X-Spam-Status: No, score=-1.4 required=5.4 tests=ALL_TRUSTED autolearn=failed version=3.1.8 X-Spam-Checker-Version: SpamAssassin 3.1.8 (2007-02-13) on pooker.samsco.org Cc: freebsd-fs@freebsd.org Subject: Re: moving slots with a 3ware raid controller ... danger ? X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 03 Jan 2008 16:51:21 -0000 Gore Jarold wrote: > I am running a 3ware 9650SE-16ML on FreeBSD > 6.1-RELEASE. > > I think that the card is in a 4x PCI-E slot, and I > think that is causing my system to crash periodically > under very high IO load. Can't help ya if you don't say what the crash is. > > I am planning on moving the card from the 4X slot to > an 8x slot, which is the speed recommended for it. If this is what 3ware actually recommends, I'd find it highly amusing, not mention highly unlikely that they have a controller that can actually push more that 1GB/sec of disk bandwidth, even if it had 16 disks hooked up to it. > > Before I do this, I would like a sanity check - is > there ANY reason that moving slots with this card > would be dangerous ? Will FreeBSD care that the card > comes in on a new slot ? No > Will the card ? I'd certainly hope not. > The arrays? I'd personally ask my my money back from 3ware if they did. Scott From owner-freebsd-fs@FreeBSD.ORG Thu Jan 3 16:53:02 2008 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 51C8D16A417 for ; Thu, 3 Jan 2008 16:53:02 +0000 (UTC) (envelope-from peter.schuller@infidyne.com) Received: from smtp.infidyne.com (ds9.infidyne.com [88.80.6.206]) by mx1.freebsd.org (Postfix) with ESMTP id 0F8CC13C448 for ; Thu, 3 Jan 2008 16:53:02 +0000 (UTC) (envelope-from peter.schuller@infidyne.com) Received: from c-8216e555.03-51-73746f3.cust.bredbandsbolaget.se (c-8216e555.03-51-73746f3.cust.bredbandsbolaget.se [85.229.22.130]) by smtp.infidyne.com (Postfix) with ESMTP id 5059A77BC3; Thu, 3 Jan 2008 17:53:01 +0100 (CET) From: Peter Schuller To: freebsd-fs@freebsd.org Date: Thu, 3 Jan 2008 17:53:04 +0100 User-Agent: KMail/1.9.7 References: <370651.79772.qm@web63012.mail.re1.yahoo.com> In-Reply-To: <370651.79772.qm@web63012.mail.re1.yahoo.com> MIME-Version: 1.0 Content-Type: multipart/signed; boundary="nextPart14451666.IWvXjrCfhN"; protocol="application/pgp-signature"; micalg=pgp-sha1 Content-Transfer-Encoding: 7bit Message-Id: <200801031753.04979.peter.schuller@infidyne.com> Cc: Gore Jarold Subject: Re: moving slots with a 3ware raid controller ... danger ? X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 03 Jan 2008 16:53:02 -0000 --nextPart14451666.IWvXjrCfhN Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable Content-Disposition: inline > Is there anything at all I should know about this > proposed slot move that would put my filesystems in > danger in any way ? War stories and speculation > appreciated... I don't have experience with that particular card. That said, the only like= ly=20 issue I can think of off hand would be if you had multiple cards and moving= =20 it re-defined the order with which they, and thus the drives on them, are=20 detected. This would mix up drive naming, which I suppose could be classifi= ed=20 as putting filesystems in danger... =2D-=20 / Peter Schuller PGP userID: 0xE9758B7D or 'Peter Schuller ' Key retrieval: Send an E-Mail to getpgpkey@scode.org E-Mail: peter.schuller@infidyne.com Web: http://www.scode.org --nextPart14451666.IWvXjrCfhN Content-Type: application/pgp-signature; name=signature.asc Content-Description: This is a digitally signed message part. -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.4 (FreeBSD) iD8DBQBHfRLwDNor2+l1i30RAtryAKDZ4q51HmKKmeG5OYc5TsStywgNwgCeIAyR ObWyvMYD+mTmAXH74ndTulw= =ogRg -----END PGP SIGNATURE----- --nextPart14451666.IWvXjrCfhN-- From owner-freebsd-fs@FreeBSD.ORG Thu Jan 3 17:10:15 2008 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 1E68016A417 for ; Thu, 3 Jan 2008 17:10:15 +0000 (UTC) (envelope-from anderson@freebsd.org) Received: from ns.trinitel.com (186.161.36.72.static.reverse.ltdomains.com [72.36.161.186]) by mx1.freebsd.org (Postfix) with ESMTP id E720C13C447 for ; Thu, 3 Jan 2008 17:10:14 +0000 (UTC) (envelope-from anderson@freebsd.org) Received: from proton.storspeed.com (209-163-168-124.static.tenantsolutions.net [209.163.168.124] (may be forged)) (authenticated bits=0) by ns.trinitel.com (8.14.1/8.14.1) with ESMTP id m03HA7E6064403; Thu, 3 Jan 2008 11:10:07 -0600 (CST) (envelope-from anderson@freebsd.org) Message-ID: <477D16EE.6070804@freebsd.org> Date: Thu, 03 Jan 2008 11:10:06 -0600 From: Eric Anderson User-Agent: Thunderbird 2.0.0.9 (Macintosh/20071031) MIME-Version: 1.0 To: Peter Schuller References: <477B16BB.8070104@freebsd.org> <20080102070146.GH49874@cicely12.cicely.de> <477B8440.1020501@freebsd.org> <200801031750.31035.peter.schuller@infidyne.com> In-Reply-To: <200801031750.31035.peter.schuller@infidyne.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-2.1 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.1.8 X-Spam-Checker-Version: SpamAssassin 3.1.8 (2007-02-13) on ns.trinitel.com Cc: freebsd-fs@freebsd.org, ticso@cicely.de Subject: Re: ZFS i/o errors - which disk is the problem? X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 03 Jan 2008 17:10:15 -0000 Peter Schuller wrote: >> I can believe a problematic SATA controller (it's an add-on PCI board), >> but does anyone know of a way to ask ZFS which devices in a pool it >> thinks has issues? > > That is exactly what zpool status is intended to tell you. That is, the disks > that you are seeing checksum errors on are the ones seeing the faults. In > your case both drives show checksum errors (for some reason). > Yea, I suspect it's the cheesy SATA controller I stuck in the system. I suppose I will rebuild my NFS server with different hardware :( Thanks, Eric From owner-freebsd-fs@FreeBSD.ORG Thu Jan 3 17:18:31 2008 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 9602516A417; Thu, 3 Jan 2008 17:18:31 +0000 (UTC) (envelope-from brooks@lor.one-eyed-alien.net) Received: from lor.one-eyed-alien.net (cl-162.ewr-01.us.sixxs.net [IPv6:2001:4830:1200:a1::2]) by mx1.freebsd.org (Postfix) with ESMTP id A22C813C467; Thu, 3 Jan 2008 17:18:30 +0000 (UTC) (envelope-from brooks@lor.one-eyed-alien.net) Received: from lor.one-eyed-alien.net (localhost [127.0.0.1]) by lor.one-eyed-alien.net (8.14.1/8.13.8) with ESMTP id m03HIPrl029818; Thu, 3 Jan 2008 11:18:25 -0600 (CST) (envelope-from brooks@lor.one-eyed-alien.net) Received: (from brooks@localhost) by lor.one-eyed-alien.net (8.14.1/8.13.8/Submit) id m03HIP8E029817; Thu, 3 Jan 2008 11:18:25 -0600 (CST) (envelope-from brooks) Date: Thu, 3 Jan 2008 11:18:25 -0600 From: Brooks Davis To: Eric Anderson Message-ID: <20080103171825.GA28361@lor.one-eyed-alien.net> References: <477B16BB.8070104@freebsd.org> <20080102070146.GH49874@cicely12.cicely.de> <477B8440.1020501@freebsd.org> <200801031750.31035.peter.schuller@infidyne.com> <477D16EE.6070804@freebsd.org> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="vkogqOf2sHV7VnPd" Content-Disposition: inline In-Reply-To: <477D16EE.6070804@freebsd.org> User-Agent: Mutt/1.5.16 (2007-06-09) X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-3.0 (lor.one-eyed-alien.net [127.0.0.1]); Thu, 03 Jan 2008 11:18:25 -0600 (CST) Cc: freebsd-fs@freebsd.org, ticso@cicely.de Subject: Re: ZFS i/o errors - which disk is the problem? X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 03 Jan 2008 17:18:31 -0000 --vkogqOf2sHV7VnPd Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Thu, Jan 03, 2008 at 11:10:06AM -0600, Eric Anderson wrote: > Peter Schuller wrote: >>> I can believe a problematic SATA controller (it's an add-on PCI board), >>> but does anyone know of a way to ask ZFS which devices in a pool it >>> thinks has issues? >> That is exactly what zpool status is intended to tell you. That is, the= =20 >> disks that you are seeing checksum errors on are the ones seeing the=20 >> faults. In your case both drives show checksum errors (for some reason). >=20 > Yea, I suspect it's the cheesy SATA controller I stuck in the system. I= =20 > suppose I will rebuild my NFS server with different hardware :( We've definitely seen cases where hardware changes fixed ZFS checksum error= s. In once case, a firmware upgrade on the raid controller fixed it. In anoth= er case, we'd been connecting to an external array with a SCSI card that didn't have a PCI bracket and the errors went away when the replacement one arrived and was installed. The fact that there were significant errors caught by Z= FS was quite disturbing since we wouldn't have found them with UFS. -- Brooks --vkogqOf2sHV7VnPd Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (FreeBSD) iD8DBQFHfRjgXY6L6fI4GtQRAuq3AJ936BpveFQhTBHDNJcED+abpsrtNQCg0d78 lVmsueyAeIh9dDHamgndOYU= =XUYN -----END PGP SIGNATURE----- --vkogqOf2sHV7VnPd--