From owner-freebsd-fs@FreeBSD.ORG Sun Jul 4 02:44:59 2010 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 2D46C106567A for ; Sun, 4 Jul 2010 02:44:59 +0000 (UTC) (envelope-from bounces@nabble.com) Received: from kuber.nabble.com (kuber.nabble.com [216.139.236.158]) by mx1.freebsd.org (Postfix) with ESMTP id 0902E8FC0C for ; Sun, 4 Jul 2010 02:44:57 +0000 (UTC) Received: from isper.nabble.com ([192.168.236.156]) by kuber.nabble.com with esmtp (Exim 4.63) (envelope-from ) id 1OVFCT-0002i5-5o for freebsd-fs@freebsd.org; Sat, 03 Jul 2010 19:44:57 -0700 Message-ID: <29067121.post@talk.nabble.com> Date: Sat, 3 Jul 2010 19:44:57 -0700 (PDT) From: Bucarr To: freebsd-fs@freebsd.org In-Reply-To: <20100704005336.00003234@unknown> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-Nabble-From: bucarr@gmail.com References: <29065362.post@talk.nabble.com> <20100704005336.00003234@unknown> Subject: Re: zfs on 4k sector disks X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 04 Jul 2010 02:44:59 -0000 On Sat, 3 Jul 2010 11:28:03 -0700 (PDT) Bucarr wrote: > I was wondering that as well. So I set up a new FreeBSD8.0 box with > a UFS2 disk as my boot disk and two WD 2TB EARS drives ("Advanced > Format") for a zfs data array. I made no attempt to alter the WD > drives in any way and ignored the 512b vs 4096b issues. 'zpool > create tank raidz da0 da1' set up the array in a few seconds and I've > copied to/from with no troubles and no apparent performance issues > that I care about. If I had more of the WD drives lying around, I'd > add those to the array and see what gives then. I suspect the performance may be acceptable, but probably not great. I thought I didn't have any problems with RAIDZ's variable stripe size on my EARS drives either until I found some writes would suddenly take ages to finish, causing applications to hang. -- Bruce Cran I wonder if my 2 drive experiment in RAIDZ didn't simply mirror the drives so that striping wasn't invoked at all? -- View this message in context: http://old.nabble.com/zfs-on-4k-sector-disks-tp27664932p29067121.html Sent from the freebsd-fs mailing list archive at Nabble.com. From owner-freebsd-fs@FreeBSD.ORG Sun Jul 4 05:21:28 2010 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 3988F106566B; Sun, 4 Jul 2010 05:21:28 +0000 (UTC) (envelope-from jhellenthal@gmail.com) Received: from mail-iw0-f182.google.com (mail-iw0-f182.google.com [209.85.214.182]) by mx1.freebsd.org (Postfix) with ESMTP id DD2D18FC13; Sun, 4 Jul 2010 05:21:27 +0000 (UTC) Received: by iwn35 with SMTP id 35so2731722iwn.13 for ; Sat, 03 Jul 2010 22:21:27 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:sender:message-id:date:from :user-agent:mime-version:to:cc:subject:references:in-reply-to :x-enigmail-version:openpgp:content-type:content-transfer-encoding; bh=gAWmtYLTzaCLBAIYL54vjvm0BLQZd7AyqKdEw6pztCo=; b=iOMJDK2DsVXiV9UtqUCFFL1F7y+XWZQBuDCDf2c+EHXYd0GcTGoKXa9QpAnXaDmp7X UgFTz4ZIww66JQqrOiRVa/v9/oxNg0vI9KM5yx6yIlHCXq/Ko3mWo4APrA82QryXNL2O Z4f0DFT2fiADgsrgJin8N+JkKYTKB+++YAWWE= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=sender:message-id:date:from:user-agent:mime-version:to:cc:subject :references:in-reply-to:x-enigmail-version:openpgp:content-type :content-transfer-encoding; b=CpqBcNBx+r8FoBUEYdET2PPntAbZezV2uvTAMn5zR5IS+9dOAtxYJjHt6H2tYlCy9F 8uTKPnNIwYurzOACEtlyik0RUJv6JhR3DQLDEoLPuiKl4c7TWAlvjod8CGT6aBeXrtuu XKvnVCGvRLBJwMfzhjpkyoPkkc5w4mRdxeTuI= Received: by 10.231.33.140 with SMTP id h12mr1331192ibd.59.1278220886873; Sat, 03 Jul 2010 22:21:26 -0700 (PDT) Received: from centel.dataix.local (adsl-99-181-128-180.dsl.klmzmi.sbcglobal.net [99.181.128.180]) by mx.google.com with ESMTPS id g31sm11567172ibh.22.2010.07.03.22.21.25 (version=SSLv3 cipher=RC4-MD5); Sat, 03 Jul 2010 22:21:25 -0700 (PDT) Sender: "J. Hellenthal" Message-ID: <4C301A54.1000405@dataix.net> Date: Sun, 04 Jul 2010 01:21:24 -0400 From: jhell User-Agent: Mozilla/5.0 (X11; U; FreeBSD i386; en-US; rv:1.9.1.10) Gecko/20100626 Thunderbird MIME-Version: 1.0 To: "Sam Fourman Jr." References: In-Reply-To: X-Enigmail-Version: 1.0.1 OpenPGP: id=89D8547E Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Cc: freebsd-fs@freebsd.org Subject: Re: Help with Faulted zpool X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 04 Jul 2010 05:21:28 -0000 On 07/03/2010 17:12, Sam Fourman Jr. wrote: > Hello list, > > I have a File server that runs FreeBSD 8.1 (zfs v14) > after a poweroutage, I am unable to import my zpool named Network > my pool is made up of 6 1TB disks configured in raidz. > there is ~1.9TB of actual data on this pool. > Hi Sam, One option if you have it available is to patch your sources to version 16 fs version 4 that mm@ has made a patch for recently. This patch contains the fix to the bug that causes this situation to happen during a power outage or other unexpected quick failure. The patches mentioned here may not help you recover but might just help stop your data loss in the future if your pool can not be repaired. I am unsure if anything in these patches will actually help resolve your issue but it might be worth a shot. Martin is CC'd, maybe he has further knowledge and notation he could add to this. Two good references: http://bit.ly/aLS9XJ http://bit.ly/dnBqsZ And Ill forward mm@ update emails directly to you following this email. Ye Ole Patch fer v16: Ive been running this for over a week now. http://bit.ly/bBhUDz Martin, Sam: After patching I had to move a few files around in the source directory, and create two directories. Namely *.py on recent 8.1-STABLE source. Since these directories do not exist on the system it just leaves the said files in the base of the source tree. Created these directories: ( mkdir -p %%BELOW%% ) /usr/src/cddl/contrib/opensolaris/cmd/pyzfs /usr/src/cddl/contrib/opensolaris/lib/pyzfs/common/ Moved these files: mv /usr/src/pyzfs.py /usr/src/cddl/contrib/opensolaris/cmd/pyzfs/ mv /usr/src/*.py /usr/src/cddl/contrib/opensolaris/lib/pyzfs/common/ Good luck, -- +-+-+-+-+-+ |j|h|e|l|l| +-+-+-+-+-+ From owner-freebsd-fs@FreeBSD.ORG Sun Jul 4 06:45:53 2010 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 82FF6106564A; Sun, 4 Jul 2010 06:45:53 +0000 (UTC) (envelope-from to.my.trociny@gmail.com) Received: from mail-bw0-f54.google.com (mail-bw0-f54.google.com [209.85.214.54]) by mx1.freebsd.org (Postfix) with ESMTP id 8E3FA8FC16; Sun, 4 Jul 2010 06:45:52 +0000 (UTC) Received: by bwz12 with SMTP id 12so2741880bwz.13 for ; Sat, 03 Jul 2010 23:45:41 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:from:to:cc:subject:references :x-comment-to:date:in-reply-to:message-id:user-agent:mime-version :content-type; bh=xelBvkr+45KRSMAUcpJYZ3ty4G0t1Dj0ynA49Jzv9NM=; b=w9/HOAzskzBltOPxuGMUt4N+yPgXZFVRXbXy5SWWqWXaBelBcHnia4R7ONW4Zop9Eg liFhvJUPIy74Bxkp4S0DVkjiGN+GnNwVsj0bgxNOUdwNDAV2sxetcig1hkSQOU6qUuJd fEc4emDglsjdp8+ty6RSQ6sY3T4YuZ6J6udUk= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=from:to:cc:subject:references:x-comment-to:date:in-reply-to :message-id:user-agent:mime-version:content-type; b=G19uxNqjFCAqOv9lrNz4zDDF18DqrNYjDIdiVD3m3XSTwfGCImDb844F3+qm5qWcXB InIQgSreNTL/hcsYHbsVhY8iRh1V2GyFYDHyr/tBwtVg+TxxW4t4DisJ8khvOQYbaaMt BDdWfi4DDaGRWHwvIKly88AEU3gh5nytiAPg8= Received: by 10.204.136.71 with SMTP id q7mr1010571bkt.111.1278225940895; Sat, 03 Jul 2010 23:45:40 -0700 (PDT) Received: from localhost ([95.69.169.55]) by mx.google.com with ESMTPS id 24sm11425120bkr.7.2010.07.03.23.45.38 (version=TLSv1/SSLv3 cipher=RC4-MD5); Sat, 03 Jul 2010 23:45:39 -0700 (PDT) From: Mikolaj Golub To: "hiroshi\@soupacific.com" References: <4C139F9C.2090305@soupacific.com> <86iq5oc82y.fsf@kopusha.home.net> <4C14215D.9090304@soupacific.com> <20100613003635.GA60012@icarus.home.lan> <20100613074921.GB1320@garage.freebsd.pl> <4C149A5C.3070401@soupacific.com> <20100613102401.GE1320@garage.freebsd.pl> <86eigavzsg.fsf@kopusha.home.net> <20100614095044.GH1721@garage.freebsd.pl> <868w6hwt2w.fsf@kopusha.home.net> <20100614153746.GN1721@garage.freebsd.pl> <86zkyxvc4v.fsf@kopusha.home.net> <4C2C43D5.1080907@soupacific.com> <86mxubndrp.fsf@kopusha.home.net> <4C2D7615.5070606@soupacific.com> <861vbm1hpr.fsf@zhuzha.ua1> <4C2D9C62.4050105@soupacific.com> <86wrtez14z.fsf@zhuzha.ua1> <4C2DC801.5080108@soupacific.com> <86iq4xx9fy.fsf@kopusha.home.net> <4C2F3E14.1080601@soupacific.com> X-Comment-To: hiroshi@soupacific.com Date: Sun, 04 Jul 2010 09:45:36 +0300 In-Reply-To: <4C2F3E14.1080601@soupacific.com> (hiroshi@soupacific.com's message of "Sat, 03 Jul 2010 22:41:40 +0900") Message-ID: <86pqz3iw33.fsf@kopusha.home.net> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/23.2 (berkeley-unix) MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Cc: freebsd-fs@freebsd.org, Pawel Jakub Dawidek Subject: Re: HAST and CARP X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 04 Jul 2010 06:45:53 -0000 On Sat, 03 Jul 2010 22:41:40 +0900 hiroshi@soupacific.com wrote: >> >> You should have a setup so when the master is rebooted after the reboot it >> checks the status of other node and sets its own role accordingly (so there >> would not be two masters simultaneously). Software I use in my setup (our home >> made application) does this well. sysutils/heartbeat should work fine too. As >> for me carp might not do well for this but I am not very experienced with carp >> so I can be wrong. >> h> By CARP, ifconfig carp0 advskew {bigger value than secondary} on h> console sets CARP as secondary. h> How do you think this idea ? I think you could make a configuration so when hostB (secondary) switches to master it changes advskew to the value lower then on hostA. In this way after hostA reboot it will have higher advskew and will be forced to do as a secondary. Then after the nodes are synchronized you can switch to the initial state restoring initial value of advskew on hostB. -- Mikolaj Golub From owner-freebsd-fs@FreeBSD.ORG Sun Jul 4 11:50:05 2010 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 9FE57106564A for ; Sun, 4 Jul 2010 11:50:05 +0000 (UTC) (envelope-from freebsd-fs@m.gmane.org) Received: from lo.gmane.org (lo.gmane.org [80.91.229.12]) by mx1.freebsd.org (Postfix) with ESMTP id 2B5088FC13 for ; Sun, 4 Jul 2010 11:50:04 +0000 (UTC) Received: from list by lo.gmane.org with local (Exim 4.69) (envelope-from ) id 1OVNhz-0007b8-Mv for freebsd-fs@freebsd.org; Sun, 04 Jul 2010 13:50:03 +0200 Received: from 193.33.173.33 ([193.33.173.33]) by main.gmane.org with esmtp (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Sun, 04 Jul 2010 13:50:03 +0200 Received: from c.kworr by 193.33.173.33 with local (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Sun, 04 Jul 2010 13:50:03 +0200 X-Injected-Via-Gmane: http://gmane.org/ To: freebsd-fs@freebsd.org From: Volodymyr Kostyrko Date: Sun, 04 Jul 2010 14:38:54 +0300 Lines: 53 Message-ID: References: Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Complaints-To: usenet@dough.gmane.org X-Gmane-NNTP-Posting-Host: 193.33.173.33 User-Agent: Mozilla/5.0 (X11; U; FreeBSD i386; uk-UA; rv:1.9.1.10) Gecko/20100627 Thunderbird/3.0.5 In-Reply-To: Subject: Re: Help with Faulted zpool X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 04 Jul 2010 11:50:05 -0000 04.07.2010 00:12, Sam Fourman Jr. написав(ла): > Hello list, > > I have a File server that runs FreeBSD 8.1 (zfs v14) > after a poweroutage, I am unable to import my zpool named Network > my pool is made up of 6 1TB disks configured in raidz. > there is ~1.9TB of actual data on this pool. > > FreeBSD does not have many option for restoring a pool from corrupt meta data > > > I have loaded OpenSolaris svn_134 on a septate boot disk, > in hopes of recovering my zpool. > > on Opensolaris 134, I am not able to import my zpool > almost everything I try gives me cannot import 'Network': I/O error > > I have done quite a bit of searching, and I found that import -fFX > Network should work > however after ~ 20 hours this hard locks Opensolaris (however it does > return a ping) > > here is a list of commands that I have run on Open Solaris > > http://www.puffybsd.com/zfsv14.txt > > if anyone could help me use zdb or mdb to recover my pool > I would very much appreciate it. > > I believe the metadata is corrupt on my zpool Just sharing the knowledge, you know... Some time ago my primary file server suffers from controller damage. When ZFS tried to fix metadata controller was able to write most of it failing to complete full restore. Thus I've got pool with two or more recent snapshots broken. FreeBSD (RELENG_8) dumps core when trying to import pool. OpenSolaris (last version available, i think with ZFSv22) was giving the same result. I've managed to recover all data from pool using single user mode. This way when I got shell ZFS already knows about my pool, but no are mounted. From this point it's possible to mount each dataset RO and copy all data from it to another disk. You will face the kernel coredump only in case of some dataset metadata would be heavily damaged. All you need is: 1. Original zpool.cache with which your pool was imported. 2. ZFS module loaded. 3. Break into single user. -- Sphinx of black quartz judge my vow. From owner-freebsd-fs@FreeBSD.ORG Sun Jul 4 13:40:28 2010 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 2D381106580C for ; Sun, 4 Jul 2010 13:40:28 +0000 (UTC) (envelope-from ticso@cicely7.cicely.de) Received: from raven.bwct.de (raven.bwct.de [85.159.14.73]) by mx1.freebsd.org (Postfix) with ESMTP id C87C88FC0A for ; Sun, 4 Jul 2010 13:40:27 +0000 (UTC) Received: from mail.cicely.de ([10.1.1.37]) by raven.bwct.de (8.13.4/8.13.4) with ESMTP id o64DdoDe019027 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK); Sun, 4 Jul 2010 15:40:05 +0200 (CEST) (envelope-from ticso@cicely7.cicely.de) Received: from cicely7.cicely.de (cicely7.cicely.de [10.1.1.9]) by mail.cicely.de (8.14.3/8.14.3) with ESMTP id o64DdkYg043461 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Sun, 4 Jul 2010 15:39:46 +0200 (CEST) (envelope-from ticso@cicely7.cicely.de) Received: from cicely7.cicely.de (localhost [127.0.0.1]) by cicely7.cicely.de (8.14.2/8.14.2) with ESMTP id o64DdkSL025292; Sun, 4 Jul 2010 15:39:46 +0200 (CEST) (envelope-from ticso@cicely7.cicely.de) Received: (from ticso@localhost) by cicely7.cicely.de (8.14.2/8.14.2/Submit) id o64Ddhv7025291; Sun, 4 Jul 2010 15:39:43 +0200 (CEST) (envelope-from ticso) Date: Sun, 4 Jul 2010 15:39:43 +0200 From: Bernd Walter To: Volodymyr Kostyrko Message-ID: <20100704133943.GG16689@cicely7.cicely.de> References: Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Operating-System: FreeBSD cicely7.cicely.de 7.0-STABLE i386 User-Agent: Mutt/1.5.11 X-Spam-Status: No, score=-2.9 required=5.0 tests=ALL_TRUSTED=-1, BAYES_00=-1.9, T_RP_MATCHES_RCVD=-0.01 autolearn=ham version=3.3.0 X-Spam-Checker-Version: SpamAssassin 3.3.0 (2010-01-18) on spamd.cicely.de Cc: freebsd-fs@freebsd.org Subject: Re: Help with Faulted zpool X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: ticso@cicely.de List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 04 Jul 2010 13:40:28 -0000 On Sun, Jul 04, 2010 at 02:38:54PM +0300, Volodymyr Kostyrko wrote: > 04.07.2010 00:12, Sam Fourman Jr. ??????????????(????): > >Hello list, > > > >I have a File server that runs FreeBSD 8.1 (zfs v14) > >after a poweroutage, I am unable to import my zpool named Network > >my pool is made up of 6 1TB disks configured in raidz. > >there is ~1.9TB of actual data on this pool. > > > >FreeBSD does not have many option for restoring a pool from corrupt meta > >data > > > > > >I have loaded OpenSolaris svn_134 on a septate boot disk, > >in hopes of recovering my zpool. > > > >on Opensolaris 134, I am not able to import my zpool > >almost everything I try gives me cannot import 'Network': I/O error > > > >I have done quite a bit of searching, and I found that import -fFX > >Network should work > >however after ~ 20 hours this hard locks Opensolaris (however it does > >return a ping) > > > >here is a list of commands that I have run on Open Solaris > > > >http://www.puffybsd.com/zfsv14.txt > > > >if anyone could help me use zdb or mdb to recover my pool > >I would very much appreciate it. > > > >I believe the metadata is corrupt on my zpool > > Just sharing the knowledge, you know... > > Some time ago my primary file server suffers from controller damage. > When ZFS tried to fix metadata controller was able to write most of it > failing to complete full restore. Thus I've got pool with two or more > recent snapshots broken. FreeBSD (RELENG_8) dumps core when trying to > import pool. OpenSolaris (last version available, i think with ZFSv22) > was giving the same result. > > I've managed to recover all data from pool using single user mode. This > way when I got shell ZFS already knows about my pool, but no are > mounted. From this point it's possible to mount each dataset RO and copy > all data from it to another disk. You will face the kernel coredump only > in case of some dataset metadata would be heavily damaged. I've had a similar expirience long time ago. Mounting a special FS let the system fall into endless disk excercise, which I waited for more about one day before I gave up. But it was possible to clone an earlier snapshot of the FS and mount this one. > All you need is: > 1. Original zpool.cache with which your pool was imported. > 2. ZFS module loaded. > 3. Break into single user. > > -- > Sphinx of black quartz judge my vow. > > _______________________________________________ > freebsd-fs@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-fs > To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org" -- B.Walter http://www.bwct.de Modbus/TCP Ethernet I/O Baugruppen, ARM basierte FreeBSD Rechner uvm. From owner-freebsd-fs@FreeBSD.ORG Mon Jul 5 02:09:18 2010 Return-Path: Delivered-To: freebsd-fs@hub.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id DC338106564A; Mon, 5 Jul 2010 02:09:18 +0000 (UTC) (envelope-from linimon@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:4f8:fff6::28]) by mx1.freebsd.org (Postfix) with ESMTP id B4BA88FC08; Mon, 5 Jul 2010 02:09:18 +0000 (UTC) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.4/8.14.4) with ESMTP id o6529I4u015798; Mon, 5 Jul 2010 02:09:18 GMT (envelope-from linimon@freefall.freebsd.org) Received: (from linimon@localhost) by freefall.freebsd.org (8.14.4/8.14.4/Submit) id o6529IPL015794; Mon, 5 Jul 2010 02:09:18 GMT (envelope-from linimon) Date: Mon, 5 Jul 2010 02:09:18 GMT Message-Id: <201007050209.o6529IPL015794@freefall.freebsd.org> To: linimon@FreeBSD.org, freebsd-bugs@FreeBSD.org, freebsd-fs@FreeBSD.org From: linimon@FreeBSD.org Cc: Subject: Re: kern/148368: [zfs] ZFS hanging forever on 8.1-PRERELEASE X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 05 Jul 2010 02:09:19 -0000 Old Synopsis: ZFS hanging forever on 8.1-PRERELEASE New Synopsis: [zfs] ZFS hanging forever on 8.1-PRERELEASE Responsible-Changed-From-To: freebsd-bugs->freebsd-fs Responsible-Changed-By: linimon Responsible-Changed-When: Mon Jul 5 02:09:06 UTC 2010 Responsible-Changed-Why: Over to maintainer(s). http://www.freebsd.org/cgi/query-pr.cgi?pr=148368 From owner-freebsd-fs@FreeBSD.ORG Mon Jul 5 02:10:21 2010 Return-Path: Delivered-To: freebsd-fs@hub.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 6EC75106566B; Mon, 5 Jul 2010 02:10:21 +0000 (UTC) (envelope-from linimon@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:4f8:fff6::28]) by mx1.freebsd.org (Postfix) with ESMTP id 47A898FC08; Mon, 5 Jul 2010 02:10:21 +0000 (UTC) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.4/8.14.4) with ESMTP id o652ALi1015923; Mon, 5 Jul 2010 02:10:21 GMT (envelope-from linimon@freefall.freebsd.org) Received: (from linimon@localhost) by freefall.freebsd.org (8.14.4/8.14.4/Submit) id o652ALg8015919; Mon, 5 Jul 2010 02:10:21 GMT (envelope-from linimon) Date: Mon, 5 Jul 2010 02:10:21 GMT Message-Id: <201007050210.o652ALg8015919@freefall.freebsd.org> To: linimon@FreeBSD.org, freebsd-i386@FreeBSD.org, freebsd-fs@FreeBSD.org From: linimon@FreeBSD.org Cc: Subject: Re: kern/148204: [nfs] UDP NFS causes overload X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 05 Jul 2010 02:10:21 -0000 Old Synopsis: UDP NFS causes overload New Synopsis: [nfs] UDP NFS causes overload Responsible-Changed-From-To: freebsd-i386->freebsd-fs Responsible-Changed-By: linimon Responsible-Changed-When: Mon Jul 5 02:09:57 UTC 2010 Responsible-Changed-Why: reclassify. http://www.freebsd.org/cgi/query-pr.cgi?pr=148204 From owner-freebsd-fs@FreeBSD.ORG Mon Jul 5 03:07:33 2010 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id E199D1065672 for ; Mon, 5 Jul 2010 03:07:33 +0000 (UTC) (envelope-from joe@rewt.org.uk) Received: from smtpauth.rollernet.us (smtpauth6.rollernet.us [IPv6:2620:0:950:f000:213:72ff:fe4f:6a76]) by mx1.freebsd.org (Postfix) with ESMTP id 537CB8FC37 for ; Mon, 5 Jul 2010 03:07:33 +0000 (UTC) Received: from smtpauth.rollernet.us (localhost [127.0.0.1]) by smtpauth.rollernet.us (Postfix) with ESMTP id 5C15159401C for ; Sun, 4 Jul 2010 20:07:29 -0700 (PDT) Received: from una.stf.rewt.org.uk (una.stf.rewt.org.uk [91.208.177.42]) (Authenticated sender: unauna) by smtpauth.rollernet.us (Postfix) with ESMTPA for ; Sun, 4 Jul 2010 20:07:28 -0700 (PDT) Received: from localhost (localhost [127.0.0.1]) by una.stf.rewt.org.uk (Postfix) with ESMTP id 8FD122281B for ; Mon, 5 Jul 2010 04:07:26 +0100 (BST) Date: Mon, 5 Jul 2010 04:07:26 +0100 (BST) From: Joe Holden To: freebsd-fs@freebsd.org Message-ID: User-Agent: Alpine 2.00 (BSF 1167 2008-08-23) MIME-Version: 1.0 Content-Type: TEXT/PLAIN; format=flowed; charset=US-ASCII X-Rollernet-Abuse: Processed by Roller Network Mail Services. Contact abuse@rollernet.us to report violations. Abuse policy: http://rollernet.us/abuse.php X-Rollernet-Submit: Submit ID 5284.4c314c70.dc28e.0 Subject: zfs/zpool hang on HEAD and STABLE. X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 05 Jul 2010 03:07:34 -0000 Collective, I have a fileserver running a zpool of approx. 5TB using mirrored pairs:- While running HEAD sources from May 22nd or so, all was well, however either HEAD or STABLE sources since then cause a non-responsive userland (login/csh stops responding, ssh et al, however the kernel is still running as I can switch vty's). Running 8.0-R allows zpool [status] to return the correct info however... Even with WITNESS and INVARIANTS compiled in, no debug is presented, the machine simply stops responding as above. Machine: 1.8Ghz Celeron 430 3GB ram, 4 * 2TB, 2 * 1TB disks in zpool, 2 * 250GB disks in gmirror for OS. amd64 branch I am running with a GENERIC kernel and no custom sysctl/kernel settings. Does anyone have any suggests as to what I can do to diagnose/debug this issue, or has anyone seen similar? Thanks, Joe From owner-freebsd-fs@FreeBSD.ORG Mon Jul 5 05:05:45 2010 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 51975106566B; Mon, 5 Jul 2010 05:05:45 +0000 (UTC) (envelope-from hiroshi@soupacific.com) Received: from mail.soupacific.com (mail.soupacific.com [211.19.53.201]) by mx1.freebsd.org (Postfix) with ESMTP id 1A7278FC17; Mon, 5 Jul 2010 05:05:44 +0000 (UTC) Received: from [127.0.0.1] (unknown [192.168.1.239]) by mail.soupacific.com (Postfix) with ESMTP id 662926C9BA; Mon, 5 Jul 2010 04:57:42 +0000 (UTC) Message-ID: <4C31681C.5070406@soupacific.com> Date: Mon, 05 Jul 2010 14:05:32 +0900 From: "hiroshi@soupacific.com" User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.2.4) Gecko/20100608 Thunderbird/3.1 MIME-Version: 1.0 To: Mikolaj Golub References: <4C139F9C.2090305@soupacific.com><86iq5oc82y.fsf@kopusha.home.net> <4C14215D.9090304@soupacific.com><20100613003635.GA60012@icarus.home.lan><20100613074921.GB1320@garage.freebsd.pl><4C149A5C.3070401@soupacific.com><20100613102401.GE1320@garage.freebsd.pl><86eigavzsg.fsf@kopusha.home.net><20100614095044.GH1721@garage.freebsd.pl><868w6hwt2w.fsf@kopusha.home.net><20100614153746.GN1721@garage.freebsd.pl><86zkyxvc4v.fsf@kopusha.home.net> <4C2C43D5.1080907@soupacific.com><86mxubndrp.fsf@kopusha.home.net> <4C2D7615.5070606@soupacific.com><861vbm1hpr.fsf@zhuzha.ua1> <4C2D9C62.4050105@soupacific.com><86wrtez14z.fsf@zhuzha.ua1> <4C2DC801.5080108@soupacific.com><86iq4xx9fy.fsf@kopusha.home.net> <4C2F3E14.1080601@soupacific.com> <86pqz3iw33.fsf@kopusha.home.net> In-Reply-To: <86pqz3iw33.fsf@kopusha.home.net> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: freebsd-fs@freebsd.org, Pawel Jakub Dawidek Subject: Re: HAST and CARP X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 05 Jul 2010 05:05:45 -0000 Hi ! > > I think you could make a configuration so when hostB (secondary) switches to > master it changes advskew to the value lower then on hostA. >In this way after > hostA reboot it will have higher advskew and will be forced to do as a > secondary. Then after the nodes are synchronized you can switch to the initial > state restoring initial value of advskew on hostB. > NO ! Once split-brain mode appeared, CARP and HAST as BACKUP mode by changed whatever, advskew value etc. No way to get out slip-brain complain forever! Is there anybody having same problem ? Only me? Thanks Hiroshi From owner-freebsd-fs@FreeBSD.ORG Mon Jul 5 06:27:41 2010 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id A5CB31065673; Mon, 5 Jul 2010 06:27:41 +0000 (UTC) (envelope-from hiroshi@soupacific.com) Received: from mail.soupacific.com (mail.soupacific.com [211.19.53.201]) by mx1.freebsd.org (Postfix) with ESMTP id 6DC0A8FC15; Mon, 5 Jul 2010 06:27:41 +0000 (UTC) Received: from [127.0.0.1] (unknown [192.168.1.239]) by mail.soupacific.com (Postfix) with ESMTP id A6ADD6CA33; Mon, 5 Jul 2010 06:19:39 +0000 (UTC) Message-ID: <4C317B51.5000409@soupacific.com> Date: Mon, 05 Jul 2010 15:27:29 +0900 From: "hiroshi@soupacific.com" User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.2.4) Gecko/20100608 Thunderbird/3.1 MIME-Version: 1.0 To: Mikolaj Golub References: <4C139F9C.2090305@soupacific.com><86iq5oc82y.fsf@kopusha.home.net> <4C14215D.9090304@soupacific.com><20100613003635.GA60012@icarus.home.lan><20100613074921.GB1320@garage.freebsd.pl><4C149A5C.3070401@soupacific.com><20100613102401.GE1320@garage.freebsd.pl><86eigavzsg.fsf@kopusha.home.net><20100614095044.GH1721@garage.freebsd.pl><868w6hwt2w.fsf@kopusha.home.net><20100614153746.GN1721@garage.freebsd.pl><86zkyxvc4v.fsf@kopusha.home.net> <4C2C43D5.1080907@soupacific.com><86mxubndrp.fsf@kopusha.home.net> <4C2D7615.5070606@soupacific.com><861vbm1hpr.fsf@zhuzha.ua1> <4C2D9C62.4050105@soupacific.com><86wrtez14z.fsf@zhuzha.ua1> <4C2DC801.5080108@soupacific.com><86iq4xx9fy.fsf@kopusha.home.net> <4C2F3E14.1080601@soupacific.com> <86pqz3iw33.fsf@kopusha.home.net> In-Reply-To: <86pqz3iw33.fsf@kopusha.home.net> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: freebsd-fs@freebsd.org, Pawel Jakub Dawidek Subject: Re: HAST and CARP X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 05 Jul 2010 06:27:41 -0000 Hi if i do without carp and my script, things works fine! So I will set up other smaller hast device box to check again. 500G takes too long to synchronize both boxes. I will get my new result tomorrow. thanks Hiroshi > > I think you could make a configuration so when hostB (secondary) switches to > master it changes advskew to the value lower then on hostA. In this way after > hostA reboot it will have higher advskew and will be forced to do as a > secondary. Then after the nodes are synchronized you can switch to the initial > state restoring initial value of advskew on hostB. > From owner-freebsd-fs@FreeBSD.ORG Mon Jul 5 08:39:31 2010 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 48D4C106566B for ; Mon, 5 Jul 2010 08:39:31 +0000 (UTC) (envelope-from marty.rosenberg@gmail.com) Received: from mail-vw0-f54.google.com (mail-vw0-f54.google.com [209.85.212.54]) by mx1.freebsd.org (Postfix) with ESMTP id EC7808FC1E for ; Mon, 5 Jul 2010 08:39:30 +0000 (UTC) Received: by vws6 with SMTP id 6so5818772vws.13 for ; Mon, 05 Jul 2010 01:39:21 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:received:in-reply-to :references:date:message-id:subject:from:to:content-type; bh=+aMtG5xmIDiGgcm7j7bBqqkHVTQtNHken+dXFNBv40U=; b=nbtuMUiuMLmLDTXaH2hGASFpZGhgBPCiPJRD0sW1vrHx62nefyZrakVTOAiBJZJS7i Cc5ayC1dw9MVldWDXMcHW69tuWbEISa7QFD5VsnNaCXpniKC084PTFuovLnl09WYCfK7 cpBQCRc0JtIPe1K/cVurZgQ93cHzOf6ralY68= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; b=cRv4dKT41bs+vqQxCU6JLRks4YFrq+dM2HSKiHJVDsakORXAVSD08fdnvrghb9qT39 PDo5OSzCA+TQ7lEb8EHYrvQzNgJbTj2noJtWrFQYPnNoHywoma8tJxJd11BbbslheMn7 iyfEDrDUHHCVBF0XNwIsPHNf2fdvfbJlYXR5U= MIME-Version: 1.0 Received: by 10.220.75.148 with SMTP id y20mr1321045vcj.144.1278317748911; Mon, 05 Jul 2010 01:15:48 -0700 (PDT) Received: by 10.220.66.222 with HTTP; Mon, 5 Jul 2010 01:15:48 -0700 (PDT) In-Reply-To: <29067121.post@talk.nabble.com> References: <29065362.post@talk.nabble.com> <20100704005336.00003234@unknown> <29067121.post@talk.nabble.com> Date: Mon, 5 Jul 2010 01:15:48 -0700 Message-ID: From: Marty Rosenberg To: freebsd-fs@freebsd.org Content-Type: text/plain; charset=ISO-8859-1 X-Content-Filtered-By: Mailman/MimeDel 2.1.5 Subject: Re: zfs on 4k sector disks X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 05 Jul 2010 08:39:31 -0000 I've tried running zfs on a maximum of 22 disks in various configurations with raidz and raidz2, and 4-7 devices in each vdev. In general, the performance has been pretty abysmal. Using gnop to emulate a disk with 4k sectors, the performance usually starts out in a reasonable place (60-100MB/s, about the write speed of an individual disk), but after adding 30 or so gigabytes of data, the write speed drops dramatically. down to 1-3 MB/s in most instances. On Sat, Jul 3, 2010 at 7:44 PM, Bucarr wrote: > > > > On Sat, 3 Jul 2010 11:28:03 -0700 (PDT) > Bucarr wrote: > > > I was wondering that as well. So I set up a new FreeBSD8.0 box with > > a UFS2 disk as my boot disk and two WD 2TB EARS drives ("Advanced > > Format") for a zfs data array. I made no attempt to alter the WD > > drives in any way and ignored the 512b vs 4096b issues. 'zpool > > create tank raidz da0 da1' set up the array in a few seconds and I've > > copied to/from with no troubles and no apparent performance issues > > that I care about. If I had more of the WD drives lying around, I'd > > add those to the array and see what gives then. > > I suspect the performance may be acceptable, but probably not great. > I thought I didn't have any problems with RAIDZ's variable stripe size > on my EARS drives either until I found some writes would suddenly take > ages to finish, causing applications to hang. > > -- > Bruce Cran > > > I wonder if my 2 drive experiment in RAIDZ didn't simply mirror the drives > so that striping wasn't invoked at all? > -- > View this message in context: > http://old.nabble.com/zfs-on-4k-sector-disks-tp27664932p29067121.html > Sent from the freebsd-fs mailing list archive at Nabble.com. > > _______________________________________________ > freebsd-fs@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-fs > To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org" > From owner-freebsd-fs@FreeBSD.ORG Mon Jul 5 11:06:53 2010 Return-Path: Delivered-To: freebsd-fs@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id E0A05106566B for ; Mon, 5 Jul 2010 11:06:53 +0000 (UTC) (envelope-from owner-bugmaster@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:4f8:fff6::28]) by mx1.freebsd.org (Postfix) with ESMTP id CE8B28FC1D for ; Mon, 5 Jul 2010 11:06:53 +0000 (UTC) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.4/8.14.4) with ESMTP id o65B6rC5079184 for ; Mon, 5 Jul 2010 11:06:53 GMT (envelope-from owner-bugmaster@FreeBSD.org) Received: (from gnats@localhost) by freefall.freebsd.org (8.14.4/8.14.4/Submit) id o65B6rLq079182 for freebsd-fs@FreeBSD.org; Mon, 5 Jul 2010 11:06:53 GMT (envelope-from owner-bugmaster@FreeBSD.org) Date: Mon, 5 Jul 2010 11:06:53 GMT Message-Id: <201007051106.o65B6rLq079182@freefall.freebsd.org> X-Authentication-Warning: freefall.freebsd.org: gnats set sender to owner-bugmaster@FreeBSD.org using -f From: FreeBSD bugmaster To: freebsd-fs@FreeBSD.org Cc: Subject: Current problem reports assigned to freebsd-fs@FreeBSD.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 05 Jul 2010 11:06:54 -0000 Note: to view an individual PR, use: http://www.freebsd.org/cgi/query-pr.cgi?pr=(number). The following is a listing of current problems submitted by FreeBSD users. These represent problem reports covering all versions including experimental development code and obsolete releases. S Tracker Resp. Description -------------------------------------------------------------------------------- o kern/148368 fs [zfs] ZFS hanging forever on 8.1-PRERELEASE o kern/148204 fs [nfs] UDP NFS causes overload o kern/148138 fs [zfs] zfs raidz pool commands freeze o kern/147903 fs [zfs] [panic] Kernel panics on faulty zfs device o kern/147790 fs [zfs] zfs set acl(mode|inherit) fails on existing zfs o kern/147420 fs [ufs] [panic] ufs_dirbad, nullfs, jail panic (corrupt o kern/147292 fs [nfs] [patch] readahead missing in nfs client options o kern/146708 fs [ufs] [panic] Kernel panic in softdep_disk_write_compl o kern/146528 fs [zfs] Severe memory leak in ZFS on i386 o kern/146502 fs [nfs] FreeBSD 8 NFS Client Connection to Server o kern/146375 fs [nfs] [patch] Typos in macro variables names in sys/fs o kern/145778 fs [zfs] [panic] panic in zfs_fuid_map_id (known issue fi s kern/145712 fs [zfs] cannot offline two drives in a raidz2 configurat s kern/145424 fs [zfs] [patch] move source closer to v15 o kern/145411 fs [xfs] [panic] Kernel panics shortly after mounting an o kern/145309 fs [disklabel]: Editing disk label invalidates the whole o kern/145272 fs [zfs] [panic] Panic during boot when accessing zfs on o kern/145246 fs [ufs] dirhash in 7.3 gratuitously frees hashes when it o kern/145238 fs [zfs] [panic] kernel panic on zpool clear tank o kern/145229 fs [zfs] Vast differences in ZFS ARC behavior between 8.0 o kern/145189 fs [nfs] nfsd performs abysmally under load o kern/144929 fs [ufs] [lor] vfs_bio.c + ufs_dirhash.c o kern/144458 fs [nfs] [patch] nfsd fails as a kld p kern/144447 fs [zfs] sharenfs fsunshare() & fsshare_main() non functi o kern/144416 fs [panic] Kernel panic on online filesystem optimization s kern/144415 fs [zfs] [panic] kernel panics on boot after zfs crash o kern/144234 fs [zfs] Cannot boot machine with recent gptzfsboot code o kern/143825 fs [nfs] [panic] Kernel panic on NFS client o kern/143345 fs [ext2fs] [patch] extfs minor header cleanups to better o kern/143212 fs [nfs] NFSv4 client strange work ... o kern/143184 fs [zfs] [lor] zfs/bufwait LOR o kern/142924 fs [ext2fs] [patch] Small cleanup for the inode struct in o kern/142914 fs [zfs] ZFS performance degradation over time o kern/142878 fs [zfs] [vfs] lock order reversal o kern/142597 fs [ext2fs] ext2fs does not work on filesystems with real o kern/142489 fs [zfs] [lor] allproc/zfs LOR o kern/142466 fs Update 7.2 -> 8.0 on Raid 1 ends with screwed raid [re o kern/142401 fs [ntfs] [patch] Minor updates to NTFS from NetBSD o kern/142306 fs [zfs] [panic] ZFS drive (from OSX Leopard) causes two o kern/142068 fs [ufs] BSD labels are got deleted spontaneously o kern/141897 fs [msdosfs] [panic] Kernel panic. msdofs: file name leng o kern/141463 fs [nfs] [panic] Frequent kernel panics after upgrade fro o kern/141305 fs [zfs] FreeBSD ZFS+sendfile severe performance issues ( o kern/141091 fs [patch] [nullfs] fix panics with DIAGNOSTIC enabled o kern/141086 fs [nfs] [panic] panic("nfs: bioread, not dir") on FreeBS o kern/141010 fs [zfs] "zfs scrub" fails when backed by files in UFS2 o kern/140888 fs [zfs] boot fail from zfs root while the pool resilveri o kern/140661 fs [zfs] [patch] /boot/loader fails to work on a GPT/ZFS- o kern/140640 fs [zfs] snapshot crash o kern/140134 fs [msdosfs] write and fsck destroy filesystem integrity o kern/140068 fs [smbfs] [patch] smbfs does not allow semicolon in file o kern/139725 fs [zfs] zdb(1) dumps core on i386 when examining zpool c o kern/139715 fs [zfs] vfs.numvnodes leak on busy zfs o bin/139651 fs [nfs] mount(8): read-only remount of NFS volume does n o kern/139597 fs [patch] [tmpfs] tmpfs initializes va_gen but doesn't u o kern/139564 fs [zfs] [panic] 8.0-RC1 - Fatal trap 12 at end of shutdo o kern/139407 fs [smbfs] [panic] smb mount causes system crash if remot o kern/139363 fs [nfs] diskless root nfs mount from non FreeBSD server o kern/138790 fs [zfs] ZFS ceases caching when mem demand is high o kern/138421 fs [ufs] [patch] remove UFS label limitations o kern/138202 fs mount_msdosfs(1) see only 2Gb f kern/137037 fs [zfs] [hang] zfs rollback on root causes FreeBSD to fr o kern/136968 fs [ufs] [lor] ufs/bufwait/ufs (open) o kern/136945 fs [ufs] [lor] filedesc structure/ufs (poll) o kern/136944 fs [ffs] [lor] bufwait/snaplk (fsync) o kern/136873 fs [ntfs] Missing directories/files on NTFS volume o kern/136865 fs [nfs] [patch] NFS exports atomic and on-the-fly atomic o kern/136470 fs [nfs] Cannot mount / in read-only, over NFS o kern/135546 fs [zfs] zfs.ko module doesn't ignore zpool.cache filenam o kern/135469 fs [ufs] [panic] kernel crash on md operation in ufs_dirb o kern/135050 fs [zfs] ZFS clears/hides disk errors on reboot o kern/134491 fs [zfs] Hot spares are rather cold... o kern/133676 fs [smbfs] [panic] umount -f'ing a vnode-based memory dis o kern/133614 fs [panic] panic: ffs_truncate: read-only filesystem o kern/133174 fs [msdosfs] [patch] msdosfs must support utf-encoded int f kern/133150 fs [zfs] Page fault with ZFS on 7.1-RELEASE/amd64 while w o kern/132960 fs [ufs] [panic] panic:ffs_blkfree: freeing free frag o kern/132397 fs reboot causes filesystem corruption (failure to sync b o kern/132331 fs [ufs] [lor] LOR ufs and syncer o kern/132237 fs [msdosfs] msdosfs has problems to read MSDOS Floppy o kern/132145 fs [panic] File System Hard Crashes o kern/131441 fs [unionfs] [nullfs] unionfs and/or nullfs not combineab o kern/131360 fs [nfs] poor scaling behavior of the NFS server under lo o kern/131342 fs [nfs] mounting/unmounting of disks causes NFS to fail o bin/131341 fs makefs: error "Bad file descriptor" on the mount poin o kern/130920 fs [msdosfs] cp(1) takes 100% CPU time while copying file o kern/130229 fs [iconv] usermount fails on fs that need iconv o kern/130210 fs [nullfs] Error by check nullfs o kern/129760 fs [nfs] after 'umount -f' of a stale NFS share FreeBSD l o kern/129488 fs [smbfs] Kernel "bug" when using smbfs in smbfs_smb.c: o kern/129231 fs [ufs] [patch] New UFS mount (norandom) option - mostly o kern/129152 fs [panic] non-userfriendly panic when trying to mount(8) o kern/129059 fs [zfs] [patch] ZFS bootloader whitelistable via WITHOUT f kern/128829 fs smbd(8) causes periodic panic on 7-RELEASE o kern/127420 fs [gjournal] [panic] Journal overflow on gmirrored gjour o kern/127029 fs [panic] mount(8): trying to mount a write protected zi o kern/126287 fs [ufs] [panic] Kernel panics while mounting an UFS file o kern/125895 fs [ffs] [panic] kernel: panic: ffs_blkfree: freeing free s kern/125738 fs [zfs] [request] SHA256 acceleration in ZFS p kern/124621 fs [ext3] [patch] Cannot mount ext2fs partition f bin/124424 fs [zfs] zfs(8): zfs list -r shows strange snapshots' siz o kern/123939 fs [msdosfs] corrupts new files o kern/122380 fs [ffs] ffs_valloc:dup alloc (Soekris 4801/7.0/USB Flash o bin/122172 fs [fs]: amd(8) automount daemon dies on 6.3-STABLE i386, o bin/121898 fs [nullfs] pwd(1)/getcwd(2) fails with Permission denied o bin/121779 fs [ufs] snapinfo(8) (and related tools?) only work for t o bin/121366 fs [zfs] [patch] Automatic disk scrubbing from periodic(8 o bin/121072 fs [smbfs] mount_smbfs(8) cannot normally convert the cha f kern/120991 fs [panic] [fs] [snapshot] System crashes when manipulati o kern/120483 fs [ntfs] [patch] NTFS filesystem locking changes o kern/120482 fs [ntfs] [patch] Sync style changes between NetBSD and F f kern/119735 fs [zfs] geli + ZFS + samba starting on boot panics 7.0-B o kern/118912 fs [2tb] disk sizing/geometry problem with large array o kern/118713 fs [minidump] [patch] Display media size required for a k o bin/118249 fs mv(1): moving a directory changes its mtime o kern/118107 fs [ntfs] [panic] Kernel panic when accessing a file at N o bin/117315 fs [smbfs] mount_smbfs(8) and related options can't mount o kern/117314 fs [ntfs] Long-filename only NTFS fs'es cause kernel pani o kern/117158 fs [zfs] zpool scrub causes panic if geli vdevs detach on o bin/116980 fs [msdosfs] [patch] mount_msdosfs(8) resets some flags f o conf/116931 fs lack of fsck_cd9660 prevents mounting iso images with o kern/116913 fs [ffs] [panic] ffs_blkfree: freeing free block p kern/116608 fs [msdosfs] [patch] msdosfs fails to check mount options o kern/116583 fs [ffs] [hang] System freezes for short time when using o kern/116170 fs [panic] Kernel panic when mounting /tmp o kern/115645 fs [snapshots] [panic] lockmgr: thread 0xc4c00d80, not ex o bin/115361 fs [zfs] mount(8) gets into a state where it won't set/un o kern/114955 fs [cd9660] [patch] [request] support for mask,dirmask,ui o kern/114847 fs [ntfs] [patch] [request] dirmask support for NTFS ala o kern/114676 fs [ufs] snapshot creation panics: snapacct_ufs2: bad blo o bin/114468 fs [patch] [request] add -d option to umount(8) to detach o kern/113852 fs [smbfs] smbfs does not properly implement DFS referral o bin/113838 fs [patch] [request] mount(8): add support for relative p o bin/113049 fs [patch] [request] make quot(8) use getopt(3) and show o kern/112658 fs [smbfs] [patch] smbfs and caching problems (resolves b o kern/111843 fs [msdosfs] Long Names of files are incorrectly created o kern/111782 fs [ufs] dump(8) fails horribly for large filesystems s bin/111146 fs [2tb] fsck(8) fails on 6T filesystem o kern/109024 fs [msdosfs] mount_msdosfs: msdosfs_iconv: Operation not o kern/109010 fs [msdosfs] can't mv directory within fat32 file system o bin/107829 fs [2TB] fdisk(8): invalid boundary checking in fdisk / w o kern/106107 fs [ufs] left-over fsck_snapshot after unfinished backgro o kern/106030 fs [ufs] [panic] panic in ufs from geom when a dead disk o kern/104406 fs [ufs] Processes get stuck in "ufs" state under persist o kern/104133 fs [ext2fs] EXT2FS module corrupts EXT2/3 filesystems o kern/103035 fs [ntfs] Directories in NTFS mounted disc images appear o kern/101324 fs [smbfs] smbfs sometimes not case sensitive when it's s o kern/99290 fs [ntfs] mount_ntfs ignorant of cluster sizes o kern/97377 fs [ntfs] [patch] syntax cleanup for ntfs_ihash.c o kern/95222 fs [iso9660] File sections on ISO9660 level 3 CDs ignored o kern/94849 fs [ufs] rename on UFS filesystem is not atomic o kern/94769 fs [ufs] Multiple file deletions on multi-snapshotted fil o kern/94733 fs [smbfs] smbfs may cause double unlock o kern/93942 fs [vfs] [patch] panic: ufs_dirbad: bad dir (patch from D o kern/92272 fs [ffs] [hang] Filling a filesystem while creating a sna f kern/91568 fs [ufs] [panic] writing to UFS/softupdates DVD media in o kern/91134 fs [smbfs] [patch] Preserve access and modification time a kern/90815 fs [smbfs] [patch] SMBFS with character conversions somet o kern/88657 fs [smbfs] windows client hang when browsing a samba shar o kern/88266 fs [smbfs] smbfs does not implement UIO_NOCOPY and sendfi o kern/87859 fs [smbfs] System reboot while umount smbfs. o bin/86765 fs [patch] bsdlabel(8) assigning wrong fs type. o kern/86587 fs [msdosfs] rm -r /PATH fails with lots of small files o kern/85326 fs [smbfs] [panic] saving a file via samba to an overquot o kern/84589 fs [2TB] 5.4-STABLE unresponsive during background fsck 2 o kern/80088 fs [smbfs] Incorrect file time setting on NTFS mounted vi o kern/73484 fs [ntfs] Kernel panic when doing `ls` from the client si o bin/73019 fs [ufs] fsck_ufs(8) cannot alloc 607016868 bytes for ino o kern/71774 fs [ntfs] NTFS cannot "see" files on a WinXP filesystem o kern/68978 fs [panic] [ufs] crashes with failing hard disk, loose po o kern/65920 fs [nwfs] Mounted Netware filesystem behaves strange o kern/65901 fs [smbfs] [patch] smbfs fails fsx write/truncate-down/tr o kern/61503 fs [smbfs] mount_smbfs does not work as non-root o kern/55617 fs [smbfs] Accessing an nsmb-mounted drive via a smb expo o kern/53137 fs [ffs] [panic] background fscking causing ffs_valloc pa o kern/51685 fs [hang] Unbounded inode allocation causes kernel to loc o kern/51583 fs [nullfs] [patch] allow to work with devices and socket o kern/36566 fs [smbfs] System reboot with dead smb mount and umount o kern/33464 fs [ufs] soft update inconsistencies after system crash o kern/18874 fs [2TB] 32bit NFS servers export wrong negative values t 180 problems total. From owner-freebsd-fs@FreeBSD.ORG Mon Jul 5 11:24:21 2010 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 570261065679; Mon, 5 Jul 2010 11:24:21 +0000 (UTC) (envelope-from hiroshi@soupacific.com) Received: from mail.soupacific.com (mail.soupacific.com [211.19.53.201]) by mx1.freebsd.org (Postfix) with ESMTP id 1FDC28FC1E; Mon, 5 Jul 2010 11:24:20 +0000 (UTC) Received: from [127.0.0.1] (unknown [192.168.1.239]) by mail.soupacific.com (Postfix) with ESMTP id A7E136CC7B; Mon, 5 Jul 2010 11:16:19 +0000 (UTC) Message-ID: <4C31C0E3.8080200@soupacific.com> Date: Mon, 05 Jul 2010 20:24:19 +0900 From: "hiroshi@soupacific.com" User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.2.4) Gecko/20100608 Thunderbird/3.1 MIME-Version: 1.0 To: Mikolaj Golub References: <4C139F9C.2090305@soupacific.com><86iq5oc82y.fsf@kopusha.home.net> <4C14215D.9090304@soupacific.com><20100613003635.GA60012@icarus.home.lan><20100613074921.GB1320@garage.freebsd.pl><4C149A5C.3070401@soupacific.com><20100613102401.GE1320@garage.freebsd.pl><86eigavzsg.fsf@kopusha.home.net><20100614095044.GH1721@garage.freebsd.pl><868w6hwt2w.fsf@kopusha.home.net><20100614153746.GN1721@garage.freebsd.pl><86zkyxvc4v.fsf@kopusha.home.net> <4C2C43D5.1080907@soupacific.com><86mxubndrp.fsf@kopusha.home.net> <4C2D7615.5070606@soupacific.com><861vbm1hpr.fsf@zhuzha.ua1> <4C2D9C62.4050105@soupacific.com><86wrtez14z.fsf@zhuzha.ua1> <4C2DC801.5080108@soupacific.com><86iq4xx9fy.fsf@kopusha.home.net> <4C2F3E14.1080601@soupacific.com> <86pqz3iw33.fsf@kopusha.home.net> In-Reply-To: <86pqz3iw33.fsf@kopusha.home.net> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: freebsd-fs@freebsd.org, Pawel Jakub Dawidek Subject: Re: HAST and CARP X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 05 Jul 2010 11:24:21 -0000 Hi ! I checked without ifstate, without CARP ! ServerA #hastctl create zfshast #hastd #hastctl role primary zfshast ServerB #hastctl create zfshast #hastd #hastctl role secondary zfshast check synch on ServerA after nodirty bytes #zpool create hasthome /dev/hast/zfshast then disconnect ethernet. ServerB #hastctl role primary zfshast #zpool import -f hasthome Then reboot ServerA and connect ethernet. zpool export -f hasthome hastd hastctl role seconday zfshast Then split-brain detected appear. I made hast device realy small and checked couple of times and same result. Thanks Hiroshi From owner-freebsd-fs@FreeBSD.ORG Mon Jul 5 12:36:17 2010 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 87B0E106564A; Mon, 5 Jul 2010 12:36:17 +0000 (UTC) (envelope-from to.my.trociny@gmail.com) Received: from mail-wy0-f182.google.com (mail-wy0-f182.google.com [74.125.82.182]) by mx1.freebsd.org (Postfix) with ESMTP id 9AF8F8FC0C; Mon, 5 Jul 2010 12:36:16 +0000 (UTC) Received: by wyb34 with SMTP id 34so3600772wyb.13 for ; Mon, 05 Jul 2010 05:36:09 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:from:to:cc:subject :organization:references:date:in-reply-to:message-id:user-agent :mime-version:content-type; bh=T8D4v1ga1TelHHbTvlGIwD0e0tW/zpCxTQvyztGRKUE=; b=H8LgX4+y0p2VkwNYyQyz46KqVNTvjiH7ZqMYTxZu9ackEgBRkv8CrnBxzYxzhQLrma VkGjLlgUtoMFBaj+SyZIgFBU86xb8BunR95yjEUKPTqh4vuljf5LIQ8/R3qiuHkjBl++ GlcRzhwRnJZcSPQMh/xDmQ0cOtn9kqo9/MDJk= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=from:to:cc:subject:organization:references:date:in-reply-to :message-id:user-agent:mime-version:content-type; b=wOAmFAqJfeljOoM5PbRKBXDaTnZ7Qas3BmWm55nIZYiHyaxwetFDo2y1UWvBtjzZ+A 3n+NH6dCfS+j/x/D1lRgRILU3KM1Z8QDTtDjCmZkMaHPKK0gucsSaFENu6UNfex/TTZm iwLCxiWwN7F2hFSFDaFI1RrXSqft11wzPYSB4= Received: by 10.227.129.136 with SMTP id o8mr3384099wbs.21.1278333368995; Mon, 05 Jul 2010 05:36:08 -0700 (PDT) Received: from localhost (ua1.etadirect.net [91.198.140.16]) by mx.google.com with ESMTPS id i25sm31922492wbi.10.2010.07.05.05.36.05 (version=TLSv1/SSLv3 cipher=RC4-MD5); Mon, 05 Jul 2010 05:36:06 -0700 (PDT) From: Mikolaj Golub To: "hiroshi\@soupacific.com" Organization: TOA Ukraine References: <4C139F9C.2090305@soupacific.com> <20100613003635.GA60012@icarus.home.lan> <20100613074921.GB1320@garage.freebsd.pl> <4C149A5C.3070401@soupacific.com> <20100613102401.GE1320@garage.freebsd.pl> <86eigavzsg.fsf@kopusha.home.net> <20100614095044.GH1721@garage.freebsd.pl> <868w6hwt2w.fsf@kopusha.home.net> <20100614153746.GN1721@garage.freebsd.pl> <86zkyxvc4v.fsf@kopusha.home.net> <4C2C43D5.1080907@soupacific.com> <86mxubndrp.fsf@kopusha.home.net> <4C2D7615.5070606@soupacific.com> <861vbm1hpr.fsf@zhuzha.ua1> <4C2D9C62.4050105@soupacific.com> <86wrtez14z.fsf@zhuzha.ua1> <4C2DC801.5080108@soupacific.com> <86iq4xx9fy.fsf@kopusha.home.net> <4C2F3E14.1080601@soupacific.com> <86pqz3iw33.fsf@kopusha.home.net> <4C31C0E3.8080200@soupacific.com> Date: Mon, 05 Jul 2010 15:36:02 +0300 In-Reply-To: <4C31C0E3.8080200@soupacific.com> (hiroshi@soupacific.com's message of "Mon, 05 Jul 2010 20:24:19 +0900") Message-ID: <86ocemyukt.fsf@zhuzha.ua1> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/23.2 (berkeley-unix) MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Cc: freebsd-fs@freebsd.org, Pawel Jakub Dawidek Subject: Re: HAST and CARP X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 05 Jul 2010 12:36:17 -0000 On Mon, 05 Jul 2010 20:24:19 +0900 hiroshi@soupacific.com wrote: h> Hi ! h> I checked without ifstate, without CARP ! h> ServerA h> #hastctl create zfshast h> #hastd h> #hastctl role primary zfshast h> ServerB h> #hastctl create zfshast h> #hastd h> #hastctl role secondary zfshast h> check synch on ServerA h> after nodirty bytes h> #zpool create hasthome /dev/hast/zfshast h> then h> disconnect ethernet. h> ServerB h> #hastctl role primary zfshast h> #zpool import -f hasthome h> Then reboot ServerA and connect ethernet. h> zpool export -f hasthome This command on this stage looks strange. It is supposed you don't have hastd started yet (you start it on the next step) and there is no hast device so zpool export should return "no such pool". Is it so? h> hastd h> hastctl role seconday zfshast h> Then split-brain detected appear. h> I made hast device realy small and checked couple of times and same result. I think I had such scenario many times when did some testing (but without disconnecting ethernet) and did not notice problems. Anyway I will try to reproduce this tonight. h> Thanks h> Hiroshi -- Mikolaj Golub From owner-freebsd-fs@FreeBSD.ORG Mon Jul 5 13:11:13 2010 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 2BE5C1065673; Mon, 5 Jul 2010 13:11:13 +0000 (UTC) (envelope-from hiroshi@soupacific.com) Received: from mail.soupacific.com (mail.soupacific.com [211.19.53.201]) by mx1.freebsd.org (Postfix) with ESMTP id B9FBA8FC17; Mon, 5 Jul 2010 13:11:12 +0000 (UTC) Received: from [127.0.0.1] (unknown [192.168.1.239]) by mail.soupacific.com (Postfix) with ESMTP id 1949F6CD2B; Mon, 5 Jul 2010 13:03:11 +0000 (UTC) Message-ID: <4C31D9EE.9030303@soupacific.com> Date: Mon, 05 Jul 2010 22:11:10 +0900 From: "hiroshi@soupacific.com" User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.2.4) Gecko/20100608 Thunderbird/3.1 MIME-Version: 1.0 To: Mikolaj Golub References: <4C139F9C.2090305@soupacific.com><20100613003635.GA60012@icarus.home.lan><20100613074921.GB1320@garage.freebsd.pl><4C149A5C.3070401@soupacific.com><20100613102401.GE1320@garage.freebsd.pl><86eigavzsg.fsf@kopusha.home.net><20100614095044.GH1721@garage.freebsd.pl><868w6hwt2w.fsf@kopusha.home.net><20100614153746.GN1721@garage.freebsd.pl><86zkyxvc4v.fsf@kopusha.home.net> <4C2C43D5.1080907@soupacific.com><86mxubndrp.fsf@kopusha.home.net> <4C2D7615.5070606@soupacific.com><861vbm1hpr.fsf@zhuzha.ua1> <4C2D9C62.4050105@soupacific.com><86wrtez14z.fsf@zhuzha.ua1> <4C2DC801.5080108@soupacific.com><86iq4xx9fy.fsf@kopusha.home.net> <4C2F3E14.1080601@soupacific.com><86pqz3iw33.fsf@kopusha.home.net> <4C31C0E3.8080200@soupacific.com> <86ocemyukt.fsf@zhuzha.ua1> In-Reply-To: <86ocemyukt.fsf@zhuzha.ua1> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: freebsd-fs@freebsd.org, Pawel Jakub Dawidek Subject: Re: HAST and CARP X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 05 Jul 2010 13:11:13 -0000 Thanks Mikolaj ! > I think I had such scenario many times when did some testing (but without > disconnecting ethernet) and did not notice problems. Anyway I will try to > reproduce this tonight. I'm bit conservative to run real boxes to internet and also I hate worst things happened all together such as failed server to resucue! Hope you can find some solution ! Sincerely thanks ! Hiroshi From owner-freebsd-fs@FreeBSD.ORG Mon Jul 5 13:32:34 2010 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 329101065670 for ; Mon, 5 Jul 2010 13:32:34 +0000 (UTC) (envelope-from nekoexmachina@gmail.com) Received: from mail-fx0-f54.google.com (mail-fx0-f54.google.com [209.85.161.54]) by mx1.freebsd.org (Postfix) with ESMTP id 413378FC1C for ; Mon, 5 Jul 2010 13:32:30 +0000 (UTC) Received: by fxm13 with SMTP id 13so3971158fxm.13 for ; Mon, 05 Jul 2010 06:32:25 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:received:date:message-id :subject:from:to:content-type; bh=Wz2CkPGGLDSjfpUyZyT2QJY+kEswEqdrwClb4hV2aNk=; b=SLbM+aL7fiMIb0htVG830gcN+zZfhwJcRcrtYSXYQGGztTZjLZWlalMK/gpgBLfj7U 5LV7UAV0yLk4harRiwTJ6QpIoGcjl90+Y9KfammRZrz3VS3b7xfkqZtbAYQWCslaWhFE XfTnpW3SI/7m/NBOUN64Zo8KSHaGJpwvda3uo= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:date:message-id:subject:from:to:content-type; b=Ep72dychk5ITEePn8oTCYgxZi3HY348L1dHHCoj4mZsZUC8Qa8dIF4Y5oBE6Ae/32x N6VF3q9IiWPFX8diSGZSkrQ+GoriqB9cY7PVsEGHa04TO7WW/JSJ+Kv7LYNYktwUj/j4 pHp27507/jL79+ryb0rbTIyaNFM22LN/uHh9w= MIME-Version: 1.0 Received: by 10.103.49.12 with SMTP id b12mr285615muk.33.1278336259878; Mon, 05 Jul 2010 06:24:19 -0700 (PDT) Received: by 10.103.175.15 with HTTP; Mon, 5 Jul 2010 06:24:19 -0700 (PDT) Date: Mon, 5 Jul 2010 13:24:19 +0000 Message-ID: From: Mikle Krutov To: freebsd-fs Content-Type: text/plain; charset=UTF-8 Subject: More stable and reliable NTFS driver for read-only access? X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 05 Jul 2010 13:32:34 -0000 Which ntfs driver is more reliable, stable and fast for read-only access, ntfs-3g or kernel one? Never ever had a contact with ntfs on fbsd, and friend of mine asks to backup his data to my machine (from ntfs hdd, formated in vista, if it makes any difference) -- with best regards, Krutov Mikle From owner-freebsd-fs@FreeBSD.ORG Mon Jul 5 13:37:20 2010 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id B17B11065670 for ; Mon, 5 Jul 2010 13:37:20 +0000 (UTC) (envelope-from jdc@koitsu.dyndns.org) Received: from qmta14.emeryville.ca.mail.comcast.net (qmta14.emeryville.ca.mail.comcast.net [76.96.27.212]) by mx1.freebsd.org (Postfix) with ESMTP id 99A878FC13 for ; Mon, 5 Jul 2010 13:37:20 +0000 (UTC) Received: from omta07.emeryville.ca.mail.comcast.net ([76.96.30.59]) by qmta14.emeryville.ca.mail.comcast.net with comcast id eDU91e0021GXsucAEDdLar; Mon, 05 Jul 2010 13:37:20 +0000 Received: from koitsu.dyndns.org ([98.248.46.159]) by omta07.emeryville.ca.mail.comcast.net with comcast id eDdK1e0083S48mS8UDdKLs; Mon, 05 Jul 2010 13:37:19 +0000 Received: by icarus.home.lan (Postfix, from userid 1000) id 28F359B425; Mon, 5 Jul 2010 06:37:19 -0700 (PDT) Date: Mon, 5 Jul 2010 06:37:19 -0700 From: Jeremy Chadwick To: Mikle Krutov Message-ID: <20100705133719.GA24489@icarus.home.lan> References: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.20 (2009-06-14) Cc: freebsd-fs Subject: Re: More stable and reliable NTFS driver for read-only access? X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 05 Jul 2010 13:37:20 -0000 On Mon, Jul 05, 2010 at 01:24:19PM +0000, Mikle Krutov wrote: > Which ntfs driver is more reliable, stable > and fast for read-only access, ntfs-3g or > kernel one? > Never ever had a contact with ntfs on > fbsd, and friend of mine asks to backup > his data to my machine (from ntfs hdd, > formated in vista, if it makes any > difference) I remember there being a thread last year sometime about the possibility of NTFS support removed from the kernel due to its age, but I forget what the outcome was. You're probably better off using something from ports, possibly with fuse. -- | Jeremy Chadwick jdc@parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, USA | | Making life hard for others since 1977. PGP: 4BD6C0CB | From owner-freebsd-fs@FreeBSD.ORG Mon Jul 5 13:43:56 2010 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 8FA15106564A for ; Mon, 5 Jul 2010 13:43:56 +0000 (UTC) (envelope-from nekoexmachina@gmail.com) Received: from mail-ww0-f50.google.com (mail-ww0-f50.google.com [74.125.82.50]) by mx1.freebsd.org (Postfix) with ESMTP id 259FC8FC16 for ; Mon, 5 Jul 2010 13:43:55 +0000 (UTC) Received: by wwi18 with SMTP id 18so137354wwi.31 for ; Mon, 05 Jul 2010 06:43:49 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:received:in-reply-to :references:date:message-id:subject:from:to:cc:content-type; bh=dCB6csJ6emvnnqaZX7qk1lQedhMYqWZ7mUVndpaJ3ws=; b=YP8O3woTgZG9vJ33/lAd+iEkvLhmBEdwCQv9OW2GojC6Yfq6577jSgmQimQ+b67goE n20zwLvUlrRG9wtUSnFdniaAUI/zHbwbOccjts2oCHZ/MrOPJ582Qvfr8ynwtSexUrB6 d6dAVWxJ3euaqTgVjk5TD+DaZZ/95vfFbtM1M= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; b=H7V1EE/11pQVvR43ZllOpXDNafFNLK3qpseMfI3AgjGXlSbcs2+fiUhrCC+EfWLmPv jr9nsnDWtoqeUIMcE2b96mEGd2VnSwk8DdIm36Fw/XnkKjKUKnwJn/eciipsgOZo3GCs CztArpFfL+6zqaXSg9tNaNhE8jyvtTxQUwM9Y= MIME-Version: 1.0 Received: by 10.103.95.17 with SMTP id x17mr315108mul.52.1278337429411; Mon, 05 Jul 2010 06:43:49 -0700 (PDT) Received: by 10.103.175.15 with HTTP; Mon, 5 Jul 2010 06:43:49 -0700 (PDT) In-Reply-To: <20100705133719.GA24489@icarus.home.lan> References: <20100705133719.GA24489@icarus.home.lan> Date: Mon, 5 Jul 2010 13:43:49 +0000 Message-ID: From: Mikle Krutov To: Jeremy Chadwick Content-Type: text/plain; charset=UTF-8 Cc: freebsd-fs Subject: Re: More stable and reliable NTFS driver for read-only access? X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 05 Jul 2010 13:43:56 -0000 Thank you for infos! Also, i tried to compile ntfs driver just ten minutes ago and it not even compile for me on 8.1-PRERELEASE with bunch of errors. 2010/7/5, Jeremy Chadwick : > On Mon, Jul 05, 2010 at 01:24:19PM +0000, Mikle Krutov wrote: >> Which ntfs driver is more reliable, stable >> and fast for read-only access, ntfs-3g or >> kernel one? >> Never ever had a contact with ntfs on >> fbsd, and friend of mine asks to backup >> his data to my machine (from ntfs hdd, >> formated in vista, if it makes any >> difference) > > I remember there being a thread last year sometime about the possibility > of NTFS support removed from the kernel due to its age, but I forget > what the outcome was. > > You're probably better off using something from ports, possibly with > fuse. > > -- > | Jeremy Chadwick jdc@parodius.com | > | Parodius Networking http://www.parodius.com/ | > | UNIX Systems Administrator Mountain View, CA, USA | > | Making life hard for others since 1977. PGP: 4BD6C0CB | > > -- with best regards, Krutov Mikle From owner-freebsd-fs@FreeBSD.ORG Mon Jul 5 14:02:31 2010 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id E81B6106566B for ; Mon, 5 Jul 2010 14:02:31 +0000 (UTC) (envelope-from to.my.trociny@gmail.com) Received: from mail-wy0-f182.google.com (mail-wy0-f182.google.com [74.125.82.182]) by mx1.freebsd.org (Postfix) with ESMTP id 764258FC12 for ; Mon, 5 Jul 2010 14:02:31 +0000 (UTC) Received: by wyb34 with SMTP id 34so3641433wyb.13 for ; Mon, 05 Jul 2010 07:02:24 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:from:to:cc:subject :organization:references:date:in-reply-to:message-id:user-agent :mime-version:content-type; bh=itBnmAMBkgdE1PExfZQHoU6Qy4HSeUIB9+ngc/YdfyI=; b=bferIclfVzPSbgIwoLzgL+z/Dj/TmQW4/LQmKFtwHDSa31lVAXYjQreP9P4U7bhYQR 6h88fBsUd71OCE/PfMmrNlW8q4O8gCB04oSO9YRm90iG6nf5ccXblRA3NmHSwZ+pnm/b V6dKVl6tnT2mdWvkkWj3cXs5dyzgLUYEQxJAE= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=from:to:cc:subject:organization:references:date:in-reply-to :message-id:user-agent:mime-version:content-type; b=leICtnIzgYqFKyqSonUlG8IfAxzUH1VLMeejUPYvOQbqHYXAnPuQqEdZlSdda0Xw4z zeza84l0zcumruIfS6+CAuqESVKihybQi9J3NKKUBCJF9UQV47OUu0VTA60UHPRZkkhb xmK9h/L+faO6mqI9Cq4IQ3oTkbiQfIPf5dNGw= Received: by 10.227.156.138 with SMTP id x10mr3491719wbw.58.1278338544791; Mon, 05 Jul 2010 07:02:24 -0700 (PDT) Received: from localhost (ua1.etadirect.net [91.198.140.16]) by mx.google.com with ESMTPS id a1sm32442525wbb.14.2010.07.05.07.02.22 (version=TLSv1/SSLv3 cipher=RC4-MD5); Mon, 05 Jul 2010 07:02:23 -0700 (PDT) From: Mikolaj Golub To: Mikle Krutov Organization: TOA Ukraine References: <20100705133719.GA24489@icarus.home.lan> Date: Mon, 05 Jul 2010 17:02:20 +0300 In-Reply-To: (Mikle Krutov's message of "Mon, 5 Jul 2010 13:43:49 +0000") Message-ID: <86k4payqkz.fsf@zhuzha.ua1> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/23.2 (berkeley-unix) MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Cc: freebsd-fs Subject: Re: More stable and reliable NTFS driver for read-only access? X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 05 Jul 2010 14:02:32 -0000 On Mon, 5 Jul 2010 13:43:49 +0000 Mikle Krutov wrote: MK> Thank you for infos! MK> Also, i tried to compile ntfs driver just MK> ten minutes ago and it not even compile MK> for me on 8.1-PRERELEASE with bunch MK> of errors. What driver do you mean? [root@zhuzha /tmp/test]# mount -t ntfs /dev/md2 /mnt/ntfs [root@zhuzha /tmp/test]# kldstat |grep ntfs 24 1 0xc9268000 b000 ntfs.ko [root@zhuzha /tmp/test]# uname -v FreeBSD 8.1-PRERELEASE #21: Wed Jun 2 17:15:41 EEST 2010 root@zhuzha.ua1:/usr/obj/usr/src/sys/DEBUG -- Mikolaj Golub From owner-freebsd-fs@FreeBSD.ORG Mon Jul 5 14:34:40 2010 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 1178A106564A for ; Mon, 5 Jul 2010 14:34:40 +0000 (UTC) (envelope-from pjd@garage.freebsd.pl) Received: from mail.garage.freebsd.pl (chello089077043238.chello.pl [89.77.43.238]) by mx1.freebsd.org (Postfix) with ESMTP id 544868FC0C for ; Mon, 5 Jul 2010 14:34:37 +0000 (UTC) Received: by mail.garage.freebsd.pl (Postfix, from userid 65534) id 0459545CDC; Mon, 5 Jul 2010 16:34:35 +0200 (CEST) Received: from localhost (pdawidek.wheel.pl [10.0.1.1]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mail.garage.freebsd.pl (Postfix) with ESMTP id 17CC5456B1; Mon, 5 Jul 2010 16:34:29 +0200 (CEST) Date: Mon, 5 Jul 2010 16:34:10 +0200 From: Pawel Jakub Dawidek To: "hiroshi@soupacific.com" Message-ID: <20100705143410.GA1782@garage.freebsd.pl> References: <4C2F3E14.1080601@soupacific.com> <86pqz3iw33.fsf@kopusha.home.net> <4C31C0E3.8080200@soupacific.com> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="liOOAslEiF7prFVr" Content-Disposition: inline In-Reply-To: <4C31C0E3.8080200@soupacific.com> User-Agent: Mutt/1.4.2.3i X-PGP-Key-URL: http://people.freebsd.org/~pjd/pjd.asc X-OS: FreeBSD 9.0-CURRENT amd64 X-Spam-Checker-Version: SpamAssassin 3.0.4 (2005-06-05) on mail.garage.freebsd.pl X-Spam-Level: X-Spam-Status: No, score=-5.9 required=4.5 tests=ALL_TRUSTED,BAYES_00, TO_ADDRESS_EQ_REAL autolearn=ham version=3.0.4 Cc: freebsd-fs@freebsd.org Subject: Re: HAST and CARP X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 05 Jul 2010 14:34:40 -0000 --liOOAslEiF7prFVr Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Mon, Jul 05, 2010 at 08:24:19PM +0900, hiroshi@soupacific.com wrote: > Hi ! >=20 > I checked without ifstate, without CARP ! >=20 > ServerA > #hastctl create zfshast > #hastd > #hastctl role primary zfshast >=20 > ServerB > #hastctl create zfshast > #hastd > #hastctl role secondary zfshast >=20 > check synch on ServerA > after nodirty bytes >=20 > #zpool create hasthome /dev/hast/zfshast >=20 > then > disconnect ethernet. Split-brain happens when two nodes think they are masters. This is exactly what happens when you disconnect ethernet. In other words you asked for split-brain and you got it. The rule is that at any given time there should at most one master. If you have two masters at some point you will cause split-brain. --=20 Pawel Jakub Dawidek http://www.wheelsystems.com pjd@FreeBSD.org http://www.FreeBSD.org FreeBSD committer Am I Evil? Yes, I Am! --liOOAslEiF7prFVr Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.14 (FreeBSD) iEYEARECAAYFAkwx7WEACgkQForvXbEpPzREIACfail4pOzs0kqryrlG6KmL0Flh WFQAoNSo+iOfSOOXjVJxgDL7jm0mmq+Q =t1Gt -----END PGP SIGNATURE----- --liOOAslEiF7prFVr-- From owner-freebsd-fs@FreeBSD.ORG Mon Jul 5 14:46:57 2010 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id D032D106566C; Mon, 5 Jul 2010 14:46:57 +0000 (UTC) (envelope-from to.my.trociny@gmail.com) Received: from mail-ww0-f50.google.com (mail-ww0-f50.google.com [74.125.82.50]) by mx1.freebsd.org (Postfix) with ESMTP id 089F18FC13; Mon, 5 Jul 2010 14:46:56 +0000 (UTC) Received: by wwi18 with SMTP id 18so166117wwi.31 for ; Mon, 05 Jul 2010 07:46:49 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:from:to:cc:subject :organization:references:date:in-reply-to:message-id:user-agent :mime-version:content-type; bh=yzOi8O6qY5O/ulMdum3+hHYe3JAzObLhSGnXEJkGmxU=; b=ocns/K0QyseT0aT8OxFHm4Cmsnsk6uhDZ0xz7Zp5A0nzV0eYxFxLWZ8Cjq70Obro+f ZAg8EPQlEd/gEGLRxuU63+Y+InuNiwOQhwJAckC/vOy5RvxVE+odZRdtlakWP75gmzXC 1ZVSqbsW+h/DYHw5jZiE3QWjW7yE5TV3VYEwU= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=from:to:cc:subject:organization:references:date:in-reply-to :message-id:user-agent:mime-version:content-type; b=QjFGNYenizfnMazv3u197j6i0JtOJCGIzv4XHQLDqLxANxhD/DdeLE8u2L64F1iUSC vBWCuY9YWtx21yXnVdgNnU1Jsc1D/nESJ5pF6H1unTGU7n3THxmtN3wChG1hZEQTcIZA yzqBQyza4pdWa98G3B8I2q5iRZJEh6xRxr5gM= Received: by 10.227.135.78 with SMTP id m14mr890167wbt.47.1278341209551; Mon, 05 Jul 2010 07:46:49 -0700 (PDT) Received: from localhost (ua1.etadirect.net [91.198.140.16]) by mx.google.com with ESMTPS id e31sm32700238wbe.11.2010.07.05.07.46.47 (version=TLSv1/SSLv3 cipher=RC4-MD5); Mon, 05 Jul 2010 07:46:48 -0700 (PDT) From: Mikolaj Golub To: Pawel Jakub Dawidek Organization: TOA Ukraine References: <4C2F3E14.1080601@soupacific.com> <86pqz3iw33.fsf@kopusha.home.net> <4C31C0E3.8080200@soupacific.com> <20100705143410.GA1782@garage.freebsd.pl> Date: Mon, 05 Jul 2010 17:46:45 +0300 In-Reply-To: <20100705143410.GA1782@garage.freebsd.pl> (Pawel Jakub Dawidek's message of "Mon, 5 Jul 2010 16:34:10 +0200") Message-ID: <86fwzyyoiy.fsf@zhuzha.ua1> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/23.2 (berkeley-unix) MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Cc: freebsd-fs@freebsd.org Subject: Re: HAST and CARP X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 05 Jul 2010 14:46:57 -0000 On Mon, 5 Jul 2010 16:34:10 +0200 Pawel Jakub Dawidek wrote: PJD> On Mon, Jul 05, 2010 at 08:24:19PM +0900, hiroshi@soupacific.com wrote: >> Hi ! >> >> I checked without ifstate, without CARP ! >> >> ServerA >> #hastctl create zfshast >> #hastd >> #hastctl role primary zfshast >> >> ServerB >> #hastctl create zfshast >> #hastd >> #hastctl role secondary zfshast >> >> check synch on ServerA >> after nodirty bytes >> >> #zpool create hasthome /dev/hast/zfshast >> >> then >> disconnect ethernet. PJD> Split-brain happens when two nodes think they are masters. This is PJD> exactly what happens when you disconnect ethernet. In other words you PJD> asked for split-brain and you got it. Ah! When reading hiroshi@'s description I missed that ServerB was set as primary before ServerA was rebooted (thus it was still primary). So there is no need in reproducing this :-). -- Mikolaj Golub From owner-freebsd-fs@FreeBSD.ORG Mon Jul 5 19:06:04 2010 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 64FBB1065674 for ; Mon, 5 Jul 2010 19:06:04 +0000 (UTC) (envelope-from kpielorz_lst@tdx.co.uk) Received: from mail.tdx.com (mail.tdx.com [62.13.128.18]) by mx1.freebsd.org (Postfix) with ESMTP id E23D68FC1A for ; Mon, 5 Jul 2010 19:06:03 +0000 (UTC) Received: from Octa64 (octa64.tdx.co.uk [62.13.130.232]) (authenticated bits=0) by mail.tdx.com (8.14.3/8.14.3/Kp) with ESMTP id o65J62c2007632 for ; Mon, 5 Jul 2010 20:06:02 +0100 (BST) Date: Mon, 05 Jul 2010 20:06:04 +0100 From: Karl Pielorz To: freebsd-fs Message-ID: X-Mailer: Mulberry/4.0.8 (Win32) MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit Content-Disposition: inline Subject: 7.3-S amd64 - ZFS replace/attach hangs - related to 'guid mismatch' / GEOM? X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 05 Jul 2010 19:06:04 -0000 Hi, A previously working system (amd64, 10Gb of RAM, two dual core Opteron 285's - stock GENERIC kernel) - where I've done 'zpool attach' and 'zpool replace's before (admittedly under 7.2-S) hangs when doing either of those now. If I run: host# zpool attach vol ad34 ad40 ZFS debugging shows: " vdev_geom_attach:112[1]: Attaching to ad40. vdev_geom_attach:153[1]: Created consumer for ad40. vdev_geom_read_guid:334[1]: guid for ad40 is 13247785578180267154 vdev_geom_detach:173[1]: Closing access to ad40. vdev_geom_detach:177[1]: Destroyed consumer to ad40. vdev_geom_open_by_path:472[1]: guid mismatch for provider /dev/ad40: 835553262974889329 != 13247785578180267154. vdev_geom_open_by_guid:430[1]: Searching by guid [835553262974889329]. " And that's it. 'ps axl' shows the zpool process as: " 0 2250 2004 0 -8 0 14460 2044 g_wait D+ p0 0:00.01 zpool attach vol ad34 ad40 " So it appears to be hung in 'g_wait'. I re-ran the replace, but with GEOM and ZFS debug enabled - the rather large output is below. I'm concerned about "guid mismatch for provider /dev/ad40: 835553262974889329 != 13247785578180267154." - and then the fact the GEOM seems to start to enumerate all the disk devices it can, and something hangs while it's looking at "zvol/vol/scanned@1237495449"? 'zvol/vol/scanned@1237495449' is a snapshot of a zfs volume (not FS), which is encrypted using GELI (but not currently geli attached, nor mounted). Any advice? Thanks, -Karl Zpool status output: host# zpool status pool: vol state: ONLINE scrub: none requested config: NAME STATE READ WRITE CKSUM vol ONLINE 0 0 0 mirror ONLINE 0 0 0 ad28 ONLINE 0 0 0 ad12 ONLINE 0 0 0 mirror ONLINE 0 0 0 ad14 ONLINE 0 0 0 ad30 ONLINE 0 0 0 mirror ONLINE 0 0 0 ad16 ONLINE 0 0 0 ad32 ONLINE 0 0 0 mirror ONLINE 0 0 0 ad18 ONLINE 0 0 0 ad34 ONLINE 0 0 0 mirror ONLINE 0 0 0 ad20 ONLINE 0 0 0 ad36 ONLINE 0 0 0 mirror ONLINE 0 0 0 ad22 ONLINE 0 0 0 ad38 ONLINE 0 0 0 spares ad42 AVAIL (ad40 was also previously a spare - but I did a 'zpool remove vol ad40' to free it up) Attempting the attach again, but with GEOM and ZFS debug enabled: host# zpool attach vol ad34 ad40 Jul 5 19:42:50 host kernel: g_dev_open(ad40, 1, 8192, 0xffffff000e655ae0) Jul 5 19:42:50 host kernel: g_access(0xffffff0004b20280(ad40), 1, 0, 0) Jul 5 19:42:50 host kernel: open delta:[r1w0e0] old:[r0w0e0] provider:[r0w0e0] 0xffffff0004ae2500(ad40) Jul 5 19:42:50 host kernel: g_disk_access(ad40, 1, 0, 0) Jul 5 19:42:50 host kernel: g_dev_close(ad40, 1, 8192, 0xffffff000e655ae0) Jul 5 19:42:50 host kernel: g_access(0xffffff0004b20280(ad40), -1, 0, 0) Jul 5 19:42:50 host kernel: open delta:[r-1w0e0] old:[r1w0e0] provider:[r1w0e0] 0xffffff0004ae2500(ad40) Jul 5 19:42:50 host kernel: g_disk_access(ad40, -1, 0, 0) Jul 5 19:42:50 host kernel: g_dev_open(ad40, 1, 8192, 0xffffff000e655ae0) Jul 5 19:42:50 host kernel: g_access(0xffffff0004b20280(ad40), 1, 0, 0) Jul 5 19:42:50 host kernel: open delta:[r1w0e0] old:[r0w0e0] provider:[r0w0e0] 0xffffff0004ae2500(ad40) Jul 5 19:42:50 host kernel: g_disk_access(ad40, 1, 0, 0) Jul 5 19:42:50 host kernel: g_dev_close(ad40, 1, 8192, 0xffffff000e655ae0) Jul 5 19:42:50 host kernel: g_access(0xffffff0004b20280(ad40), -1, 0, 0) Jul 5 19:42:50 host kernel: open delta:[r-1w0e0] old:[r1w0e0] provider:[r1w0e0] 0xffffff0004ae2500(ad40) Jul 5 19:42:50 host kernel: g_disk_access(ad40, -1, 0, 0) Jul 5 19:42:50 host kernel: g_dev_open(ad40, 1, 8192, 0xffffff000e655ae0) Jul 5 19:42:50 host kernel: g_access(0xffffff0004b20280(ad40), 1, 0, 0) Jul 5 19:42:50 host kernel: open delta:[r1w0e0] old:[r0w0e0] provider:[r0w0e0] 0xffffff0004ae2500(ad40) Jul 5 19:42:50 host kernel: g_disk_access(ad40, 1, 0, 0) Jul 5 19:42:50 host kernel: g_dev_close(ad40, 1, 8192, 0xffffff000e655ae0) Jul 5 19:42:50 host kernel: g_access(0xffffff0004b20280(ad40), -1, 0, 0) Jul 5 19:42:50 host kernel: open delta:[r-1w0e0] old:[r1w0e0] provider:[r1w0e0] 0xffffff0004ae2500(ad40) Jul 5 19:42:50 host kernel: g_disk_access(ad40, -1, 0, 0) Jul 5 19:42:50 host kernel: g_dev_open(ad40, 1, 8192, 0xffffff000e655ae0) Jul 5 19:42:50 host kernel: g_access(0xffffff0004b20280(ad40), 1, 0, 0) Jul 5 19:42:50 host kernel: open delta:[r1w0e0] old:[r0w0e0] provider:[r0w0e0] 0xffffff0004ae2500(ad40) Jul 5 19:42:50 host kernel: g_disk_access(ad40, 1, 0, 0) Jul 5 19:42:50 host kernel: g_dev_close(ad40, 1, 8192, 0xffffff000e655ae0) Jul 5 19:42:50 host kernel: g_access(0xffffff0004b20280(ad40), -1, 0, 0) Jul 5 19:42:50 host kernel: open delta:[r-1w0e0] old:[r1w0e0] provider:[r1w0e0] 0xffffff0004ae2500(ad40) Jul 5 19:42:50 host kernel: g_disk_access(ad40, -1, 0, 0) Jul 5 19:42:50 host kernel: vdev_geom_open_by_path:461[1]: Found provider by name /dev/ad40. Jul 5 19:42:50 host kernel: vdev_geom_attach:112[1]: Attaching to ad40. Jul 5 19:42:50 host kernel: g_access(0xffffff00351c1480(ad40), 1, 1, 1) Jul 5 19:42:50 host kernel: open delta:[r1w1e1] old:[r0w0e0] provider:[r0w0e0] 0xffffff0004ae2500(ad40) Jul 5 19:42:50 host kernel: g_disk_access(ad40, 1, 1, 1) Jul 5 19:42:50 host kernel: g_post_event_x(0xffffffff802557e0, 0xffffff0004ae2500, 2, 0) Jul 5 19:42:50 host kernel: ref 0xffffff0004ae2500 Jul 5 19:42:50 host kernel: vdev_geom_attach:153[1]: Created consumer for ad40. Jul 5 19:42:50 host kernel: vdev_geom_read_guid:334[1]: guid for ad40 is 13247785578180267154 Jul 5 19:42:50 host kernel: vdev_geom_detach:173[1]: Closing access to ad40. Jul 5 19:42:50 host kernel: g_access(0xffffff00351c1480(ad40), -1, 0, -1) Jul 5 19:42:50 host kernel: open delta:[r-1w0e-1] old:[r1w1e1] provider:[r1w1e1] 0xffffff0004ae2500(ad40) Jul 5 19:42:50 host kernel: g_disk_access(ad40, -1, 0, -1) Jul 5 19:42:50 host kernel: vdev_geom_detach:177[1]: Destroyed consumer to ad40. Jul 5 19:42:50 host kernel: g_access(0xffffff00351c1480(ad40), 0, -1, 0) Jul 5 19:42:50 host kernel: open delta:[r0w-1e0] old:[r0w1e0] provider:[r0w1e0] 0xffffff0004ae2500(ad40) Jul 5 19:42:50 host kernel: g_disk_access(ad40, 0, -1, 0) Jul 5 19:42:50 host kernel: g_post_event_x(0xffffffff80255580, 0xffffff0004ae2500, 2, 0) Jul 5 19:42:50 host kernel: ref 0xffffff0004ae2500 Jul 5 19:42:50 host kernel: g_detach(0xffffff00351c1480) Jul 5 19:42:50 host kernel: g_destroy_consumer(0xffffff00351c1480) Jul 5 19:42:50 host kernel: vdev_geom_open_by_path:472[1]: guid mismatch for provider /dev/ad40: 6262509414735727538 != 13247785578180267154. Jul 5 19:42:50 host kernel: vgd_epva_rgte_otma_sotpee(nP_AbRyT_,gaudi4d0:)43 Jul 5 19:42:50 host kernel: 0[1]: Seagr_cahcicnegs sb(y0 xgfufifdf f[f0602365215c009340104(7a3d54702)7,5 381],. 0, Jul 5 19:42:50 host kernel: g_0p)os Jul 5 19:42:50 host kernel: to_peevne ndte_lxt(a0:x[frf1fwf0fef0f]f 8o0l8da:8[4re00w, 0e00x]f fpfrfofvfi0de0r0:1[arc09w700e00,] 2,0 xf2f6f2f1f4f40)0 Jul 5 19:42:50 host kernel: 04ae2500(ad40) Jul 5 19:42:50 host kernel: g_disk_access(ad40, 1, 0, 0) Jul 5 19:42:50 host kernel: g_access(0xffffff00351c0300(ad40), -1, 0, 0) Jul 5 19:42:50 host kernel: open delta:[r-1w0e0] old:[r1w0e0] provider:[r1w0e0] 0xffffff0004ae2500(ad40) Jul 5 19:42:50 host kernel: g_disk_access(ad40, -1, 0, 0) Jul 5 19:42:50 host kernel: g_wither_geom(0xffffff000eda9800(ad40)) Jul 5 19:42:50 host kernel: bsd_taste(BSD,ad40) Jul 5 19:42:50 host kernel: g_access(0xffffff003517b880(ad40), 1, 0, 0) Jul 5 19:42:50 host kernel: open delta:[r1w0e0] old:[r0w0e0] provider:[r0w0e0] 0xffffff0004ae2500(ad40) Jul 5 19:42:50 host kernel: g_disk_access(ad40, 1, 0, 0) Jul 5 19:42:50 host kernel: g_access(0xffffff003517b880(ad40), -1, 0, 0) Jul 5 19:42:50 host kernel: open delta:[r-1w0e0] old:[r1w0e0] provider:[r1w0e0] 0xffffff0004ae2500(ad40) Jul 5 19:42:50 host kernel: g_disk_access(ad40, -1, 0, 0) Jul 5 19:42:50 host kernel: g_slice_spoiled(0xffffff003517b880/ad40) Jul 5 19:42:50 host kernel: g_wither_geom(0xffffff003527c100(ad40)) Jul 5 19:42:50 host kernel: g_label_taste(LABEL, ad40) Jul 5 19:42:50 host kernel: g_access(0xffffff003517aa00(ad40), 1, 0, 0) Jul 5 19:42:50 host kernel: open delta:[r1w0e0] old:[r0w0e0] provider:[r0w0e0] 0xffffff0004ae2500(ad40) Jul 5 19:42:50 host kernel: g_disk_access(ad40, 1, 0, 0) Jul 5 19:42:50 host kernel: g_access(0xffffff003517aa00(ad40), -1, 0, 0) Jul 5 19:42:50 host kernel: open delta:[r-1w0e0] old:[r1w0e0] provider:[r1w0e0] 0xffffff0004ae2500(ad40) Jul 5 19:42:50 host kernel: g_disk_access(ad40, -1, 0, 0) Jul 5 19:42:50 host kernel: g_detach(0xffffff003517aa00) Jul 5 19:42:50 host kernel: g_destroy_consumer(0xffffff003517aa00) Jul 5 19:42:50 host kernel: g_destroy_geom(0xffffff000ee56500(label:taste)) Jul 5 19:42:50 host kernel: mbr_taste(MBR,ad40) Jul 5 19:42:50 host kernel: g_access(0xffffff000ef8c580(ad40), 1, 0, 0) Jul 5 19:42:50 host kernel: open delta:[r1w0e0] old:[r0w0e0] provider:[r0w0e0] 0xffffff0004ae2500(ad40) Jul 5 19:42:50 host kernel: g_disk_access(ad40, 1, 0, 0) Jul 5 19:42:50 host kernel: g_access(0xffffff000ef8c580(ad40), -1, 0, 0) Jul 5 19:42:50 host kernel: open delta:[r-1w0e0] old:[r1w0e0] provider:[r1w0e0] 0xffffff0004ae2500(ad40) Jul 5 19:42:50 host kernel: g_disk_access(ad40, -1, 0, 0) Jul 5 19:42:50 host kernel: g_slice_spoiled(0xffffff000ef8c580/ad40) Jul 5 19:42:50 host kernel: g_wither_geom(0xffffff0035909b00(ad40)) Jul 5 19:42:50 host kernel: g_mbrext_taste(MBREXT,ad40) Jul 5 19:42:50 host kernel: g_eli_taste(ELI, ad40) Jul 5 19:42:50 host kernel: g_access(0xffffff0035015380(zvol/vol2/zfs_backups/secure@1243935776), 1, 0, 0) Jul 5 19:42:50 host kernel: open delta:[r1w0e0] old:[r0w0e0] provider:[r0w0e0] 0xffffff000e832000(zvol/vol2/zfs_backups/secure@1243935776) Jul 5 19:42:50 host kernel: g_access(0xffffff0035015380(zvol/vol2/zfs_backups/secure@1243935776), -1, 0, 0) Jul 5 19:42:50 host kernel: open delta:[r-1w0e0] old:[r1w0e0] provider:[r1w0e0] 0xffffff000e832000(zvol/vol2/zfs_backups/secure@1243935776) Jul 5 19:42:50 host kernel: g_detach(0xffffff0035015380) Jul 5 19:42:50 host kernel: g_access(0xffffff0035015380(zvol/vol2/zfs_backups/secure), 1, 0, 0) Jul 5 19:42:50 host kernel: open delta:[r1w0e0] old:[r0w0e0] provider:[r0w0e0] 0xffffff0004acc100(zvol/vol2/zfs_backups/secure) Jul 5 19:42:50 host kernel: g_access(0xffffff0035015380(zvol/vol2/zfs_backups/secure), -1, 0, 0) Jul 5 19:42:50 host kernel: open delta:[r-1w0e0] old:[r1w0e0] provider:[r1w0e0] 0xffffff0004acc100(zvol/vol2/zfs_backups/secure) Jul 5 19:42:50 host kernel: g_detach(0xffffff0035015380) Jul 5 19:42:50 host kernel: g_access(0xffffff0035015380(zvol/vol2/zfs_backups/scanned@1267226353), 1, 0, 0) Jul 5 19:42:50 host kernel: open delta:[r1w0e0] old:[r0w0e0] provider:[r0w0e0] 0xffffff000e1fde00(zvol/vol2/zfs_backups/scanned@1267226353) Jul 5 19:42:50 host kernel: g_access(0xffffff0035015380(zvol/vol2/zfs_backups/scanned@1267226353), -1, 0, 0) Jul 5 19:42:50 host kernel: open delta:[r-1w0e0] old:[r1w0e0] provider:[r1w0e0] 0xffffff000e1fde00(zvol/vol2/zfs_backups/scanned@1267226353) Jul 5 19:42:50 host kernel: g_detach(0xffffff0035015380) Jul 5 19:42:50 host kernel: g_access(0xffffff0035015380(zvol/vol2/zfs_backups/scanned), 1, 0, 0) Jul 5 19:42:50 host kernel: open delta:[r1w0e0] old:[r0w0e0] provider:[r0w0e0] 0xffffff000e1fd000(zvol/vol2/zfs_backups/scanned) Jul 5 19:42:50 host kernel: g_access(0xffffff0035015380(zvol/vol2/zfs_backups/scanned), -1, 0, 0) Jul 5 19:42:50 host kernel: open delta:[r-1w0e0] old:[r1w0e0] provider:[r1w0e0] 0xffffff000e1fd000(zvol/vol2/zfs_backups/scanned) Jul 5 19:42:50 host kernel: g_detach(0xffffff0035015380) Jul 5 19:42:50 host kernel: g_access(0xffffff0035015380(zvol/vol/scanned@1237495449), 1, 0, 0) Jul 5 19:42:50 host kernel: open delta:[r1w0e0] old:[r0w0e0] provider:[r0w0e0] 0xffffff000e60b300(zvol/vol/scanned@1237495449) [hangs here] From owner-freebsd-fs@FreeBSD.ORG Mon Jul 5 19:27:01 2010 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 854831065670 for ; Mon, 5 Jul 2010 19:27:01 +0000 (UTC) (envelope-from fjwcash@gmail.com) Received: from mail-iw0-f182.google.com (mail-iw0-f182.google.com [209.85.214.182]) by mx1.freebsd.org (Postfix) with ESMTP id 431CC8FC17 for ; Mon, 5 Jul 2010 19:27:01 +0000 (UTC) Received: by iwn35 with SMTP id 35so4194706iwn.13 for ; Mon, 05 Jul 2010 12:27:00 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:received:in-reply-to :references:date:message-id:subject:from:to:content-type; bh=XNwxRL/gWfNOQOev3e8AhgfrPUlHyOReOy7zNucyXuc=; b=XLMliJ8bQXqONXBDVs2QrdV7gpRYyAXGoRsXAUZ02aOhcS3CR5eJx0nPKWDrFJqS98 OU1lM726ildCWuhvZpkg6zQAV3YC3ZpdvGAaTCwZ6hAW8anP/xBcXsIS/Muj18FITXS8 x1nOY6aehhfCI4Er/A/1r6rnDg8TbQ9HuQUA4= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; b=UFQmKonnJWFHcBiJ322Dgp+Wz6OD0ehtnnX+0e1fX8rACkqRXGgn9E/QdrxjADrZHt qPgUxIL5CwDa12hyrvaSUoi6mMmB1uFmmhOWRAVkn4L+7pverTkyK5/ozIb/adqRGM21 dky0FD8QWw7gHeg5LLWKAiIYZDBaj+/jP6IQ4= MIME-Version: 1.0 Received: by 10.231.146.129 with SMTP id h1mr3434181ibv.181.1278358020738; Mon, 05 Jul 2010 12:27:00 -0700 (PDT) Received: by 10.231.37.11 with HTTP; Mon, 5 Jul 2010 12:27:00 -0700 (PDT) In-Reply-To: <4C31681C.5070406@soupacific.com> References: <4C139F9C.2090305@soupacific.com> <86iq5oc82y.fsf@kopusha.home.net> <4C14215D.9090304@soupacific.com> <20100613003635.GA60012@icarus.home.lan> <20100613074921.GB1320@garage.freebsd.pl> <4C149A5C.3070401@soupacific.com> <20100613102401.GE1320@garage.freebsd.pl> <86eigavzsg.fsf@kopusha.home.net> <20100614095044.GH1721@garage.freebsd.pl> <868w6hwt2w.fsf@kopusha.home.net> <20100614153746.GN1721@garage.freebsd.pl> <86zkyxvc4v.fsf@kopusha.home.net> <4C2C43D5.1080907@soupacific.com> <86mxubndrp.fsf@kopusha.home.net> <4C2D7615.5070606@soupacific.com> <861vbm1hpr.fsf@zhuzha.ua1> <4C2D9C62.4050105@soupacific.com> <86wrtez14z.fsf@zhuzha.ua1> <4C2DC801.5080108@soupacific.com> <86iq4xx9fy.fsf@kopusha.home.net> <4C2F3E14.1080601@soupacific.com> <86pqz3iw33.fsf@kopusha.home.net> <4C31681C.5070406@soupacific.com> Date: Mon, 5 Jul 2010 12:27:00 -0700 Message-ID: From: Freddie Cash To: freebsd-fs@freebsd.org Content-Type: text/plain; charset=UTF-8 Subject: Re: HAST and CARP X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 05 Jul 2010 19:27:01 -0000 On Sun, Jul 4, 2010 at 10:05 PM, hiroshi@soupacific.com wrote: >> I think you could make a configuration so when hostB (secondary) switches >> to master it changes advskew to the value lower then on hostA. > >> In this way after hostA reboot it will have higher advskew and will be forced to do as a >> secondary. Then after the nodes are synchronized you can switch to the >> initial state restoring initial value of advskew on hostB. >> > NO ! Once split-brain mode appeared, CARP and HAST as BACKUP mode by changed > whatever, advskew value etc. No way to get out slip-brain complain forever! > > Is there anybody having same problem ? Only me? Once you are in a split-brain situation, you have to take manual steps to repair. Set one side as master. Then run "hastctl create" on the other box, to reset all the hast metadata on the devices, and initiate a new sync from the master. Ideally, any automated scripts would handle all the possible error conditions and checks, and prevent the systems from getting into the split-brain situation in the first place. :) (Yeah, a lot easier said than done.) -- Freddie Cash fjwcash@gmail.com From owner-freebsd-fs@FreeBSD.ORG Mon Jul 5 21:23:05 2010 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 46440106564A for ; Mon, 5 Jul 2010 21:23:05 +0000 (UTC) (envelope-from spork@bway.net) Received: from xena.bway.net (xena.bway.net [216.220.96.26]) by mx1.freebsd.org (Postfix) with ESMTP id DDEEF8FC17 for ; Mon, 5 Jul 2010 21:23:04 +0000 (UTC) Received: (qmail 31918 invoked by uid 0); 5 Jul 2010 21:23:04 -0000 Received: from unknown (HELO ?10.3.2.41?) (spork@96.57.144.66) by smtp.bway.net with (DHE-RSA-AES256-SHA encrypted) SMTP; 5 Jul 2010 21:23:04 -0000 Date: Mon, 5 Jul 2010 17:23:03 -0400 (EDT) From: Charles Sprickman X-X-Sender: spork@hotlap.local To: freebsd-fs@freebsd.org Message-ID: User-Agent: Alpine 2.00 (OSX 1167 2008-08-23) MIME-Version: 1.0 Content-Type: TEXT/PLAIN; format=flowed; charset=US-ASCII Subject: 7.2 - ufs2 corruption X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 05 Jul 2010 21:23:05 -0000 Howdy, I've posted previously about this, but I'm going to give it one more shot before I start reformatting and/or upgrading things. I have a largish filesystem (1.3TB) that holds a few jails, the main one being a mail server. Running 7.2/amd64 on a Dell 2970 with the mfi raid card, 6GB RAM, UFS2 (SU was enabled, I disabled it for testing to no effect) The symptoms are as follows: Various applications will log messages about "bad file descriptors" (imap, rsync backup script, quota counter): du: ./cur/1271801961.M21831P98582V0000005BI08E85975_0.foo.net,S=2824:2,S: Bad file descriptor The kernel also starts logging messages like this to the console: g_vfs_done():mfid0s1e[READ(offset=2456998070156636160, length=16384)]error = 5 g_vfs_done():mfid0s1e[READ(offset=-7347040593908226048, length=16384)]error = 5 g_vfs_done():mfid0s1e[READ(offset=2456998070156636160, length=16384)]error = 5 g_vfs_done():mfid0s1e[READ(offset=-7347040593908226048, length=16384)]error = 5 g_vfs_done():mfid0s1e[READ(offset=2456998070156636160, length=16384)]error = 5 Note that the offsets look a bit... suspicious, especially those negative ones. Usually within a day or two of those "g_vfs_done()" messages showing up the box will panic shortly after the daily run. Things are hosed up enough that it is unable to save a dump. The panic always looks like this: panic: ufs_dirbad: /spool: bad dir ino 151699770 at offset 163920: mangled entry cpuid = 0 Uptime: 70d22h56m48s Physical memory: 6130 MB Dumping 811 MB: 796 780 764 748 732 716 700 684 668 652 636 620 604 588 572 556 540 524 508 492 476 460 444 428 412 396 380 364 348 332 316 300 284 ** DUMP FAILED (ERROR 16) ** panic: ufs_dirbad: /spool: bad dir ino 150073505 at offset 150: mangled entry cpuid = 2 Uptime: 13d22h30m21s Physical memory: 6130 MB Dumping 816 MB: 801 785 769 753 737 721 705 689 ** DUMP FAILED (ERROR 16) ** Automatic reboot in 15 seconds - press a key on the console to abort Rebooting... The fs, specifically "/spool" (which is where the errors always originate), will be pretty trashed and require a manual fsck. The first pass finds/fixes errors, but does not mark the fs clean. It can take anywhere from 2-4 passes to get a clean fs. The box then runs fine for a few weeks or a few months until the "g_vfs_done" errors start popping up, then it's a repeat. Are there any *known* issues with either the fs or possibly the mfi driver in 7.2? My plan was to do something like this: -shut down services and copy all of /spool off to the backups server -newfs /spool -copy everything back Then if it continues, repeat the above with a 7.3 upgrade before running newfs. If it still continues, then just go nuts and see what 8.0 or 8.1 does. But I'd really like to avoid that. Any tips? Thanks, Charles From owner-freebsd-fs@FreeBSD.ORG Mon Jul 5 21:35:11 2010 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 613501065670 for ; Mon, 5 Jul 2010 21:35:11 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from mail.zoral.com.ua (mx0.zoral.com.ua [91.193.166.200]) by mx1.freebsd.org (Postfix) with ESMTP id EEBC78FC15 for ; Mon, 5 Jul 2010 21:35:10 +0000 (UTC) Received: from deviant.kiev.zoral.com.ua (root@deviant.kiev.zoral.com.ua [10.1.1.148]) by mail.zoral.com.ua (8.14.2/8.14.2) with ESMTP id o65LZ2ZG080082 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Tue, 6 Jul 2010 00:35:02 +0300 (EEST) (envelope-from kostikbel@gmail.com) Received: from deviant.kiev.zoral.com.ua (kostik@localhost [127.0.0.1]) by deviant.kiev.zoral.com.ua (8.14.4/8.14.4) with ESMTP id o65LZ23j092452; Tue, 6 Jul 2010 00:35:02 +0300 (EEST) (envelope-from kostikbel@gmail.com) Received: (from kostik@localhost) by deviant.kiev.zoral.com.ua (8.14.4/8.14.4/Submit) id o65LZ2s9092451; Tue, 6 Jul 2010 00:35:02 +0300 (EEST) (envelope-from kostikbel@gmail.com) X-Authentication-Warning: deviant.kiev.zoral.com.ua: kostik set sender to kostikbel@gmail.com using -f Date: Tue, 6 Jul 2010 00:35:02 +0300 From: Kostik Belousov To: Charles Sprickman Message-ID: <20100705213502.GX13238@deviant.kiev.zoral.com.ua> References: Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="mnRbTohuKv9GovM+" Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.4.2.3i X-Virus-Scanned: clamav-milter 0.95.2 at skuns.kiev.zoral.com.ua X-Virus-Status: Clean X-Spam-Status: No, score=-2.1 required=5.0 tests=ALL_TRUSTED,AWL,BAYES_50, DNS_FROM_OPENWHOIS, NUMERIC_HTTP_ADDR, URI_HEX autolearn=no version=3.2.5 X-Spam-Checker-Version: SpamAssassin 3.2.5 (2008-06-10) on skuns.kiev.zoral.com.ua Cc: freebsd-fs@freebsd.org Subject: Re: 7.2 - ufs2 corruption X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 05 Jul 2010 21:35:11 -0000 --mnRbTohuKv9GovM+ Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Mon, Jul 05, 2010 at 05:23:03PM -0400, Charles Sprickman wrote: > Howdy, >=20 > I've posted previously about this, but I'm going to give it one more shot= =20 > before I start reformatting and/or upgrading things. >=20 > I have a largish filesystem (1.3TB) that holds a few jails, the main one= =20 > being a mail server. Running 7.2/amd64 on a Dell 2970 with the mfi=20 > raid card, 6GB RAM, UFS2 (SU was enabled, I disabled it for testing to=20 > no effect) >=20 > The symptoms are as follows: >=20 > Various applications will log messages about "bad file descriptors" (imap= ,=20 > rsync backup script, quota counter): >=20 > du: > ./cur/1271801961.M21831P98582V0000005BI08E85975_0.foo.net,S=3D2824:2,S: > Bad file descriptor >=20 > The kernel also starts logging messages like this to the console: >=20 > g_vfs_done():mfid0s1e[READ(offset=3D2456998070156636160, length=3D16384)]= error=20 > =3D 5 > g_vfs_done():mfid0s1e[READ(offset=3D-7347040593908226048, length=3D16384)= ]error=20 > =3D 5 > g_vfs_done():mfid0s1e[READ(offset=3D2456998070156636160, length=3D16384)]= error=20 > =3D 5 > g_vfs_done():mfid0s1e[READ(offset=3D-7347040593908226048, length=3D16384)= ]error=20 > =3D 5 > g_vfs_done():mfid0s1e[READ(offset=3D2456998070156636160, length=3D16384)]= error=20 > =3D 5 >=20 > Note that the offsets look a bit... suspicious, especially those negative= =20 > ones. >=20 > Usually within a day or two of those "g_vfs_done()" messages showing up= =20 > the box will panic shortly after the daily run. Things are hosed up=20 > enough that it is unable to save a dump. The panic always looks like=20 > this: >=20 > panic: ufs_dirbad: /spool: bad dir ino 151699770 at offset 163920: mangle= d=20 > entry > cpuid =3D 0 > Uptime: 70d22h56m48s > Physical memory: 6130 MB > Dumping 811 MB: 796 780 764 748 732 716 700 684 668 652 636 620 604 588= =20 > 572 556 540 524 508 492 476 460 444 428 412 396 380 364 348 332 316 300= =20 > 284 > ** DUMP FAILED (ERROR 16) ** >=20 > panic: ufs_dirbad: /spool: bad dir ino 150073505 at offset 150: mangled= =20 > entry > cpuid =3D 2 > Uptime: 13d22h30m21s > Physical memory: 6130 MB > Dumping 816 MB: 801 785 769 753 737 721 705 689 > ** DUMP FAILED (ERROR 16) ** > Automatic reboot in 15 seconds - press a key on the console to abort > Rebooting... >=20 > The fs, specifically "/spool" (which is where the errors always=20 > originate), will be pretty trashed and require a manual fsck. The first= =20 > pass finds/fixes errors, but does not mark the fs clean. It can take=20 > anywhere from 2-4 passes to get a clean fs. >=20 > The box then runs fine for a few weeks or a few months until the=20 > "g_vfs_done" errors start popping up, then it's a repeat. >=20 > Are there any *known* issues with either the fs or possibly the mfi drive= r=20 > in 7.2? >=20 > My plan was to do something like this: >=20 > -shut down services and copy all of /spool off to the backups server > -newfs /spool > -copy everything back >=20 > Then if it continues, repeat the above with a 7.3 upgrade before running= =20 > newfs. >=20 > If it still continues, then just go nuts and see what 8.0 or 8.1 does.=20 > But I'd really like to avoid that. >=20 > Any tips? Show "df -i" output for the the affected filesystem. --mnRbTohuKv9GovM+ Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.10 (FreeBSD) iEYEARECAAYFAkwyUAYACgkQC3+MBN1Mb4iinwCfc7TNNQJTl08QixNmSwrQJKLp YrEAnim9o+sJ5J7nGlBk8FWN0z64GLdv =hqg1 -----END PGP SIGNATURE----- --mnRbTohuKv9GovM+-- From owner-freebsd-fs@FreeBSD.ORG Mon Jul 5 21:37:31 2010 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id EA28C106566B for ; Mon, 5 Jul 2010 21:37:30 +0000 (UTC) (envelope-from spork@bway.net) Received: from xena.bway.net (xena.bway.net [216.220.96.26]) by mx1.freebsd.org (Postfix) with ESMTP id 8A5858FC17 for ; Mon, 5 Jul 2010 21:37:30 +0000 (UTC) Received: (qmail 46811 invoked by uid 0); 5 Jul 2010 21:37:29 -0000 Received: from unknown (HELO ?10.3.2.41?) (spork@96.57.144.66) by smtp.bway.net with (DHE-RSA-AES256-SHA encrypted) SMTP; 5 Jul 2010 21:37:29 -0000 Date: Mon, 5 Jul 2010 17:37:29 -0400 (EDT) From: Charles Sprickman X-X-Sender: spork@hotlap.local To: Kostik Belousov In-Reply-To: <20100705213502.GX13238@deviant.kiev.zoral.com.ua> Message-ID: References: <20100705213502.GX13238@deviant.kiev.zoral.com.ua> User-Agent: Alpine 2.00 (OSX 1167 2008-08-23) MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: freebsd-fs@freebsd.org Subject: Re: 7.2 - ufs2 corruption X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 05 Jul 2010 21:37:31 -0000 On Tue, 6 Jul 2010, Kostik Belousov wrote: > On Mon, Jul 05, 2010 at 05:23:03PM -0400, Charles Sprickman wrote: >> Howdy, >> >> I've posted previously about this, but I'm going to give it one more shot >> before I start reformatting and/or upgrading things. >> >> I have a largish filesystem (1.3TB) that holds a few jails, the main one >> being a mail server. Running 7.2/amd64 on a Dell 2970 with the mfi >> raid card, 6GB RAM, UFS2 (SU was enabled, I disabled it for testing to >> no effect) >> >> The symptoms are as follows: >> >> Various applications will log messages about "bad file descriptors" (imap, >> rsync backup script, quota counter): >> >> du: >> ./cur/1271801961.M21831P98582V0000005BI08E85975_0.foo.net,S=2824:2,S: >> Bad file descriptor >> >> The kernel also starts logging messages like this to the console: >> >> g_vfs_done():mfid0s1e[READ(offset=2456998070156636160, length=16384)]error >> = 5 >> g_vfs_done():mfid0s1e[READ(offset=-7347040593908226048, length=16384)]error >> = 5 >> g_vfs_done():mfid0s1e[READ(offset=2456998070156636160, length=16384)]error >> = 5 >> g_vfs_done():mfid0s1e[READ(offset=-7347040593908226048, length=16384)]error >> = 5 >> g_vfs_done():mfid0s1e[READ(offset=2456998070156636160, length=16384)]error >> = 5 >> >> Note that the offsets look a bit... suspicious, especially those negative >> ones. >> >> Usually within a day or two of those "g_vfs_done()" messages showing up >> the box will panic shortly after the daily run. Things are hosed up >> enough that it is unable to save a dump. The panic always looks like >> this: >> >> panic: ufs_dirbad: /spool: bad dir ino 151699770 at offset 163920: mangled >> entry >> cpuid = 0 >> Uptime: 70d22h56m48s >> Physical memory: 6130 MB >> Dumping 811 MB: 796 780 764 748 732 716 700 684 668 652 636 620 604 588 >> 572 556 540 524 508 492 476 460 444 428 412 396 380 364 348 332 316 300 >> 284 >> ** DUMP FAILED (ERROR 16) ** >> >> panic: ufs_dirbad: /spool: bad dir ino 150073505 at offset 150: mangled >> entry >> cpuid = 2 >> Uptime: 13d22h30m21s >> Physical memory: 6130 MB >> Dumping 816 MB: 801 785 769 753 737 721 705 689 >> ** DUMP FAILED (ERROR 16) ** >> Automatic reboot in 15 seconds - press a key on the console to abort >> Rebooting... >> >> The fs, specifically "/spool" (which is where the errors always >> originate), will be pretty trashed and require a manual fsck. The first >> pass finds/fixes errors, but does not mark the fs clean. It can take >> anywhere from 2-4 passes to get a clean fs. >> >> The box then runs fine for a few weeks or a few months until the >> "g_vfs_done" errors start popping up, then it's a repeat. >> >> Are there any *known* issues with either the fs or possibly the mfi driver >> in 7.2? >> >> My plan was to do something like this: >> >> -shut down services and copy all of /spool off to the backups server >> -newfs /spool >> -copy everything back >> >> Then if it continues, repeat the above with a 7.3 upgrade before running >> newfs. >> >> If it still continues, then just go nuts and see what 8.0 or 8.1 does. >> But I'd really like to avoid that. >> >> Any tips? > > Show "df -i" output for the the affected filesystem. Here you go: [spork@bigmail ~]$ df -i /spool Filesystem 1K-blocks Used Avail Capacity iused ifree %iused Mounted on /dev/mfid0s1g 1359086872 70105344 1180254580 6% 4691134 171006784 3% /spool Thanks, Charles From owner-freebsd-fs@FreeBSD.ORG Mon Jul 5 21:56:48 2010 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id BE487106566B for ; Mon, 5 Jul 2010 21:56:48 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from mail.zoral.com.ua (mx0.zoral.com.ua [91.193.166.200]) by mx1.freebsd.org (Postfix) with ESMTP id 576BB8FC0A for ; Mon, 5 Jul 2010 21:56:45 +0000 (UTC) Received: from deviant.kiev.zoral.com.ua (root@deviant.kiev.zoral.com.ua [10.1.1.148]) by mail.zoral.com.ua (8.14.2/8.14.2) with ESMTP id o65LuDhV081499 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Tue, 6 Jul 2010 00:56:13 +0300 (EEST) (envelope-from kostikbel@gmail.com) Received: from deviant.kiev.zoral.com.ua (kostik@localhost [127.0.0.1]) by deviant.kiev.zoral.com.ua (8.14.4/8.14.4) with ESMTP id o65LuDxv094852; Tue, 6 Jul 2010 00:56:13 +0300 (EEST) (envelope-from kostikbel@gmail.com) Received: (from kostik@localhost) by deviant.kiev.zoral.com.ua (8.14.4/8.14.4/Submit) id o65LuCiA094851; Tue, 6 Jul 2010 00:56:12 +0300 (EEST) (envelope-from kostikbel@gmail.com) X-Authentication-Warning: deviant.kiev.zoral.com.ua: kostik set sender to kostikbel@gmail.com using -f Date: Tue, 6 Jul 2010 00:56:12 +0300 From: Kostik Belousov To: Charles Sprickman Message-ID: <20100705215612.GY13238@deviant.kiev.zoral.com.ua> References: <20100705213502.GX13238@deviant.kiev.zoral.com.ua> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="VkR3OQEnNfOLYsi4" Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.4.2.3i X-Virus-Scanned: clamav-milter 0.95.2 at skuns.kiev.zoral.com.ua X-Virus-Status: Clean X-Spam-Status: No, score=-2.1 required=5.0 tests=ALL_TRUSTED,AWL,BAYES_50, DNS_FROM_OPENWHOIS, NUMERIC_HTTP_ADDR, URI_HEX autolearn=no version=3.2.5 X-Spam-Checker-Version: SpamAssassin 3.2.5 (2008-06-10) on skuns.kiev.zoral.com.ua Cc: freebsd-fs@freebsd.org Subject: Re: 7.2 - ufs2 corruption X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 05 Jul 2010 21:56:48 -0000 --VkR3OQEnNfOLYsi4 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Mon, Jul 05, 2010 at 05:37:29PM -0400, Charles Sprickman wrote: > On Tue, 6 Jul 2010, Kostik Belousov wrote: >=20 > >On Mon, Jul 05, 2010 at 05:23:03PM -0400, Charles Sprickman wrote: > >>Howdy, > >> > >>I've posted previously about this, but I'm going to give it one more sh= ot > >>before I start reformatting and/or upgrading things. > >> > >>I have a largish filesystem (1.3TB) that holds a few jails, the main one > >>being a mail server. Running 7.2/amd64 on a Dell 2970 with the mfi > >>raid card, 6GB RAM, UFS2 (SU was enabled, I disabled it for testing to > >>no effect) > >> > >>The symptoms are as follows: > >> > >>Various applications will log messages about "bad file descriptors" (im= ap, > >>rsync backup script, quota counter): > >> > >>du: > >>./cur/1271801961.M21831P98582V0000005BI08E85975_0.foo.net,S=3D2824:2,S: > >>Bad file descriptor > >> > >>The kernel also starts logging messages like this to the console: > >> > >>g_vfs_done():mfid0s1e[READ(offset=3D2456998070156636160, length=3D16384= )]error > >>=3D 5 > >>g_vfs_done():mfid0s1e[READ(offset=3D-7347040593908226048,=20 > >>length=3D16384)]error > >>=3D 5 > >>g_vfs_done():mfid0s1e[READ(offset=3D2456998070156636160, length=3D16384= )]error > >>=3D 5 > >>g_vfs_done():mfid0s1e[READ(offset=3D-7347040593908226048,=20 > >>length=3D16384)]error > >>=3D 5 > >>g_vfs_done():mfid0s1e[READ(offset=3D2456998070156636160, length=3D16384= )]error > >>=3D 5 > >> > >>Note that the offsets look a bit... suspicious, especially those negati= ve > >>ones. > >> > >>Usually within a day or two of those "g_vfs_done()" messages showing up > >>the box will panic shortly after the daily run. Things are hosed up > >>enough that it is unable to save a dump. The panic always looks like > >>this: > >> > >>panic: ufs_dirbad: /spool: bad dir ino 151699770 at offset 163920: mang= led > >>entry > >>cpuid =3D 0 > >>Uptime: 70d22h56m48s > >>Physical memory: 6130 MB > >>Dumping 811 MB: 796 780 764 748 732 716 700 684 668 652 636 620 604 588 > >>572 556 540 524 508 492 476 460 444 428 412 396 380 364 348 332 316 300 > >>284 > >>** DUMP FAILED (ERROR 16) ** > >> > >>panic: ufs_dirbad: /spool: bad dir ino 150073505 at offset 150: mangled > >>entry > >>cpuid =3D 2 > >>Uptime: 13d22h30m21s > >>Physical memory: 6130 MB > >>Dumping 816 MB: 801 785 769 753 737 721 705 689 > >>** DUMP FAILED (ERROR 16) ** > >>Automatic reboot in 15 seconds - press a key on the console to abort > >>Rebooting... > >> > >>The fs, specifically "/spool" (which is where the errors always > >>originate), will be pretty trashed and require a manual fsck. The first > >>pass finds/fixes errors, but does not mark the fs clean. It can take > >>anywhere from 2-4 passes to get a clean fs. > >> > >>The box then runs fine for a few weeks or a few months until the > >>"g_vfs_done" errors start popping up, then it's a repeat. > >> > >>Are there any *known* issues with either the fs or possibly the mfi dri= ver > >>in 7.2? > >> > >>My plan was to do something like this: > >> > >>-shut down services and copy all of /spool off to the backups server > >>-newfs /spool > >>-copy everything back > >> > >>Then if it continues, repeat the above with a 7.3 upgrade before running > >>newfs. > >> > >>If it still continues, then just go nuts and see what 8.0 or 8.1 does. > >>But I'd really like to avoid that. > >> > >>Any tips? > > > >Show "df -i" output for the the affected filesystem. >=20 > Here you go: >=20 > [spork@bigmail ~]$ df -i /spool > Filesystem 1K-blocks Used Avail Capacity iused ifree=20 > %iused Mounted on > /dev/mfid0s1g 1359086872 70105344 1180254580 6% 4691134 171006784=20 > 3% /spool I really expected to see the count of inodes on the fs to be bigger then 2G. It is not, but it is greater then 1G. Just to make sure: you do not get any messages from mfi(4) about disk errors ? You could try to format the partition with less inodes, see -i switch for the newfs. Make it less then 1G, and try your load again. The bug with handling volume with >2G inodes was fixed on RELENG_7 after 7.3 was released. Your simptoms are very similar to what happen when the bug is hit. --VkR3OQEnNfOLYsi4 Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.10 (FreeBSD) iEYEARECAAYFAkwyVPwACgkQC3+MBN1Mb4ikUQCfVAxXmTgOafR2Sd7zHnxOVv5q sooAoLaDkKGEXq25REpgWLOacFoZ0zMn =EKli -----END PGP SIGNATURE----- --VkR3OQEnNfOLYsi4-- From owner-freebsd-fs@FreeBSD.ORG Mon Jul 5 22:08:57 2010 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 334DB106564A for ; Mon, 5 Jul 2010 22:08:57 +0000 (UTC) (envelope-from jdc@koitsu.dyndns.org) Received: from qmta11.emeryville.ca.mail.comcast.net (qmta11.emeryville.ca.mail.comcast.net [76.96.27.211]) by mx1.freebsd.org (Postfix) with ESMTP id 1A9AA8FC17 for ; Mon, 5 Jul 2010 22:08:56 +0000 (UTC) Received: from omta20.emeryville.ca.mail.comcast.net ([76.96.30.87]) by qmta11.emeryville.ca.mail.comcast.net with comcast id eLxl1e0051smiN4ABN8wdP; Mon, 05 Jul 2010 22:08:56 +0000 Received: from koitsu.dyndns.org ([98.248.46.159]) by omta20.emeryville.ca.mail.comcast.net with comcast id eN8v1e00A3S48mS8gN8vZ7; Mon, 05 Jul 2010 22:08:56 +0000 Received: by icarus.home.lan (Postfix, from userid 1000) id 6803A9B425; Mon, 5 Jul 2010 15:08:55 -0700 (PDT) Date: Mon, 5 Jul 2010 15:08:55 -0700 From: Jeremy Chadwick To: Charles Sprickman Message-ID: <20100705220855.GA35860@icarus.home.lan> References: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.20 (2009-06-14) Cc: freebsd-fs@freebsd.org Subject: Re: 7.2 - ufs2 corruption X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 05 Jul 2010 22:08:57 -0000 On Mon, Jul 05, 2010 at 05:23:03PM -0400, Charles Sprickman wrote: > Howdy, > > I've posted previously about this, but I'm going to give it one more > shot before I start reformatting and/or upgrading things. > > I have a largish filesystem (1.3TB) that holds a few jails, the main > one being a mail server. Running 7.2/amd64 on a Dell 2970 with the > mfi raid card, 6GB RAM, UFS2 (SU was enabled, I disabled it for > testing to no effect) > > The symptoms are as follows: > > Various applications will log messages about "bad file descriptors" > (imap, rsync backup script, quota counter): > > du: > ./cur/1271801961.M21831P98582V0000005BI08E85975_0.foo.net,S=2824:2,S: > Bad file descriptor > > The kernel also starts logging messages like this to the console: > > g_vfs_done():mfid0s1e[READ(offset=2456998070156636160, length=16384)]error = 5 > g_vfs_done():mfid0s1e[READ(offset=-7347040593908226048, length=16384)]error = 5 > g_vfs_done():mfid0s1e[READ(offset=2456998070156636160, length=16384)]error = 5 > g_vfs_done():mfid0s1e[READ(offset=-7347040593908226048, length=16384)]error = 5 > g_vfs_done():mfid0s1e[READ(offset=2456998070156636160, length=16384)]error = 5 > > Note that the offsets look a bit... suspicious, especially those > negative ones. > > Usually within a day or two of those "g_vfs_done()" messages showing > up the box will panic shortly after the daily run. Things are hosed > up enough that it is unable to save a dump. The panic always looks > like this: > > panic: ufs_dirbad: /spool: bad dir ino 151699770 at offset 163920: > mangled entry > cpuid = 0 > Uptime: 70d22h56m48s > Physical memory: 6130 MB > Dumping 811 MB: 796 780 764 748 732 716 700 684 668 652 636 620 604 > 588 572 556 540 524 508 492 476 460 444 428 412 396 380 364 348 332 > 316 300 284 > ** DUMP FAILED (ERROR 16) ** > > panic: ufs_dirbad: /spool: bad dir ino 150073505 at offset 150: > mangled entry > cpuid = 2 > Uptime: 13d22h30m21s > Physical memory: 6130 MB > Dumping 816 MB: 801 785 769 753 737 721 705 689 > ** DUMP FAILED (ERROR 16) ** > Automatic reboot in 15 seconds - press a key on the console to abort > Rebooting... > > The fs, specifically "/spool" (which is where the errors always > originate), will be pretty trashed and require a manual fsck. The > first pass finds/fixes errors, but does not mark the fs clean. It > can take anywhere from 2-4 passes to get a clean fs. > > The box then runs fine for a few weeks or a few months until the > "g_vfs_done" errors start popping up, then it's a repeat. > > Are there any *known* issues with either the fs or possibly the mfi > driver in 7.2? http://lists.freebsd.org/pipermail/freebsd-hardware/2010-May/006350.html A reply in the thread indicates "the hardware runs great", so everyone's situation is different. That's also for 8.0-RELEASE. There's also a good possibility you have a disk that has problems ("bit rot" syndrome or bad cache), and I imagine it would manifest itself in this manner (filesystem corruption). I would probably start by removing mfi(4) from the picture if at all possible. If the disks act reliably on some on-board or alternate brand controller, then I think you've ruled out which piece is flaky. You can also use the opportunity to get SMART stats from the disks (smartctl -a) and provide it here for review. It's too bad the filesystem isn't ZFS (either mirror or raidz), as this sort of thing could be detected easily and auto-corrected, plus you could narrow it down to a single device/drive. Finally, you might also try memtest86 just for fun, to see if there's any RAM issues. I'm doubting this is the problem though, as you'd likely see many other problems (programs crashing, etc.), but better to be safe than sorry. A colleague of mine recently went through the bad RAM ordeal -- during the 4th pass one of the DRAM modules exhibited a single bit error. Fun times. Sorry for going down the "it's the hardware!!!" route, but sometimes that's the case. -- | Jeremy Chadwick jdc@parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, USA | | Making life hard for others since 1977. PGP: 4BD6C0CB | From owner-freebsd-fs@FreeBSD.ORG Mon Jul 5 23:04:23 2010 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id E13601065672 for ; Mon, 5 Jul 2010 23:04:23 +0000 (UTC) (envelope-from hiroshi@soupacific.com) Received: from mail.soupacific.com (mail.soupacific.com [211.19.53.201]) by mx1.freebsd.org (Postfix) with ESMTP id A8A728FC0A for ; Mon, 5 Jul 2010 23:04:23 +0000 (UTC) Received: from [127.0.0.1] (unknown [192.168.1.239]) by mail.soupacific.com (Postfix) with ESMTP id AF4636CFC8 for ; Mon, 5 Jul 2010 22:56:22 +0000 (UTC) Message-ID: <4C3264F5.1090700@soupacific.com> Date: Tue, 06 Jul 2010 08:04:21 +0900 From: "hiroshi@soupacific.com" User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.2.4) Gecko/20100608 Thunderbird/3.1 MIME-Version: 1.0 To: freebsd-fs@freebsd.org References: <4C139F9C.2090305@soupacific.com> <86iq5oc82y.fsf@kopusha.home.net><4C14215D.9090304@soupacific.com><20100613003635.GA60012@icarus.home.lan><20100613074921.GB1320@garage.freebsd.pl><4C149A5C.3070401@soupacific.com><20100613102401.GE1320@garage.freebsd.pl><86eigavzsg.fsf@kopusha.home.net><20100614095044.GH1721@garage.freebsd.pl><868w6hwt2w.fsf@kopusha.home.net><20100614153746.GN1721@garage.freebsd.pl><86zkyxvc4v.fsf@kopusha.home.net> <4C2C43D5.1080907@soupacific.com><86mxubndrp.fsf@kopusha.home.net> <4C2D7615.5070606@soupacific.com><861vbm1hpr.fsf@zhuzha.ua1> <4C2D9C62.4050105@soupacific.com><86wrtez14z.fsf@zhuzha.ua1> <4C2DC801.5080108@soupacific.com><86iq4xx9fy.fsf@kopusha.home.net> <4C2F3E14.1080601@soupacific.com><86pqz3iw33.fsf@kopusha.home.net> <4C31681C.5070406@soupacific.com> In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: HAST and CARP X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 05 Jul 2010 23:04:24 -0000 HI ! > Once you are in a split-brain situation, you have to take manual steps > to repair. > > Set one side as master. > > Then run "hastctl create" on the other box, to reset all the hast > metadata on the devices, and initiate a new sync from the master. > That's I understand. hstctl create xxx and hastctl -f role secondary xxx are almost same manner. > Ideally, any automated scripts would handle all the possible error > conditions and checks, and prevent the systems from getting into the > split-brain situation in the first place. :) (Yeah, a lot easier > said than done.) Other side can only know MASTER was dead, but MASTER once dead, how can MASTER know he is already dead ? Conclusion is once split-brain happen, DO hastctl create xxx. Thanks Hiroshi From owner-freebsd-fs@FreeBSD.ORG Mon Jul 5 23:20:07 2010 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 38226106566B for ; Mon, 5 Jul 2010 23:20:07 +0000 (UTC) (envelope-from spork@bway.net) Received: from xena.bway.net (xena.bway.net [216.220.96.26]) by mx1.freebsd.org (Postfix) with ESMTP id DCB958FC0C for ; Mon, 5 Jul 2010 23:20:06 +0000 (UTC) Received: (qmail 57965 invoked by uid 0); 5 Jul 2010 23:20:05 -0000 Received: from unknown (HELO ?10.3.2.41?) (spork@96.57.144.66) by smtp.bway.net with (DHE-RSA-AES256-SHA encrypted) SMTP; 5 Jul 2010 23:20:05 -0000 Date: Mon, 5 Jul 2010 19:20:04 -0400 (EDT) From: Charles Sprickman X-X-Sender: spork@hotlap.local To: Kostik Belousov In-Reply-To: <20100705215612.GY13238@deviant.kiev.zoral.com.ua> Message-ID: References: <20100705213502.GX13238@deviant.kiev.zoral.com.ua> <20100705215612.GY13238@deviant.kiev.zoral.com.ua> User-Agent: Alpine 2.00 (OSX 1167 2008-08-23) MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: freebsd-fs@freebsd.org Subject: Re: 7.2 - ufs2 corruption X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 05 Jul 2010 23:20:07 -0000 On Tue, 6 Jul 2010, Kostik Belousov wrote: > On Mon, Jul 05, 2010 at 05:37:29PM -0400, Charles Sprickman wrote: >> On Tue, 6 Jul 2010, Kostik Belousov wrote: >> >>> On Mon, Jul 05, 2010 at 05:23:03PM -0400, Charles Sprickman wrote: >>>> Howdy, >>>> >>>> I've posted previously about this, but I'm going to give it one more shot >>>> before I start reformatting and/or upgrading things. >>>> >>>> I have a largish filesystem (1.3TB) that holds a few jails, the main one >>>> being a mail server. Running 7.2/amd64 on a Dell 2970 with the mfi >>>> raid card, 6GB RAM, UFS2 (SU was enabled, I disabled it for testing to >>>> no effect) >>>> >>>> The symptoms are as follows: >>>> >>>> Various applications will log messages about "bad file descriptors" (imap, >>>> rsync backup script, quota counter): >>>> >>>> du: >>>> ./cur/1271801961.M21831P98582V0000005BI08E85975_0.foo.net,S=2824:2,S: >>>> Bad file descriptor >>>> >>>> The kernel also starts logging messages like this to the console: >>>> >>>> g_vfs_done():mfid0s1e[READ(offset=2456998070156636160, length=16384)]error >>>> = 5 >>>> g_vfs_done():mfid0s1e[READ(offset=-7347040593908226048, >>>> length=16384)]error >>>> = 5 >>>> g_vfs_done():mfid0s1e[READ(offset=2456998070156636160, length=16384)]error >>>> = 5 >>>> g_vfs_done():mfid0s1e[READ(offset=-7347040593908226048, >>>> length=16384)]error >>>> = 5 >>>> g_vfs_done():mfid0s1e[READ(offset=2456998070156636160, length=16384)]error >>>> = 5 >>>> >>>> Note that the offsets look a bit... suspicious, especially those negative >>>> ones. >>>> >>>> Usually within a day or two of those "g_vfs_done()" messages showing up >>>> the box will panic shortly after the daily run. Things are hosed up >>>> enough that it is unable to save a dump. The panic always looks like >>>> this: >>>> >>>> panic: ufs_dirbad: /spool: bad dir ino 151699770 at offset 163920: mangled >>>> entry >>>> cpuid = 0 >>>> Uptime: 70d22h56m48s >>>> Physical memory: 6130 MB >>>> Dumping 811 MB: 796 780 764 748 732 716 700 684 668 652 636 620 604 588 >>>> 572 556 540 524 508 492 476 460 444 428 412 396 380 364 348 332 316 300 >>>> 284 >>>> ** DUMP FAILED (ERROR 16) ** >>>> >>>> panic: ufs_dirbad: /spool: bad dir ino 150073505 at offset 150: mangled >>>> entry >>>> cpuid = 2 >>>> Uptime: 13d22h30m21s >>>> Physical memory: 6130 MB >>>> Dumping 816 MB: 801 785 769 753 737 721 705 689 >>>> ** DUMP FAILED (ERROR 16) ** >>>> Automatic reboot in 15 seconds - press a key on the console to abort >>>> Rebooting... >>>> >>>> The fs, specifically "/spool" (which is where the errors always >>>> originate), will be pretty trashed and require a manual fsck. The first >>>> pass finds/fixes errors, but does not mark the fs clean. It can take >>>> anywhere from 2-4 passes to get a clean fs. >>>> >>>> The box then runs fine for a few weeks or a few months until the >>>> "g_vfs_done" errors start popping up, then it's a repeat. >>>> >>>> Are there any *known* issues with either the fs or possibly the mfi driver >>>> in 7.2? >>>> >>>> My plan was to do something like this: >>>> >>>> -shut down services and copy all of /spool off to the backups server >>>> -newfs /spool >>>> -copy everything back >>>> >>>> Then if it continues, repeat the above with a 7.3 upgrade before running >>>> newfs. >>>> >>>> If it still continues, then just go nuts and see what 8.0 or 8.1 does. >>>> But I'd really like to avoid that. >>>> >>>> Any tips? >>> >>> Show "df -i" output for the the affected filesystem. >> >> Here you go: >> >> [spork@bigmail ~]$ df -i /spool >> Filesystem 1K-blocks Used Avail Capacity iused ifree >> %iused Mounted on >> /dev/mfid0s1g 1359086872 70105344 1180254580 6% 4691134 171006784 >> 3% /spool > > I really expected to see the count of inodes on the fs to be bigger > then 2G. It is not, but it is greater then 1G. FWIW, I see the latest g_vfs_done errors are on /usr/local, which is only 7GB... Looking through the logs on the console server, this is the first time this has happened outside of /spool. > Just to make sure: you do not get any messages from mfi(4) about disk > errors ? Not a peep. It only complains during periodic when megacli reads the drie status for the daily output; I was told this is normal/cosmetic: mfi0: Copy out failed > You could try to format the partition with less inodes, see -i switch > for the newfs. Make it less then 1G, and try your load again. > > The bug with handling volume with >2G inodes was fixed on RELENG_7 > after 7.3 was released. Your simptoms are very similar to what happen > when the bug is hit. Do you have any further info on this? Was it discussed on this list? Thanks, Charles From owner-freebsd-fs@FreeBSD.ORG Mon Jul 5 23:34:06 2010 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id B62801065672 for ; Mon, 5 Jul 2010 23:34:06 +0000 (UTC) (envelope-from spork@bway.net) Received: from xena.bway.net (xena.bway.net [216.220.96.26]) by mx1.freebsd.org (Postfix) with ESMTP id 7F33A8FC18 for ; Mon, 5 Jul 2010 23:34:06 +0000 (UTC) Received: (qmail 68906 invoked by uid 0); 5 Jul 2010 23:34:05 -0000 Received: from unknown (HELO ?10.3.2.41?) (spork@96.57.144.66) by smtp.bway.net with (DHE-RSA-AES256-SHA encrypted) SMTP; 5 Jul 2010 23:34:05 -0000 Date: Mon, 5 Jul 2010 19:34:04 -0400 (EDT) From: Charles Sprickman X-X-Sender: spork@hotlap.local To: Jeremy Chadwick In-Reply-To: <20100705220855.GA35860@icarus.home.lan> Message-ID: References: <20100705220855.GA35860@icarus.home.lan> User-Agent: Alpine 2.00 (OSX 1167 2008-08-23) MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: freebsd-fs@freebsd.org Subject: Re: 7.2 - ufs2 corruption X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 05 Jul 2010 23:34:06 -0000 On Mon, 5 Jul 2010, Jeremy Chadwick wrote: > On Mon, Jul 05, 2010 at 05:23:03PM -0400, Charles Sprickman wrote: >> Howdy, >> >> I've posted previously about this, but I'm going to give it one more >> shot before I start reformatting and/or upgrading things. >> >> I have a largish filesystem (1.3TB) that holds a few jails, the main >> one being a mail server. Running 7.2/amd64 on a Dell 2970 with the >> mfi raid card, 6GB RAM, UFS2 (SU was enabled, I disabled it for >> testing to no effect) >> >> The symptoms are as follows: >> >> Various applications will log messages about "bad file descriptors" >> (imap, rsync backup script, quota counter): >> >> du: >> ./cur/1271801961.M21831P98582V0000005BI08E85975_0.foo.net,S=2824:2,S: >> Bad file descriptor >> >> The kernel also starts logging messages like this to the console: >> >> g_vfs_done():mfid0s1e[READ(offset=2456998070156636160, length=16384)]error = 5 >> g_vfs_done():mfid0s1e[READ(offset=-7347040593908226048, length=16384)]error = 5 >> g_vfs_done():mfid0s1e[READ(offset=2456998070156636160, length=16384)]error = 5 >> g_vfs_done():mfid0s1e[READ(offset=-7347040593908226048, length=16384)]error = 5 >> g_vfs_done():mfid0s1e[READ(offset=2456998070156636160, length=16384)]error = 5 >> >> Note that the offsets look a bit... suspicious, especially those >> negative ones. >> >> Usually within a day or two of those "g_vfs_done()" messages showing >> up the box will panic shortly after the daily run. Things are hosed >> up enough that it is unable to save a dump. The panic always looks >> like this: >> >> panic: ufs_dirbad: /spool: bad dir ino 151699770 at offset 163920: >> mangled entry >> cpuid = 0 >> Uptime: 70d22h56m48s >> Physical memory: 6130 MB >> Dumping 811 MB: 796 780 764 748 732 716 700 684 668 652 636 620 604 >> 588 572 556 540 524 508 492 476 460 444 428 412 396 380 364 348 332 >> 316 300 284 >> ** DUMP FAILED (ERROR 16) ** >> >> panic: ufs_dirbad: /spool: bad dir ino 150073505 at offset 150: >> mangled entry >> cpuid = 2 >> Uptime: 13d22h30m21s >> Physical memory: 6130 MB >> Dumping 816 MB: 801 785 769 753 737 721 705 689 >> ** DUMP FAILED (ERROR 16) ** >> Automatic reboot in 15 seconds - press a key on the console to abort >> Rebooting... >> >> The fs, specifically "/spool" (which is where the errors always >> originate), will be pretty trashed and require a manual fsck. The >> first pass finds/fixes errors, but does not mark the fs clean. It >> can take anywhere from 2-4 passes to get a clean fs. >> >> The box then runs fine for a few weeks or a few months until the >> "g_vfs_done" errors start popping up, then it's a repeat. >> >> Are there any *known* issues with either the fs or possibly the mfi >> driver in 7.2? > > http://lists.freebsd.org/pipermail/freebsd-hardware/2010-May/006350.html > > A reply in the thread indicates "the hardware runs great", so everyone's > situation is different. That's also for 8.0-RELEASE. I've got a Perc6/i in this, that guy has an H700. They are both LSI cards, but I'm not sure how different they are. > There's also a good possibility you have a disk that has problems ("bit > rot" syndrome or bad cache), and I imagine it would manifest itself in > this manner (filesystem corruption). I am running RAID 10 on here - 4 drives, one hot spare. megacli does offer a "consistency check" operation for the array: http://trac.biostr.washington.edu/trac/wiki/MegaRaid I assume that if it's just one bad drive, this would catch it, correct? If the controller sees a bit flipped when comparing one half of a mirror to the other, it should complain. > I would probably start by removing mfi(4) from the picture if at all > possible. If the disks act reliably on some on-board or alternate brand > controller, then I think you've ruled out which piece is flaky. You can > also use the opportunity to get SMART stats from the disks (smartctl -a) > and provide it here for review. Not possible to remove the raid controller - these people don't believe in spare hardware (at least not enough to run a mail server). I'm stuck with this setup. I can dig around and see if mfi is one of the controllers that smartctl can talk "through". If not, megacli may be able to dump SMART data as well. > It's too bad the filesystem isn't ZFS (either mirror or raidz), as this > sort of thing could be detected easily and auto-corrected, plus you > could narrow it down to a single device/drive. Frankly, I've had less problems with zfs boxes of late than ufs2 boxes. I have a few colleagues that have upgraded fairly heavily loaded boxes from 4.11 to 6.x and beyond and hardware that was trouble-free just went down the crapper - lots of panics in the softdep code. This is all anecdotal, but my trust level is not real high w/ufs2 at the moment. I'm also not sure I can really believe fsck when it claims the fs is clean... For all I know the fs was damaged on the first crash and fsck has never truly repaired it. > Finally, you might also try memtest86 just for fun, to see if there's > any RAM issues. I'm doubting this is the problem though, as you'd > likely see many other problems (programs crashing, etc.), but better to > be safe than sorry. A colleague of mine recently went through the bad > RAM ordeal -- during the 4th pass one of the DRAM modules exhibited a > single bit error. Fun times. I can certainly try that, but it is hard to schedule hours of downtime for something like that. FWIW, it did pass a few days of memtest running when I got the hardware... > Sorry for going down the "it's the hardware!!!" route, but sometimes > that's the case. Certainly possible, but I have to focus on the software first. At the very least there may be something in 7.3 (or 7-STABLE?) that improves things. Thanks, Charles > -- > | Jeremy Chadwick jdc@parodius.com | > | Parodius Networking http://www.parodius.com/ | > | UNIX Systems Administrator Mountain View, CA, USA | > | Making life hard for others since 1977. PGP: 4BD6C0CB | > > From owner-freebsd-fs@FreeBSD.ORG Tue Jul 6 18:50:03 2010 Return-Path: Delivered-To: freebsd-fs@hub.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 0793D1065672 for ; Tue, 6 Jul 2010 18:50:03 +0000 (UTC) (envelope-from gnats@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:4f8:fff6::28]) by mx1.freebsd.org (Postfix) with ESMTP id D22808FC1E for ; Tue, 6 Jul 2010 18:50:02 +0000 (UTC) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.4/8.14.4) with ESMTP id o66Io2rX012822 for ; Tue, 6 Jul 2010 18:50:02 GMT (envelope-from gnats@freefall.freebsd.org) Received: (from gnats@localhost) by freefall.freebsd.org (8.14.4/8.14.4/Submit) id o66Io2NH012821; Tue, 6 Jul 2010 18:50:02 GMT (envelope-from gnats) Date: Tue, 6 Jul 2010 18:50:02 GMT Message-Id: <201007061850.o66Io2NH012821@freefall.freebsd.org> To: freebsd-fs@FreeBSD.org From: "Grant Peel" Cc: Subject: Re: kern/146502: [nfs] FreeBSD 8 NFS Client Connection to Server X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: Grant Peel List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 06 Jul 2010 18:50:03 -0000 The following reply was made to PR kern/146502; it has been noted by GNATS. From: "Grant Peel" To: , "Grant Peel" Cc: Subject: Re: kern/146502: [nfs] FreeBSD 8 NFS Client Connection to Server Date: Tue, 6 Jul 2010 14:13:26 -0400 I have recently installed FreeBSD 8 p3 onto the NFS machine. Supprisingly enough, the older FreeBSD 6.x machines can connect to it, but other client machines running FreeBSD 8 p3 can't I have followed the handbook NFS docs to the letter, and still not connecting. -Grant From owner-freebsd-fs@FreeBSD.ORG Tue Jul 6 23:09:53 2010 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id EC4EB1065672 for ; Tue, 6 Jul 2010 23:09:53 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) Received: from esa-jnhn.mail.uoguelph.ca (esa-jnhn.mail.uoguelph.ca [131.104.91.44]) by mx1.freebsd.org (Postfix) with ESMTP id 9E9B88FC19 for ; Tue, 6 Jul 2010 23:09:53 +0000 (UTC) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AvsEAGpUM0yDaFvJ/2dsb2JhbACgA3HAIoUkBA X-IronPort-AV: E=Sophos;i="4.53,548,1272859200"; d="scan'208";a="83398694" Received: from ganges.cs.uoguelph.ca ([131.104.91.201]) by esa-jnhn-pri.mail.uoguelph.ca with ESMTP; 06 Jul 2010 19:09:50 -0400 Received: from localhost (localhost.localdomain [127.0.0.1]) by ganges.cs.uoguelph.ca (Postfix) with ESMTP id A888DFB80DB; Tue, 6 Jul 2010 19:09:52 -0400 (EDT) X-Virus-Scanned: amavisd-new at ganges.cs.uoguelph.ca Received: from ganges.cs.uoguelph.ca ([127.0.0.1]) by localhost (ganges.cs.uoguelph.ca [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id CRaI6fP5w4Y6; Tue, 6 Jul 2010 19:09:51 -0400 (EDT) Received: from muncher.cs.uoguelph.ca (muncher.cs.uoguelph.ca [131.104.91.102]) by ganges.cs.uoguelph.ca (Postfix) with ESMTP id CF2EFFB80CA; Tue, 6 Jul 2010 19:09:51 -0400 (EDT) Received: from localhost (rmacklem@localhost) by muncher.cs.uoguelph.ca (8.11.7p3+Sun/8.11.6) with ESMTP id o66NR5U23617; Tue, 6 Jul 2010 19:27:05 -0400 (EDT) X-Authentication-Warning: muncher.cs.uoguelph.ca: rmacklem owned process doing -bs Date: Tue, 6 Jul 2010 19:27:05 -0400 (EDT) From: Rick Macklem X-X-Sender: rmacklem@muncher.cs.uoguelph.ca To: Grant Peel In-Reply-To: <201007061850.o66Io2NH012821@freefall.freebsd.org> Message-ID: References: <201007061850.o66Io2NH012821@freefall.freebsd.org> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: freebsd-fs@FreeBSD.org Subject: Re: kern/146502: [nfs] FreeBSD 8 NFS Client Connection to Server X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 06 Jul 2010 23:09:54 -0000 On Tue, 6 Jul 2010, Grant Peel wrote: > The following reply was made to PR kern/146502; it has been noted by GNATS. > > From: "Grant Peel" > To: , > "Grant Peel" > Cc: > Subject: Re: kern/146502: [nfs] FreeBSD 8 NFS Client Connection to Server > Date: Tue, 6 Jul 2010 14:13:26 -0400 > > I have recently installed FreeBSD 8 p3 onto the NFS machine. > > Supprisingly enough, the older FreeBSD 6.x machines can connect to it, but > other client machines running FreeBSD 8 p3 can't > Sometime post FreeBSD6 (can't remember exactly when), the default for mounts changed from UDP to TCP. To get a UDP mount for a newer FreeBSD system, you need to explicitly add the "udp" option on the mount. (If your NFS server handles TCP mounts fine, I have no idea why this doesn't work for newer systems, but I've never used a Live Filesystem Fixit console.) rick From owner-freebsd-fs@FreeBSD.ORG Wed Jul 7 03:56:08 2010 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 3FDCE106566C for ; Wed, 7 Jul 2010 03:56:08 +0000 (UTC) (envelope-from spork@bway.net) Received: from xena.bway.net (xena.bway.net [216.220.96.26]) by mx1.freebsd.org (Postfix) with ESMTP id D28848FC0A for ; Wed, 7 Jul 2010 03:56:07 +0000 (UTC) Received: (qmail 23854 invoked by uid 0); 7 Jul 2010 03:20:03 -0000 Received: from unknown (HELO ?10.3.2.41?) (spork@96.57.144.66) by smtp.bway.net with (DHE-RSA-AES256-SHA encrypted) SMTP; 7 Jul 2010 03:20:03 -0000 Date: Tue, 6 Jul 2010 23:16:00 -0400 (EDT) From: Charles Sprickman X-X-Sender: spork@hotlap.local To: Kostik Belousov In-Reply-To: <20100705215612.GY13238@deviant.kiev.zoral.com.ua> Message-ID: References: <20100705213502.GX13238@deviant.kiev.zoral.com.ua> <20100705215612.GY13238@deviant.kiev.zoral.com.ua> User-Agent: Alpine 2.00 (OSX 1167 2008-08-23) MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: freebsd-fs@freebsd.org Subject: Re: 7.2 - ufs2 corruption X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 07 Jul 2010 03:56:08 -0000 On Tue, 6 Jul 2010, Kostik Belousov wrote: > On Mon, Jul 05, 2010 at 05:37:29PM -0400, Charles Sprickman wrote: >> On Tue, 6 Jul 2010, Kostik Belousov wrote: >> >>> On Mon, Jul 05, 2010 at 05:23:03PM -0400, Charles Sprickman wrote: >>>> Howdy, >>>> >>>> I've posted previously about this, but I'm going to give it one more shot >>>> before I start reformatting and/or upgrading things. >>>> >>>> I have a largish filesystem (1.3TB) that holds a few jails, the main one >>>> being a mail server. Running 7.2/amd64 on a Dell 2970 with the mfi >>>> raid card, 6GB RAM, UFS2 (SU was enabled, I disabled it for testing to >>>> no effect) >>>> >>>> The symptoms are as follows: >>>> >>>> Various applications will log messages about "bad file descriptors" (imap, >>>> rsync backup script, quota counter): >>>> >>>> du: >>>> ./cur/1271801961.M21831P98582V0000005BI08E85975_0.foo.net,S=2824:2,S: >>>> Bad file descriptor >>>> >>>> The kernel also starts logging messages like this to the console: >>>> >>>> g_vfs_done():mfid0s1e[READ(offset=2456998070156636160, length=16384)]error >>>> = 5 >>>> g_vfs_done():mfid0s1e[READ(offset=-7347040593908226048, >>>> length=16384)]error >>>> = 5 >>>> g_vfs_done():mfid0s1e[READ(offset=2456998070156636160, length=16384)]error >>>> = 5 >>>> g_vfs_done():mfid0s1e[READ(offset=-7347040593908226048, >>>> length=16384)]error >>>> = 5 >>>> g_vfs_done():mfid0s1e[READ(offset=2456998070156636160, length=16384)]error >>>> = 5 >>>> >>>> Note that the offsets look a bit... suspicious, especially those negative >>>> ones. >>>> >>>> Usually within a day or two of those "g_vfs_done()" messages showing up >>>> the box will panic shortly after the daily run. Things are hosed up >>>> enough that it is unable to save a dump. The panic always looks like >>>> this: >>>> >>>> panic: ufs_dirbad: /spool: bad dir ino 151699770 at offset 163920: mangled >>>> entry >>>> cpuid = 0 >>>> Uptime: 70d22h56m48s >>>> Physical memory: 6130 MB >>>> Dumping 811 MB: 796 780 764 748 732 716 700 684 668 652 636 620 604 588 >>>> 572 556 540 524 508 492 476 460 444 428 412 396 380 364 348 332 316 300 >>>> 284 >>>> ** DUMP FAILED (ERROR 16) ** >>>> >>>> panic: ufs_dirbad: /spool: bad dir ino 150073505 at offset 150: mangled >>>> entry >>>> cpuid = 2 >>>> Uptime: 13d22h30m21s >>>> Physical memory: 6130 MB >>>> Dumping 816 MB: 801 785 769 753 737 721 705 689 >>>> ** DUMP FAILED (ERROR 16) ** >>>> Automatic reboot in 15 seconds - press a key on the console to abort >>>> Rebooting... >>>> >>>> The fs, specifically "/spool" (which is where the errors always >>>> originate), will be pretty trashed and require a manual fsck. The first >>>> pass finds/fixes errors, but does not mark the fs clean. It can take >>>> anywhere from 2-4 passes to get a clean fs. >>>> >>>> The box then runs fine for a few weeks or a few months until the >>>> "g_vfs_done" errors start popping up, then it's a repeat. >>>> >>>> Are there any *known* issues with either the fs or possibly the mfi driver >>>> in 7.2? >>>> >>>> My plan was to do something like this: >>>> >>>> -shut down services and copy all of /spool off to the backups server >>>> -newfs /spool >>>> -copy everything back >>>> >>>> Then if it continues, repeat the above with a 7.3 upgrade before running >>>> newfs. >>>> >>>> If it still continues, then just go nuts and see what 8.0 or 8.1 does. >>>> But I'd really like to avoid that. >>>> >>>> Any tips? >>> >>> Show "df -i" output for the the affected filesystem. >> >> Here you go: >> >> [spork@bigmail ~]$ df -i /spool >> Filesystem 1K-blocks Used Avail Capacity iused ifree >> %iused Mounted on >> /dev/mfid0s1g 1359086872 70105344 1180254580 6% 4691134 171006784 >> 3% /spool > > I really expected to see the count of inodes on the fs to be bigger > then 2G. It is not, but it is greater then 1G. > > Just to make sure: you do not get any messages from mfi(4) about disk > errors ? > > You could try to format the partition with less inodes, see -i switch > for the newfs. Make it less then 1G, and try your load again. > > The bug with handling volume with >2G inodes was fixed on RELENG_7 > after 7.3 was released. Your simptoms are very similar to what happen > when the bug is hit. I have a bit more info... all the memory in this box is in fact ECC, which from what I gather is supposed to deal with errors with extra chips that store parity info, correct? Also I ran the megacli consistency check and it came back clean - so I think that sort of rules out "bit rot". At this point I don't see any harm in upgrading, but I'm not sure whether I should be looking to 7.3 or 7-STABLE - any pointers? Perhaps this is helpful, or not... It crashed again tonight, but it was able to dump out a core. Here's the backtrace - I can't make too much sense of it, but it again involves ffs/ufs/vnodes: # kgdb kernel.debug /var/crash/vmcore.3 GNU gdb 6.1.1 [FreeBSD] Copyright 2004 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "amd64-marcel-freebsd"... Unread portion of the kernel message buffer: = 12 panic: page fault cpuid = 2 Uptime: 15d19h6m52s Physical memory: 6130 MB Dumping 796 MB: 781 765 749 733 717 701 685 669 653 637 621 605 589 573 557 541 525 509 493 477 461 445 429 413 397 381 365 349 333 317 301 285 269 253 237 221 205 189 173 157 141 125 109 93 77 61 45 29 13 Reading symbols from /boot/kernel/nullfs.ko...Reading symbols from /boot/kernel/nullfs.ko.symbols...done. done. Loaded symbols for /boot/kernel/nullfs.ko Reading symbols from /boot/kernel/fdescfs.ko...Reading symbols from /boot/kernel/fdescfs.ko.symbols...done. done. Loaded symbols for /boot/kernel/fdescfs.ko #0 doadump () at pcpu.h:195 195 __asm __volatile("movq %%gs:0,%0" : "=r" (td)); (kgdb) bt #0 doadump () at pcpu.h:195 #1 0x0000000000000004 in ?? () #2 0xffffffff8034c799 in boot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:418 #3 0xffffffff8034cba2 in panic (fmt=0x104
) at /usr/src/sys/kern/kern_shutdown.c:574 #4 0xffffffff80574823 in trap_fatal (frame=0xffffff009811f000, eva=Variable "eva" is not available.) at /usr/src/sys/amd64/amd64/trap.c:757 #5 0xffffffff80574bf5 in trap_pfault (frame=0xffffffff2943b500, usermode=0) at /usr/src/sys/amd64/amd64/trap.c:673 #6 0xffffffff80575534 in trap (frame=0xffffffff2943b500) at /usr/src/sys/amd64/amd64/trap.c:444 #7 0xffffffff8055969e in calltrap () at /usr/src/sys/amd64/amd64/exception.S:209 #8 0xffffffff8050382e in ffs_realloccg (ip=0xffffff0186d69508, lbprev=0, bprev=6288224785898156086, bpref=601582184, osize=0, nsize=4096, flags=33619968, cred=0xffffff00b234d400, bpp=0xffffffff2943b800) at /usr/src/sys/ufs/ffs/ffs_alloc.c:1349 #9 0xffffffff80506e8e in ffs_balloc_ufs2 (vp=0xffffff00852957e0, startoffset=Variable "startoffset" is not available. ) at /usr/src/sys/ufs/ffs/ffs_balloc.c:692 #10 0xffffffff805223e5 in ffs_write (ap=0xffffffff2943ba10) at /usr/src/sys/ufs/ffs/ffs_vnops.c:724 #11 0xffffffff805a0645 in VOP_WRITE_APV (vop=0xffffffff80793d20, ---Type to continue, or q to quit--- a=0xffffffff2943ba10) at vnode_if.c:691 #12 0xffffffff803dd731 in vn_write (fp=0xffffff00548d1a80, uio=0xffffffff2943bb00, active_cred=Variable "active_cred" is not available.) at vnode_if.h:373 #13 0xffffffff80388768 in dofilewrite (td=0xffffff009811f000, fd=6, fp=0xffffff00548d1a80, auio=dwarf2_read_address: Corrupted DWARF expression.) at file.h:257 #14 0xffffffff80388a6e in kern_writev (td=0xffffff009811f000, fd=6, auio=0xffffffff2943bb00) at /usr/src/sys/kern/sys_generic.c:402 #15 0xffffffff80388aec in write (td=0x1000, uap=0x12d4b9f50) at /usr/src/sys/kern/sys_generic.c:318 #16 0xffffffff80596a66 in ia32_syscall (frame=0xffffffff2943bc80) at /usr/src/sys/amd64/ia32/ia32_syscall.c:182 #17 0xffffffff80559ad0 in Xint0x80_syscall () at ia32_exception.S:65 #18 0x0000000028241928 in ?? () Thanks all, Charles From owner-freebsd-fs@FreeBSD.ORG Wed Jul 7 17:30:06 2010 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 8A5CD106564A for ; Wed, 7 Jul 2010 17:30:06 +0000 (UTC) (envelope-from ronald-freebsd8@klop.yi.org) Received: from smtp-out0.tiscali.nl (smtp-out0.tiscali.nl [195.241.79.175]) by mx1.freebsd.org (Postfix) with ESMTP id 441218FC1D for ; Wed, 7 Jul 2010 17:30:06 +0000 (UTC) Received: from [212.123.145.58] (helo=sjakie.klop.ws) by smtp-out0.tiscali.nl with esmtp (Exim) (envelope-from ) id 1OWYRh-0003Ca-8f; Wed, 07 Jul 2010 19:30:05 +0200 Received: from 212-123-145-58.ip.telfort.nl (localhost [127.0.0.1]) by sjakie.klop.ws (Postfix) with ESMTP id AE7FDC859; Wed, 7 Jul 2010 19:30:01 +0200 (CEST) Content-Type: text/plain; charset=us-ascii; format=flowed; delsp=yes To: "Mikle Krutov" , freebsd-fs References: Date: Wed, 07 Jul 2010 19:30:01 +0200 MIME-Version: 1.0 From: "Ronald Klop" Message-ID: In-Reply-To: User-Agent: Opera Mail/10.10 (FreeBSD) Content-Transfer-Encoding: quoted-printable Cc: Subject: Re: More stable and reliable NTFS driver for read-only access? X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 07 Jul 2010 17:30:06 -0000 On Mon, 05 Jul 2010 15:24:19 +0200, Mikle Krutov =20 wrote: > Which ntfs driver is more reliable, stable > and fast for read-only access, ntfs-3g or > kernel one? > Never ever had a contact with ntfs on > fbsd, and friend of mine asks to backup > his data to my machine (from ntfs hdd, > formated in vista, if it makes any > difference) > I used the kernel one with FreeBSD 7 to backup a NTFS partition. I =20 wouldn't know why it wouldn't work in FreeBSD 8. Why don't you just try it? Ronald. From owner-freebsd-fs@FreeBSD.ORG Wed Jul 7 19:34:36 2010 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id E75BB1065672 for ; Wed, 7 Jul 2010 19:34:36 +0000 (UTC) (envelope-from dak.col@gmail.com) Received: from mail-gy0-f182.google.com (mail-gy0-f182.google.com [209.85.160.182]) by mx1.freebsd.org (Postfix) with ESMTP id A1DF38FC1B for ; Wed, 7 Jul 2010 19:34:36 +0000 (UTC) Received: by gyd8 with SMTP id 8so2002gyd.13 for ; Wed, 07 Jul 2010 12:34:34 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:received:in-reply-to :references:date:message-id:subject:from:to:content-type; bh=qmOU/OGDL1IhnGaq8CRwXNty8brovRq8mbDKfBAI4IY=; b=mjN++rOz99hhtUqTJDj27Xfg3+bCWvingzzOikwMESpgkoENCeDhu3vLItef4K2UYh MrORHuilMwBC7mDFSdEB7Vlne8ZfC/46LFCJ+nNFhnoVBGspXHxC8liPCNS9WS3RkuDV 1PHqe/NZouPyqMCkYI/s3e/jNQwHeGyeDABig= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; b=uVbip0/JYL/zLurw6HarTu9Ww5em/fl3GyDboiwBTMDq7D7W7OkjdwUuQnmWLm53Y2 ErYG0mfAKnvWDQ2Z1bSjlUQAR6RIwyRrSKJOJwXumggBN5eXtTb+1yNVVbFZ5foIKshn 2/Cxmr1TcmYIuDJFDM+ndTAL3sSm+UzFWAxXY= MIME-Version: 1.0 Received: by 10.90.99.8 with SMTP id w8mr578559agb.22.1278529448578; Wed, 07 Jul 2010 12:04:08 -0700 (PDT) Received: by 10.90.28.10 with HTTP; Wed, 7 Jul 2010 12:04:08 -0700 (PDT) In-Reply-To: References: Date: Wed, 7 Jul 2010 14:04:08 -0500 Message-ID: From: Diego Arias To: freebsd-fs@freebsd.org Content-Type: text/plain; charset=UTF-8 X-Content-Filtered-By: Mailman/MimeDel 2.1.5 Subject: Freebsd 8 Release /usr Die After host VMWARE Crash X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 07 Jul 2010 19:34:37 -0000 Hi i posted this to questions but gets no answer, so i repost it here because i think this is important. I have a VM Running FreeBSD 8 as a small router/proxy/fetchmail/openvpn. The host system is a VMWARE ESX 4 Update 2 running on 2 HP DL460G1 Blade Systems. Unfortunately the Blade Enclosure (a C7000 From HP) start having malfunction so i have to power it off with blades still on. After the power loss the VMWARE came up but FreeBSD ask por FSCK on single user mode, so i run it fsck -y on all partitions. After FSCK freebsd wont came up with error of getty not found. i restart it in single use mode and mount /usr but no luck, all the data was gone and only got a lost+found directory with crazy files on it. I have restored the machine from a backup with minimum data loss only the fetchmail stuff but i want that some help me to know what happen if there is a bug or is there any way to recover the data. the other machines (Mostly Windows 2003/2008/2008R2) came up without problems. All the data is stored on an EMC Clarion SAN Partitions: %cat /etc/fstab # Device Mountpoint FStype Options Dump Pass# /dev/ad0s1b none swap sw 0 0 /dev/ad0s1a / ufs rw 1 1 /dev/ad0s1e /tmp ufs rw 2 2 /dev/ad0s1f /usr ufs rw 2 2 /dev/ad0s1d /var ufs rw 2 2 /dev/acd0 /cdrom cd9660 ro,noauto 0 0 Thanks, i will provide anything info you need. Diego Arias -- mmm, interesante..... -- mmm, interesante..... From owner-freebsd-fs@FreeBSD.ORG Wed Jul 7 20:46:04 2010 Return-Path: Delivered-To: freebsd-fs@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id E86F21065673 for ; Wed, 7 Jul 2010 20:46:04 +0000 (UTC) (envelope-from julian@elischer.org) Received: from out-0.mx.aerioconnect.net (out-0-33.mx.aerioconnect.net [216.240.47.93]) by mx1.freebsd.org (Postfix) with ESMTP id C36D38FC15 for ; Wed, 7 Jul 2010 20:46:04 +0000 (UTC) Received: from idiom.com (postfix@mx0.idiom.com [216.240.32.160]) by out-0.mx.aerioconnect.net (8.13.8/8.13.8) with ESMTP id o67KPGdf029182; Wed, 7 Jul 2010 13:25:17 -0700 X-Client-Authorized: MaGic Cook1e X-Client-Authorized: MaGic Cook1e Received: from julian-mac.elischer.org (h-67-100-89-137.snfccasy.static.covad.net [67.100.89.137]) by idiom.com (Postfix) with ESMTP id 524732D6018; Wed, 7 Jul 2010 13:25:16 -0700 (PDT) Message-ID: <4C34E2C9.3060302@elischer.org> Date: Wed, 07 Jul 2010 13:25:45 -0700 From: Julian Elischer User-Agent: Mozilla/5.0 (Macintosh; U; PPC Mac OS X 10.4; en-US; rv:1.9.1.10) Gecko/20100512 Thunderbird/3.0.5 MIME-Version: 1.0 To: Diego Arias References: In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Scanned-By: MIMEDefang 2.67 on 216.240.47.51 Cc: freebsd-fs@FreeBSD.org Subject: Re: Freebsd 8 Release /usr Die After host VMWARE Crash X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 07 Jul 2010 20:46:05 -0000 On 7/7/10 12:04 PM, Diego Arias wrote: > Hi i posted this to questions but gets no answer, so i repost it here > because i think this is important. > > I have a VM Running FreeBSD 8 as a small router/proxy/fetchmail/openvpn. > The host system is a VMWARE ESX 4 Update 2 running on 2 HP DL460G1 Blade > Systems. Unfortunately the Blade Enclosure (a C7000 From HP) start having > malfunction so i have to power it off with blades still on. After the power > loss the VMWARE came up but FreeBSD ask por FSCK on single user mode, so i > run it fsck -y on all partitions. After FSCK freebsd wont came up with error > of getty not found. i restart it in single use mode and mount /usr but no > luck, all the data was gone and only got a lost+found directory with crazy > files on it. > > I have restored the machine from a backup with minimum data loss only the > fetchmail stuff but i want that some help me to know what happen if there is > a bug or is there any way to recover the data. > > the other machines (Mostly Windows 2003/2008/2008R2) came up without > problems. > > All the data is stored on an EMC Clarion SAN > > Partitions: > > %cat /etc/fstab > # Device Mountpoint FStype Options Dump Pass# > /dev/ad0s1b none swap sw 0 0 > /dev/ad0s1a / ufs rw 1 1 > /dev/ad0s1e /tmp ufs rw 2 2 > /dev/ad0s1f /usr ufs rw 2 2 > /dev/ad0s1d /var ufs rw 2 2 > /dev/acd0 /cdrom cd9660 ro,noauto 0 0 > > > Thanks, I will provide anything info you need. unfortunately shutting down almost any system in an 'unexpected' manner can produce problems like this. The new file systems are better in this regard but virtualization adds a whole new layer of pain to the problem. Not only does the guest have to save everything, but the host has to sync everything as well. > > > > Diego Arias > From owner-freebsd-fs@FreeBSD.ORG Wed Jul 7 21:42:52 2010 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 4E32B1065672; Wed, 7 Jul 2010 21:42:52 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from cyrus.watson.org (cyrus.watson.org [65.122.17.42]) by mx1.freebsd.org (Postfix) with ESMTP id 201218FC08; Wed, 7 Jul 2010 21:42:52 +0000 (UTC) Received: from bigwig.baldwin.cx (66.111.2.69.static.nyinternet.net [66.111.2.69]) by cyrus.watson.org (Postfix) with ESMTPSA id 9267E46B94; Wed, 7 Jul 2010 17:42:51 -0400 (EDT) Received: from jhbbsd.localnet (smtp.hudson-trading.com [209.249.190.9]) by bigwig.baldwin.cx (Postfix) with ESMTPSA id D79188A04E; Wed, 7 Jul 2010 17:42:49 -0400 (EDT) From: John Baldwin To: Nathaniel W Filardo Date: Wed, 7 Jul 2010 16:42:28 -0400 User-Agent: KMail/1.13.5 (FreeBSD/7.3-CBSD-20100217; KDE/4.4.5; amd64; ; ) References: <20100609212747.GF21929@gradx.cs.jhu.edu> <20100703085516.GH21929@gradx.cs.jhu.edu> In-Reply-To: <20100703085516.GH21929@gradx.cs.jhu.edu> MIME-Version: 1.0 Content-Type: Text/Plain; charset="iso-8859-15" Content-Transfer-Encoding: 7bit Message-Id: <201007071642.28847.jhb@freebsd.org> X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.0.1 (bigwig.baldwin.cx); Wed, 07 Jul 2010 17:42:49 -0400 (EDT) X-Virus-Scanned: clamav-milter 0.95.1 at bigwig.baldwin.cx X-Virus-Status: Clean X-Spam-Status: No, score=-2.6 required=4.2 tests=AWL,BAYES_00 autolearn=ham version=3.2.5 X-Spam-Checker-Version: SpamAssassin 3.2.5 (2008-06-10) on bigwig.baldwin.cx Cc: alc@freebsd.org, freebsd-fs@freebsd.org Subject: Re: [sparc64] [ZFS] panic: mutex vnode interlock not owned X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 07 Jul 2010 21:42:52 -0000 On Saturday, July 03, 2010 4:55:16 am Nathaniel W Filardo wrote: > (hello freebsd-fs@; I'm cc:ing you since the latest part of my story > involves a ZFS-related panic and I hear you're the right place to go with > those. It began attempting to debug a VM locking panic and has moved a > little...) > > On Thu, Jun 10, 2010 at 12:23:24PM -0500, Alan Cox wrote: > > On Thu, Jun 10, 2010 at 7:16 AM, John Baldwin wrote: > > > > > On Wednesday 09 June 2010 5:27:47 pm Nathaniel W Filardo wrote: > > > > Attempting to boot on (2-way SMP; SUN Fire V240) sparc64 a 9.0-CURRENT > > > > kernel built on Jun 9 at 14:41, and fully csup'd before building (I don't > > > > have the SVN revision number, sorry) yields, surprisingly late in the > > > boot > > > > process, this panic: > > > > > > > > panic: mutex vm object not owned at /systank/src/sys/vm/vm_object.c:1692 > > > > cpuid = 0 > > > > KDB: stack backtrace: > > > > panic() at panic+0x1c8 > > > > _mtx_assert() at _mtx_assert+0xb0 > > > > vm_object_collapse() at vm_object_collapse+0x28 > > > > vm_object_deallocate() at vm_object_deallocate+0x538 > > > > _vm_map_unlock() at _vm_map_unlock+0x64 > > > > vm_map_remove() at vm_map_remove+0x64 > > > > vmspace_exit() at vmspace_exit+0x100 > > > > exit1() at exit1+0x788 > > > > sys_exit() at sys_exit+0x10 > > > > syscallenter() at syscallenter+0x268 > > > > syscall() at syscall+0x74 > > > > -- syscall (1, FreeBSD ELF64, sys_exit) %o7=0x11980c -- > > > > userland() at 0x406fe8c8 > > > > user trace: trap %o7=0x11980c > > > > pc 0x406fe8c8, sp 0x7fdffff7611 > > > > done > > > > Uptime: 4m7s > > > > > > > > The system was, at the time, attempting to bring up its jails. > > > > > > > > Anything else that would be helpful to know? > > > > > > Can you get a crashdump? If so, it would be good to pull up gdb and check > > > the > > > value sof 'object' and 'robject' in the vm_object_deallocate() frame. > > > > > > > > That would be useful. None of the locking changes of the last few weeks > > have altered the vm object locking, so this assertion failure and stack > > trace come as something of a surprise. > > > > Alan > > Well, I thought that no longer delegating ZFS (with "zfs jail") to the jail > whose startup was causing the above panic might solve the problem and indeed > the system made it slightly further. A few minutes after reaching the > login: prompt, though, it produced > > panic: mutex vnode interlock not owned at /systank/src/sys/kern/kern_mutex.c:223 > cpuid = 0 > KDB: stack backtrace: > panic() at panic+0x1c8 > _mtx_assert() at _mtx_assert+0xb0 > _mtx_unlock_flags() at _mtx_unlock_flags+0x144 > vnlru_free() at vnlru_free+0x500 > getnewvnode() at getnewvnode+0x7c > zfs_znode_cache_constructor() at zfs_znode_cache_constructor+0x4c > zfs_znode_alloc() at zfs_znode_alloc+0x34 > zfs_zget() at zfs_zget+0x2b8 > zfs_dirent_lock() at zfs_dirent_lock+0x508 > zfs_dirlook() at zfs_dirlook+0x50 > zfs_lookup() at zfs_lookup+0x1bc > zfs_freebsd_lookup() at zfs_freebsd_lookup+0x6c > VOP_CACHEDLOOKUP_APV() at VOP_CACHEDLOOKUP_APV+0x108 > vfs_cache_lookup() at vfs_cache_lookup+0xfc > VOP_LOOKUP_APV() at VOP_LOOKUP_APV+0x110 > lookup() at lookup+0x7d0 > namei() at namei+0x69c > kern_statat_vnhook() at kern_statat_vnhook+0x48 > kern_statat() at kern_statat+0x1c > kern_lstat() at kern_lstat+0x18 > lstat() at lstat+0x14 > syscallenter() at syscallenter+0x27c > syscall() at syscall+0x74 > -- syscall (190, FreeBSD ELF64, lstat) %o7=0x12b830 -- > ... > > which at least is consistent with my hunch that the original panic had > something to do with ZFS. The system is as of svn 209653 (git c65b199...) > with http://people.freebsd.org/~marius/sparc64_pin_ipis.diff applied. The > old kernel has uname > FreeBSD hydra.priv.oc.ietfng.org 9.0-CURRENT FreeBSD 9.0-CURRENT #20: Sun > Apr 4 20:31:58 EDT 2010 > root@hydra.priv.oc.ietfng.org:/systank/obj/systank/src/sys/NWFKERN sparc64 > which is probably too old to be of use to anybody, but just in case, there > it is. I don't suspect the machine of having bad hardware since this old > kernel runs apparently fine on it and zpool scrubs haven't found anything > yet. > > I can't easily get a crash dump on the system (if somebody could tell me how > to get one from a ddb(4) prompt, I could try that, but otherwise the system > just ceases to do anything after panic; I have swap and dump set, so I'm not > sure what's not happening there...). > > Anything more I should do? I really think you might have some sort of hardware issue as all of your reported panics have been weird "can't happen" cases. -- John Baldwin From owner-freebsd-fs@FreeBSD.ORG Thu Jul 8 05:35:41 2010 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id ED405106566B for ; Thu, 8 Jul 2010 05:35:41 +0000 (UTC) (envelope-from andrew@modulus.org) Received: from email.octopus.com.au (email.octopus.com.au [122.100.2.232]) by mx1.freebsd.org (Postfix) with ESMTP id AE79E8FC14 for ; Thu, 8 Jul 2010 05:35:41 +0000 (UTC) Received: by email.octopus.com.au (Postfix, from userid 1002) id AC4FD5CB92F; Thu, 8 Jul 2010 15:27:47 +1000 (EST) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on spamkiller X-Spam-Level: X-Spam-Status: No, score=-2.9 required=5.0 tests=ALL_TRUSTED,BAYES_00 autolearn=unavailable version=3.3.1 Received: from [10.1.50.144] (ppp121-44-74-103.lns20.syd6.internode.on.net [121.44.74.103]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) (Authenticated sender: admin@email.octopus.com.au) by email.octopus.com.au (Postfix) with ESMTP id A4D415CB8CD; Thu, 8 Jul 2010 15:27:46 +1000 (EST) Message-ID: <4C3563A8.7060301@modulus.org> Date: Thu, 08 Jul 2010 15:35:36 +1000 From: Andrew Snow User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.1.9) Gecko/20100423 Thunderbird/3.0.4 MIME-Version: 1.0 To: Diego Arias , freebsd-fs@freebsd.org References: In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit Cc: Subject: Re: Freebsd 8 Release /usr Die After host VMWARE Crash X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 08 Jul 2010 05:35:42 -0000 This should never happen! I hardly know where to start... The possibilities I can think of are: 1. A bug in UFS2 filesystem handling code (it has to be considered) 2. the blade suffered from undetected memory or CPU corruption 3. A misconfiguration somewhere somehow disabled synchronous disk device writes. Possibly in freebsd (did you mount it async?), possibly in the SAN (doubtful unless you powered if off at the same time as the blades), possibly in vmware (i dont know of any options in esx that let you do something as silly as this). 4. You were using VMFS thin provisioning and the volume ran out of space 5. You were using VMFS extents and one or more LUNs vanished during the host crashing Obviously all of these possibilities seem very unlikely.. but it would take more precise knowledge of your setup to narrow it down. In the scheme of things it seems a bit premature to blame FreeBSD but bugs do happen. - Andrew From owner-freebsd-fs@FreeBSD.ORG Thu Jul 8 07:45:38 2010 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 50AD9106566C for ; Thu, 8 Jul 2010 07:45:38 +0000 (UTC) (envelope-from numisemis@yahoo.com) Received: from web112402.mail.gq1.yahoo.com (web112402.mail.gq1.yahoo.com [98.137.26.77]) by mx1.freebsd.org (Postfix) with SMTP id 207D78FC1A for ; Thu, 8 Jul 2010 07:45:37 +0000 (UTC) Received: (qmail 93385 invoked by uid 60001); 8 Jul 2010 07:45:37 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yahoo.com; s=s1024; t=1278575137; bh=Tp8KCIV+ln7CLOk933Qi4HQBQLYIMrExBQU50uwu/Xs=; h=Message-ID:X-YMail-OSG:Received:X-Mailer:References:Date:From:Subject:To:Cc:In-Reply-To:MIME-Version:Content-Type; b=daU+yYFyJ7wPUlsPrJhQvZHBTmOmPvetCCNpb1D80/KiuKzUXHlEiUYoTosrvucgYbf0iCMmlkoUeZ1vqo8WL54BjZLI9LWoEWal+j3Fu17PbWTgTmgVRbOrH0CCBtJnq1H3jVVHdtR9e45e865jfLZvYamCGlonrX/OkJAwRBs= DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=s1024; d=yahoo.com; h=Message-ID:X-YMail-OSG:Received:X-Mailer:References:Date:From:Subject:To:Cc:In-Reply-To:MIME-Version:Content-Type; b=f/vpJubvvzvh58VWL3EqfqxczzfcniIAyb4FoOZbGWW5YlFun9JBh55IuNuIxP9w3Z75H7s0vzp55Fr6jVGoti8sl4/Eg+zF/+hxcphKNBjPDlbbtO82RyEOhP8Ao3ADOWB0vGDM/1U7JQvL8nYPSeS/5gkE2TvGaSS7tYmKHGM=; Message-ID: <688583.92527.qm@web112402.mail.gq1.yahoo.com> X-YMail-OSG: mUDhCaEVM1n039S2WMu77Y8FSxVoHMIjg8w5ZOci2HSKwlV to6fxdPQCtt0PhQhqRFcm.1R2qKOgdXHstM3bT4Okc_FAyAOnM4z7V6kea0l K0PUConFJz5rOQMqsImWkWlMK9Lt2vwgt_8XEQMUJaADPqVZuDKodvI8n2aR 7sMtCZvJu0wL3tt6sI.F3f4XriEmcL.pl9_0zuDeAQQMO3YfXmOPGfZ773Ut GD8low4lLxBFV4Nzi1vIAWibamos7O.CPFlqDL.UuWq4O_emACRtXLbHVfjg oUIRapJMZDxA2CzUcnPwpEgow8weXSsTU.DludXXe1vZ0bDmsdFyqq7J_7N7 xgz_cAKmct4CUIYIHaZWL9_08xkDhiw-- Received: from [213.147.110.159] by web112402.mail.gq1.yahoo.com via HTTP; Thu, 08 Jul 2010 00:45:37 PDT X-Mailer: YahooMailRC/420.4 YahooMailWebService/0.8.104.274457 References: <4C3563A8.7060301@modulus.org> Date: Thu, 8 Jul 2010 00:45:37 -0700 (PDT) From: Simun Mikecin To: Andrew Snow In-Reply-To: <4C3563A8.7060301@modulus.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Cc: freebsd-fs@freebsd.org Subject: Re: Freebsd 8 Release /usr Die After host VMWARE Crash X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 08 Jul 2010 07:45:38 -0000 ----- Original Message ---- > From: Andrew Snow > To: Diego Arias ; freebsd-fs@freebsd.org > Sent: Thu, July 8, 2010 7:35:36 AM > Subject: Re: Freebsd 8 Release /usr Die After host VMWARE Crash > > This should never happen! I hardly know where to start... > > The possibilities I can think of are: > > 1. A bug in UFS2 filesystem handling code (it has to be considered) > 2. the blade suffered from undetected memory or CPU corruption > 3. A misconfiguration somewhere somehow disabled synchronous disk device > writes. Possibly in freebsd (did you mount it async?), possibly in the SAN >(doubtful unless you powered if off at the same time as the blades), possibly >in vmware (i dont know of any options in esx that let you do something as silly >as this). > 4. You were using VMFS thin provisioning and the volume ran out of space > 5. You were using VMFS extents and one or more LUNs vanished during the host >crashing > > Obviously all of these possibilities seem very unlikely.. but it would take >more precise knowledge of your setup to narrow it down. In the scheme of >things it seems a bit premature to blame FreeBSD but bugs do happen. > AFAIK virtual environments ignore disk sync requests by default. For example, in VirtualBox they are ignored by default, by you could enable it if you want (with a performance penalty). Haven't used VMWare, so not 100% sure about it, maybe someone more knowledgable with VMWare knows what it's defaults are. Described fsck errors are the same if you use a lying ATA drive (disk that reports that it has written data, but it has not) with UFS2+softupdates. Solution for a lying ATA drive is to use a filesystem that uses disk write cache flushing, like UFS2+gjournal or ZFS. I suppose UFS journaling would be ok, too, but haven't used it myself, so cannot comment on that. If VMWare does not honor disk write cache flushing then described solutions would not work on it. From owner-freebsd-fs@FreeBSD.ORG Thu Jul 8 07:59:52 2010 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 95F72106566C for ; Thu, 8 Jul 2010 07:59:52 +0000 (UTC) (envelope-from andrew@modulus.org) Received: from email.octopus.com.au (email.octopus.com.au [122.100.2.232]) by mx1.freebsd.org (Postfix) with ESMTP id 585248FC13 for ; Thu, 8 Jul 2010 07:59:52 +0000 (UTC) Received: by email.octopus.com.au (Postfix, from userid 1002) id 30AAF5CB955; Thu, 8 Jul 2010 17:51:59 +1000 (EST) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on spamkiller X-Spam-Level: X-Spam-Status: No, score=-2.9 required=5.0 tests=ALL_TRUSTED,BAYES_00 autolearn=ham version=3.3.1 Received: from [10.1.50.144] (ppp121-44-74-103.lns20.syd6.internode.on.net [121.44.74.103]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) (Authenticated sender: admin@email.octopus.com.au) by email.octopus.com.au (Postfix) with ESMTP id A27BB5CB8E7; Thu, 8 Jul 2010 17:51:58 +1000 (EST) Message-ID: <4C358574.2040009@modulus.org> Date: Thu, 08 Jul 2010 17:59:48 +1000 From: Andrew Snow User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.1.9) Gecko/20100423 Thunderbird/3.0.4 MIME-Version: 1.0 To: Simun Mikecin , freebsd-fs@freebsd.org References: <4C3563A8.7060301@modulus.org> <688583.92527.qm@web112402.mail.gq1.yahoo.com> In-Reply-To: <688583.92527.qm@web112402.mail.gq1.yahoo.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: Subject: Re: Freebsd 8 Release /usr Die After host VMWARE Crash X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 08 Jul 2010 07:59:52 -0000 On 08/07/10 17:45, Simun Mikecin wrote: > AFAIK virtual environments ignore disk sync requests by default. For example, in > VirtualBox they are ignored by default, by you could enable it if you want (with > a performance penalty). Haven't used VMWare, so not 100% sure about it, maybe > someone more knowledgable with VMWare knows what it's defaults are. VMware vSphere/ESX/ESXi makes all writes synchronous. (It is terribly slow on RAID cards that lack battery-backed cache!) - Andrew From owner-freebsd-fs@FreeBSD.ORG Thu Jul 8 11:18:36 2010 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 2959C106564A for ; Thu, 8 Jul 2010 11:18:36 +0000 (UTC) (envelope-from numisemis@yahoo.com) Received: from web112417.mail.gq1.yahoo.com (web112417.mail.gq1.yahoo.com [98.137.26.185]) by mx1.freebsd.org (Postfix) with SMTP id EB57D8FC08 for ; Thu, 8 Jul 2010 11:18:35 +0000 (UTC) Received: (qmail 41786 invoked by uid 60001); 8 Jul 2010 11:18:35 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yahoo.com; s=s1024; t=1278587915; bh=RKC4JG0srXXA41TTMUlOdJ7xy+QqYDyMGkz91pPENEY=; h=Message-ID:X-YMail-OSG:Received:X-Mailer:References:Date:From:Subject:To:Cc:In-Reply-To:MIME-Version:Content-Type; b=OJjGzt5JDhHCGKDx8fzWSaSLSs55QPO6W6C9lAGxU6iOInA3CQ/nhWCT8OgPt/ERwc6je4AldRtwZqkNn+FlRlG/iz/fKsrodMCuGiNPAhApK+GD5Ype6aZ+sm74GOvlJ5XH0tBWvMouUUi6cn76z+dFNwUAm2Q/9NyWYhvVM4E= DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=s1024; d=yahoo.com; h=Message-ID:X-YMail-OSG:Received:X-Mailer:References:Date:From:Subject:To:Cc:In-Reply-To:MIME-Version:Content-Type; b=dJJRdUVrYBHXzQDf0T5QoNICT1szoEU0bL3q4nxUYmL7ewaSH4VGJs4P+Log1vHyOiSc8G9kVnkph30WG3h+q5S2doOoXCUZ/QPKUdU3bi3FwtosTx+xlfG+OE5nzxROFNPW2hTIry+onjJMDmPkr1hwtt0P8Jy1Rj/5s34DwzY=; Message-ID: <394586.41761.qm@web112417.mail.gq1.yahoo.com> X-YMail-OSG: nlSlYtEVM1nhLMic9gQ3CLRpLDCW1gR4122c8GCi3nm5mVg yAoCPoZiLyXx63Ecs8fi0NACDW51tS7Oqics1n5L871MSWepAyIe5cSLOJjF b9f2Dwnw91IOxVy8T6NKKL8USlL78xHUPb93QJoZtJlu.W_fKWkB78iIAGlP nnfQI8S_HXaQSIseVBi1NVNttyQIeV7ZP1VGjR8CcSzU.qxc.CHtzOYunfu9 hyf5E_o9gAcQ7J9AlrrCxk0YT6fz56IHLIfI6ULmEoQpNB1dZB4ke6kRcUBC h40cJuhjdv36MUqRymxaDn.0k0.c.K151LlcpzhLnIb.hh1Nl9Egi6WI6lyU TxzqgORWdogeSjp7mfXDlumCSvhmG_w-- Received: from [213.147.110.159] by web112417.mail.gq1.yahoo.com via HTTP; Thu, 08 Jul 2010 04:18:35 PDT X-Mailer: YahooMailRC/420.4 YahooMailWebService/0.8.104.274457 References: <4C3563A8.7060301@modulus.org> <688583.92527.qm@web112402.mail.gq1.yahoo.com> <4C358574.2040009@modulus.org> Date: Thu, 8 Jul 2010 04:18:35 -0700 (PDT) From: Simun Mikecin To: Andrew Snow In-Reply-To: <4C358574.2040009@modulus.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Cc: freebsd-fs@freebsd.org Subject: Re: Freebsd 8 Release /usr Die After host VMWARE Crash X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 08 Jul 2010 11:18:36 -0000 ----- Original Message ---- > From: Andrew Snow > To: Simun Mikecin ; freebsd-fs@freebsd.org > Sent: Thu, July 8, 2010 9:59:48 AM > Subject: Re: Freebsd 8 Release /usr Die After host VMWARE Crash > > On 08/07/10 17:45, Simun Mikecin wrote: > > AFAIK virtual environments ignore disk sync requests by default. For >example, in > > VirtualBox they are ignored by default, by you could enable it if you want >(with > > a performance penalty). Haven't used VMWare, so not 100% sure about it, >maybe > > someone more knowledgable with VMWare knows what it's defaults are. > > VMware vSphere/ESX/ESXi makes all writes synchronous. > > (It is terribly slow on RAID cards that lack battery-backed cache!) Thanx for the info. But, there is still a chance that the host environment on which (original poster's) VMware is running uses a lying ATA drive or some other storage that behaves the same as a lying ATA drive. From owner-freebsd-fs@FreeBSD.ORG Thu Jul 8 13:59:40 2010 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 66051106564A for ; Thu, 8 Jul 2010 13:59:40 +0000 (UTC) (envelope-from dak.col@gmail.com) Received: from mail-gy0-f182.google.com (mail-gy0-f182.google.com [209.85.160.182]) by mx1.freebsd.org (Postfix) with ESMTP id 1C2948FC17 for ; Thu, 8 Jul 2010 13:59:39 +0000 (UTC) Received: by gyd8 with SMTP id 8so463566gyd.13 for ; Thu, 08 Jul 2010 06:59:33 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:received:in-reply-to :references:date:message-id:subject:from:to:cc:content-type; bh=a9RJyDuOQI3KVJco7qs5f1TX6tsrFXP/B5cXl/FYAqY=; b=Av50pSFo0KBabpklG35TL31A1lHgEh+ChnMgcgrMGgIOnQk2T35/LzD6zG2Sec6Rmk 8HjpMdb11n60wQHCFWDB5RKRToL0xrmmv5IePSRpEn0ULuKujZjO2kEp62dfPjRo30Ct oaqhlUk2YiGlMiGl5bmDg5lyAvAQIfS/hmIU4= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; b=YcDfaloDdXOyPlb/rztIFFNwBokpj+suEe3BgLmxHhMFxelu+T8tn2cWnm/HTNL/sg FQ9PUkjeQI2Dbo+xCLhXWHAHUhiSCOepBxCJky0Sb23rkVbNkOm+OipHrb9sRuqSqMxu c+MtYTN8bY2nqGjM3C0CBkSQecte+W0NIU1FU= MIME-Version: 1.0 Received: by 10.90.53.3 with SMTP id b3mr3363070aga.121.1278597573573; Thu, 08 Jul 2010 06:59:33 -0700 (PDT) Received: by 10.90.28.10 with HTTP; Thu, 8 Jul 2010 06:59:33 -0700 (PDT) In-Reply-To: <394586.41761.qm@web112417.mail.gq1.yahoo.com> References: <4C3563A8.7060301@modulus.org> <688583.92527.qm@web112402.mail.gq1.yahoo.com> <4C358574.2040009@modulus.org> <394586.41761.qm@web112417.mail.gq1.yahoo.com> Date: Thu, 8 Jul 2010 08:59:33 -0500 Message-ID: From: Diego Arias To: Simun Mikecin Content-Type: text/plain; charset=UTF-8 X-Content-Filtered-By: Mailman/MimeDel 2.1.5 Cc: freebsd-fs@freebsd.org Subject: Re: Freebsd 8 Release /usr Die After host VMWARE Crash X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 08 Jul 2010 13:59:40 -0000 On Thu, Jul 8, 2010 at 6:18 AM, Simun Mikecin wrote: > > > > > ----- Original Message ---- > > From: Andrew Snow > > To: Simun Mikecin ; freebsd-fs@freebsd.org > > Sent: Thu, July 8, 2010 9:59:48 AM > > Subject: Re: Freebsd 8 Release /usr Die After host VMWARE Crash > > > > On 08/07/10 17:45, Simun Mikecin wrote: > > > AFAIK virtual environments ignore disk sync requests by default. For > >example, in > > > VirtualBox they are ignored by default, by you could enable it if you > want > >(with > > > a performance penalty). Haven't used VMWare, so not 100% sure about > it, > >maybe > > > someone more knowledgable with VMWare knows what it's defaults are. > > > > VMware vSphere/ESX/ESXi makes all writes synchronous. > > > > (It is terribly slow on RAID cards that lack battery-backed cache!) > > > Thanx for the info. > But, there is still a chance that the host environment on which (original > poster's) VMware is running uses a lying ATA drive or some other storage > that > behaves the same as a lying ATA drive. > > > > > Hi there: - I discard a CPU/memory corruption because 8 more VM were running there and start without problem (Windows, not Unix ,Linux or BSD) - Actually i dont know if it is FreeBSD Fault but its curious and dangerous. - Its stock system so, no Async - Any info you need that might be usefull just ask, im here to help and can give you all the info you need - The FSTAB is in the first mail and this report mount /dev/ad0s1a on / (ufs, local) devfs on /dev (devfs, local, multilabel) /dev/ad0s1e on /tmp (ufs, local, soft-updates) /dev/ad0s1f on /usr (ufs, local, soft-updates) /dev/ad0s1d on /var (ufs, local, soft-updates) - the machine its actually running from its backup, i have not touched the FSTAB or mount stuff since installation - the machine was installed on VirtualBox then migrated to VMWARE with the convert utility from Virtualbox - SAN Hard drives are all Fibre Channel On Raid 5, fully redundant path, the SAN Switches on the chassis (Enclosure) didn't die when the enclosure start malfunctioning, other SAN applications works without trouble after restart on the same chasis as Virtual machines on the same LUN. the Blades Hard drives are SAS 10000RPM Diego Arias -- mmm, interesante..... From owner-freebsd-fs@FreeBSD.ORG Fri Jul 9 01:42:37 2010 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 2D01D106566B for ; Fri, 9 Jul 2010 01:42:37 +0000 (UTC) (envelope-from andrew@modulus.org) Received: from email.octopus.com.au (email.octopus.com.au [122.100.2.232]) by mx1.freebsd.org (Postfix) with ESMTP id DE7098FC13 for ; Fri, 9 Jul 2010 01:42:36 +0000 (UTC) Received: by email.octopus.com.au (Postfix, from userid 1002) id 08ABA5CB95F; Fri, 9 Jul 2010 11:34:42 +1000 (EST) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on spamkiller X-Spam-Level: X-Spam-Status: No, score=-2.9 required=5.0 tests=ALL_TRUSTED,BAYES_00 autolearn=unavailable version=3.3.1 Received: from [10.1.50.144] (ppp121-44-74-103.lns20.syd6.internode.on.net [121.44.74.103]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) (Authenticated sender: admin@email.octopus.com.au) by email.octopus.com.au (Postfix) with ESMTP id DD27B5CB955; Fri, 9 Jul 2010 11:34:40 +1000 (EST) Message-ID: <4C367E87.7050505@modulus.org> Date: Fri, 09 Jul 2010 11:42:31 +1000 From: Andrew Snow User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.1.9) Gecko/20100423 Thunderbird/3.0.4 MIME-Version: 1.0 To: Diego Arias , freebsd-fs@freebsd.org References: <4C3563A8.7060301@modulus.org> <688583.92527.qm@web112402.mail.gq1.yahoo.com> <4C358574.2040009@modulus.org> <394586.41761.qm@web112417.mail.gq1.yahoo.com> In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit Cc: Subject: Re: Freebsd 8 Release /usr Die After host VMWARE Crash X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 09 Jul 2010 01:42:37 -0000 On 08/07/10 23:59, Diego Arias wrote: > /dev/ad0s1a on / (ufs, local) > > - the machine was installed on VirtualBox then migrated to VMWARE with > the convert utility from Virtualbox Ahh, I think this is the problem. When converting from VBox, it uses an ATA disk, instead of VMWare's default of SCSI guest disks. This means FreeBSD enables the ATA write cache by default, which VMware honors and might be prone to lose data on power outage. I suspect you should either set hw.ata.wc=0 in loader.conf, or switch to SCSI gues disk type. But its still possible there was CPU/RAM problems and you were just lucky that the other guest disks didnt get corrupt as they may not have been writing to directory metadata at the time of crash. - Andrew From owner-freebsd-fs@FreeBSD.ORG Fri Jul 9 06:15:58 2010 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 39E761065670 for ; Fri, 9 Jul 2010 06:15:58 +0000 (UTC) (envelope-from mashtizadeh@gmail.com) Received: from mail-iw0-f182.google.com (mail-iw0-f182.google.com [209.85.214.182]) by mx1.freebsd.org (Postfix) with ESMTP id 037BB8FC16 for ; Fri, 9 Jul 2010 06:15:57 +0000 (UTC) Received: by iwn35 with SMTP id 35so2225828iwn.13 for ; Thu, 08 Jul 2010 23:15:57 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:received:date:message-id :subject:from:to:content-type:content-transfer-encoding; bh=+edksPmzjI9L6zYePC8IaJqfVIabezMFU9uFFkcccnY=; b=Zb064rOs0xH/oR9KMr2eiu7d/owfTgC6D0IoX2nHesK8VTfeO4NDg5sa3xfVKIWDkk vFIg18PEpvz29CjpWwhbBEBebjM5/JOEpXOMnNE+aSUFRH3BxIRx9Jy1dgn2KDk4QRge s2oq5m9650IDNeLhGAGpIgmKjlzQLKuwVjTVw= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:date:message-id:subject:from:to:content-type :content-transfer-encoding; b=mf/bfK+xkJZt/XVe2y04e/SKA1XDWS9oeCJzCmgVk8ZI6bHkbMklZz0gILmZVMGajH sQKPoRk7yB4idHaRWVELy4d+JtgzaY1tzaUj6QyG2cFrVo359eJotqgZLSBUPTC6rnUF 9MSM2N/MbG4UwC6PoJBuI5edB2qHySQJoHW3s= MIME-Version: 1.0 Received: by 10.231.37.199 with SMTP id y7mr9484738ibd.180.1278656157364; Thu, 08 Jul 2010 23:15:57 -0700 (PDT) Received: by 10.231.156.19 with HTTP; Thu, 8 Jul 2010 23:15:57 -0700 (PDT) Date: Thu, 8 Jul 2010 23:15:57 -0700 Message-ID: From: Ali Mashtizadeh To: freebsd-fs@freebsd.org Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: base64 Subject: ufs_db: UFS debugging and exploration tool X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 09 Jul 2010 06:15:58 -0000 SGkgRm9sa3MsCgpJIGp1c3Qgd2FudGVkIHRvIHBvaW50IGZvbGtzIHRvIGEgdG9vbCBJIHdyb3Rl IHRvIHdhbGsgb3ZlciBhIFVGUwpmaWxlLXN5c3RlbSBhbmQgZXhhbWluZSBzb21lIG9mIGl0J3Mg c3RydWN0dXJlcy4gSXQgbWlnaHQgYmUgdXNlZnVsCmZvciBVRlMgZGV2ZWxvcG1lbnQsIGFuYWx5 emluZyBjb3JydXB0aW9ucywgYW5kIG1vcmUuIEkgd2FzIGN1cmlvdXMgdG8KZ2F1Z2UgdGhlIGlu dGVyZXN0IGluIHRoaXMgdG9vbCBpZiB0aGVyZSBpcyBJJ2xsIGNvbnRpbnVlIHRvIGV4dGVuZCBp dAphbmQgYWRkIHN1cHBvcnQgZm9yIGFkZGl0aW9uYWwgc3RydWN0dXJlcyBhbmQgc3VwcG9ydCBt b2RpZmljYXRpb24gb2YKdGhlIGZpbGUtc3lzdGVtLgoKbWVyY3VyaWFsIHJlcG9zaXRvcnkgYXZh aWxhYmxlIGF0IGh0dHBzOi8vYml0YnVja2V0Lm9yZy9tYXNodGl6YWRlaC91ZnNfZGIKCn4gTGlz dCBvZiBjdXJyZW50bHkgc3VwcG9ydGVkIGNvbW1hbmRzIH4KaGVscCDCoCDCoCDCoCDCoCDCoCDC oCDCoCDCoCBQcmludCB0aGlzIG1lc3NhZ2UKZWNobyDCoCDCoCDCoCDCoCDCoCDCoCDCoCDCoCBF Y2hvIGEgc3RyaW5nIHRvIHN0ZG91dApleGVjIMKgIMKgIMKgIMKgIMKgIMKgIMKgIMKgIEV4ZWN1 dGUgYSBzY3JpcHQKb3BlbmRldiDCoCDCoCDCoCDCoCDCoCDCoCDCoE9wZW4gYSBkZXZpY2UKY2xv c2VkZXYgwqAgwqAgwqAgwqAgwqAgwqAgQ2xvc2UgYSBkZXZpY2UKYnJlYWQgwqAgwqAgwqAgwqAg wqAgwqAgwqAgwqBSZWFkIGEgYmxvY2sgaW4gYXMgaGV4CnNiIMKgIMKgIMKgIMKgIMKgIMKgIMKg IMKgIMKgIFNlbGVjdCBzdXBlcmJsb2NrCmNnIMKgIMKgIMKgIMKgIMKgIMKgIMKgIMKgIMKgIFNl bGVjdCBhIGN5bGluZGVyIGdyb3VwCmlubyDCoCDCoCDCoCDCoCDCoCDCoCDCoCDCoCDCoEdldCBh biBJbm9kZQpyZWFkIMKgIMKgIMKgIMKgIMKgIMKgIMKgIMKgIFJlYWQgZnJvbSB0aGUgY3VycmVu dCBmaWxlCmxpc3QgwqAgwqAgwqAgwqAgwqAgwqAgwqAgwqAgU2hvdyBhIGRpcmVjdG9yeSBsaXN0 aW5nCgp+IEV4YW1wbGUgb3V0cHV0IG9uIGEgbmV3bHkgY3JlYXRlZCBmaWxlIHN5c3RlbSB+CnVm c19kYj4gb3BlbmRldiAvZGV2L21kMAp1ZnNfZGI+IGlubyAyCklub2RlIDIgKGJsb2NrIDApCmRp X21vZGUgwqAgwqAgwqAgwqAgwqAgwqAgwqAxNjg3NyDCoCDCoCDCoCDCoCDCoCBNb2RlIGJpdHMK ZGlfbmxpbmsgwqAgwqAgwqAgwqAgwqAgwqAgMyDCoCDCoCDCoCDCoCDCoCDCoCDCoCBMaW5rIGNv dW50CmRpX3VpZCDCoCDCoCDCoCDCoCDCoCDCoCDCoCAwIMKgIMKgIMKgIMKgIMKgIMKgIMKgIEZp bGUgb3duZXIKZGlfZ2lkIMKgIMKgIMKgIMKgIMKgIMKgIMKgIDAgwqAgwqAgwqAgwqAgwqAgwqAg wqAgRmlsZSBncm91cApkaV9ibGtzaXplIMKgIMKgIMKgIMKgIMKgIDAgwqAgwqAgwqAgwqAgwqAg wqAgwqAgSW5vZGUgYmxvY2tzaXplCmRpX3NpemUgwqAgwqAgwqAgwqAgwqAgwqAgwqA1MTIgwqAg wqAgwqAgwqAgwqAgwqAgRmlsZSBieXRlIGNvdW50CmRpX2Jsb2NrcyDCoCDCoCDCoCDCoCDCoCDC oDQgwqAgwqAgwqAgwqAgwqAgwqAgwqAgQmxvY2tzIGFjdHVhbGx5IGhlbGQKZGlfYXRpbWUgwqAg wqAgwqAgwqAgwqAgwqAgMTI3ODY1NTkwNyDCoCDCoCDCoExhc3QgYWNjZXNzIHRpbWUKZGlfbXRp bWUgwqAgwqAgwqAgwqAgwqAgwqAgMTI3ODY1NTkwNyDCoCDCoCDCoExhc3QgbW9kaWZpZWQgdGlt ZQpkaV9jdGltZSDCoCDCoCDCoCDCoCDCoCDCoCAxMjc4NjU1OTA3IMKgIMKgIMKgTGFzdCBpbm9k ZSBjaGFuZ2UgdGltZQpkaV9iaXJ0aHRpbWUgwqAgwqAgwqAgwqAgMTI3ODY1NTkwNyDCoCDCoCDC oElub2RlIGNyZWF0aW9uIHRpbWUKZGlfbXRpbWVuc2VjIMKgIMKgIMKgIMKgIMKgMCDCoCDCoCDC oCDCoCDCoCDCoCDCoExhc3QgbW9kaWZpZWQgdGltZSAobnMpCmRpX2F0aW1lbnNlYyDCoCDCoCDC oCDCoCDCoDAgwqAgwqAgwqAgwqAgwqAgwqAgwqBMYXN0IGFjY2VzcyB0aW1lIChucykKZGlfY3Rp bWVuc2VjIMKgIMKgIMKgIMKgIMKgMCDCoCDCoCDCoCDCoCDCoCDCoCDCoExhc3QgaW5vZGUgY2hh bmdlIHRpbWUgKG5zKQpkaV9iaXJ0aG5zZWMgwqAgwqAgwqAgwqAgwqAwIMKgIMKgIMKgIMKgIMKg IMKgIMKgSW5vZGUgY3JlYXRpb24gdGltZSAobnMpCmRpX2tlcm5mbGFncyDCoCDCoCDCoCDCoCAw IMKgIMKgIMKgIMKgIMKgIMKgIMKgIEtlcm5lbCBmbGFncwpkaV9mbGFncyDCoCDCoCDCoCDCoCDC oCDCoCAwIMKgIMKgIMKgIMKgIMKgIMKgIMKgIFN0YXR1cyBmbGFncyAoY2hmbGFncykKZGlfZXh0 c2l6ZSDCoCDCoCDCoCDCoCDCoCDCoDAgwqAgwqAgwqAgwqAgwqAgwqAgwqBFeHRlcm5hbCBhdHRy aWJ1dGVzIGJsb2NrIHNpemUKZGlfZXh0YiDCoCDCoCDCoCDCoCDCoCDCoCBbMF0gMCDCoCDCoCDC oCDCoCDCoCDCoCDCoCBFeHRlcm5hbCBhdHRyaWJ1dGVzIGJsb2NrcwpkaV9leHRiIMKgIMKgIMKg IMKgIMKgIMKgIFsxXSAwIMKgIMKgIMKgIMKgIMKgIMKgIMKgIEV4dGVybmFsIGF0dHJpYnV0ZXMg YmxvY2tzCmRpX2RiIMKgIMKgIMKgIMKgIMKgIMKgIMKgIFswXSA1ODQgwqAgwqAgwqAgwqAgwqAg wqAgRGlyZWN0IGRpc2sgYmxvY2tzCmRpX2RiIMKgIMKgIMKgIMKgIMKgIMKgIMKgIFsxXSAwIMKg IMKgIMKgIMKgIMKgIMKgIMKgIERpcmVjdCBkaXNrIGJsb2NrcwpkaV9kYiDCoCDCoCDCoCDCoCDC oCDCoCDCoCBbMl0gMCDCoCDCoCDCoCDCoCDCoCDCoCDCoCBEaXJlY3QgZGlzayBibG9ja3MKZGlf ZGIgwqAgwqAgwqAgwqAgwqAgwqAgwqAgWzNdIDAgwqAgwqAgwqAgwqAgwqAgwqAgwqAgRGlyZWN0 IGRpc2sgYmxvY2tzCmRpX2RiIMKgIMKgIMKgIMKgIMKgIMKgIMKgIFs0XSAwIMKgIMKgIMKgIMKg IMKgIMKgIMKgIERpcmVjdCBkaXNrIGJsb2NrcwpkaV9kYiDCoCDCoCDCoCDCoCDCoCDCoCDCoCBb NV0gMCDCoCDCoCDCoCDCoCDCoCDCoCDCoCBEaXJlY3QgZGlzayBibG9ja3MKZGlfZGIgwqAgwqAg wqAgwqAgwqAgwqAgwqAgWzZdIDAgwqAgwqAgwqAgwqAgwqAgwqAgwqAgRGlyZWN0IGRpc2sgYmxv Y2tzCmRpX2RiIMKgIMKgIMKgIMKgIMKgIMKgIMKgIFs3XSAwIMKgIMKgIMKgIMKgIMKgIMKgIMKg IERpcmVjdCBkaXNrIGJsb2NrcwpkaV9kYiDCoCDCoCDCoCDCoCDCoCDCoCDCoCBbOF0gMCDCoCDC oCDCoCDCoCDCoCDCoCDCoCBEaXJlY3QgZGlzayBibG9ja3MKZGlfZGIgwqAgwqAgwqAgwqAgwqAg wqAgwqAgWzldIDAgwqAgwqAgwqAgwqAgwqAgwqAgwqAgRGlyZWN0IGRpc2sgYmxvY2tzCmRpX2Ri IMKgIMKgIMKgIMKgIMKgIMKgIMKgIFsxMF0gMCDCoCDCoCDCoCDCoCDCoCDCoCDCoCBEaXJlY3Qg ZGlzayBibG9ja3MKZGlfZGIgwqAgwqAgwqAgwqAgwqAgwqAgwqAgWzExXSAwIMKgIMKgIMKgIMKg IMKgIMKgIMKgIERpcmVjdCBkaXNrIGJsb2NrcwpkaV9pYiDCoCDCoCDCoCDCoCDCoCDCoCDCoCBb MF0gMCDCoCDCoCDCoCDCoCDCoCDCoCDCoCBJbmRpcmVjdCBkaXNrIGJsb2NrcwpkaV9pYiDCoCDC oCDCoCDCoCDCoCDCoCDCoCBbMV0gMCDCoCDCoCDCoCDCoCDCoCDCoCDCoCBJbmRpcmVjdCBkaXNr IGJsb2NrcwpkaV9pYiDCoCDCoCDCoCDCoCDCoCDCoCDCoCBbMl0gMCDCoCDCoCDCoCDCoCDCoCDC oCDCoCBJbmRpcmVjdCBkaXNrIGJsb2NrcwpkaV9tb2RyZXYgwqAgwqAgwqAgwqAgwqAgwqAwIMKg IMKgIMKgIMKgIMKgIMKgIMKgIGlfbW9kcmV2IGZvciBORlN2NAp1ZnNfZGI+IGxpc3QKMCAyMzM2 CkZpbGVuYW1lIMKgIMKgIMKgIMKgIMKgIMKgIMKgIMKgIMKgIMKgIMKgIMKgIElub2RlIMKgIMKg IMKgVHlwZQouIMKgIMKgIMKgIMKgIMKgIMKgIMKgIMKgIMKgIMKgIMKgIMKgIMKgIMKgIMKgIMKg MiDCoCDCoCDCoCDCoCDCoDQKLi4gwqAgwqAgwqAgwqAgwqAgwqAgwqAgwqAgwqAgwqAgwqAgwqAg wqAgwqAgwqAgMiDCoCDCoCDCoCDCoCDCoDQKLnNuYXAgwqAgwqAgwqAgwqAgwqAgwqAgwqAgwqAg wqAgwqAgwqAgwqAgwqAgwqAzIMKgIMKgIMKgIMKgIMKgNAoKVGhhbmtzLAotLQpBbGkgTWFzaHRp emFkZWgK2LnZhNuMINmF2LTYqtuMINiy2KfYr9mHCg== From owner-freebsd-fs@FreeBSD.ORG Fri Jul 9 14:11:58 2010 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 97544106564A for ; Fri, 9 Jul 2010 14:11:58 +0000 (UTC) (envelope-from dak.col@gmail.com) Received: from mail-gx0-f182.google.com (mail-gx0-f182.google.com [209.85.161.182]) by mx1.freebsd.org (Postfix) with ESMTP id 4ED458FC12 for ; Fri, 9 Jul 2010 14:11:57 +0000 (UTC) Received: by gxk24 with SMTP id 24so1485033gxk.13 for ; Fri, 09 Jul 2010 07:11:49 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:received:in-reply-to :references:date:message-id:subject:from:to:cc:content-type; bh=AOptDAla8PHyz+D6ySvfqy41i0sXpSe91gwi/SRCMmM=; b=vkYX1HsTWw+HDDbc+aIBN1fSU6bGEL4R5ixqh101qXCeecHCy2kULoCElXE7M+WMP/ 8fnfeYl9iSI4m8Ya6epRmRtyFvc9hPbbXkYGABoiw1lPH06uvkK/sV1nJHw2nC0oMdbA RC3bOJRKnyRuWwws7q/LD+ZlvKcN6QgjfrE+Y= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; b=nCfW5fJ9jfcPHGbUL7dvCiXRIUowE7u0Qnqx9i3KCugX/1GJuV2+hdY0DNLkuBc4E9 ttcw30DJAPL9D0hl/BUQtrW67mMHwd/WGbrm0na23YOeyxMxI+52H2Nm6UTqJLltagZj vHRrQBP5XVTdqIOdRrIKeSqZn33b3H3ub81Ss= MIME-Version: 1.0 Received: by 10.90.120.7 with SMTP id s7mr3579211agc.57.1278684693372; Fri, 09 Jul 2010 07:11:33 -0700 (PDT) Received: by 10.90.28.10 with HTTP; Fri, 9 Jul 2010 07:11:33 -0700 (PDT) In-Reply-To: <4C367E87.7050505@modulus.org> References: <4C3563A8.7060301@modulus.org> <688583.92527.qm@web112402.mail.gq1.yahoo.com> <4C358574.2040009@modulus.org> <394586.41761.qm@web112417.mail.gq1.yahoo.com> <4C367E87.7050505@modulus.org> Date: Fri, 9 Jul 2010 09:11:33 -0500 Message-ID: From: Diego Arias To: Andrew Snow Content-Type: text/plain; charset=UTF-8 X-Content-Filtered-By: Mailman/MimeDel 2.1.5 Cc: freebsd-fs@freebsd.org Subject: Re: Freebsd 8 Release /usr Die After host VMWARE Crash X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 09 Jul 2010 14:11:58 -0000 On Thu, Jul 8, 2010 at 8:42 PM, Andrew Snow wrote: > On 08/07/10 23:59, Diego Arias wrote: > >> /dev/ad0s1a on / (ufs, local) >> >> - the machine was installed on VirtualBox then migrated to VMWARE with >> the convert utility from Virtualbox >> > > Ahh, I think this is the problem. When converting from VBox, it uses an > ATA disk, instead of VMWare's default of SCSI guest disks. > > This means FreeBSD enables the ATA write cache by default, which VMware > honors and might be prone to lose data on power outage. > > I suspect you should either set hw.ata.wc=0 in loader.conf, or switch to > SCSI gues disk type. > > But its still possible there was CPU/RAM problems and you were just lucky > that the other guest disks didnt get corrupt as they may not have been > writing to directory metadata at the time of crash. > > > - Andrew > Well you are right, i have ATA type devices (ad0). so im going to change to SAS -- mmm, interesante..... From owner-freebsd-fs@FreeBSD.ORG Sat Jul 10 05:10:09 2010 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 30B371065676 for ; Sat, 10 Jul 2010 05:10:09 +0000 (UTC) (envelope-from rincebrain@gmail.com) Received: from mail-iw0-f182.google.com (mail-iw0-f182.google.com [209.85.214.182]) by mx1.freebsd.org (Postfix) with ESMTP id 8D0658FC21 for ; Sat, 10 Jul 2010 05:10:08 +0000 (UTC) Received: by iwn35 with SMTP id 35so3645991iwn.13 for ; Fri, 09 Jul 2010 22:10:07 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:received:date:message-id :subject:from:to:content-type; bh=ZN1KeyaeIlnfKbYlpE8Ag2FkVQ9h0JePl7H6RIegxC0=; b=glsjlXFj/SOtzLImdtpR+b4CJyUfxUcIVXTAyG815iZxwpKTwMOmn0jBGdAH+7eyFD 4hA6Ve3QZ2vuyF/W9RJXwq1XCkbQHk81Tr9I5nJDstg/J43xqytlz7GDyiSUQIJF5gy5 v/uYC2s3qnVKaHIEWjxppvBHOKPDeD3iBsl7A= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:date:message-id:subject:from:to:content-type; b=Nya2ShRPHj/rxoHQMF7Ti9+m1cUmdKyfdRfgDtH007fOOivlhk7tymW2e34QaXP+jK hV0j8j1qeC6CXE/XXtJC6HwEo3MkYuqcjLl7nVbmj9wDvxV3CW3KMi5Zz6B40PXM/a+S hg9aZwu+ZG8oQO0d8BZW19sF3PgYt8UkCD6ys= MIME-Version: 1.0 Received: by 10.231.14.194 with SMTP id h2mr11049308iba.67.1278738607500; Fri, 09 Jul 2010 22:10:07 -0700 (PDT) Received: by 10.231.191.134 with HTTP; Fri, 9 Jul 2010 22:10:07 -0700 (PDT) Date: Sat, 10 Jul 2010 01:10:07 -0400 Message-ID: From: Rich To: freebsd-fs Content-Type: text/plain; charset=ISO-8859-1 Subject: "zpool import" hangs forever on r209755 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 10 Jul 2010 05:10:09 -0000 Hey world, I had a system I filed a bug about, wherein the system hung all "zfs/zpool [command] queries some time after booting, followed by the machine itself becoming entirely unresponsive. I've upgraded it to a newer kernel, with CTF+DDB+DTRACE configured in it, and a proper rev number, and I nuked /boot/zfs/zpool.cache after I found the machine was hanging forever after it loaded zfs.ko. Now, the machine boots, but zpool status returns "no pools available" [good] and zpool import hangs forever with no output [which is bad]. I presume I can ask dtrace/ddb/etc for useful information regarding this, but is this expected? Did some crazy commit make it in that was immediately reverted after I fetched source? Thanks, - Rich