From owner-freebsd-fs@FreeBSD.ORG Mon Dec 24 11:06:57 2007 Return-Path: Delivered-To: freebsd-fs@hub.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 85EA316A417 for ; Mon, 24 Dec 2007 11:06:57 +0000 (UTC) (envelope-from owner-bugmaster@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:4f8:fff6::28]) by mx1.freebsd.org (Postfix) with ESMTP id 6A4FB13C447 for ; Mon, 24 Dec 2007 11:06:57 +0000 (UTC) (envelope-from owner-bugmaster@FreeBSD.org) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.2/8.14.2) with ESMTP id lBOB6vYn031920 for ; Mon, 24 Dec 2007 11:06:57 GMT (envelope-from owner-bugmaster@FreeBSD.org) Received: (from gnats@localhost) by freefall.freebsd.org (8.14.2/8.14.1/Submit) id lBOB6uFP031916 for freebsd-fs@FreeBSD.org; Mon, 24 Dec 2007 11:06:56 GMT (envelope-from owner-bugmaster@FreeBSD.org) Date: Mon, 24 Dec 2007 11:06:56 GMT Message-Id: <200712241106.lBOB6uFP031916@freefall.freebsd.org> X-Authentication-Warning: freefall.freebsd.org: gnats set sender to owner-bugmaster@FreeBSD.org using -f From: FreeBSD bugmaster To: freebsd-fs@FreeBSD.org Cc: Subject: Current problem reports assigned to freebsd-fs@FreeBSD.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 24 Dec 2007 11:06:57 -0000 Current FreeBSD problem reports Critical problems Serious problems S Tracker Resp. Description -------------------------------------------------------------------------------- o kern/112658 fs [smbfs] [patch] smbfs and caching problems (resolves b o kern/114676 fs [ufs] snapshot creation panics: snapacct_ufs2: bad blo o kern/114856 fs [ntfs] [patch] Bug in NTFS allows bogus file modes. o kern/116170 fs Kernel panic when mounting /tmp o kern/118322 fs [panic] Sometimes (seldom), "panic:page fault" happens 5 problems total. Non-critical problems S Tracker Resp. Description -------------------------------------------------------------------------------- o kern/114847 fs [ntfs] [patch] dirmask support for NTFS ala MSDOSFS o bin/118249 fs mv(1): moving a directory changes its mtime 2 problems total. From owner-freebsd-fs@FreeBSD.ORG Tue Dec 25 00:44:34 2007 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 87B5516A419 for ; Tue, 25 Dec 2007 00:44:34 +0000 (UTC) (envelope-from peter.schuller@infidyne.com) Received: from smtp.infidyne.com (ds9.infidyne.com [88.80.6.206]) by mx1.freebsd.org (Postfix) with ESMTP id 45C1F13C465 for ; Tue, 25 Dec 2007 00:44:34 +0000 (UTC) (envelope-from peter.schuller@infidyne.com) Received: from c-8216e555.03-51-73746f3.cust.bredbandsbolaget.se (c-8216e555.03-51-73746f3.cust.bredbandsbolaget.se [85.229.22.130]) by smtp.infidyne.com (Postfix) with ESMTP id D4F4877E7D; Tue, 25 Dec 2007 01:44:32 +0100 (CET) From: Peter Schuller To: freebsd-fs@freebsd.org Date: Tue, 25 Dec 2007 02:44:23 +0100 User-Agent: KMail/1.9.7 References: In-Reply-To: MIME-Version: 1.0 Content-Type: multipart/signed; boundary="nextPart1743473.odSBQiRclx"; protocol="application/pgp-signature"; micalg=pgp-sha1 Content-Transfer-Encoding: 7bit Message-Id: <200712250244.32695.peter.schuller@infidyne.com> Cc: Subject: Re: How safe is ZFS to use for a home user? X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 25 Dec 2007 00:44:34 -0000 --nextPart1743473.odSBQiRclx Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable Content-Disposition: inline > Has anybody ever lost data due to any bugs or anything like that? I'm > going to use this ZFS as my primary storage medium (and poor man's > backup solution), so I would be devastated if I lost my entire array > due to a bug or other issue (aside from losing two hard drives in a > three hard drive RAID-Z array). I haven't. Haven't really seen anyone say they have either, in a way that w= as=20 due to a ZFS bug. I'm using it for several machines (both private and in production). Two of= =20 them are doing raidz2 with 5 and 6 disks respectively, another two are doin= g=20 three-way mirroring. Based on past experience and behavior in various edge cases (port outtages,= =20 crashes causing rebuilds, etc), I feel safer with ZFS than without, even if= =20 the implementation is not as mature as UFS. However, the fact that I "feel= =20 safe" is of course not very objective nor useful ;) Just with these select few machines, I have already had snapshots save me a= t=20 least once and checksumming "sort of" saved me once. And knowing that 'zpoo= l=20 scrub' really tests your integrity properly is *so* re-assuring I can't eve= n=20 begin to describe it. That said, no raid/storage solution is ever going to be perfect. Insert=20 standard rant about keeping backups here. =2D-=20 / Peter Schuller PGP userID: 0xE9758B7D or 'Peter Schuller ' Key retrieval: Send an E-Mail to getpgpkey@scode.org E-Mail: peter.schuller@infidyne.com Web: http://www.scode.org --nextPart1743473.odSBQiRclx Content-Type: application/pgp-signature; name=signature.asc Content-Description: This is a digitally signed message part. -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.4 (FreeBSD) iD8DBQBHcGCADNor2+l1i30RAlKAAKDKFR2qp0t5ttfpsn673ZwVTvOCBACg7Xi6 SUMJZEpTIUJZufAdzfREPB4= =mpNn -----END PGP SIGNATURE----- --nextPart1743473.odSBQiRclx-- From owner-freebsd-fs@FreeBSD.ORG Tue Dec 25 03:45:55 2007 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 968C316A417; Tue, 25 Dec 2007 03:45:55 +0000 (UTC) (envelope-from bsd@fluffles.net) Received: from mail.fluffles.net (fluffles.net [80.69.95.190]) by mx1.freebsd.org (Postfix) with ESMTP id 604D313C458; Tue, 25 Dec 2007 03:45:55 +0000 (UTC) (envelope-from bsd@fluffles.net) Received: from [10.0.0.18] (82-169-78-205.dsl.ip.tiscali.nl [82.169.78.205]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) (Authenticated sender: info@fluffles.net) by mail.fluffles.net (Postfix) with ESMTP id ECCC5B2A1F1; Tue, 25 Dec 2007 04:25:30 +0100 (CET) Message-ID: <47707966.4030309@fluffles.net> Date: Tue, 25 Dec 2007 04:30:46 +0100 From: "fluffles.net" User-Agent: Thunderbird 2.0.0.6 (X11/20071022) MIME-Version: 1.0 To: Ivan Voras References: <475D7866.1070803@hangwithme.com> <475D7D60.4040701@fuckner.net> <20071212003235.G54053@3jane.math.ualberta.ca> In-Reply-To: Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit Cc: freebsd-fs@freebsd.org, freebsd-hardware@freebsd.org Subject: Re: large disk > 8 TB X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 25 Dec 2007 03:45:55 -0000 Ivan Voras wrote: > Barkley Vowk wrote: > > >> It looks like he created a 32bit disk label. He needs to use either the >> raw device, or gpt partitions I think. >> >> Ie. /dev/mdid1 or /dev/mdid1p1 instead of /dev/mdid1s1 >> > > You're right :) > I didn't think of checking that - a wrong assumption at my part. > If you are using partitions on a RAID device, you have to make sure you don't end up with a stripe misalignment. If there is misalignment then you end up requiring 2 I/O requests whereas otherwise 1 I/O request would suffice. Naturally this decreases IO performance (less IOps). To avoid a misalignment you have two options: - not using partitions, but using the raw device like Barkley said - use partitions (GPT or normal) and create one large partition, which starts at offset 1MiB (not MB!) thus 1024*1024 bytes. Note that you probably need to convert this to sectors (512 bytes). If you do option 2 right, then the partition will start at precisely the start of a new stripe block - thus there is no misalignment. You can use offsets like 64KiB and 128KiB but i prefer to use 1MiB since that will work with all stripesizes (up to 1MiB, which is rarely used). Merry christmas to all. :) Regards, Veronica From owner-freebsd-fs@FreeBSD.ORG Wed Dec 26 13:47:02 2007 Return-Path: Delivered-To: freebsd-fs@hub.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 7AC5A16A417; Wed, 26 Dec 2007 13:47:02 +0000 (UTC) (envelope-from kris@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:4f8:fff6::28]) by mx1.freebsd.org (Postfix) with ESMTP id 60C8313C455; Wed, 26 Dec 2007 13:47:02 +0000 (UTC) (envelope-from kris@FreeBSD.org) Received: from freefall.freebsd.org (kris@localhost [127.0.0.1]) by freefall.freebsd.org (8.14.2/8.14.2) with ESMTP id lBQDl2WY041782; Wed, 26 Dec 2007 13:47:02 GMT (envelope-from kris@freefall.freebsd.org) Received: (from kris@localhost) by freefall.freebsd.org (8.14.2/8.14.1/Submit) id lBQDl2o2041778; Wed, 26 Dec 2007 13:47:02 GMT (envelope-from kris) Date: Wed, 26 Dec 2007 13:47:02 GMT Message-Id: <200712261347.lBQDl2o2041778@freefall.freebsd.org> To: kris@FreeBSD.org, freebsd-fs@FreeBSD.org, scottl@FreeBSD.org From: kris@FreeBSD.org Cc: Subject: Re: kern/118322: [panic] Sometimes (seldom), "panic:page fault" happens after KDE automount occur when I insert CD/DVD X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 26 Dec 2007 13:47:02 -0000 Synopsis: [panic] Sometimes (seldom), "panic:page fault" happens after KDE automount occur when I insert CD/DVD Responsible-Changed-From-To: freebsd-fs->scottl Responsible-Changed-By: kris Responsible-Changed-When: Wed Dec 26 13:46:48 UTC 2007 Responsible-Changed-Why: Scott is interested in UDF http://www.freebsd.org/cgi/query-pr.cgi?pr=118322 From owner-freebsd-fs@FreeBSD.ORG Wed Dec 26 18:52:05 2007 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 7DCFA16A41B; Wed, 26 Dec 2007 18:52:05 +0000 (UTC) (envelope-from avg@icyb.net.ua) Received: from falcon.cybervisiontech.com (falcon.cybervisiontech.com [217.20.163.9]) by mx1.freebsd.org (Postfix) with ESMTP id 395FB13C45D; Wed, 26 Dec 2007 18:52:05 +0000 (UTC) (envelope-from avg@icyb.net.ua) Received: from localhost (localhost [127.0.0.1]) by falcon.cybervisiontech.com (Postfix) with ESMTP id 7005174400D; Wed, 26 Dec 2007 20:28:14 +0200 (EET) X-Virus-Scanned: Debian amavisd-new at falcon.cybervisiontech.com Received: from falcon.cybervisiontech.com ([127.0.0.1]) by localhost (falcon.cybervisiontech.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id Ix1nbKY6fNC9; Wed, 26 Dec 2007 20:28:14 +0200 (EET) Received: from [10.2.1.87] (gateway.cybervisiontech.com.ua [88.81.251.18]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by falcon.cybervisiontech.com (Postfix) with ESMTP id DC074744001; Wed, 26 Dec 2007 20:28:13 +0200 (EET) Message-ID: <47729D3C.8050301@icyb.net.ua> Date: Wed, 26 Dec 2007 20:28:12 +0200 From: Andriy Gapon User-Agent: Thunderbird 2.0.0.9 (X11/20071116) MIME-Version: 1.0 To: bug-followup@FreeBSD.org, andrew@dobrohot.org Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit Cc: freebsd-fs@freebsd.org Subject: Re: kern/118322: [panic] Sometimes (seldom), "panic:page fault" happens after KDE automount occur when I insert CD/DVD X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 26 Dec 2007 18:52:05 -0000 http://www.freebsd.org/cgi/query-pr.cgi?pr=118322 This panic looks like dereferencing a NULL pointer to a structure: > fault virtual address = 0x2c 44 is exactly an offset of 'perm' field in file_entry structure and fentry is a field of 'struct file_entry *' type in udf_node structure. >From the code it seems that fentry field can not be NULL during "normal" life-cycle of udf_node. Memory allocation is properly checked for errors. The only suspicious place is udf_reclaim() where memory is freed. It seems that some race condition could have allowed access to that udf (v)node while it was being reclaimed. Comparing udf_reclaim (and cd9660_reclaim for that matter) with ufs_reclaim I see that the latter has the following code: /* * Lock the clearing of v_data so ffs_lock() can inspect it * prior to obtaining the lock. */ VI_LOCK(vp); vp->v_data = 0; VI_UNLOCK(vp); Important difference is that UFS code has the lock and it frees the actual data after setting v_data pointer to NULL, UDF and CD9660 do not have any locks and free the data before resetting v_data. I am no filesystem expert, but I suspect that the above might be important in the mpsafe vfs world. But maybe this is just a red herring. P.S. author of the quoted ufs code, Jeff Roberson, is bcc-ed -- Andriy Gapon From owner-freebsd-fs@FreeBSD.ORG Wed Dec 26 22:20:49 2007 Return-Path: Delivered-To: freebsd-fs@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id BD16F16A418 for ; Wed, 26 Dec 2007 22:20:49 +0000 (UTC) (envelope-from bp@barryp.org) Received: from eden.barryp.org (host-42-60-230-24.midco.net [24.230.60.42]) by mx1.freebsd.org (Postfix) with ESMTP id 99CED13C447 for ; Wed, 26 Dec 2007 22:20:49 +0000 (UTC) (envelope-from bp@barryp.org) Received: from geo.med.und.nodak.edu ([134.129.166.11]) by eden.barryp.org with esmtpsa (TLSv1:AES256-SHA:256) (Exim 4.67 (FreeBSD)) (envelope-from ) id 1J7dxc-000377-BM; Wed, 26 Dec 2007 15:38:44 -0600 Message-ID: <4772C9EE.8090407@barryp.org> Date: Wed, 26 Dec 2007 15:38:54 -0600 From: Barry Pederson User-Agent: Thunderbird 2.0.0.9 (X11/20071031) MIME-Version: 1.0 To: Pawel Jakub Dawidek References: <474546F5.2000007@fsn.hu> <20071123183420.GA12811@garage.freebsd.pl> In-Reply-To: <20071123183420.GA12811@garage.freebsd.pl> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: freebsd-fs@FreeBSD.org Subject: Re: ZFS and FAULTED devices (corrupted data), can't make the pool ONLINE again X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 26 Dec 2007 22:20:49 -0000 Pawel Jakub Dawidek wrote: > On Thu, Nov 22, 2007 at 10:08:05AM +0100, Attila Nagy wrote: >> Hello, >> >> FreeBSD RELENG_7, x86, a terrible disk array, called Promise RM-8000 >> with 8 disks on an ahc. >> The pool is a RAIDZ2. >> Tomorrow the array went crazy (its firmware is a total crap), so I had >> to reboot both the machine and the disk array. >> > > You should use: > > # zpool replace people da3 da3 > > but to do it, you need this patch, which was not yet MFCed: > > http://www.freebsd.org/cgi/cvsweb.cgi/src/sys/contrib/opensolaris/uts/common/fs/zfs/vdev.c.diff?r1=1.3;r2=1.4 I had a drive in a raidz2 pool fail, and wasn't able to replace it until rebuilding the kernel (7.0beta3) with the above patch. I'm just mentioning this as a worksforme kind of thing. I'm rebuilding today with RELENG_7_0 and saw that the patch still applied cleanly, so I'm assuming it's still necessary. I hope it or something similar gets merged in. Or at least maybe the ZFS wiki could have a list of recommended patches? Barry From owner-freebsd-fs@FreeBSD.ORG Wed Dec 26 22:41:36 2007 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 775E816A417 for ; Wed, 26 Dec 2007 22:41:36 +0000 (UTC) (envelope-from bp@barryp.org) Received: from eden.barryp.org (host-42-60-230-24.midco.net [24.230.60.42]) by mx1.freebsd.org (Postfix) with ESMTP id 5370D13C467 for ; Wed, 26 Dec 2007 22:41:36 +0000 (UTC) (envelope-from bp@barryp.org) Received: from geo.med.und.nodak.edu ([134.129.166.11]) by eden.barryp.org with esmtpsa (TLSv1:AES256-SHA:256) (Exim 4.67 (FreeBSD)) (envelope-from ) id 1J7ewR-0003PG-Oh; Wed, 26 Dec 2007 16:41:35 -0600 Message-ID: <4772D8AA.6080905@barryp.org> Date: Wed, 26 Dec 2007 16:41:46 -0600 From: Barry Pederson User-Agent: Thunderbird 2.0.0.9 (X11/20071031) MIME-Version: 1.0 To: Sverre Svenningsen References: <474546F5.2000007@fsn.hu> <20071123183420.GA12811@garage.freebsd.pl> <4772C9EE.8090407@barryp.org> <28DDC893-1432-4F0B-95D4-AA6049CD5FB2@online.no> In-Reply-To: <28DDC893-1432-4F0B-95D4-AA6049CD5FB2@online.no> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: freebsd-fs@freebsd.org Subject: Re: ZFS and FAULTED devices (corrupted data), can't make the pool ONLINE again X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 26 Dec 2007 22:41:36 -0000 Sverre Svenningsen wrote: > Doesn't it work even when doing a "zfs offline people da3" first? I > installed a 7.0-beta in a Parallels VM just to torture test the raidz > recreation (since my real hardware is running linux+evms right now) and > i got the error that the device was in use, until i issued the offline > command FIRST and then told it to replace the offlined disk with the > same disk. > > This should probably be emphasized in the zfs crash course documentation :) > > -Sverre IIRC, at the time, no - it didn't work to do "zpool offline". I was a while ago though, but I'm fairly certain I tried "offline", "detach", and "replace", and even export/import. I could be wrong, and it could be different now though *shrug*. Barry From owner-freebsd-fs@FreeBSD.ORG Wed Dec 26 23:15:27 2007 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 49FF416A417 for ; Wed, 26 Dec 2007 23:15:27 +0000 (UTC) (envelope-from ss.alert@online.no) Received: from mail44.e.nsc.no (mail44.e.nsc.no [193.213.115.44]) by mx1.freebsd.org (Postfix) with ESMTP id 7B55B13C478 for ; Wed, 26 Dec 2007 23:15:26 +0000 (UTC) (envelope-from ss.alert@online.no) Received: from basilisk (ti0034a340-0801.bb.online.no [88.90.3.33]) by mail44.nsc.no (8.13.8/8.13.5) with ESMTP id lBQMUgaD024165 (version=TLSv1/SSLv3 cipher=AES128-SHA bits=128 verify=NOT); Wed, 26 Dec 2007 23:30:43 +0100 (MET) Message-Id: <28DDC893-1432-4F0B-95D4-AA6049CD5FB2@online.no> From: Sverre Svenningsen To: Barry Pederson In-Reply-To: <4772C9EE.8090407@barryp.org> Content-Type: text/plain; charset=US-ASCII; format=flowed; delsp=yes Content-Transfer-Encoding: 7bit Mime-Version: 1.0 (Apple Message framework v915) Date: Wed, 26 Dec 2007 23:30:42 +0100 References: <474546F5.2000007@fsn.hu> <20071123183420.GA12811@garage.freebsd.pl> <4772C9EE.8090407@barryp.org> X-Mailer: Apple Mail (2.915) Cc: freebsd-fs@freebsd.org Subject: Re: ZFS and FAULTED devices (corrupted data), can't make the pool ONLINE again X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 26 Dec 2007 23:15:27 -0000 On Dec 26, 2007, at 22:38 , Barry Pederson wrote: > Pawel Jakub Dawidek wrote: >> On Thu, Nov 22, 2007 at 10:08:05AM +0100, Attila Nagy wrote: >>> Hello, >>> >>> FreeBSD RELENG_7, x86, a terrible disk array, called Promise >>> RM-8000 with 8 disks on an ahc. >>> The pool is a RAIDZ2. >>> Tomorrow the array went crazy (its firmware is a total crap), so I >>> had to reboot both the machine and the disk array. >>> >> You should use: >> # zpool replace people da3 da3 >> but to do it, you need this patch, which was not yet MFCed: >> http://www.freebsd.org/cgi/cvsweb.cgi/src/sys/contrib/opensolaris/uts/common/fs/zfs/vdev.c.diff?r1=1.3;r2=1.4 > > I had a drive in a raidz2 pool fail, and wasn't able to replace it > until rebuilding the kernel (7.0beta3) with the above patch. I'm > just mentioning this as a worksforme kind of thing. > > I'm rebuilding today with RELENG_7_0 and saw that the patch still > applied cleanly, so I'm assuming it's still necessary. I hope it or > something similar gets merged in. Or at least maybe the ZFS wiki > could have a list of recommended patches? > > Barry > Doesn't it work even when doing a "zfs offline people da3" first? I installed a 7.0-beta in a Parallels VM just to torture test the raidz recreation (since my real hardware is running linux+evms right now) and i got the error that the device was in use, until i issued the offline command FIRST and then told it to replace the offlined disk with the same disk. This should probably be emphasized in the zfs crash course documentation :) -Sverre From owner-freebsd-fs@FreeBSD.ORG Thu Dec 27 01:13:37 2007 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 8D16A16A419 for ; Thu, 27 Dec 2007 01:13:37 +0000 (UTC) (envelope-from srajag00@yahoo.com) Received: from web90511.mail.mud.yahoo.com (web90511.mail.mud.yahoo.com [216.252.100.178]) by mx1.freebsd.org (Postfix) with SMTP id 5686E13C442 for ; Thu, 27 Dec 2007 01:13:37 +0000 (UTC) (envelope-from srajag00@yahoo.com) Received: (qmail 46919 invoked by uid 60001); 27 Dec 2007 00:46:48 -0000 DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=s1024; d=yahoo.com; h=X-YMail-OSG:Received:Date:From:Subject:To:MIME-Version:Content-Type:Content-Transfer-Encoding:Message-ID; b=bJAI+X4KQeuemTITqXaEkzo5hSsaKTFlB75ZQaad8Yqw7ppRn/NRJEBVImyXGJTXga70/97/XoeXnh8yKNw9GikcPZyuz7CcZP0oLRAcPJc4U1KoSHZd0esrGmbx2r1RKC7vwPAEJV1/SRSp1FXQf6q+QWTAA2xFlL/+RF8Dwpw=; X-YMail-OSG: _O5P0k4VM1nleRjld4IsY6lXrdI4TTcIFzrK_s0.espdei_23QEOT3KTR85eMTs6RFa2jZ03woJDgGlJd3MhgcoOiOxJVrZkqBN87YozHWylk_Cc2KQ- Received: from [66.129.224.36] by web90511.mail.mud.yahoo.com via HTTP; Wed, 26 Dec 2007 16:46:48 PST Date: Wed, 26 Dec 2007 16:46:48 -0800 (PST) From: Raja Sivaramakrishnan To: freebsd-fs@freebsd.org MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: 8bit Message-ID: <701759.46468.qm@web90511.mail.mud.yahoo.com> Subject: namei lookup vnode locking X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 27 Dec 2007 01:13:37 -0000 Hello, I encountered an issue with FreeBSD 6.1 and would appreciate some feedback on this. The problem happens when a perl script running on some client system does a telnet into the FreeBSD box, exits from the login shell and immediately exits the perl script too. After the script exits, there is a deadlock on the FreeBSD box that prevents new processes (such as ps, top etc.) from starting. Upon investigation, this seems to be caused due to the following sequence of events on the FreeBSD system. 1) login process exits. exit call in the kernel closes all file descriptors. One of these is the fd for /dev/ttyp0, used for the telnet session. login locks the vnode for /dev/ttyp0 and waits for 5 minutes in order for the tty to drain (ttywait() call). 2) The tty is supposed to be drained by telnetd. However, telnetd sees the network connection go down when the perl script exits. As a result, it jumps to cleanup code, where it tries to do chmod on /dev/ttyp0. chmod syscall attempts to lock /dev/ttyp0, but fails as the lock is held by login, which puts telnetd process to sleep. However, telnetd holds the lock on the vnode for /dev. It appears that the lock was acquired when doing the namei lookup for /dev/ttyp0. The current state is that there is output in the tty that has to be read by telnetd, but it can't because it is sleeping for the /dev/ttyp0 lock. telnetd is holding the /dev vnode lock while sleeping. 3) As a result, any process that needs the /dev vnode lock is put to sleep for 5 minutes (ttywait waits for a default of 5 minutes). Even if a process wants to open an unrelated device file, /dev/foo, it is not able to do so because the /dev lock is held by telnetd. Few questions: 1) Does namei lookup need to acquire an exclusive lock on intermediate vnodes when looking up a pathname i.e. if telnetd is trying to lookup /dev/ttyp0, does it need to get an exclusive lock on /dev? Can it be a shared lock that will allow at least other readers to make progress? 2) Besides relaxing the locking above, any other thoughts on how to fix this? Reducing the tty timeout from the close routine is another option, but that only limits the duration of the deadlock. Thanks, Raja ____________________________________________________________________________________ Never miss a thing. Make Yahoo your home page. http://www.yahoo.com/r/hs From owner-freebsd-fs@FreeBSD.ORG Thu Dec 27 13:15:28 2007 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id B23FD16A417 for ; Thu, 27 Dec 2007 13:15:28 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from relay02.kiev.sovam.com (relay02.kiev.sovam.com [62.64.120.197]) by mx1.freebsd.org (Postfix) with ESMTP id 4695A13C467 for ; Thu, 27 Dec 2007 13:15:28 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from [212.82.216.226] (helo=deviant.kiev.zoral.com.ua) by relay02.kiev.sovam.com with esmtps (TLSv1:AES256-SHA:256) (Exim 4.67) (envelope-from ) id 1J7sa6-000Jlz-J2 for freebsd-fs@freebsd.org; Thu, 27 Dec 2007 15:15:27 +0200 Received: from deviant.kiev.zoral.com.ua (kostik@localhost [127.0.0.1]) by deviant.kiev.zoral.com.ua (8.14.2/8.14.2) with ESMTP id lBRDFNj2026151; Thu, 27 Dec 2007 15:15:23 +0200 (EET) (envelope-from kostikbel@gmail.com) Received: (from kostik@localhost) by deviant.kiev.zoral.com.ua (8.14.2/8.14.2/Submit) id lBRDFM9A026150; Thu, 27 Dec 2007 15:15:22 +0200 (EET) (envelope-from kostikbel@gmail.com) X-Authentication-Warning: deviant.kiev.zoral.com.ua: kostik set sender to kostikbel@gmail.com using -f Date: Thu, 27 Dec 2007 15:15:21 +0200 From: Kostik Belousov To: Raja Sivaramakrishnan Message-ID: <20071227131521.GO57756@deviant.kiev.zoral.com.ua> References: <701759.46468.qm@web90511.mail.mud.yahoo.com> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="5tiY7shzwSGI9p6W" Content-Disposition: inline In-Reply-To: <701759.46468.qm@web90511.mail.mud.yahoo.com> User-Agent: Mutt/1.4.2.3i X-Scanner-Signature: 6bd56f30920205a35a58b455a86ead09 X-DrWeb-checked: yes X-SpamTest-Envelope-From: kostikbel@gmail.com X-SpamTest-Group-ID: 00000000 X-SpamTest-Info: Profiles 1966 [Dec 27 2007] X-SpamTest-Info: helo_type=3 X-SpamTest-Info: {SMTP from is not routable} X-SpamTest-Info: {received from trusted relay: not dialup} X-SpamTest-Method: none X-SpamTest-Method: Local Lists X-SpamTest-Rate: 19 X-SpamTest-Status: Not detected X-SpamTest-Status-Extended: not_detected X-SpamTest-Version: SMTP-Filter Version 3.0.0 [0255], KAS30/Release Cc: freebsd-fs@freebsd.org Subject: Re: namei lookup vnode locking X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 27 Dec 2007 13:15:28 -0000 --5tiY7shzwSGI9p6W Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Wed, Dec 26, 2007 at 04:46:48PM -0800, Raja Sivaramakrishnan wrote: > Hello, > I encountered an issue with FreeBSD 6.1 and would > appreciate some feedback on this. The problem happens First, there were the significant locking fixes for the devfs after 6.1, in particular, check that 6.2, or, even better 6.3-latest RC shows the errant behaviour. It may be all fixed already. For proper reporting of the deadlock, see kernel debug chapter of the developer handbook, in particular, deadlock section. I have a doubt regarding you analysis as far as I was able to understand it. ttydrain() ioctl does not hold the dev vnode lock while calling the driver. Anyway, do what I recommended above. > when a perl script running on some client system does=20 > a telnet into the FreeBSD box, exits from the login=20 > shell and immediately exits the perl script too. After > the script exits, there is a deadlock on the FreeBSD > box that prevents new processes (such as ps, top etc.) > from starting. Upon investigation, this seems to be > caused due to the following sequence of events on the > FreeBSD system. >=20 > 1) login process exits. exit call in the kernel closes > all file descriptors. One of these is the fd for > /dev/ttyp0, used for the telnet session. login locks > the vnode for /dev/ttyp0 and waits for 5 minutes in > order for the tty to drain (ttywait() call). >=20 > 2) The tty is supposed to be drained by telnetd.=20 > However, telnetd sees the network connection go=20 > down when the perl script exits. As a result, it > jumps to cleanup code, where it tries to do chmod > on /dev/ttyp0. chmod syscall attempts to lock > /dev/ttyp0, but fails as the lock is held by login, > which puts telnetd process to sleep. However,=20 > telnetd holds the lock on the vnode for /dev. > It appears that the lock was acquired when doing the > namei lookup for /dev/ttyp0. The current state is > that there is output in the tty that has to be > read by telnetd, but it can't because it is sleeping > for the /dev/ttyp0 lock. telnetd is holding the > /dev vnode lock while sleeping. >=20 > 3) As a result, any process that needs the /dev > vnode lock is put to sleep for 5 minutes (ttywait > waits for a default of 5 minutes). Even if a > process wants to open an unrelated device file, > /dev/foo, it is not able to do so because the /dev > lock is held by telnetd. >=20 > Few questions: >=20 > 1) Does namei lookup need to acquire an exclusive > lock on intermediate vnodes when looking up a pathname > i.e. if telnetd is trying to lookup /dev/ttyp0, does > it need to get an exclusive lock on /dev? Can it > be a shared lock that will allow at least other > readers > to make progress? >=20 > 2) Besides relaxing the locking above, any other > thoughts on how to fix this? Reducing the tty timeout > from the close routine is another option, but that > only limits the duration of the deadlock. >=20 > Thanks, >=20 > Raja >=20 >=20 > ___________________________________________________________________= _________________ > Never miss a thing. Make Yahoo your home page.=20 > http://www.yahoo.com/r/hs > _______________________________________________ > freebsd-fs@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-fs > To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org" --5tiY7shzwSGI9p6W Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.7 (FreeBSD) iD8DBQFHc6VpC3+MBN1Mb4gRAhrBAJsFhdoxgeovIz5GlGNnUq+FrEXJUwCgpzpX 5pLJcQy/LtDqoVIk1Ce/9Jo= =NQA7 -----END PGP SIGNATURE----- --5tiY7shzwSGI9p6W-- From owner-freebsd-fs@FreeBSD.ORG Thu Dec 27 17:17:39 2007 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 0AF6A16A41A for ; Thu, 27 Dec 2007 17:17:39 +0000 (UTC) (envelope-from srajag00@yahoo.com) Received: from web90514.mail.mud.yahoo.com (web90514.mail.mud.yahoo.com [216.252.100.181]) by mx1.freebsd.org (Postfix) with SMTP id C9E4113C46E for ; Thu, 27 Dec 2007 17:17:38 +0000 (UTC) (envelope-from srajag00@yahoo.com) Received: (qmail 3586 invoked by uid 60001); 27 Dec 2007 17:17:38 -0000 DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=s1024; d=yahoo.com; h=X-YMail-OSG:Received:Date:From:Subject:To:Cc:In-Reply-To:MIME-Version:Content-Type:Content-Transfer-Encoding:Message-ID; b=cPHaRWNA6UX7l3ibum4TFfYSvAzo2q57bhpDtrT6q/d0yUGCxqTqczmf6ocXDikOzvzS+GMSKg0TtahoI72y66QuO1NGsAwO94wGsJG+TePCbN14gETF/Kku/hiNYV2dWIvykASdxAnyJg3zmShx8TsQ2XkxFzXWqREXr+lZwBQ=; X-YMail-OSG: dp9PSFsVM1lX.qxF0fqmgtUpi7Wc9p4lKmx0f460R1zrPyOOYGXl4eg6l2GrfNG34J9pnavV6HvG2st.oSMCZsMPV7sI2LdnwwYFyeJLPJmAwhr6FV4- Received: from [71.139.47.161] by web90514.mail.mud.yahoo.com via HTTP; Thu, 27 Dec 2007 09:17:38 PST Date: Thu, 27 Dec 2007 09:17:38 -0800 (PST) From: Raja Sivaramakrishnan To: Kostik Belousov In-Reply-To: <20071227131521.GO57756@deviant.kiev.zoral.com.ua> MIME-Version: 1.0 Message-ID: <153149.2801.qm@web90514.mail.mud.yahoo.com> Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: 8bit X-Content-Filtered-By: Mailman/MimeDel 2.1.5 Cc: freebsd-fs@freebsd.org Subject: Re: namei lookup vnode locking X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 27 Dec 2007 17:17:39 -0000 Thanks for the response - I'll take a look at the handbook. Regarding ttywait, it was not called through the ttydrain ioctl. ttywait was called through fdfree from exit1() when the login process was exiting. I believe this is called with the vnode lock held. - Raja First, there were the significant locking fixes for the devfs after 6.1, in particular, check that 6.2, or, even better 6.3-latest RC shows the errant behaviour. It may be all fixed already. For proper reporting of the deadlock, see kernel debug chapter of the developer handbook, in particular, deadlock section. I have a doubt regarding you analysis as far as I was able to understand it. ttydrain() ioctl does not hold the dev vnode lock while calling the driver. Anyway, do what I recommended above. --------------------------------- Never miss a thing. Make Yahoo your homepage. From owner-freebsd-fs@FreeBSD.ORG Thu Dec 27 19:43:06 2007 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id BA66F16A418 for ; Thu, 27 Dec 2007 19:43:06 +0000 (UTC) (envelope-from johan@stromnet.se) Received: from core.stromnet.se (core.stromnet.se [83.218.84.131]) by mx1.freebsd.org (Postfix) with ESMTP id 4710313C442 for ; Thu, 27 Dec 2007 19:43:05 +0000 (UTC) (envelope-from johan@stromnet.se) Received: from localhost (unknown [83.218.84.135]) by core.stromnet.se (Postfix) with ESMTP id 59FB5D46F37 for ; Thu, 27 Dec 2007 20:26:18 +0100 (CET) X-Virus-Scanned: amavisd-new at stromnet.se Received: from core.stromnet.se ([83.218.84.131]) by localhost (core.stromnet.se [83.218.84.135]) (amavisd-new, port 10024) with ESMTP id BMcjXaLS8SH7 for ; Thu, 27 Dec 2007 20:26:13 +0100 (CET) Received: from [172.28.1.102] (90-224-172-102-no129.tbcn.telia.com [90.224.172.102]) by core.stromnet.se (Postfix) with ESMTP id D619AD46405 for ; Thu, 27 Dec 2007 20:26:13 +0100 (CET) Mime-Version: 1.0 (Apple Message framework v753) Content-Transfer-Encoding: quoted-printable Message-Id: <5A6CFB06-4175-452F-BFC9-323C2023D2F6@stromnet.se> Content-Type: text/plain; charset=ISO-8859-1; delsp=yes; format=flowed To: freebsd-fs@freebsd.org From: =?ISO-8859-1?Q?Johan_Str=F6m?= Date: Thu, 27 Dec 2007 20:25:34 +0100 X-Mailer: Apple Mail (2.753) Subject: ZFS replace/expand problem X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 27 Dec 2007 19:43:06 -0000 Hello list First of all, I want to thank everybody involved in writing and =20 porting ZFS to FreeBSD, its working (except for this problem) great =20 for me! Now to my problem. To sumarize it, I want to replace two mirrored =20 disk with bigger ones. Replace works well but the vdev doesnt expand =20 until i do export/import. Details follows: I currently have the following setup: back-1 /$ zpool status pool: tank state: ONLINE scrub: none requested config: NAME STATE READ WRITE CKSUM tank ONLINE 0 0 0 mirror ONLINE 0 0 0 ad14s1d ONLINE 0 0 0 ad16s1d ONLINE 0 0 0 mirror ONLINE 0 0 0 ad8 ONLINE 0 0 0 ad10s2 ONLINE 0 0 0 mirror ONLINE 0 0 0 ad12 ONLINE 0 0 0 ad10s1 ONLINE 0 0 0 The ad8/ad10/ad12 setup is kindof stupid, I know.. ad8 is a 80Gb and =20 ad10 is a 120Gb, and a10 200Gb.. But now I want to replace those two =20 mirrors with 4x 300GB (or rather 2x300 and 2x320). So my plan was to =20 do something like: zpool replace tank ad8 ad18 zpool replace tank ad10s2 ad20 where ad18 and ad20 are the two 300Gbs.. Then the same thing for ad12 =20= and ad10s1.. But before I did that i wanted to make sure that it =20 would actually expand as I'ev read, so i tried this first.. On ad18/ad20 I had ad*s1a, a 500MB partition, and ad*s1g a ~280Gb =20 partition. So i created a testtank with first ad*s1a: back-1 /$ zpool create testtank mirror /dev/ad18s1a /dev/ad20s1a back-1 /$ zpool list NAME SIZE USED AVAIL CAP HEALTH ALTROOT tank 878G 812G 65.1G 92% ONLINE - testtank 492M 111K 492M 0% ONLINE - back-1 /$ zpool status .. pool: testtank state: ONLINE scrub: none requested config: NAME STATE READ WRITE CKSUM testtank ONLINE 0 0 0 mirror ONLINE 0 0 0 ad18s1a ONLINE 0 0 0 ad20s1a ONLINE 0 0 0 errors: No known data errors back-1 /storage$ zpool replace testtank ad18s1a ad18s1g status now shows mirror ONLINE 0 0 0 replacing ONLINE 0 0 0 ad18s1a ONLINE 0 0 0 ad18s1g ONLINE 0 0 0 ad20s1a ONLINE 0 0 0 when that was done (and only ad18s1g was showing) i did back-1 /storage$ zpool replace testtank ad20s1a ad20s1g and then same replacing output as above (but for ad20) Okey, so now when this is done.. it should have expanded one would =20 think, right? back-1 /storage$ zpool list NAME SIZE USED AVAIL CAP HEALTH ALTROOT .. testtank 492M 218K 492M 0% ONLINE - Nope.. Waited a while, nothing happened.. Some googling gave me that =20 export/import could be done: back-1 /storage$ zpool export testtank back-1 /storage$ zpool import testtank back-1 /storage$ zpool list NAME SIZE USED AVAIL CAP HEALTH ALTROOT .. testtank 289G 132K 289G 0% ONLINE - Yey! Okey so it expands, but only after export/import.. Havent realy =20 found much docs about this but according to ppl in #opensolaris this =20 should not be necessary. Not a big deal in this test case, but doing it for my real tank will =20 require me to take the system down on an external boot medium (CD or =20 something) I guess, and then do zfs export/import there, and then =20 boot back up.. Any guidelines how to do this? Will doing import/export from a CD =20 (rescue shell I guess) work as I expect? Or what would be the =20 smartest way (the actual downtime isnt such a big deal as long as it =20 is quick and works). Thanks! -- Johan Str=F6m Stromnet johan@stromnet.se http://www.stromnet.se/ From owner-freebsd-fs@FreeBSD.ORG Thu Dec 27 22:57:52 2007 Return-Path: Delivered-To: freebsd-fs@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 7782016A41A for ; Thu, 27 Dec 2007 22:57:52 +0000 (UTC) (envelope-from brde@optusnet.com.au) Received: from mail03.syd.optusnet.com.au (mail03.syd.optusnet.com.au [211.29.132.184]) by mx1.freebsd.org (Postfix) with ESMTP id 0B60013C459 for ; Thu, 27 Dec 2007 22:57:51 +0000 (UTC) (envelope-from brde@optusnet.com.au) Received: from c211-30-219-213.carlnfd3.nsw.optusnet.com.au (c211-30-219-213.carlnfd3.nsw.optusnet.com.au [211.30.219.213]) by mail03.syd.optusnet.com.au (8.13.1/8.13.1) with ESMTP id lBRMvmOi004501 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Fri, 28 Dec 2007 09:57:49 +1100 Date: Fri, 28 Dec 2007 09:57:48 +1100 (EST) From: Bruce Evans X-X-Sender: bde@delplex.bde.org To: Raja Sivaramakrishnan In-Reply-To: <153149.2801.qm@web90514.mail.mud.yahoo.com> Message-ID: <20071228092149.T17606@delplex.bde.org> References: <153149.2801.qm@web90514.mail.mud.yahoo.com> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: freebsd-fs@FreeBSD.org Subject: Re: namei lookup vnode locking X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 27 Dec 2007 22:57:52 -0000 On Thu, 27 Dec 2007, Raja Sivaramakrishnan wrote: > Thanks for the response - I'll take a look at the handbook. > Regarding ttywait, it was not called through the ttydrain ioctl. > ttywait was called through fdfree from exit1() when the login > process was exiting. I believe this is called with the vnode > lock held. Calling device close with the vnode lock held was a large bug. It was one of the bugs fixed in 6.2 (devfs_vnops.c 1.114.2.12 2006/10/30 by kib MFC 1.136 by kib). It was broken at a higher level by locking the vnode in vn_close() starting before 6.0 (vfs_vnops.c 1.224 2005/03/13). Bruce From owner-freebsd-fs@FreeBSD.ORG Fri Dec 28 18:56:02 2007 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 316DC16A417; Fri, 28 Dec 2007 18:56:02 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from relay02.kiev.sovam.com (relay02.kiev.sovam.com [62.64.120.197]) by mx1.freebsd.org (Postfix) with ESMTP id A068113C459; Fri, 28 Dec 2007 18:56:01 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from [212.82.216.226] (helo=deviant.kiev.zoral.com.ua) by relay02.kiev.sovam.com with esmtps (TLSv1:AES256-SHA:256) (Exim 4.67) (envelope-from ) id 1J8KN8-000NkD-6n; Fri, 28 Dec 2007 20:56:00 +0200 Received: from deviant.kiev.zoral.com.ua (kostik@localhost [127.0.0.1]) by deviant.kiev.zoral.com.ua (8.14.2/8.14.2) with ESMTP id lBSItrpU041287; Fri, 28 Dec 2007 20:55:53 +0200 (EET) (envelope-from kostikbel@gmail.com) Received: (from kostik@localhost) by deviant.kiev.zoral.com.ua (8.14.2/8.14.2/Submit) id lBSItrvp041286; Fri, 28 Dec 2007 20:55:53 +0200 (EET) (envelope-from kostikbel@gmail.com) X-Authentication-Warning: deviant.kiev.zoral.com.ua: kostik set sender to kostikbel@gmail.com using -f Date: Fri, 28 Dec 2007 20:55:53 +0200 From: Kostik Belousov To: Andriy Gapon Message-ID: <20071228185553.GW57756@deviant.kiev.zoral.com.ua> References: <47729D3C.8050301@icyb.net.ua> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="i1KFSYFbl/HTybMx" Content-Disposition: inline In-Reply-To: <47729D3C.8050301@icyb.net.ua> User-Agent: Mutt/1.4.2.3i X-Scanner-Signature: 8a658898c353158d5b622bcbe36a9e12 X-DrWeb-checked: yes X-SpamTest-Envelope-From: kostikbel@gmail.com X-SpamTest-Group-ID: 00000000 X-SpamTest-Info: Profiles 1973 [Dec 28 2007] X-SpamTest-Info: helo_type=3 X-SpamTest-Info: {received from trusted relay: not dialup} X-SpamTest-Method: none X-SpamTest-Method: Local Lists X-SpamTest-Rate: 0 X-SpamTest-Status: Not detected X-SpamTest-Status-Extended: not_detected X-SpamTest-Version: SMTP-Filter Version 3.0.0 [0255], KAS30/Release Cc: freebsd-fs@freebsd.org, andrew@dobrohot.org, bug-followup@freebsd.org Subject: Re: kern/118322: [panic] Sometimes (seldom), "panic:page fault" happens after KDE automount occur when I insert CD/DVD X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 28 Dec 2007 18:56:02 -0000 --i1KFSYFbl/HTybMx Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Wed, Dec 26, 2007 at 08:28:12PM +0200, Andriy Gapon wrote: >=20 > http://www.freebsd.org/cgi/query-pr.cgi?pr=3D118322 >=20 > This panic looks like dereferencing a NULL pointer to a structure: > > fault virtual address =3D 0x2c > 44 is exactly an offset of 'perm' field in file_entry structure and > fentry is a field of 'struct file_entry *' type in udf_node structure. >=20 > >From the code it seems that fentry field can not be NULL during "normal" > life-cycle of udf_node. Memory allocation is properly checked for errors. Yes, allocations are checked, but look at the series of the if()s after the partially constructed vnode is put onto the hash. In the case any of the if() fail, the vnode is simply vput()ed. This leaves the vnode allocated and on the hash etc, while the unode->fentry is NULL. There, the vnode can be found by the namei, that I believe causes the panic. The difference between UFS and UDF code there is the ufs_inactive() routine that is defined for UFS, and that reclaims the vnode when it is in half-baked state. Please, try the patch below (only compile-tested). Note: it seems that the system shall say something before the panic (see the printf()s before the vput() in the code). diff --git a/sys/fs/udf/udf_vfsops.c b/sys/fs/udf/udf_vfsops.c index d08226b..373ee4d 100644 --- a/sys/fs/udf/udf_vfsops.c +++ b/sys/fs/udf/udf_vfsops.c @@ -630,6 +630,7 @@ udf_vget(struct mount *mp, ino_t ino, int flags, struct= vnode **vpp) devvp =3D udfmp->im_devvp; if ((error =3D RDSECTOR(devvp, sector, udfmp->bsize, &bp)) !=3D 0) { printf("Cannot read sector %d\n", sector); + vgone(vp); vput(vp); brelse(bp); *vpp =3D NULL; @@ -639,6 +640,7 @@ udf_vget(struct mount *mp, ino_t ino, int flags, struct= vnode **vpp) fe =3D (struct file_entry *)bp->b_data; if (udf_checktag(&fe->tag, TAGID_FENTRY)) { printf("Invalid file entry!\n"); + vgone(vp); vput(vp); brelse(bp); *vpp =3D NULL; @@ -649,6 +651,7 @@ udf_vget(struct mount *mp, ino_t ino, int flags, struct= vnode **vpp) M_NOWAIT | M_ZERO); if (unode->fentry =3D=3D NULL) { printf("Cannot allocate file entry block\n"); + vgone(vp); vput(vp); brelse(bp); *vpp =3D NULL; --i1KFSYFbl/HTybMx Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.7 (FreeBSD) iD8DBQFHdUa4C3+MBN1Mb4gRAjhxAKCMfNkz755UcajtcsdTxEPFfSd5WACfbrGi WIw9PQ8fvva2pDoVTwC4dZE= =zPak -----END PGP SIGNATURE----- --i1KFSYFbl/HTybMx--