From owner-freebsd-fs@FreeBSD.ORG Sun Feb 24 05:12:41 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 94AD7D62 for ; Sun, 24 Feb 2013 05:12:41 +0000 (UTC) (envelope-from freebsd@deman.com) Received: from plato.corp.nas.com (plato.corp.nas.com [66.114.32.138]) by mx1.freebsd.org (Postfix) with ESMTP id 365761543 for ; Sun, 24 Feb 2013 05:12:40 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by plato.corp.nas.com (Postfix) with ESMTP id 46CBA1314141D for ; Sat, 23 Feb 2013 21:02:37 -0800 (PST) X-Virus-Scanned: amavisd-new at corp.nas.com Received: from plato.corp.nas.com ([127.0.0.1]) by localhost (plato.corp.nas.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id NHhXPEVRRwKh for ; Sat, 23 Feb 2013 21:02:37 -0800 (PST) Received: from [192.168.2.160] (mono-sis1.s.bli.openaccess.org [66.114.32.149]) by plato.corp.nas.com (Postfix) with ESMTPSA id DBF4A13141412 for ; Sat, 23 Feb 2013 21:02:36 -0800 (PST) Content-Type: text/plain; charset=iso-8859-1 Mime-Version: 1.0 (Mac OS X Mail 6.2 \(1499\)) Subject: Re: RFC: Suggesting ZFS "best practices" in FreeBSD From: Michael DeMan In-Reply-To: Date: Sat, 23 Feb 2013 21:02:36 -0800 Content-Transfer-Encoding: quoted-printable Message-Id: References: <314B600D-E8E6-4300-B60F-33D5FA5A39CF@sarenet.es> To: FreeBSD Filesystems X-Mailer: Apple Mail (2.1499) X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 24 Feb 2013 05:12:41 -0000 I have not a heard a word on this topic in a while and still think it is = a good idea. How can we move forward? How can I help? Would it be useful to have a = sharable space somewhere to discuss things so when a best practices = document that is available for all instead of the secret few - is = reliable? I am willing to put some effort in. Thanks, - Mike On Jan 22, 2013, at 5:27 PM, Michael DeMan wrote: > I think this would be awesome. Googling around it is extremely = difficult to know what to do and which practices are current or = obsolete, etc. > = From owner-freebsd-fs@FreeBSD.ORG Sun Feb 24 10:55:04 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 517FC1EC for ; Sun, 24 Feb 2013 10:55:04 +0000 (UTC) (envelope-from radiomlodychbandytow@o2.pl) Received: from tur.go2.pl (tur.go2.pl [193.17.41.50]) by mx1.freebsd.org (Postfix) with ESMTP id D7DBF1C3A for ; Sun, 24 Feb 2013 10:55:03 +0000 (UTC) Received: from moh1-ve1.go2.pl (moh1-ve1.go2.pl [193.17.41.131]) by tur.go2.pl (Postfix) with ESMTP id B6D7615A06AC for ; Sun, 24 Feb 2013 11:54:55 +0100 (CET) Received: from moh1-ve1.go2.pl (unknown [10.0.0.131]) by moh1-ve1.go2.pl (Postfix) with ESMTP id 9979891D58F for ; Sun, 24 Feb 2013 11:54:35 +0100 (CET) Received: from unknown (unknown [10.0.0.108]) by moh1-ve1.go2.pl (Postfix) with SMTP for ; Sun, 24 Feb 2013 11:54:35 +0100 (CET) Received: from unknown [93.175.66.185] by poczta.o2.pl with ESMTP id GGKQxE; Sun, 24 Feb 2013 11:54:35 +0100 Message-ID: <5129F16A.6020505@o2.pl> Date: Sun, 24 Feb 2013 11:54:34 +0100 From: =?UTF-8?B?UmFkaW8gbcWCb2R5Y2ggYmFuZHl0w7N3?= User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:17.0) Gecko/20130201 Thunderbird/17.0.2 MIME-Version: 1.0 To: freebsd-fs@freebsd.org, "Ronald Klop" Subject: Re: Some filesystem thoughts References: In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-O2-Trust: 1, 37 X-O2-SPF: neutral X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 24 Feb 2013 10:55:04 -0000 "Ronald Klop" wrote: > Creative ideas. > Part of what you want is in fusefs (mounting of files to edit their > content). Mhm. Could you give some link or details in another form? And part is implemented in e.g. KDE (integrated support for > various file types in fulltext search and tagging of files/metadata, etc.). Well, I view it as not much different from implementing a TC / MC / VIM plugin. Anybody can benefit, but they have to implement the right API (And there are several programs that use TC plugins). It's interesting as a way of getting some of these benefits though. > The chances of having all these complex libraries integrated in the > FreeBSD OS are close to zero I presume. But I am not in a position to > decide about that. Frankly, I haven't expected anything different. My thoughts did jump to implementation issues and I see them numerous, but I think the idea itself is not sufficiently mature, so I decided to skip them in the first post. > I think you can't expect the OS to serve everybody's detailed wishes. I don't expect it. I just wanted to discuss an idea that seemed to have a potential. > The OS serves files and user programs know what to do with them. Unfortunately, far too often programs don't know it. Files are often not simple and a single program is unable to deal with them. The only way to deal with such cases ATM that I see is to manually remove layers obfuscating the meaningful sources. In some way, it resembles piping data through multiple programs, except that pipes transport bytes, not files and therefore the transformation has to be performed step by step. -- Twoje radio From owner-freebsd-fs@FreeBSD.ORG Sun Feb 24 21:45:13 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id C5781D57 for ; Sun, 24 Feb 2013 21:45:13 +0000 (UTC) (envelope-from ronald-freebsd8@klop.yi.org) Received: from smarthost1.greenhost.nl (smarthost1.greenhost.nl [195.190.28.78]) by mx1.freebsd.org (Postfix) with ESMTP id 600BD79D for ; Sun, 24 Feb 2013 21:45:13 +0000 (UTC) Received: from smtp.greenhost.nl ([213.108.104.138]) by smarthost1.greenhost.nl with esmtps (TLS1.0:RSA_AES_256_CBC_SHA1:32) (Exim 4.69) (envelope-from ) id 1U9jNX-0006Y3-7r; Sun, 24 Feb 2013 22:45:04 +0100 Received: from h253044.upc-h.chello.nl ([62.194.253.44] helo=pinky) by smtp.greenhost.nl with esmtpsa (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.72) (envelope-from ) id 1U9jNX-00012H-6y; Sun, 24 Feb 2013 22:45:03 +0100 Content-Type: text/plain; charset=utf-8; format=flowed; delsp=yes To: freebsd-fs@freebsd.org, =?utf-8?B?UmFkaW8gbcWCb2R5Y2ggYmFuZHl0w7N3?= Subject: Re: Some filesystem thoughts References: <5129F16A.6020505@o2.pl> Date: Sun, 24 Feb 2013 22:45:03 +0100 MIME-Version: 1.0 Content-Transfer-Encoding: 8bit From: "Ronald Klop" Message-ID: In-Reply-To: <5129F16A.6020505@o2.pl> User-Agent: Opera Mail/12.14 (Win32) X-Virus-Scanned: by clamav at smarthost1.samage.net X-Spam-Level: / X-Spam-Score: 0.8 X-Spam-Status: No, score=0.8 required=5.0 tests=BAYES_50 autolearn=disabled version=3.3.1 X-Scan-Signature: ba572e8a3bde05b4b19613c12a9e49fc X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 24 Feb 2013 21:45:13 -0000 On Sun, 24 Feb 2013 11:54:34 +0100, Radio młodych bandytów wrote: > "Ronald Klop" wrote: >> Creative ideas. >> Part of what you want is in fusefs (mounting of files to edit their >> content). > Mhm. Could you give some link or details in another form? Just google for 'fusefs'. Filesystems based on FUSE: http://sourceforge.net/apps/mediawiki/fuse/index.php?title=FileSystems > And part is implemented in e.g. KDE (integrated support for >> various file types in fulltext search and tagging of files/metadata, >> etc.). > Well, I view it as not much different from implementing a TC / MC / VIM > plugin. Anybody can benefit, but they have to implement the right API > (And there are several programs that use TC plugins). > It's interesting as a way of getting some of these benefits though. > >> The chances of having all these complex libraries integrated in the >> FreeBSD OS are close to zero I presume. But I am not in a position to >> decide about that. > Frankly, I haven't expected anything different. My thoughts did jump to > implementation issues and I see them numerous, but I think the idea > itself is not sufficiently mature, so I decided to skip them in the > first post. > >> I think you can't expect the OS to serve everybody's detailed wishes. > I don't expect it. I just wanted to discuss an idea that seemed to have > a potential. >> The OS serves files and user programs know what to do with them. > Unfortunately, far too often programs don't know it. Files are often not > simple and a single program is unable to deal with them. The only way to > deal with such cases ATM that I see is to manually remove layers > obfuscating the meaningful sources. In some way, it resembles piping > data through multiple programs, except that pipes transport bytes, not > files and therefore the transformation has to be performed step by step. Well. It is probably me, but I don't really get what you're trying to say here. Regards, Ronald. From owner-freebsd-fs@FreeBSD.ORG Sun Feb 24 22:41:07 2013 Return-Path: Delivered-To: freebsd-fs@smarthost.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 21BDC88F; Sun, 24 Feb 2013 22:41:07 +0000 (UTC) (envelope-from linimon@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:1900:2254:206c::16:87]) by mx1.freebsd.org (Postfix) with ESMTP id D53DD991; Sun, 24 Feb 2013 22:41:06 +0000 (UTC) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.6/8.14.6) with ESMTP id r1OMf6On003798; Sun, 24 Feb 2013 22:41:06 GMT (envelope-from linimon@freefall.freebsd.org) Received: (from linimon@localhost) by freefall.freebsd.org (8.14.6/8.14.6/Submit) id r1OMf60k003794; Sun, 24 Feb 2013 22:41:06 GMT (envelope-from linimon) Date: Sun, 24 Feb 2013 22:41:06 GMT Message-Id: <201302242241.r1OMf60k003794@freefall.freebsd.org> To: linimon@FreeBSD.org, freebsd-fs@FreeBSD.org, rmacklem@FreeBSD.org From: linimon@FreeBSD.org Subject: Re: kern/165923: [nfs] Writing to NFS-backed mmapped files fails if flushed automatically X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 24 Feb 2013 22:41:07 -0000 Synopsis: [nfs] Writing to NFS-backed mmapped files fails if flushed automatically Responsible-Changed-From-To: freebsd-fs->rmacklem Responsible-Changed-By: linimon Responsible-Changed-When: Sun Feb 24 22:40:07 UTC 2013 Responsible-Changed-Why: Over to committer for possible MFC. http://www.freebsd.org/cgi/query-pr.cgi?pr=165923 From owner-freebsd-fs@FreeBSD.ORG Sun Feb 24 23:32:14 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id F3459960 for ; Sun, 24 Feb 2013 23:32:13 +0000 (UTC) (envelope-from rfg@tristatelogic.com) Received: from outgoing.tristatelogic.com (segfault.tristatelogic.com [69.62.255.118]) by mx1.freebsd.org (Postfix) with ESMTP id C1242DDD for ; Sun, 24 Feb 2013 23:32:13 +0000 (UTC) Received: from segfault-nmh-helo.tristatelogic.com (localhost [127.0.0.1]) by segfault.tristatelogic.com (Postfix) with ESMTP id 8A4AD3B9DD for ; Sun, 24 Feb 2013 15:32:10 -0800 (PST) From: "Ronald F. Guilmette" To: freebsd-fs@freebsd.org Subject: Hard drive device names... Serial Numbers? Date: Sun, 24 Feb 2013 15:32:10 -0800 Message-ID: <2511.1361748730@server1.tristatelogic.com> X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 24 Feb 2013 23:32:14 -0000 Today I am diddling with a system of mine that already... before today... contained three SATA drives. I just now added to this system one old PATA drive I had lying around which I plan to use as a swap drive. The motherboard for the system in question has two of the older PATA ports (supporting up to four devices) and then also four SATA ports. It appears that the BIOS numbers the PATA devices first, and that FreeBSD just follows suit. Thus, for FreeBSD, whatever drive is the (PATA) primary master gets the name ada0, the primary slave, ada1, the secondary master ada2, the secondary slave ada3, and then the SATA ports get names ada4, ada5, etc. So anyway, adding the PATA drive to this system of course rendered everything I had previously had in my /etc/fstab suddenly incorrect. Fortunately, I anticipated this and was prepared to boot FreeBSD Live from a CD, and then go in and edit my /etc/fstab as necessary to adjust everything for the new hard drive numbers. This isn't the first time I've had to go through this process. It is always an annoyance. Up until today I was only dimly aware of the different approach described here: http://www.freebsd.org/doc/handbook/geom-glabel.html but today I was finally motivated to seek out and read the above page, which I have now done. Having now read all about temporary labels, permanment labels, and ufsids, and having noted the obvious drawbacks to each (including but not limited to the fact that these techniques generally only appear to be applicable exclusively to UFS file systems _and_ only recently created ones at that) I thought that I would take a second and ask about the general idea of using built-in hard drive serial numbers as a filesystem-independent and interface-independent way of identifying specific hard drive devices (and/or their sub-parts) e.g. within /etc/fstab. This idea seems so obviously that I am forced to assume that I'm probably far from the first person to have suggested and/or asked about it. So what gives? Why can't we have something like /dev/hdsn/ (hdsn == Hard Drive Serial Number) where a set of device numbers would automagically be created within that directory, all of whose names correspond to the actual hardware serial numbers of all currently attached hard drive type devices? (If just the serial numbers are not seen an being unique enough, I can imagine other unique or semi-unique properties of the drive being concatenated with the serial numbers.) It is easy also to envision obvious extensions to such a scheme. For example, a node named /dev/hdsn/1mgbhxed.s1a might represent the BSD "a" partition of MBR slice number 1 within the drive whose serial number is "1mgbhxed". Anyway, the whole point here is to have a naming convention that would work across both UFS and non-UFS filesystems, and also/even across both recently created UFS file systems which include the recently introduced ufsids as well as older pree-existing UFS filesystems. So, um, has any idea along these lines been disucssed previously? If so, what were the arguments for and against? Regards, rfg From owner-freebsd-fs@FreeBSD.ORG Mon Feb 25 06:45:04 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 22FB5DDF for ; Mon, 25 Feb 2013 06:45:04 +0000 (UTC) (envelope-from jdc@koitsu.org) Received: from qmta10.emeryville.ca.mail.comcast.net (qmta10.emeryville.ca.mail.comcast.net [IPv6:2001:558:fe2d:43:76:96:30:17]) by mx1.freebsd.org (Postfix) with ESMTP id ECF2682 for ; Mon, 25 Feb 2013 06:45:03 +0000 (UTC) Received: from omta24.emeryville.ca.mail.comcast.net ([76.96.30.92]) by qmta10.emeryville.ca.mail.comcast.net with comcast id 4Wl31l0021zF43QAAWl3hX; Mon, 25 Feb 2013 06:45:03 +0000 Received: from koitsu.strangled.net ([67.180.84.87]) by omta24.emeryville.ca.mail.comcast.net with comcast id 4Wl21l0041t3BNj8kWl2pg; Mon, 25 Feb 2013 06:45:02 +0000 Received: by icarus.home.lan (Postfix, from userid 1000) id 1ED0273A1C; Sun, 24 Feb 2013 22:45:02 -0800 (PST) Date: Sun, 24 Feb 2013 22:45:02 -0800 From: Jeremy Chadwick To: rfg@tristatelogic.com Subject: Re: Hard drive device names... Serial Numbers? Message-ID: <20130225064502.GA26208@icarus.home.lan> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.5.21 (2010-09-15) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=comcast.net; s=q20121106; t=1361774703; bh=iuXrbyQRIz8/VduCzoS0fqIcY7AiGRcwkxwryTeDHjY=; h=Received:Received:Received:Date:From:To:Subject:Message-ID: MIME-Version:Content-Type; b=ZAM7ckOmadbRYnwmU1awLvWGNTs6FbSYT81Z0EOdJpvJlnb4YVHfDZepz0vGozY1x Ywen9jDlBbEgY3BFYAFZqzgsP5OQaTD3iq/lVMdQN34Bz3yZ9LOmEABWSP3PXM/rxL eqA3bGNCkQD2eVmCLrraWrNtGtY/C0e6OAPvZefHzpsPQBo7O+BKGQhqZJ/rNPXOqk ub0Mg0AELDoOuGICT+trZQDCc0YCvHvX63bzjrYvqJbrzd7bapxYk5vcUN/2neiPMZ UvDR6eazzOrTZBh7pIPIPudsO0pla8f+vUnOijdrgQSbaarrtmsd+Tmwqkce8phofX 2pDastHBmcR+w== Cc: freebsd-fs@freebsd.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 25 Feb 2013 06:45:04 -0000 (Please keep me CC'd as I am not subscribed to freebsd-fs@) This topic has been discussed at length before, and recently, particularly between Warren Block and myself. The thread, which you can read time permitting (kind of scattered between two lists, sorry): http://lists.freebsd.org/pipermail/freebsd-fs/2013-January/016237.html http://lists.freebsd.org/pipermail/freebsd-stable/2013-January/071900.html The answer -- and I am hard set on this and will not bend, so anyone considering arguing with me on it should just save their breath -- is to use the "wiring down" or "wired down" capability of CAM(4) to ensure you get static device numbers for your disks (across multiple controllers too). You can then add/remove whatever you want and the numbers will remain the same/however you declared them in /boot/loader.conf. How to do that (references): http://lists.freebsd.org/pipermail/freebsd-stable/2013-January/071851.html http://lists.freebsd.org/pipermail/freebsd-fs/2011-March/011036.html http://lists.freebsd.org/pipermail/freebsd-fs/2012-June/014522.html Also see the CAM(4) man page for some details. It becomes a little more tricky depending on what controllers you have. All you have to do is spend some time paying very close attention to the dmesg output and working it out. Some reboots later you'll have it, and you won't have to touch it/change it. It's a one-time deal, and saves you all the pain and idiocy that labels introduce (I explain what those are in the initially-mentioned thread). Footnote: I have tried mailing you 3 separate times in the past about separate subjects and your mail server (server1.tristatelogic.com) intentionally rejects mail (550 5.7.1) from Comcast's SMTP servers. I gave up trying to contact you after repeated attempts. Example: > Reporting-MTA: dns; qmta01.emeryville.ca.mail.comcast.net [76.96.30.16] > Received-From-MTA: dns; omta05.emeryville.ca.mail.comcast.net [76.96.30.43] > Arrival-Date: Fri, 07 Dec 2012 15:07:11 +0000 > Final-recipient: rfc822; rfg@tristatelogic.com > Action: failed > Status: 5.1.1 > Diagnostic-Code: smtp; 550 5.7.1 : Client host rejected: emeryville.ca.mail.comcast.net is BLACKLISTED - Use http://www.tristatelogic.com/contact.html > Last-attempt-Date: Fri, 07 Dec 2012 15:07:13 +0000 If you have a problem with Comcast's mail servers, I can refer you to lots of different people on the Comcast side who can help with that; I'd be happy to talk to you off-list about it (but you'd have to release that blockage to actually see my responses to you, naturally). If this is a side effect of using DNSBLs and you need a DNSWL (whitelist),, you might look into dnswl.org. I stopped using them in 2012 given some changes of theirs which I did not agree with, but those reasons were my own and were of an "administrative annoyance" nature. -- | Jeremy Chadwick jdc@koitsu.org | | UNIX Systems Administrator http://jdc.koitsu.org/ | | Mountain View, CA, US | | Making life hard for others since 1977. PGP 4BD6C0CB | From owner-freebsd-fs@FreeBSD.ORG Mon Feb 25 11:03:34 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id D4F63E80 for ; Mon, 25 Feb 2013 11:03:34 +0000 (UTC) (envelope-from freebsd-listen@fabiankeil.de) Received: from smtprelay06.ispgateway.de (smtprelay06.ispgateway.de [80.67.31.96]) by mx1.freebsd.org (Postfix) with ESMTP id 60153DD6 for ; Mon, 25 Feb 2013 11:03:34 +0000 (UTC) Received: from [84.44.152.99] (helo=fabiankeil.de) by smtprelay06.ispgateway.de with esmtpsa (SSLv3:AES128-SHA:128) (Exim 4.68) (envelope-from ) id 1U9vqA-0007se-Jm; Mon, 25 Feb 2013 12:03:26 +0100 Date: Mon, 25 Feb 2013 12:02:34 +0100 From: Fabian Keil To: "Ronald F. Guilmette" Subject: Re: Hard drive device names... Serial Numbers? Message-ID: <20130225120234.66dd1b36@fabiankeil.de> In-Reply-To: <2511.1361748730@server1.tristatelogic.com> References: <2511.1361748730@server1.tristatelogic.com> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=PGP-SHA1; boundary="Sig_/JTNJXq7cwkjlmXwwtAuC3.o"; protocol="application/pgp-signature" X-Df-Sender: Nzc1MDY3 Cc: freebsd-fs@freebsd.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 25 Feb 2013 11:03:34 -0000 --Sig_/JTNJXq7cwkjlmXwwtAuC3.o Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: quoted-printable "Ronald F. Guilmette" wrote: > So anyway, adding the PATA drive to this system of course rendered > everything I had previously had in my /etc/fstab suddenly incorrect. > Fortunately, I anticipated this and was prepared to boot FreeBSD Live > from a CD, and then go in and edit my /etc/fstab as necessary to > adjust everything for the new hard drive numbers. >=20 > This isn't the first time I've had to go through this process. It is > always an annoyance. >=20 > Up until today I was only dimly aware of the different approach described > here: >=20 > http://www.freebsd.org/doc/handbook/geom-glabel.html >=20 > but today I was finally motivated to seek out and read the above page, > which I have now done. >=20 > Having now read all about temporary labels, permanment labels, and ufsids, > and having noted the obvious drawbacks to each (including but not limited > to the fact that these techniques generally only appear to be applicable > exclusively to UFS file systems _and_ only recently created ones at that) Only ufsids are limited to UFS, temporary and permanent labels are generic. > I thought that I would take a second and ask about the general idea of > using built-in hard drive serial numbers as a filesystem-independent > and interface-independent way of identifying specific hard drive devices > (and/or their sub-parts) e.g. within /etc/fstab. Note that you can already do that manually by putting the serial number in the permanent glabel label. If you are using GPT headers you need to be careful with this, though, for details see gpart(8). =20 > This idea seems so obviously that I am forced to assume that I'm probably > far from the first person to have suggested and/or asked about it. >=20 > So what gives? Why can't we have something like /dev/hdsn/ (hdsn =3D=3D > Hard Drive Serial Number) where a set of device numbers would automagical= ly > be created within that directory, all of whose names correspond to the > actual hardware serial numbers of all currently attached hard drive > type devices? (If just the serial numbers are not seen an being unique > enough, I can imagine other unique or semi-unique properties of the drive > being concatenated with the serial numbers.) I believe the main reasons is that so far nobody cared enough about this to provide patches. It has been suggested before and I don't remember anyone being against it. Dragonfly BSD already supports this and maybe parts of the code could be ported. Fabian --Sig_/JTNJXq7cwkjlmXwwtAuC3.o Content-Type: application/pgp-signature; name=signature.asc Content-Disposition: attachment; filename=signature.asc -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.19 (FreeBSD) iEYEARECAAYFAlErRM0ACgkQBYqIVf93VJ1BlACfRJHG0tl5WbexidOqr/KD4edI 6ucAn0nrZZkLp2Flk/LnCJ2w+wUznAel =oA+O -----END PGP SIGNATURE----- --Sig_/JTNJXq7cwkjlmXwwtAuC3.o-- From owner-freebsd-fs@FreeBSD.ORG Mon Feb 25 11:06:46 2013 Return-Path: Delivered-To: freebsd-fs@FreeBSD.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id A0400112 for ; Mon, 25 Feb 2013 11:06:46 +0000 (UTC) (envelope-from owner-bugmaster@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:1900:2254:206c::16:87]) by mx1.freebsd.org (Postfix) with ESMTP id 93722E6A for ; Mon, 25 Feb 2013 11:06:46 +0000 (UTC) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.6/8.14.6) with ESMTP id r1PB6kuD066574 for ; Mon, 25 Feb 2013 11:06:46 GMT (envelope-from owner-bugmaster@FreeBSD.org) Received: (from gnats@localhost) by freefall.freebsd.org (8.14.6/8.14.6/Submit) id r1PB6kwL066572 for freebsd-fs@FreeBSD.org; Mon, 25 Feb 2013 11:06:46 GMT (envelope-from owner-bugmaster@FreeBSD.org) Date: Mon, 25 Feb 2013 11:06:46 GMT Message-Id: <201302251106.r1PB6kwL066572@freefall.freebsd.org> X-Authentication-Warning: freefall.freebsd.org: gnats set sender to owner-bugmaster@FreeBSD.org using -f From: FreeBSD bugmaster To: freebsd-fs@FreeBSD.org Subject: Current problem reports assigned to freebsd-fs@FreeBSD.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 25 Feb 2013 11:06:46 -0000 Note: to view an individual PR, use: http://www.freebsd.org/cgi/query-pr.cgi?pr=(number). The following is a listing of current problems submitted by FreeBSD users. These represent problem reports covering all versions including experimental development code and obsolete releases. S Tracker Resp. Description -------------------------------------------------------------------------------- o bin/176253 fs zpool(8): zfs pool indentation is misleading/wrong o kern/176141 fs [zfs] sharesmb=on makes errors for sharenfs, and still o kern/175950 fs [zfs] Possible deadlock in zfs after long uptime o kern/175897 fs [zfs] operations on readonly zpool hang o kern/175179 fs [zfs] ZFS may attach wrong device on move o kern/175071 fs [ufs] [panic] softdep_deallocate_dependencies: unrecov o kern/174372 fs [zfs] Pagefault appears to be related to ZFS o kern/174315 fs [zfs] chflags uchg not supported o kern/174310 fs [zfs] root point mounting broken on CURRENT with multi o kern/174279 fs [ufs] UFS2-SU+J journal and filesystem corruption o kern/174060 fs [ext2fs] Ext2FS system crashes (buffer overflow?) o kern/173830 fs [zfs] Brain-dead simple change to ZFS error descriptio o kern/173718 fs [zfs] phantom directory in zraid2 pool f kern/173657 fs [nfs] strange UID map with nfsuserd o kern/173363 fs [zfs] [panic] Panic on 'zpool replace' on readonly poo o kern/173136 fs [unionfs] mounting above the NFS read-only share panic o kern/172348 fs [unionfs] umount -f of filesystem in use with readonly o kern/172334 fs [unionfs] unionfs permits recursive union mounts; caus o kern/171626 fs [tmpfs] tmpfs should be noisier when the requested siz o kern/171415 fs [zfs] zfs recv fails with "cannot receive incremental o kern/170945 fs [gpt] disk layout not portable between direct connect o bin/170778 fs [zfs] [panic] FreeBSD panics randomly o kern/170680 fs [nfs] Multiple NFS Client bug in the FreeBSD 7.4-RELEA o kern/170497 fs [xfs][panic] kernel will panic whenever I ls a mounted o kern/169945 fs [zfs] [panic] Kernel panic while importing zpool (afte o kern/169480 fs [zfs] ZFS stalls on heavy I/O o kern/169398 fs [zfs] Can't remove file with permanent error o kern/169339 fs panic while " : > /etc/123" o kern/169319 fs [zfs] zfs resilver can't complete o kern/168947 fs [nfs] [zfs] .zfs/snapshot directory is messed up when o kern/168942 fs [nfs] [hang] nfsd hangs after being restarted (not -HU o kern/168158 fs [zfs] incorrect parsing of sharenfs options in zfs (fs o kern/167979 fs [ufs] DIOCGDINFO ioctl does not work on 8.2 file syste o kern/167977 fs [smbfs] mount_smbfs results are differ when utf-8 or U o kern/167688 fs [fusefs] Incorrect signal handling with direct_io o kern/167685 fs [zfs] ZFS on USB drive prevents shutdown / reboot o kern/167612 fs [portalfs] The portal file system gets stuck inside po o kern/167272 fs [zfs] ZFS Disks reordering causes ZFS to pick the wron o kern/167260 fs [msdosfs] msdosfs disk was mounted the second time whe o kern/167109 fs [zfs] [panic] zfs diff kernel panic Fatal trap 9: gene o kern/167105 fs [nfs] mount_nfs can not handle source exports wiht mor o kern/167067 fs [zfs] [panic] ZFS panics the server o kern/167065 fs [zfs] boot fails when a spare is the boot disk o kern/167048 fs [nfs] [patch] RELEASE-9 crash when using ZFS+NULLFS+NF o kern/166912 fs [ufs] [panic] Panic after converting Softupdates to jo o kern/166851 fs [zfs] [hang] Copying directory from the mounted UFS di o kern/166477 fs [nfs] NFS data corruption. o kern/165950 fs [ffs] SU+J and fsck problem o kern/165521 fs [zfs] [hang] livelock on 1 Gig of RAM with zfs when 31 o kern/165392 fs Multiple mkdir/rmdir fails with errno 31 o kern/165087 fs [unionfs] lock violation in unionfs o kern/164472 fs [ufs] fsck -B panics on particular data inconsistency o kern/164370 fs [zfs] zfs destroy for snapshot fails on i386 and sparc o kern/164261 fs [nullfs] [patch] fix panic with NFS served from NULLFS o kern/164256 fs [zfs] device entry for volume is not created after zfs o kern/164184 fs [ufs] [panic] Kernel panic with ufs_makeinode o kern/163801 fs [md] [request] allow mfsBSD legacy installed in 'swap' o kern/163770 fs [zfs] [hang] LOR between zfs&syncer + vnlru leading to o kern/163501 fs [nfs] NFS exporting a dir and a subdir in that dir to o kern/162944 fs [coda] Coda file system module looks broken in 9.0 o kern/162860 fs [zfs] Cannot share ZFS filesystem to hosts with a hyph o kern/162751 fs [zfs] [panic] kernel panics during file operations o kern/162591 fs [nullfs] cross-filesystem nullfs does not work as expe o kern/162519 fs [zfs] "zpool import" relies on buggy realpath() behavi o kern/162362 fs [snapshots] [panic] ufs with snapshot(s) panics when g o kern/161968 fs [zfs] [hang] renaming snapshot with -r including a zvo o kern/161864 fs [ufs] removing journaling from UFS partition fails on o bin/161807 fs [patch] add option for explicitly specifying metadata o kern/161579 fs [smbfs] FreeBSD sometimes panics when an smb share is o kern/161533 fs [zfs] [panic] zfs receive panic: system ioctl returnin o kern/161438 fs [zfs] [panic] recursed on non-recursive spa_namespace_ o kern/161424 fs [nullfs] __getcwd() calls fail when used on nullfs mou o kern/161280 fs [zfs] Stack overflow in gptzfsboot o kern/161205 fs [nfs] [pfsync] [regression] [build] Bug report freebsd o kern/161169 fs [zfs] [panic] ZFS causes kernel panic in dbuf_dirty o kern/161112 fs [ufs] [lor] filesystem LOR in FreeBSD 9.0-BETA3 o kern/160893 fs [zfs] [panic] 9.0-BETA2 kernel panic o kern/160860 fs [ufs] Random UFS root filesystem corruption with SU+J o kern/160801 fs [zfs] zfsboot on 8.2-RELEASE fails to boot from root-o o kern/160790 fs [fusefs] [panic] VPUTX: negative ref count with FUSE o kern/160777 fs [zfs] [hang] RAID-Z3 causes fatal hang upon scrub/impo o kern/160706 fs [zfs] zfs bootloader fails when a non-root vdev exists o kern/160591 fs [zfs] Fail to boot on zfs root with degraded raidz2 [r o kern/160410 fs [smbfs] [hang] smbfs hangs when transferring large fil o kern/160283 fs [zfs] [patch] 'zfs list' does abort in make_dataset_ha o kern/159930 fs [ufs] [panic] kernel core o kern/159402 fs [zfs][loader] symlinks cause I/O errors o kern/159357 fs [zfs] ZFS MAXNAMELEN macro has confusing name (off-by- o kern/159356 fs [zfs] [patch] ZFS NAME_ERR_DISKLIKE check is Solaris-s o kern/159351 fs [nfs] [patch] - divide by zero in mountnfs() o kern/159251 fs [zfs] [request]: add FLETCHER4 as DEDUP hash option o kern/159077 fs [zfs] Can't cd .. with latest zfs version o kern/159048 fs [smbfs] smb mount corrupts large files o kern/159045 fs [zfs] [hang] ZFS scrub freezes system o kern/158839 fs [zfs] ZFS Bootloader Fails if there is a Dead Disk o kern/158802 fs amd(8) ICMP storm and unkillable process. o kern/158231 fs [nullfs] panic on unmounting nullfs mounted over ufs o f kern/157929 fs [nfs] NFS slow read o kern/157399 fs [zfs] trouble with: mdconfig force delete && zfs strip o kern/157179 fs [zfs] zfs/dbuf.c: panic: solaris assert: arc_buf_remov o kern/156797 fs [zfs] [panic] Double panic with FreeBSD 9-CURRENT and o kern/156781 fs [zfs] zfs is losing the snapshot directory, p kern/156545 fs [ufs] mv could break UFS on SMP systems o kern/156193 fs [ufs] [hang] UFS snapshot hangs && deadlocks processes o kern/156039 fs [nullfs] [unionfs] nullfs + unionfs do not compose, re o kern/155615 fs [zfs] zfs v28 broken on sparc64 -current o kern/155587 fs [zfs] [panic] kernel panic with zfs p kern/155411 fs [regression] [8.2-release] [tmpfs]: mount: tmpfs : No o kern/155199 fs [ext2fs] ext3fs mounted as ext2fs gives I/O errors o bin/155104 fs [zfs][patch] use /dev prefix by default when importing o kern/154930 fs [zfs] cannot delete/unlink file from full volume -> EN o kern/154828 fs [msdosfs] Unable to create directories on external USB o kern/154491 fs [smbfs] smb_co_lock: recursive lock for object 1 p kern/154228 fs [md] md getting stuck in wdrain state o kern/153996 fs [zfs] zfs root mount error while kernel is not located o kern/153753 fs [zfs] ZFS v15 - grammatical error when attempting to u o kern/153716 fs [zfs] zpool scrub time remaining is incorrect o kern/153695 fs [patch] [zfs] Booting from zpool created on 4k-sector o kern/153680 fs [xfs] 8.1 failing to mount XFS partitions o kern/153418 fs [zfs] [panic] Kernel Panic occurred writing to zfs vol o kern/153351 fs [zfs] locking directories/files in ZFS o bin/153258 fs [patch][zfs] creating ZVOLs requires `refreservation' s kern/153173 fs [zfs] booting from a gzip-compressed dataset doesn't w o bin/153142 fs [zfs] ls -l outputs `ls: ./.zfs: Operation not support o kern/153126 fs [zfs] vdev failure, zpool=peegel type=vdev.too_small o kern/152022 fs [nfs] nfs service hangs with linux client [regression] o kern/151942 fs [zfs] panic during ls(1) zfs snapshot directory o kern/151905 fs [zfs] page fault under load in /sbin/zfs o bin/151713 fs [patch] Bug in growfs(8) with respect to 32-bit overfl o kern/151648 fs [zfs] disk wait bug o kern/151629 fs [fs] [patch] Skip empty directory entries during name o kern/151330 fs [zfs] will unshare all zfs filesystem after execute a o kern/151326 fs [nfs] nfs exports fail if netgroups contain duplicate o kern/151251 fs [ufs] Can not create files on filesystem with heavy us o kern/151226 fs [zfs] can't delete zfs snapshot o kern/150503 fs [zfs] ZFS disks are UNAVAIL and corrupted after reboot o kern/150501 fs [zfs] ZFS vdev failure vdev.bad_label on amd64 o kern/150390 fs [zfs] zfs deadlock when arcmsr reports drive faulted o kern/150336 fs [nfs] mountd/nfsd became confused; refused to reload n o kern/149208 fs mksnap_ffs(8) hang/deadlock o kern/149173 fs [patch] [zfs] make OpenSolaris installa o kern/149015 fs [zfs] [patch] misc fixes for ZFS code to build on Glib o kern/149014 fs [zfs] [patch] declarations in ZFS libraries/utilities o kern/149013 fs [zfs] [patch] make ZFS makefiles use the libraries fro o kern/148504 fs [zfs] ZFS' zpool does not allow replacing drives to be o kern/148490 fs [zfs]: zpool attach - resilver bidirectionally, and re o kern/148368 fs [zfs] ZFS hanging forever on 8.1-PRERELEASE o kern/148138 fs [zfs] zfs raidz pool commands freeze o kern/147903 fs [zfs] [panic] Kernel panics on faulty zfs device o kern/147881 fs [zfs] [patch] ZFS "sharenfs" doesn't allow different " o kern/147420 fs [ufs] [panic] ufs_dirbad, nullfs, jail panic (corrupt o kern/146941 fs [zfs] [panic] Kernel Double Fault - Happens constantly o kern/146786 fs [zfs] zpool import hangs with checksum errors o kern/146708 fs [ufs] [panic] Kernel panic in softdep_disk_write_compl o kern/146528 fs [zfs] Severe memory leak in ZFS on i386 o kern/146502 fs [nfs] FreeBSD 8 NFS Client Connection to Server s kern/145712 fs [zfs] cannot offline two drives in a raidz2 configurat o kern/145411 fs [xfs] [panic] Kernel panics shortly after mounting an f bin/145309 fs bsdlabel: Editing disk label invalidates the whole dev o kern/145272 fs [zfs] [panic] Panic during boot when accessing zfs on o kern/145246 fs [ufs] dirhash in 7.3 gratuitously frees hashes when it o kern/145238 fs [zfs] [panic] kernel panic on zpool clear tank o kern/145229 fs [zfs] Vast differences in ZFS ARC behavior between 8.0 o kern/145189 fs [nfs] nfsd performs abysmally under load o kern/144929 fs [ufs] [lor] vfs_bio.c + ufs_dirhash.c p kern/144447 fs [zfs] sharenfs fsunshare() & fsshare_main() non functi o kern/144416 fs [panic] Kernel panic on online filesystem optimization s kern/144415 fs [zfs] [panic] kernel panics on boot after zfs crash o kern/144234 fs [zfs] Cannot boot machine with recent gptzfsboot code o kern/143825 fs [nfs] [panic] Kernel panic on NFS client o bin/143572 fs [zfs] zpool(1): [patch] The verbose output from iostat o kern/143212 fs [nfs] NFSv4 client strange work ... o kern/143184 fs [zfs] [lor] zfs/bufwait LOR o kern/142878 fs [zfs] [vfs] lock order reversal o kern/142597 fs [ext2fs] ext2fs does not work on filesystems with real o kern/142489 fs [zfs] [lor] allproc/zfs LOR o kern/142466 fs Update 7.2 -> 8.0 on Raid 1 ends with screwed raid [re o kern/142306 fs [zfs] [panic] ZFS drive (from OSX Leopard) causes two o kern/142068 fs [ufs] BSD labels are got deleted spontaneously o kern/141897 fs [msdosfs] [panic] Kernel panic. msdofs: file name leng o kern/141463 fs [nfs] [panic] Frequent kernel panics after upgrade fro o kern/141305 fs [zfs] FreeBSD ZFS+sendfile severe performance issues ( o kern/141091 fs [patch] [nullfs] fix panics with DIAGNOSTIC enabled o kern/141086 fs [nfs] [panic] panic("nfs: bioread, not dir") on FreeBS o kern/141010 fs [zfs] "zfs scrub" fails when backed by files in UFS2 o kern/140888 fs [zfs] boot fail from zfs root while the pool resilveri o kern/140661 fs [zfs] [patch] /boot/loader fails to work on a GPT/ZFS- o kern/140640 fs [zfs] snapshot crash o kern/140068 fs [smbfs] [patch] smbfs does not allow semicolon in file o kern/139725 fs [zfs] zdb(1) dumps core on i386 when examining zpool c o kern/139715 fs [zfs] vfs.numvnodes leak on busy zfs p bin/139651 fs [nfs] mount(8): read-only remount of NFS volume does n o kern/139407 fs [smbfs] [panic] smb mount causes system crash if remot o kern/138662 fs [panic] ffs_blkfree: freeing free block o kern/138421 fs [ufs] [patch] remove UFS label limitations o kern/138202 fs mount_msdosfs(1) see only 2Gb o kern/136968 fs [ufs] [lor] ufs/bufwait/ufs (open) o kern/136945 fs [ufs] [lor] filedesc structure/ufs (poll) o kern/136944 fs [ffs] [lor] bufwait/snaplk (fsync) o kern/136873 fs [ntfs] Missing directories/files on NTFS volume o kern/136865 fs [nfs] [patch] NFS exports atomic and on-the-fly atomic p kern/136470 fs [nfs] Cannot mount / in read-only, over NFS o kern/135546 fs [zfs] zfs.ko module doesn't ignore zpool.cache filenam o kern/135469 fs [ufs] [panic] kernel crash on md operation in ufs_dirb o kern/135050 fs [zfs] ZFS clears/hides disk errors on reboot o kern/134491 fs [zfs] Hot spares are rather cold... o kern/133676 fs [smbfs] [panic] umount -f'ing a vnode-based memory dis p kern/133174 fs [msdosfs] [patch] msdosfs must support multibyte inter o kern/132960 fs [ufs] [panic] panic:ffs_blkfree: freeing free frag o kern/132397 fs reboot causes filesystem corruption (failure to sync b o kern/132331 fs [ufs] [lor] LOR ufs and syncer o kern/132237 fs [msdosfs] msdosfs has problems to read MSDOS Floppy o kern/132145 fs [panic] File System Hard Crashes o kern/131441 fs [unionfs] [nullfs] unionfs and/or nullfs not combineab o kern/131360 fs [nfs] poor scaling behavior of the NFS server under lo o kern/131342 fs [nfs] mounting/unmounting of disks causes NFS to fail o bin/131341 fs makefs: error "Bad file descriptor" on the mount poin o kern/130920 fs [msdosfs] cp(1) takes 100% CPU time while copying file o kern/130210 fs [nullfs] Error by check nullfs o kern/129760 fs [nfs] after 'umount -f' of a stale NFS share FreeBSD l o kern/129488 fs [smbfs] Kernel "bug" when using smbfs in smbfs_smb.c: o kern/129231 fs [ufs] [patch] New UFS mount (norandom) option - mostly o kern/129152 fs [panic] non-userfriendly panic when trying to mount(8) o kern/127787 fs [lor] [ufs] Three LORs: vfslock/devfs/vfslock, ufs/vfs o bin/127270 fs fsck_msdosfs(8) may crash if BytesPerSec is zero o kern/127029 fs [panic] mount(8): trying to mount a write protected zi o kern/126287 fs [ufs] [panic] Kernel panics while mounting an UFS file o kern/125895 fs [ffs] [panic] kernel: panic: ffs_blkfree: freeing free s kern/125738 fs [zfs] [request] SHA256 acceleration in ZFS o kern/123939 fs [msdosfs] corrupts new files o kern/122380 fs [ffs] ffs_valloc:dup alloc (Soekris 4801/7.0/USB Flash o bin/122172 fs [fs]: amd(8) automount daemon dies on 6.3-STABLE i386, o bin/121898 fs [nullfs] pwd(1)/getcwd(2) fails with Permission denied o bin/121072 fs [smbfs] mount_smbfs(8) cannot normally convert the cha o kern/120483 fs [ntfs] [patch] NTFS filesystem locking changes o kern/120482 fs [ntfs] [patch] Sync style changes between NetBSD and F o kern/118912 fs [2tb] disk sizing/geometry problem with large array o kern/118713 fs [minidump] [patch] Display media size required for a k o kern/118318 fs [nfs] NFS server hangs under special circumstances o bin/118249 fs [ufs] mv(1): moving a directory changes its mtime o kern/118126 fs [nfs] [patch] Poor NFS server write performance o kern/118107 fs [ntfs] [panic] Kernel panic when accessing a file at N o kern/117954 fs [ufs] dirhash on very large directories blocks the mac o bin/117315 fs [smbfs] mount_smbfs(8) and related options can't mount o kern/117158 fs [zfs] zpool scrub causes panic if geli vdevs detach on o bin/116980 fs [msdosfs] [patch] mount_msdosfs(8) resets some flags f o conf/116931 fs lack of fsck_cd9660 prevents mounting iso images with o kern/116583 fs [ffs] [hang] System freezes for short time when using o bin/115361 fs [zfs] mount(8) gets into a state where it won't set/un o kern/114955 fs [cd9660] [patch] [request] support for mask,dirmask,ui o kern/114847 fs [ntfs] [patch] [request] dirmask support for NTFS ala o kern/114676 fs [ufs] snapshot creation panics: snapacct_ufs2: bad blo o bin/114468 fs [patch] [request] add -d option to umount(8) to detach o kern/113852 fs [smbfs] smbfs does not properly implement DFS referral o bin/113838 fs [patch] [request] mount(8): add support for relative p o bin/113049 fs [patch] [request] make quot(8) use getopt(3) and show o kern/112658 fs [smbfs] [patch] smbfs and caching problems (resolves b o kern/111843 fs [msdosfs] Long Names of files are incorrectly created o kern/111782 fs [ufs] dump(8) fails horribly for large filesystems s bin/111146 fs [2tb] fsck(8) fails on 6T filesystem o bin/107829 fs [2TB] fdisk(8): invalid boundary checking in fdisk / w o kern/106107 fs [ufs] left-over fsck_snapshot after unfinished backgro o kern/104406 fs [ufs] Processes get stuck in "ufs" state under persist o kern/104133 fs [ext2fs] EXT2FS module corrupts EXT2/3 filesystems o kern/103035 fs [ntfs] Directories in NTFS mounted disc images appear o kern/101324 fs [smbfs] smbfs sometimes not case sensitive when it's s o kern/99290 fs [ntfs] mount_ntfs ignorant of cluster sizes s bin/97498 fs [request] newfs(8) has no option to clear the first 12 o kern/97377 fs [ntfs] [patch] syntax cleanup for ntfs_ihash.c o kern/95222 fs [cd9660] File sections on ISO9660 level 3 CDs ignored o kern/94849 fs [ufs] rename on UFS filesystem is not atomic o bin/94810 fs fsck(8) incorrectly reports 'file system marked clean' o kern/94769 fs [ufs] Multiple file deletions on multi-snapshotted fil o kern/94733 fs [smbfs] smbfs may cause double unlock o kern/93942 fs [vfs] [patch] panic: ufs_dirbad: bad dir (patch from D o kern/92272 fs [ffs] [hang] Filling a filesystem while creating a sna o kern/91134 fs [smbfs] [patch] Preserve access and modification time a kern/90815 fs [smbfs] [patch] SMBFS with character conversions somet o kern/88657 fs [smbfs] windows client hang when browsing a samba shar o kern/88555 fs [panic] ffs_blkfree: freeing free frag on AMD 64 o bin/87966 fs [patch] newfs(8): introduce -A flag for newfs to enabl o kern/87859 fs [smbfs] System reboot while umount smbfs. o kern/86587 fs [msdosfs] rm -r /PATH fails with lots of small files o bin/85494 fs fsck_ffs: unchecked use of cg_inosused macro etc. o kern/80088 fs [smbfs] Incorrect file time setting on NTFS mounted vi o bin/74779 fs Background-fsck checks one filesystem twice and omits o kern/73484 fs [ntfs] Kernel panic when doing `ls` from the client si o bin/73019 fs [ufs] fsck_ufs(8) cannot alloc 607016868 bytes for ino o kern/71774 fs [ntfs] NTFS cannot "see" files on a WinXP filesystem o bin/70600 fs fsck(8) throws files away when it can't grow lost+foun o kern/68978 fs [panic] [ufs] crashes with failing hard disk, loose po o kern/65920 fs [nwfs] Mounted Netware filesystem behaves strange o kern/65901 fs [smbfs] [patch] smbfs fails fsx write/truncate-down/tr o kern/61503 fs [smbfs] mount_smbfs does not work as non-root o kern/55617 fs [smbfs] Accessing an nsmb-mounted drive via a smb expo o kern/51685 fs [hang] Unbounded inode allocation causes kernel to loc o kern/36566 fs [smbfs] System reboot with dead smb mount and umount o bin/27687 fs fsck(8) wrapper is not properly passing options to fsc o kern/18874 fs [2TB] 32bit NFS servers export wrong negative values t 299 problems total. From owner-freebsd-fs@FreeBSD.ORG Mon Feb 25 15:15:15 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 5B60D1F6 for ; Mon, 25 Feb 2013 15:15:15 +0000 (UTC) (envelope-from olivier@gid0.org) Received: from mail-ee0-f54.google.com (mail-ee0-f54.google.com [74.125.83.54]) by mx1.freebsd.org (Postfix) with ESMTP id ECAA2287 for ; Mon, 25 Feb 2013 15:15:14 +0000 (UTC) Received: by mail-ee0-f54.google.com with SMTP id c41so1469516eek.13 for ; Mon, 25 Feb 2013 07:15:13 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=mime-version:x-received:in-reply-to:references:date:message-id :subject:from:to:cc:content-type:content-transfer-encoding :x-gm-message-state; bh=axMB73Iym45D+8oJaODrh1sE9cKF1PuBN96Hi3mrEpA=; b=na1GewtnAFYeDAXW7phuCGGKhhodoikuBzgahFWrDmB3gon6yYgZh9W82w0FwX/4y1 jCBNBDDbNtWXH7yoH0dROQsseS7U2jgiM0GvsBlEUM+1ZedaAfTPsjlWs3jgP1NhEd0T ri5XOr7hrajWNmd5Da10WG/Bwhi4pARmZNljipdlSWQcZy8Le9sgrjLvWyT5uO6GSQZa WyhL7sVmNpQwSOzFIcoM4FGk68jse4CQu24USQ+wCWP3YicyyWiQoUMGO/8W8vgAN0NX T8BmsPaKYDyBYgkLOhzGSEuDU/sZ1OKHhBBIf1jTZ649O5pqvKMN+FKZTXKjTUufW7AL WgfQ== MIME-Version: 1.0 X-Received: by 10.14.219.129 with SMTP id m1mr39716245eep.16.1361805313804; Mon, 25 Feb 2013 07:15:13 -0800 (PST) Received: by 10.14.179.65 with HTTP; Mon, 25 Feb 2013 07:15:13 -0800 (PST) In-Reply-To: References: <314B600D-E8E6-4300-B60F-33D5FA5A39CF@sarenet.es> Date: Mon, 25 Feb 2013 16:15:13 +0100 Message-ID: Subject: Re: RFC: Suggesting ZFS "best practices" in FreeBSD From: Olivier Smedts To: Kevin Day Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable X-Gm-Message-State: ALoCoQlaNU+QTv0D9Je4dsQAw4v739eiKs64TXM4Pv6g9NdnB+UwByAmC8GhLCPmm/7Asy9HWF7Y Cc: FreeBSD Filesystems , Scott Long , wblock@freebsd.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 25 Feb 2013 15:15:15 -0000 Hi, 2013/1/22 Kevin Day : > I run ftpmirror.your.org, which is a 72 x 3TB drive ZFS server. It's a ve= ry busy server. It currently houses the only off-site backup of all of the = Wikimedia projects(121TB), a full FreeBSD FTP mirror(1T), a full CentOS mir= ror, all of FreeBSD-Archive(1.5TB), FreeBSD-CVS, etc. It's usually running= between 100 and 1500mbps of ethernet traffic in/out of it. There are usual= ly around 15 FTP connections, 20-50 HTTP connections, 10 rsync connections = and 1 or 2 CVS connections. > > The only changes we've made that are ZFS specific are atime=3Doff and syn= c=3Ddisabled. Nothing we do uses atimes so disabling that cuts down on a to= n of unnecessary writes. Disabling sync is okay here too - we're just mirro= ring stuff that's available elsewhere, so there's no threat of data loss. O= ther than some TCP tuning in sysctl.conf, this is running a totally stock k= ernel with no special settings. If your workload is mostly made of reads (you're a mirror after all, you should only write when you're syncing with upstream servers) why use sync=3Ddisabled ? It shouldn't make a big difference for such a workload. > I've looked at using an SSD for meta-data only caching, but it appears th= at we've got far more than 256GB of metadata here that's being accessed reg= ularly (nearly every file is being stat'ed when rsync runs) so I'm guessing= it's not going to be incredibly effective unless I buy a seriously large S= SD. > > If you have any specific questions I'm happy to answer though. > > -- Kevin > > _______________________________________________ > freebsd-fs@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-fs > To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org" --=20 Olivier Smedts _ ASCII ribbon campaign ( ) e-mail: olivier@gid0.org - against HTML email & vCards X www: http://www.gid0.org - against proprietary attachments / \ "Il y a seulement 10 sortes de gens dans le monde : ceux qui comprennent le binaire, et ceux qui ne le comprennent pas." From owner-freebsd-fs@FreeBSD.ORG Mon Feb 25 17:00:11 2013 Return-Path: Delivered-To: freebsd-fs@FreeBSD.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id DF07B204; Mon, 25 Feb 2013 17:00:11 +0000 (UTC) (envelope-from toasty@dragondata.com) Received: from mail.your.org (mail.your.org [IPv6:2001:4978:1:2::cc09:3717]) by mx1.freebsd.org (Postfix) with ESMTP id BE357934; Mon, 25 Feb 2013 17:00:11 +0000 (UTC) Received: from mail.your.org (chi02.mail.your.org [204.9.55.23]) by mail.your.org (Postfix) with ESMTP id 5D6A6F06C72; Mon, 25 Feb 2013 17:00:11 +0000 (UTC) Received: from vpn132.rw1.your.org (vpn132.rw1.your.org [204.9.51.132]) (using TLSv1 with cipher AES128-SHA (128/128 bits)) (No client certificate requested) by mail.your.org (Postfix) with ESMTPSA id 23EFFF06C6E; Mon, 25 Feb 2013 17:00:11 +0000 (UTC) Content-Type: text/plain; charset=windows-1252 Mime-Version: 1.0 (Mac OS X Mail 6.2 \(1499\)) Subject: Re: Improving ZFS performance for large directories From: Kevin Day In-Reply-To: <5124AC69.6010709@FreeBSD.org> Date: Mon, 25 Feb 2013 11:00:10 -0600 Content-Transfer-Encoding: quoted-printable Message-Id: <237DCD81-5CAB-466B-8BF4-543D195FA545@dragondata.com> References: <19DB8F4A-6788-44F6-9A2C-E01DEA01BED9@dragondata.com> <20130201192416.GA76461@server.rulingia.com> <19E0C908-79F1-43F8-899C-6B60F998D4A5@dragondata.com> <5124AC69.6010709@FreeBSD.org> To: Andriy Gapon X-Mailer: Apple Mail (2.1499) Cc: FreeBSD Filesystems X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 25 Feb 2013 17:00:11 -0000 On Feb 20, 2013, at 4:58 AM, Andriy Gapon wrote: > on 19/02/2013 22:10 Kevin Day said the following: >> Timing doing an "ls" in large directories 20 times, the first is the = slowest, > then all subsequent listings are roughly the same. There doesn't = appear to be any > gain after 20 repetitions >=20 > I think that the above could be related to the below >=20 >> vfs.zfs.arc_meta_limit 16398159872 >> vfs.zfs.arc_meta_used 16398120264 >=20 Doing some more testing=85 After a fresh reboot, without the SSD cache, an ls(1) in a large = directory is pretty fast. After we've been running for an hour or so, = the speed gets progressively worse. I can kill all other activity on the = system, and it's still bad. I reboot, and it's back to normal.=20 On an idle system, I watched gstat(8), during the ls(1) the drives are = basically at 100% busy while it's running, reading far more data than = I'd think necessary to read a directory. top(1) is showing that the = "zfskern" kernel process is burning a lot of CPU during that time too. = Is there a possibility there's a bug/sub-optimal access pattern we're = hitting when the arc_meta_limit is hit? Something akin to if something = that was just read doesn't get put into the arc_meta cache, it's having = to re-read the same data many times just to iterate through the = directory? I've been hesitating to increase the arc size because we've only got = 64GB of memory here and I can't add any further. The processes running = on the system themselves need a fair chunk of ram, so I'm trying to = figure out how we can either upgrade this motherboard to something newer = or reduce our memory size. I've got a feeling I'm going to need to do = this, but since this is a non-commercial project it's kinda hard to = spend that much money on it. :) -- Kevin From owner-freebsd-fs@FreeBSD.ORG Mon Feb 25 17:34:05 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 4CD5A9D1 for ; Mon, 25 Feb 2013 17:34:05 +0000 (UTC) (envelope-from daniel@digsys.bg) Received: from smtp-sofia.digsys.bg (smtp-sofia.digsys.bg [193.68.3.230]) by mx1.freebsd.org (Postfix) with ESMTP id BA86DA6C for ; Mon, 25 Feb 2013 17:34:04 +0000 (UTC) Received: from dcave.digsys.bg (dcave.digsys.bg [192.92.129.5]) (authenticated bits=0) by smtp-sofia.digsys.bg (8.14.5/8.14.5) with ESMTP id r1PHXsbF029070 (version=TLSv1/SSLv3 cipher=DHE-RSA-CAMELLIA256-SHA bits=256 verify=NO) for ; Mon, 25 Feb 2013 19:33:54 +0200 (EET) (envelope-from daniel@digsys.bg) Message-ID: <512BA082.3070605@digsys.bg> Date: Mon, 25 Feb 2013 19:33:54 +0200 From: Daniel Kalchev User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:10.0.12) Gecko/20130125 Thunderbird/10.0.12 MIME-Version: 1.0 To: freebsd-fs@freebsd.org Subject: Re: Improving ZFS performance for large directories References: <19DB8F4A-6788-44F6-9A2C-E01DEA01BED9@dragondata.com> <20130201192416.GA76461@server.rulingia.com> <19E0C908-79F1-43F8-899C-6B60F998D4A5@dragondata.com> <5124AC69.6010709@FreeBSD.org> <237DCD81-5CAB-466B-8BF4-543D195FA545@dragondata.com> In-Reply-To: <237DCD81-5CAB-466B-8BF4-543D195FA545@dragondata.com> Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 7bit X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 25 Feb 2013 17:34:05 -0000 On 25.02.13 19:00, Kevin Day wrote: > I've been hesitating to increase the arc size because we've only got 64GB of memory here and I can't add any further. The processes running on the system themselves need a fair chunk of ram, so I'm trying to figure out how we can either upgrade this motherboard to something newer or reduce our memory size. I've got a feeling I'm going to need to do this, but since this is a non-commercial project it's kinda hard to spend that much money on it. :) Just make vfs.zfs.arc_meta_limit as big as arc_max. This is safe. By default it is 25% of arc_max I believe. In your case, you are better caching more metadata than file data anyway. Daniel From owner-freebsd-fs@FreeBSD.ORG Mon Feb 25 23:38:39 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id A940D2DD for ; Mon, 25 Feb 2013 23:38:39 +0000 (UTC) (envelope-from rfg@tristatelogic.com) Received: from outgoing.tristatelogic.com (segfault.tristatelogic.com [69.62.255.118]) by mx1.freebsd.org (Postfix) with ESMTP id 73474EDB for ; Mon, 25 Feb 2013 23:38:34 +0000 (UTC) Received: from segfault-nmh-helo.tristatelogic.com (localhost [127.0.0.1]) by segfault.tristatelogic.com (Postfix) with ESMTP id CEA633B59D; Mon, 25 Feb 2013 15:38:17 -0800 (PST) From: "Ronald F. Guilmette" To: freebsd-fs@freebsd.org Subject: Hard drive device names... Serial Numbers? Date: Mon, 25 Feb 2013 15:38:17 -0800 Message-ID: <10096.1361835497@server1.tristatelogic.com> Cc: Jeremy Chadwick X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 25 Feb 2013 23:38:39 -0000 Firstly, I want to apologize to Jeremy Chadwick, and also to anyone else who, either recently or in the past, has been kind enough to try contact me (to send me something non-spamish) and who has been tripped up by my ham-fisted local spam filtering. Believe me, it isn't personal, and I regret that you had trouble reaching me. I could go on at great length about my personal philosophy regarding spam, and spam filtering, but this is neither the time nor the place. Still, I feel compelled to say just a few words on this topic now. For now I'll just briefly say that unlike 99.999% of all Internet users, I personally have never come around to the belief that the existance of spam is inevitable. Rather, I believe that 100% of it is due to either incompetence or greed on the part of the folks responsible for overseeing the machines at the IP addresses from which it emanates, with incompetence being responsible for the majority of it. In the case of spam coming out of the likes of Comcast and other "public access" Internet Service Providers that are neither knowingly or willingly supporting spam however, I am of the belief... not widely shared... that they could put a stop to virtually 100% of the spam coming out of their networks simply by requiring all local senders to authenticate and by limiting per-day outbound mail flow on a per-account basis to something modest (e.g. 100 messages) except in special cases (and by special request on the part of the user in question). But very few ISPs do this, because they are, by and large, too cheap/greedy to be willing to spend the money to implement and support any such simple and effective system for curtailing their own outbound spam flow. Comcast is no exception to this general rule. As a result, I have previously locally blocked all Comcast sub-domains that have spammed me in the past, specifically: mn.comcast.net ny.comcast.net in.comcast.net fl.comcast.net ma.comcast.net co.comcast.net de.comcast.net ga.comcast.net va.comcast.net mi.comcast.net tx.comcast.net wa.comcast.net ut.comcast.net pa.comcast.net nj.comcast.net md.comcast.net sc.comcast.net ms.comcast.net or.comcast.net tn.comcast.net al.comcast.net ct.comcast.net la.comcast.net dc.comcast.net ar.comcast.net nh.comcast.net ks.comcast.net wv.comcast.net nm.comcast.net vt.comcast.net But I had made a special exception for ca.comcast.net, because I needed to do so in order to be able to receive e-mail from one California-resident relative. Unfortunately, it appears that Comcast recently snafued my special California exception by changing their DNS naming scheme so that now, mail coming out of their California mail servers arrives from nodes within the ca.mail.comcast.net subdomain, rather than nodes within the mail.ca.comcast.net domain, as previously. Predictably, and shortly thereafter, I got spammed from a node within the ca.mail.comcast.net subdomain, thus causing that domain to end up in the local blacklist. (And that in turn caused e-mail from both Jeremy and my California relative to start bouncing.) I thank Jeremy for his ernest offer to put me in touch with "people on the Comcast side" and I do accept that offer, but I feel sure that despite any amount of haranguing and/or cajoling I might subject any such contacts to, Comcast corporate, like so many other ISPs, has long ago made the irrevocable decision NOT to do what it takes to stop their network from leaking massive amounts of spam on a daily basis, because to do otherwise might subtract some paltry number of bucks from the corporate bottom line. "Our problems are manmade, therefore, they can be solved by man." -- John Fitzgerald Kennedy My apologies to all for the lengthy off-topic digression. Jeremy, I've readjusted my local blacklists now, so you should be able to e-mail me direct. Back on topic... I've tried to plow through the references Jeremy gave regading the "wired down" capability of CAM(4). I think that I may sort-of understand it. It does appear to be _a_ solution. I'm not yet 100% persuaded that it is the _best_ solution, and the idea of using serial numbers (or WWN numbers) is still appealing, at least to me. But I'm not going to advocate for that, mostly because I don't feel that I fully understand this "wired down" stuff yet. I need to look into that more before I say anything else. At least I come away with the the satisfaction of knowing that (a) I am indeed not the first person to have either thought of or suggested using drive serial numbers and also (b) that this idea _has_ already been well and truly discussed, apparently by and among better minds than mine. Regards, rfg P.S. I confess that I've only skimmed the material on the "wired down" capability of CAM(4). Perhaps the answer to this question is burried in there someplace, but I'd just like to ask: What are the rules, if any regarding what I can rename a given controller channel to within the /boot/loader.conf file? Could I rename one to, for example /dev/foobar707 if I wanted to? If not, then what are the rules? From owner-freebsd-fs@FreeBSD.ORG Tue Feb 26 05:59:26 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 8385D3F4 for ; Tue, 26 Feb 2013 05:59:26 +0000 (UTC) (envelope-from zbeeble@gmail.com) Received: from mail-ve0-f181.google.com (mail-ve0-f181.google.com [209.85.128.181]) by mx1.freebsd.org (Postfix) with ESMTP id 48DBCEA2 for ; Tue, 26 Feb 2013 05:59:26 +0000 (UTC) Received: by mail-ve0-f181.google.com with SMTP id d10so2970638vea.40 for ; Mon, 25 Feb 2013 21:59:25 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:x-received:in-reply-to:references:date:message-id :subject:from:to:cc:content-type; bh=gnaumDMx2vkbvEwuoJwziDUcrG5di7rBiUFs0x1Xy3g=; b=vn7/Zmqu4A8+xhkl+KwqyrvKFHiiqI9To2LmX9xCExZ/NHqLvuQ3lKw2w5qHZXk5Hl HqVqXcMUCVnVjVBQ31DaOmS5uUc6cqqu0A5nIp0E0I9Dj8s/AwuLvyS6/T32PeNbm6l5 QGK5i8r5TYeaWTYMwITsnQOdofwHjIs2oSW5AVmRpfXvbLoBfGDRPAl9T4LT+zZ6gSds eF5ehD5saoXiYd7S6w4jmX3BY7YOWl2qOmB7hiyXGBJQQAGc9sIac6jF8j5Rsuo7+geR AsE8Odzr61HW3NPu4Y1BJOB+Ejcl4+QIZ+2XUqpha1FV6a4GZD6OJqv2goOe0a/Rtmrm 9dzw== MIME-Version: 1.0 X-Received: by 10.220.222.8 with SMTP id ie8mr11099897vcb.27.1361858365523; Mon, 25 Feb 2013 21:59:25 -0800 (PST) Received: by 10.220.232.6 with HTTP; Mon, 25 Feb 2013 21:59:25 -0800 (PST) In-Reply-To: <20130123111852.GM30633@server.rulingia.com> References: <314B600D-E8E6-4300-B60F-33D5FA5A39CF@sarenet.es> <20130123111852.GM30633@server.rulingia.com> Date: Tue, 26 Feb 2013 00:59:25 -0500 Message-ID: Subject: Re: RFC: Suggesting ZFS "best practices" in FreeBSD From: Zaphod Beeblebrox To: Peter Jeremy Content-Type: text/plain; charset=ISO-8859-1 X-Content-Filtered-By: Mailman/MimeDel 2.1.14 Cc: FreeBSD Filesystems X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 26 Feb 2013 05:59:26 -0000 On Wed, Jan 23, 2013 at 6:18 AM, Peter Jeremy wrote: > On 2013-Jan-22 17:27:13 -0800, Michael DeMan wrote: > > >#2. Ensure a little extra space is left on the drive since if the whole > drive is used, a replacement may be a tiny bit smaller and will not work. > > As someone else has mentioned, recent ZFS allows some slop here. But > I still think it's worthwhile carving out some space to allow for a > marginally smaller replacement disk. > I'm somewhat interested in this point. Not that we should miss a few meg on a multi-terrabyte disk, but in my recent experience, all the drive manufacturers seem to "agree" on the number of sectors for a certain size of disk. I'm just not sure we need to leave for the allowance of a smaller disk. larger (than required) disks already work anyways. From owner-freebsd-fs@FreeBSD.ORG Tue Feb 26 09:14:45 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id D223B366 for ; Tue, 26 Feb 2013 09:14:45 +0000 (UTC) (envelope-from ronald-freebsd8@klop.yi.org) Received: from smarthost1.greenhost.nl (smarthost1.greenhost.nl [195.190.28.78]) by mx1.freebsd.org (Postfix) with ESMTP id 958F57C8 for ; Tue, 26 Feb 2013 09:14:45 +0000 (UTC) Received: from smtp.greenhost.nl ([213.108.104.138]) by smarthost1.greenhost.nl with esmtps (TLS1.0:RSA_AES_256_CBC_SHA1:32) (Exim 4.69) (envelope-from ) id 1UAGcU-00033k-5Y; Tue, 26 Feb 2013 10:14:43 +0100 Received: from [81.21.138.17] (helo=ronaldradial.versatec.local) by smtp.greenhost.nl with esmtpsa (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.72) (envelope-from ) id 1UAGcU-0004ok-6f; Tue, 26 Feb 2013 10:14:42 +0100 Content-Type: text/plain; charset=us-ascii; format=flowed; delsp=yes To: freebsd-fs@freebsd.org, "Ronald F. Guilmette" Subject: Re: Hard drive device names... Serial Numbers? References: <10096.1361835497@server1.tristatelogic.com> Date: Tue, 26 Feb 2013 10:14:42 +0100 MIME-Version: 1.0 Content-Transfer-Encoding: 8bit From: "Ronald Klop" Message-ID: In-Reply-To: <10096.1361835497@server1.tristatelogic.com> User-Agent: Opera Mail/12.14 (Win32) X-Virus-Scanned: by clamav at smarthost1.samage.net X-Spam-Level: / X-Spam-Score: 0.8 X-Spam-Status: No, score=0.8 required=5.0 tests=BAYES_50 autolearn=disabled version=3.3.1 X-Scan-Signature: e462de357cb394d64966911c06262bc8 Cc: Jeremy Chadwick X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 26 Feb 2013 09:14:45 -0000 On Tue, 26 Feb 2013 00:38:17 +0100, Ronald F. Guilmette wrote: > [snip some talk about spam from comcast.net] > > I've tried to plow through the references Jeremy gave regading the "wired > down" capability of CAM(4). I think that I may sort-of understand it. > It does appear to be _a_ solution. I'm not yet 100% persuaded that it > is the _best_ solution, and the idea of using serial numbers (or WWN > numbers) is still appealing, at least to me. But I'm not going to > advocate for that, mostly because I don't feel that I fully understand > this "wired down" stuff yet. I need to look into that more before I > say anything else. > > At least I come away with the the satisfaction of knowing that (a) I am > indeed not the first person to have either thought of or suggested using > drive serial numbers and also (b) that this idea _has_ already been well > and truly discussed, apparently by and among better minds than mine. > > > Regards, > rfg > > > P.S. I confess that I've only skimmed the material on the "wired down" > capability of CAM(4). Perhaps the answer to this question is burried in > there someplace, but I'd just like to ask: What are the rules, if any > regarding what I can rename a given controller channel to within the > /boot/loader.conf file? Could I rename one to, for example > /dev/foobar707 > if I wanted to? If not, then what are the rules? This cam wiring can be very good for complex setups with dedicated sysadmins, but for a lot of FreeBSD users mounting on serial number makes administrating their servers really easy. I would think both ways can exist together. Ronald, From owner-freebsd-fs@FreeBSD.ORG Tue Feb 26 20:42:58 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: by hub.freebsd.org (Postfix, from userid 821) id 8F1FF7C3; Tue, 26 Feb 2013 20:42:58 +0000 (UTC) Date: Tue, 26 Feb 2013 20:42:58 +0000 From: John To: Zaphod Beeblebrox Subject: Re: RFC: Suggesting ZFS "best practices" in FreeBSD Message-ID: <20130226204258.GA62875@FreeBSD.org> References: <314B600D-E8E6-4300-B60F-33D5FA5A39CF@sarenet.es> <20130123111852.GM30633@server.rulingia.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.21 (2010-09-15) Cc: FreeBSD Filesystems X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 26 Feb 2013 20:42:58 -0000 ----- Zaphod Beeblebrox's Original Message ----- > On Wed, Jan 23, 2013 at 6:18 AM, Peter Jeremy wrote: > > > On 2013-Jan-22 17:27:13 -0800, Michael DeMan wrote: > > > >#2. Ensure a little extra space is left on the drive since if the whole > > drive is used, a replacement may be a tiny bit smaller and will not work. > > > > As someone else has mentioned, recent ZFS allows some slop here. But > > I still think it's worthwhile carving out some space to allow for a > > marginally smaller replacement disk. > > I'm somewhat interested in this point. Not that we should miss a few meg > on a multi-terrabyte disk, but in my recent experience, all the drive > manufacturers seem to "agree" on the number of sectors for a certain size > of disk. I'm just not sure we need to leave for the allowance of a smaller > disk. larger (than required) disks already work anyways. >From the zpool manpage: disk A block device, typically located under /dev. ZFS can use indi- vidual slices or partitions, though the recommended mode of oper- ation is to use whole disks. A disk can be specified by a full path to the device or the geom(4) provider name. When given a whole disk, ZFS automatically labels the disk, if necessary. ... For pools to be portable, you must give the zpool command whole disks, not just slices, so that ZFS can label the disks with portable EFI labels. Otherwise, disk drivers on platforms of different endian- ness will not recognize the disks. And of course, if you look through the source, you'll see where ZFS makes a distinction between slices & whole disks. I have not debugged through it recently to see how much of it is currently in use. If you use dual-channel SAS drives with geom multipath, you need to be clear whether your meta-data on disk from the different geoms collide... Regardless of how the best practices is put together, make sure folks are aware of the limitations/caveats of the different choices. YMMV Cheers! John From owner-freebsd-fs@FreeBSD.ORG Tue Feb 26 21:39:30 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id D38F358E for ; Tue, 26 Feb 2013 21:39:30 +0000 (UTC) (envelope-from radiomlodychbandytow@o2.pl) Received: from tur.go2.pl (tur.go2.pl [193.17.41.50]) by mx1.freebsd.org (Postfix) with ESMTP id 6195B226 for ; Tue, 26 Feb 2013 21:39:30 +0000 (UTC) Received: from moh1-ve2.go2.pl (moh1-ve2.go2.pl [193.17.41.132]) by tur.go2.pl (Postfix) with ESMTP id BF29415A080F for ; Tue, 26 Feb 2013 22:39:22 +0100 (CET) Received: from moh1-ve2.go2.pl (unknown [10.0.0.132]) by moh1-ve2.go2.pl (Postfix) with ESMTP id 27DD7104401F for ; Tue, 26 Feb 2013 22:38:53 +0100 (CET) Received: from unknown (unknown [10.0.0.74]) by moh1-ve2.go2.pl (Postfix) with SMTP for ; Tue, 26 Feb 2013 22:38:53 +0100 (CET) Received: from unknown [93.175.66.185] by poczta.o2.pl with ESMTP id ESXQdj; Tue, 26 Feb 2013 22:38:52 +0100 Message-ID: <512D2B6C.4010009@o2.pl> Date: Tue, 26 Feb 2013 22:38:52 +0100 From: =?UTF-8?B?UmFkaW8gbcWCb2R5Y2ggYmFuZHl0w7N3?= User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:17.0) Gecko/20130201 Thunderbird/17.0.2 MIME-Version: 1.0 To: Ronald Klop Subject: Re: Some filesystem thoughts References: <5129F16A.6020505@o2.pl> In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-O2-Trust: 1, 38 X-O2-SPF: neutral Cc: freebsd-fs@freebsd.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 26 Feb 2013 21:39:30 -0000 On 24/02/2013 22:45, Ronald Klop wrote: > On Sun, 24 Feb 2013 11:54:34 +0100, Radio młodych bandytów > wrote: > >> "Ronald Klop" wrote: >>> Creative ideas. >>> Part of what you want is in fusefs (mounting of files to edit their >>> content). >> Mhm. Could you give some link or details in another form? > > Just google for 'fusefs'. > Filesystems based on FUSE: > http://sourceforge.net/apps/mediawiki/fuse/index.php?title=FileSystems OK, so you meant zipfs-like things. I thought about file being its own mountpoint. >> And part is implemented in e.g. KDE (integrated support for >>> various file types in fulltext search and tagging of files/metadata, >>> etc.). >> Well, I view it as not much different from implementing a TC / MC / >> VIM plugin. Anybody can benefit, but they have to implement the right >> API (And there are several programs that use TC plugins). >> It's interesting as a way of getting some of these benefits though. >> >>> The chances of having all these complex libraries integrated in the >>> FreeBSD OS are close to zero I presume. But I am not in a position to >>> decide about that. >> Frankly, I haven't expected anything different. My thoughts did jump >> to implementation issues and I see them numerous, but I think the idea >> itself is not sufficiently mature, so I decided to skip them in the >> first post. >> >>> I think you can't expect the OS to serve everybody's detailed wishes. >> I don't expect it. I just wanted to discuss an idea that seemed to >> have a potential. >>> The OS serves files and user programs know what to do with them. >> Unfortunately, far too often programs don't know it. Files are often >> not simple and a single program is unable to deal with them. The only >> way to deal with such cases ATM that I see is to manually remove >> layers obfuscating the meaningful sources. In some way, it resembles >> piping data through multiple programs, except that pipes transport >> bytes, not files and therefore the transformation has to be performed >> step by step. > > Well. It is probably me, but I don't really get what you're trying to > say here. We have grepmail mboxgrep pdfgrep zgrep They exist solely because grep doesn't know how to deal with some kinds of data. Adding tools doesn't scale as one needs number_of_formats * number_of_tools for full coverage. Moving it to another layer would reduce it to number_of_formats + number_of_tools. The approach is not only redundant, but also insufficient because such tools don't let you grep f.e. pdfs in gzip email-attachments despite having all necessary parts in place. When we look at the data flow that's necessary to perform such task, it's unmbox (extracts a list of emails) V unmail (separates individual emails to text and attachments) V ungzip (unzips gzipped attachments) V unpdf (extracts texts) V grep (greps) If mailboxes contained at most 1 email, emails at most 1 attachment, this could be performed as a pipe job. That's why I say it's similar to piping, yet impossible to implement this way. -- Twoje radio From owner-freebsd-fs@FreeBSD.ORG Wed Feb 27 13:50:56 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 356BA6C3; Wed, 27 Feb 2013 13:50:56 +0000 (UTC) (envelope-from borjam@sarenet.es) Received: from proxypop03b.sare.net (proxypop03b.sare.net [194.30.0.251]) by mx1.freebsd.org (Postfix) with ESMTP id ED4551B2; Wed, 27 Feb 2013 13:50:55 +0000 (UTC) Received: from [172.16.1.163] (izaro.sarenet.es [192.148.167.11]) by proxypop03.sare.net (Postfix) with ESMTPSA id E26899DD406; Wed, 27 Feb 2013 14:44:15 +0100 (CET) Subject: Re: RFC: Suggesting ZFS "best practices" in FreeBSD Mime-Version: 1.0 (Apple Message framework v1085) Content-Type: text/plain; charset=us-ascii From: Borja Marcos In-Reply-To: <20130226204258.GA62875@FreeBSD.org> Date: Wed, 27 Feb 2013 14:44:10 +0100 Content-Transfer-Encoding: quoted-printable Message-Id: References: <314B600D-E8E6-4300-B60F-33D5FA5A39CF@sarenet.es> <20130123111852.GM30633@server.rulingia.com> <20130226204258.GA62875@FreeBSD.org> To: John X-Mailer: Apple Mail (2.1085) Cc: FreeBSD Filesystems X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 27 Feb 2013 13:50:56 -0000 On Feb 26, 2013, at 9:42 PM, John wrote: > And of course, if you look through the source, you'll see where ZFS > makes a distinction between slices & whole disks. I have not debugged > through it recently to see how much of it is currently in use. >=20 > If you use dual-channel SAS drives with geom multipath, you need to = be > clear whether your meta-data on disk from the different geoms = collide...=20 >=20 > Regardless of how the best practices is put together, make sure > folks are aware of the limitations/caveats of the different choices. Exactly. Anyway, as far as I know, both FreeBSD and Solaris should be = able to work with GPT slices instead of whole disks.=20 In the past at least (and, despite the lore one can read here and there) = Solaris refused to use the disks cache if the vdevs were made of slices instead of whole disks, but maybe it has changed in = the past. As far as I know, however, FreeBSD doesn't show that behavior. Borja. From owner-freebsd-fs@FreeBSD.ORG Wed Feb 27 17:01:57 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 61B3750E for ; Wed, 27 Feb 2013 17:01:57 +0000 (UTC) (envelope-from freebsd-fs@m.gmane.org) Received: from plane.gmane.org (plane.gmane.org [80.91.229.3]) by mx1.freebsd.org (Postfix) with ESMTP id 19869ED2 for ; Wed, 27 Feb 2013 17:01:57 +0000 (UTC) Received: from list by plane.gmane.org with local (Exim 4.69) (envelope-from ) id 1UAkOO-00021R-EL for freebsd-fs@freebsd.org; Wed, 27 Feb 2013 18:02:08 +0100 Received: from lara.cc.fer.hr ([161.53.72.113]) by main.gmane.org with esmtp (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Wed, 27 Feb 2013 18:02:08 +0100 Received: from ivoras by lara.cc.fer.hr with local (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Wed, 27 Feb 2013 18:02:08 +0100 X-Injected-Via-Gmane: http://gmane.org/ To: freebsd-fs@freebsd.org From: Ivan Voras Subject: Re: Some filesystem thoughts Date: Wed, 27 Feb 2013 18:01:27 +0100 Lines: 70 Message-ID: References: <51252372.1040001@o2.pl> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="------------enigF51AC19A28A4F755E7D2BDE5" X-Complaints-To: usenet@ger.gmane.org X-Gmane-NNTP-Posting-Host: lara.cc.fer.hr User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:14.0) Gecko/20120812 Thunderbird/14.0 In-Reply-To: <51252372.1040001@o2.pl> X-Enigmail-Version: 1.4.3 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 27 Feb 2013 17:01:57 -0000 This is an OpenPGP/MIME signed message (RFC 2440 and 3156) --------------enigF51AC19A28A4F755E7D2BDE5 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable On 20/02/2013 20:26, Radio m=C5=82odych bandyt=C3=B3w wrote: > The way I see it is not to treat files as streams of bytes. That's not > what they are, files have meanings and there are tools that bring them > out. A picture is a stored emotion. OK, there are no tools for that yet= =2E > But it is also an array of pixels. And a container with exif data. And > may be a container with an encrypted archive. And, a stream of bytes to= o. > They have multiple facets. > I think that it would be useful to somehow expose them to applications.= > Wouldn't it be useful to be able to grep through pdfs in your email > attachments? I think the problem is presentation - offering just the "grep" function is waste of effort since those using GUIs will generally not use grep. What you're talking about is something like google tried to do with android (and, probably, failed): a unified search interface across all applications and their data. Actually, modern smartphones & tablets are slowly moving into the direction that there are no "files" and no "filesystems" on your device, but rather jost your "data" and "apps" which both are managed by the system (and possibly reside in a "cloud"). It may be that the "hierarhical filesystem" idea has just not so useful or efficient any more (but OTOH, I don't see it going away any time soon). > Mass-edit music tags with sed? Manually edit with your favourite text > editor instead of the sucky one-liner provided by your favourite music > player? > How about video players being able to play videos by reading them in > decoded form directly from the filesystem instead of having to integrat= e > a significant number of complex libraries to provide sufficient format > coverage? All those things already exist (or will exist soon) in modern GUI desktop environments, and especially on handheld-enabled OSes. The way they are achieved is to introduce a Grand Unified Interface (or several of them, as it happens), which severly abstract the low-level libraries, even to the point where the (GUI) application doesn't know it's dealing with actual files or something completely different. If you're more concerned about the technical aspects, then learning to write filesystems in FUSE would be a good starting point for you. --------------enigF51AC19A28A4F755E7D2BDE5 Content-Type: application/pgp-signature; name="signature.asc" Content-Description: OpenPGP digital signature Content-Disposition: attachment; filename="signature.asc" -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.19 (FreeBSD) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAlEuO+cACgkQ/QjVBj3/HSy6HgCfSt+PSRDWzubuIY4WdOyG1C+z VNcAniNDeHoT2gIk3w66cItjOh71Lg4f =xdTa -----END PGP SIGNATURE----- --------------enigF51AC19A28A4F755E7D2BDE5-- From owner-freebsd-fs@FreeBSD.ORG Wed Feb 27 19:30:01 2013 Return-Path: Delivered-To: freebsd-fs@smarthost.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id EA5E6E79 for ; Wed, 27 Feb 2013 19:30:01 +0000 (UTC) (envelope-from gnats@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:1900:2254:206c::16:87]) by mx1.freebsd.org (Postfix) with ESMTP id DB791999 for ; Wed, 27 Feb 2013 19:30:01 +0000 (UTC) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.6/8.14.6) with ESMTP id r1RJU1Mk060466 for ; Wed, 27 Feb 2013 19:30:01 GMT (envelope-from gnats@freefall.freebsd.org) Received: (from gnats@localhost) by freefall.freebsd.org (8.14.6/8.14.6/Submit) id r1RJU15a060465; Wed, 27 Feb 2013 19:30:01 GMT (envelope-from gnats) Date: Wed, 27 Feb 2013 19:30:01 GMT Message-Id: <201302271930.r1RJU15a060465@freefall.freebsd.org> To: freebsd-fs@FreeBSD.org Cc: From: dfilter@FreeBSD.ORG (dfilter service) Subject: Re: kern/175897: commit references a PR X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list Reply-To: dfilter service List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 27 Feb 2013 19:30:02 -0000 The following reply was made to PR kern/175897; it has been noted by GNATS. From: dfilter@FreeBSD.ORG (dfilter service) To: bug-followup@FreeBSD.org Cc: Subject: Re: kern/175897: commit references a PR Date: Wed, 27 Feb 2013 19:22:54 +0000 (UTC) Author: mm Date: Wed Feb 27 19:22:27 2013 New Revision: 247407 URL: http://svnweb.freebsd.org/changeset/base/247407 Log: MFC r246631,246651,246666,246675,246678,246688: Merge various ZFS bugfixes MFC r246631: Import vendor bugfixes Illumos ZFS issues: 3422 zpool create/syseventd race yield non-importable pool 3425 first write to a new zvol can fail with EFBIG MFC r246651: Import minor type change in refcount.h header from vendor (illumos). MFC r246666: Import vendor ZFS bugfix fixing a problem in arc_read(). Illumos ZFS issues: 3498 panic in arc_read(): !refcount_is_zero(&pbuf->b_hdr->b_refcnt) MFC r246675: Add tunable to allow block allocation on degraded vdevs. Illumos ZFS issues: 3507 Tunable to allow block allocation even on degraded vdevs MFC r246678: Import vendor bugfixes regarding SA rounding, header size and layout. This was already partially fixed by avg. Illumos ZFS issues: 3512 rounding discrepancy in sa_find_sizes() 3513 mismatch between SA header size and layout MFC r246688 [1]: Merge zfs_ioctl.c code that should have been merged together with ZFS v28. Fixes several problems if working with read-only pools. Changed code originaly introduced in onnv-gate 13061:bda0decf867b Contains changes up to illumos-gate 13700:4bc0783f6064 PR: kern/175897 [1] Suggested by: avg [1] Modified: stable/8/cddl/contrib/opensolaris/cmd/zdb/zdb.c stable/8/cddl/contrib/opensolaris/lib/libzfs/common/libzfs_import.c stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/bptree.c stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dbuf.c stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_diff.c stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_objset.c stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_send.c stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_traverse.c stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_tx.c stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_dataset.c stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_scan.c stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/metaslab.c stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sa.c stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa.c stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/arc.h stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/dmu_traverse.h stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/dsl_pool.h stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/refcount.h stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev.c stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_ioctl.c stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zil.c stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zvol.c Directory Properties: stable/8/cddl/contrib/opensolaris/ (props changed) stable/8/cddl/contrib/opensolaris/lib/libzfs/ (props changed) stable/8/sys/ (props changed) stable/8/sys/cddl/ (props changed) stable/8/sys/cddl/contrib/opensolaris/ (props changed) Modified: stable/8/cddl/contrib/opensolaris/cmd/zdb/zdb.c ============================================================================== --- stable/8/cddl/contrib/opensolaris/cmd/zdb/zdb.c Wed Feb 27 19:20:50 2013 (r247406) +++ stable/8/cddl/contrib/opensolaris/cmd/zdb/zdb.c Wed Feb 27 19:22:27 2013 (r247407) @@ -983,7 +983,7 @@ visit_indirect(spa_t *spa, const dnode_p arc_buf_t *buf; uint64_t fill = 0; - err = arc_read_nolock(NULL, spa, bp, arc_getbuf_func, &buf, + err = arc_read(NULL, spa, bp, arc_getbuf_func, &buf, ZIO_PRIORITY_ASYNC_READ, ZIO_FLAG_CANFAIL, &flags, zb); if (err) return (err); @@ -2001,9 +2001,8 @@ zdb_count_block(zdb_cb_t *zcb, zilog_t * bp, NULL, NULL, ZIO_FLAG_CANFAIL)), ==, 0); } -/* ARGSUSED */ static int -zdb_blkptr_cb(spa_t *spa, zilog_t *zilog, const blkptr_t *bp, arc_buf_t *pbuf, +zdb_blkptr_cb(spa_t *spa, zilog_t *zilog, const blkptr_t *bp, const zbookmark_t *zb, const dnode_phys_t *dnp, void *arg) { zdb_cb_t *zcb = arg; @@ -2410,7 +2409,7 @@ typedef struct zdb_ddt_entry { /* ARGSUSED */ static int zdb_ddt_add_cb(spa_t *spa, zilog_t *zilog, const blkptr_t *bp, - arc_buf_t *pbuf, const zbookmark_t *zb, const dnode_phys_t *dnp, void *arg) + const zbookmark_t *zb, const dnode_phys_t *dnp, void *arg) { avl_tree_t *t = arg; avl_index_t where; Modified: stable/8/cddl/contrib/opensolaris/lib/libzfs/common/libzfs_import.c ============================================================================== --- stable/8/cddl/contrib/opensolaris/lib/libzfs/common/libzfs_import.c Wed Feb 27 19:20:50 2013 (r247406) +++ stable/8/cddl/contrib/opensolaris/lib/libzfs/common/libzfs_import.c Wed Feb 27 19:22:27 2013 (r247407) @@ -526,13 +526,12 @@ get_configs(libzfs_handle_t *hdl, pool_l * version * pool guid * name - * pool txg (if available) * comment (if available) * pool state * hostid (if available) * hostname (if available) */ - uint64_t state, version, pool_txg; + uint64_t state, version; char *comment = NULL; version = fnvlist_lookup_uint64(tmp, @@ -548,11 +547,6 @@ get_configs(libzfs_handle_t *hdl, pool_l fnvlist_add_string(config, ZPOOL_CONFIG_POOL_NAME, name); - if (nvlist_lookup_uint64(tmp, - ZPOOL_CONFIG_POOL_TXG, &pool_txg) == 0) - fnvlist_add_uint64(config, - ZPOOL_CONFIG_POOL_TXG, pool_txg); - if (nvlist_lookup_string(tmp, ZPOOL_CONFIG_COMMENT, &comment) == 0) fnvlist_add_string(config, Modified: stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c ============================================================================== --- stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c Wed Feb 27 19:20:50 2013 (r247406) +++ stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c Wed Feb 27 19:22:27 2013 (r247407) @@ -940,7 +940,6 @@ buf_cons(void *vbuf, void *unused, int k bzero(buf, sizeof (arc_buf_t)); mutex_init(&buf->b_evict_lock, NULL, MUTEX_DEFAULT, NULL); - rw_init(&buf->b_data_lock, NULL, RW_DEFAULT, NULL); arc_space_consume(sizeof (arc_buf_t), ARC_SPACE_HDRS); return (0); @@ -970,7 +969,6 @@ buf_dest(void *vbuf, void *unused) arc_buf_t *buf = vbuf; mutex_destroy(&buf->b_evict_lock); - rw_destroy(&buf->b_data_lock); arc_space_return(sizeof (arc_buf_t), ARC_SPACE_HDRS); } @@ -2968,42 +2966,11 @@ arc_read_done(zio_t *zio) * * arc_read_done() will invoke all the requested "done" functions * for readers of this block. - * - * Normal callers should use arc_read and pass the arc buffer and offset - * for the bp. But if you know you don't need locking, you can use - * arc_read_nolock. */ int -arc_read(zio_t *pio, spa_t *spa, const blkptr_t *bp, arc_buf_t *pbuf, - arc_done_func_t *done, void *private, int priority, int zio_flags, - uint32_t *arc_flags, const zbookmark_t *zb) -{ - int err; - - if (pbuf == NULL) { - /* - * XXX This happens from traverse callback funcs, for - * the objset_phys_t block. - */ - return (arc_read_nolock(pio, spa, bp, done, private, priority, - zio_flags, arc_flags, zb)); - } - - ASSERT(!refcount_is_zero(&pbuf->b_hdr->b_refcnt)); - ASSERT3U((char *)bp - (char *)pbuf->b_data, <, pbuf->b_hdr->b_size); - rw_enter(&pbuf->b_data_lock, RW_READER); - - err = arc_read_nolock(pio, spa, bp, done, private, priority, - zio_flags, arc_flags, zb); - rw_exit(&pbuf->b_data_lock); - - return (err); -} - -int -arc_read_nolock(zio_t *pio, spa_t *spa, const blkptr_t *bp, - arc_done_func_t *done, void *private, int priority, int zio_flags, - uint32_t *arc_flags, const zbookmark_t *zb) +arc_read(zio_t *pio, spa_t *spa, const blkptr_t *bp, arc_done_func_t *done, + void *private, int priority, int zio_flags, uint32_t *arc_flags, + const zbookmark_t *zb) { arc_buf_hdr_t *hdr; arc_buf_t *buf; @@ -3482,19 +3449,6 @@ arc_release(arc_buf_t *buf, void *tag) } } -/* - * Release this buffer. If it does not match the provided BP, fill it - * with that block's contents. - */ -/* ARGSUSED */ -int -arc_release_bp(arc_buf_t *buf, void *tag, blkptr_t *bp, spa_t *spa, - zbookmark_t *zb) -{ - arc_release(buf, tag); - return (0); -} - int arc_released(arc_buf_t *buf) { Modified: stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/bptree.c ============================================================================== --- stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/bptree.c Wed Feb 27 19:20:50 2013 (r247406) +++ stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/bptree.c Wed Feb 27 19:22:27 2013 (r247407) @@ -135,7 +135,7 @@ bptree_add(objset_t *os, uint64_t obj, b /* ARGSUSED */ static int -bptree_visit_cb(spa_t *spa, zilog_t *zilog, const blkptr_t *bp, arc_buf_t *pbuf, +bptree_visit_cb(spa_t *spa, zilog_t *zilog, const blkptr_t *bp, const zbookmark_t *zb, const dnode_phys_t *dnp, void *arg) { int err; Modified: stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dbuf.c ============================================================================== --- stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dbuf.c Wed Feb 27 19:20:50 2013 (r247406) +++ stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dbuf.c Wed Feb 27 19:22:27 2013 (r247407) @@ -513,7 +513,6 @@ dbuf_read_impl(dmu_buf_impl_t *db, zio_t spa_t *spa; zbookmark_t zb; uint32_t aflags = ARC_NOWAIT; - arc_buf_t *pbuf; DB_DNODE_ENTER(db); dn = DB_DNODE(db); @@ -575,14 +574,8 @@ dbuf_read_impl(dmu_buf_impl_t *db, zio_t db->db.db_object, db->db_level, db->db_blkid); dbuf_add_ref(db, NULL); - /* ZIO_FLAG_CANFAIL callers have to check the parent zio's error */ - if (db->db_parent) - pbuf = db->db_parent->db_buf; - else - pbuf = db->db_objset->os_phys_buf; - - (void) dsl_read(zio, spa, db->db_blkptr, pbuf, + (void) arc_read(zio, spa, db->db_blkptr, dbuf_read_done, db, ZIO_PRIORITY_SYNC_READ, (*flags & DB_RF_CANFAIL) ? ZIO_FLAG_CANFAIL : ZIO_FLAG_MUSTSUCCEED, &aflags, &zb); @@ -982,7 +975,6 @@ void dbuf_release_bp(dmu_buf_impl_t *db) { objset_t *os; - zbookmark_t zb; DB_GET_OBJSET(&os, db); ASSERT(dsl_pool_sync_context(dmu_objset_pool(os))); @@ -990,13 +982,7 @@ dbuf_release_bp(dmu_buf_impl_t *db) list_link_active(&os->os_dsl_dataset->ds_synced_link)); ASSERT(db->db_parent == NULL || arc_released(db->db_parent->db_buf)); - zb.zb_objset = os->os_dsl_dataset ? - os->os_dsl_dataset->ds_object : 0; - zb.zb_object = db->db.db_object; - zb.zb_level = db->db_level; - zb.zb_blkid = db->db_blkid; - (void) arc_release_bp(db->db_buf, db, - db->db_blkptr, os->os_spa, &zb); + (void) arc_release(db->db_buf, db); } dbuf_dirty_record_t * @@ -1831,7 +1817,6 @@ dbuf_prefetch(dnode_t *dn, uint64_t blki if (bp && !BP_IS_HOLE(bp)) { int priority = dn->dn_type == DMU_OT_DDT_ZAP ? ZIO_PRIORITY_DDT_PREFETCH : ZIO_PRIORITY_ASYNC_READ; - arc_buf_t *pbuf; dsl_dataset_t *ds = dn->dn_objset->os_dsl_dataset; uint32_t aflags = ARC_NOWAIT | ARC_PREFETCH; zbookmark_t zb; @@ -1839,13 +1824,8 @@ dbuf_prefetch(dnode_t *dn, uint64_t blki SET_BOOKMARK(&zb, ds ? ds->ds_object : DMU_META_OBJSET, dn->dn_object, 0, blkid); - if (db) - pbuf = db->db_buf; - else - pbuf = dn->dn_objset->os_phys_buf; - - (void) dsl_read(NULL, dn->dn_objset->os_spa, - bp, pbuf, NULL, NULL, priority, + (void) arc_read(NULL, dn->dn_objset->os_spa, + bp, NULL, NULL, priority, ZIO_FLAG_CANFAIL | ZIO_FLAG_SPECULATIVE, &aflags, &zb); } Modified: stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_diff.c ============================================================================== --- stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_diff.c Wed Feb 27 19:20:50 2013 (r247406) +++ stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_diff.c Wed Feb 27 19:22:27 2013 (r247407) @@ -128,7 +128,7 @@ report_dnode(struct diffarg *da, uint64_ /* ARGSUSED */ static int -diff_cb(spa_t *spa, zilog_t *zilog, const blkptr_t *bp, arc_buf_t *pbuf, +diff_cb(spa_t *spa, zilog_t *zilog, const blkptr_t *bp, const zbookmark_t *zb, const dnode_phys_t *dnp, void *arg) { struct diffarg *da = arg; @@ -155,9 +155,9 @@ diff_cb(spa_t *spa, zilog_t *zilog, cons int blksz = BP_GET_LSIZE(bp); int i; - if (dsl_read(NULL, spa, bp, pbuf, - arc_getbuf_func, &abuf, ZIO_PRIORITY_ASYNC_READ, - ZIO_FLAG_CANFAIL, &aflags, zb) != 0) + if (arc_read(NULL, spa, bp, arc_getbuf_func, &abuf, + ZIO_PRIORITY_ASYNC_READ, ZIO_FLAG_CANFAIL, + &aflags, zb) != 0) return (EIO); blk = abuf->b_data; Modified: stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_objset.c ============================================================================== --- stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_objset.c Wed Feb 27 19:20:50 2013 (r247406) +++ stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_objset.c Wed Feb 27 19:22:27 2013 (r247407) @@ -276,12 +276,7 @@ dmu_objset_open_impl(spa_t *spa, dsl_dat aflags |= ARC_L2CACHE; dprintf_bp(os->os_rootbp, "reading %s", ""); - /* - * XXX when bprewrite scrub can change the bp, - * and this is called from dmu_objset_open_ds_os, the bp - * could change, and we'll need a lock. - */ - err = dsl_read_nolock(NULL, spa, os->os_rootbp, + err = arc_read(NULL, spa, os->os_rootbp, arc_getbuf_func, &os->os_phys_buf, ZIO_PRIORITY_SYNC_READ, ZIO_FLAG_CANFAIL, &aflags, &zb); if (err) { @@ -1124,8 +1119,7 @@ dmu_objset_sync(objset_t *os, zio_t *pio SET_BOOKMARK(&zb, os->os_dsl_dataset ? os->os_dsl_dataset->ds_object : DMU_META_OBJSET, ZB_ROOT_OBJECT, ZB_ROOT_LEVEL, ZB_ROOT_BLKID); - VERIFY3U(0, ==, arc_release_bp(os->os_phys_buf, &os->os_phys_buf, - os->os_rootbp, os->os_spa, &zb)); + arc_release(os->os_phys_buf, &os->os_phys_buf); dmu_write_policy(os, NULL, 0, 0, &zp); @@ -1764,7 +1758,7 @@ dmu_objset_prefetch(const char *name, vo SET_BOOKMARK(&zb, ds->ds_object, ZB_ROOT_OBJECT, ZB_ROOT_LEVEL, ZB_ROOT_BLKID); - (void) dsl_read_nolock(NULL, dsl_dataset_get_spa(ds), + (void) arc_read(NULL, dsl_dataset_get_spa(ds), &ds->ds_phys->ds_bp, NULL, NULL, ZIO_PRIORITY_ASYNC_READ, ZIO_FLAG_CANFAIL | ZIO_FLAG_SPECULATIVE, Modified: stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_send.c ============================================================================== --- stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_send.c Wed Feb 27 19:20:50 2013 (r247406) +++ stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_send.c Wed Feb 27 19:22:27 2013 (r247407) @@ -317,7 +317,7 @@ dump_dnode(dmu_sendarg_t *dsp, uint64_t /* ARGSUSED */ static int -backup_cb(spa_t *spa, zilog_t *zilog, const blkptr_t *bp, arc_buf_t *pbuf, +backup_cb(spa_t *spa, zilog_t *zilog, const blkptr_t *bp, const zbookmark_t *zb, const dnode_phys_t *dnp, void *arg) { dmu_sendarg_t *dsp = arg; @@ -346,9 +346,9 @@ backup_cb(spa_t *spa, zilog_t *zilog, co uint32_t aflags = ARC_WAIT; arc_buf_t *abuf; - if (dsl_read(NULL, spa, bp, pbuf, - arc_getbuf_func, &abuf, ZIO_PRIORITY_ASYNC_READ, - ZIO_FLAG_CANFAIL, &aflags, zb) != 0) + if (arc_read(NULL, spa, bp, arc_getbuf_func, &abuf, + ZIO_PRIORITY_ASYNC_READ, ZIO_FLAG_CANFAIL, + &aflags, zb) != 0) return (EIO); blk = abuf->b_data; @@ -365,9 +365,9 @@ backup_cb(spa_t *spa, zilog_t *zilog, co arc_buf_t *abuf; int blksz = BP_GET_LSIZE(bp); - if (arc_read_nolock(NULL, spa, bp, - arc_getbuf_func, &abuf, ZIO_PRIORITY_ASYNC_READ, - ZIO_FLAG_CANFAIL, &aflags, zb) != 0) + if (arc_read(NULL, spa, bp, arc_getbuf_func, &abuf, + ZIO_PRIORITY_ASYNC_READ, ZIO_FLAG_CANFAIL, + &aflags, zb) != 0) return (EIO); err = dump_spill(dsp, zb->zb_object, blksz, abuf->b_data); @@ -377,9 +377,9 @@ backup_cb(spa_t *spa, zilog_t *zilog, co arc_buf_t *abuf; int blksz = BP_GET_LSIZE(bp); - if (dsl_read(NULL, spa, bp, pbuf, - arc_getbuf_func, &abuf, ZIO_PRIORITY_ASYNC_READ, - ZIO_FLAG_CANFAIL, &aflags, zb) != 0) { + if (arc_read(NULL, spa, bp, arc_getbuf_func, &abuf, + ZIO_PRIORITY_ASYNC_READ, ZIO_FLAG_CANFAIL, + &aflags, zb) != 0) { if (zfs_send_corrupt_data) { /* Send a block filled with 0x"zfs badd bloc" */ abuf = arc_buf_alloc(spa, blksz, &abuf, Modified: stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_traverse.c ============================================================================== --- stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_traverse.c Wed Feb 27 19:20:50 2013 (r247406) +++ stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_traverse.c Wed Feb 27 19:22:27 2013 (r247407) @@ -62,9 +62,9 @@ typedef struct traverse_data { } traverse_data_t; static int traverse_dnode(traverse_data_t *td, const dnode_phys_t *dnp, - arc_buf_t *buf, uint64_t objset, uint64_t object); + uint64_t objset, uint64_t object); static void prefetch_dnode_metadata(traverse_data_t *td, const dnode_phys_t *, - arc_buf_t *buf, uint64_t objset, uint64_t object); + uint64_t objset, uint64_t object); static int traverse_zil_block(zilog_t *zilog, blkptr_t *bp, void *arg, uint64_t claim_txg) @@ -81,7 +81,7 @@ traverse_zil_block(zilog_t *zilog, blkpt SET_BOOKMARK(&zb, td->td_objset, ZB_ZIL_OBJECT, ZB_ZIL_LEVEL, bp->blk_cksum.zc_word[ZIL_ZC_SEQ]); - (void) td->td_func(td->td_spa, zilog, bp, NULL, &zb, NULL, td->td_arg); + (void) td->td_func(td->td_spa, zilog, bp, &zb, NULL, td->td_arg); return (0); } @@ -105,7 +105,7 @@ traverse_zil_record(zilog_t *zilog, lr_t SET_BOOKMARK(&zb, td->td_objset, lr->lr_foid, ZB_ZIL_LEVEL, lr->lr_offset / BP_GET_LSIZE(bp)); - (void) td->td_func(td->td_spa, zilog, bp, NULL, &zb, NULL, + (void) td->td_func(td->td_spa, zilog, bp, &zb, NULL, td->td_arg); } return (0); @@ -182,7 +182,7 @@ traverse_pause(traverse_data_t *td, cons static void traverse_prefetch_metadata(traverse_data_t *td, - arc_buf_t *pbuf, const blkptr_t *bp, const zbookmark_t *zb) + const blkptr_t *bp, const zbookmark_t *zb) { uint32_t flags = ARC_NOWAIT | ARC_PREFETCH; @@ -200,14 +200,13 @@ traverse_prefetch_metadata(traverse_data if (BP_GET_LEVEL(bp) == 0 && BP_GET_TYPE(bp) != DMU_OT_DNODE) return; - (void) arc_read(NULL, td->td_spa, bp, - pbuf, NULL, NULL, ZIO_PRIORITY_ASYNC_READ, - ZIO_FLAG_CANFAIL, &flags, zb); + (void) arc_read(NULL, td->td_spa, bp, NULL, NULL, + ZIO_PRIORITY_ASYNC_READ, ZIO_FLAG_CANFAIL, &flags, zb); } static int traverse_visitbp(traverse_data_t *td, const dnode_phys_t *dnp, - arc_buf_t *pbuf, const blkptr_t *bp, const zbookmark_t *zb) + const blkptr_t *bp, const zbookmark_t *zb) { zbookmark_t czb; int err = 0, lasterr = 0; @@ -228,8 +227,7 @@ traverse_visitbp(traverse_data_t *td, co } if (BP_IS_HOLE(bp)) { - err = td->td_func(td->td_spa, NULL, NULL, pbuf, zb, dnp, - td->td_arg); + err = td->td_func(td->td_spa, NULL, NULL, zb, dnp, td->td_arg); return (err); } @@ -249,7 +247,7 @@ traverse_visitbp(traverse_data_t *td, co } if (td->td_flags & TRAVERSE_PRE) { - err = td->td_func(td->td_spa, NULL, bp, pbuf, zb, dnp, + err = td->td_func(td->td_spa, NULL, bp, zb, dnp, td->td_arg); if (err == TRAVERSE_VISIT_NO_CHILDREN) return (0); @@ -265,8 +263,7 @@ traverse_visitbp(traverse_data_t *td, co blkptr_t *cbp; int epb = BP_GET_LSIZE(bp) >> SPA_BLKPTRSHIFT; - err = dsl_read(NULL, td->td_spa, bp, pbuf, - arc_getbuf_func, &buf, + err = arc_read(NULL, td->td_spa, bp, arc_getbuf_func, &buf, ZIO_PRIORITY_ASYNC_READ, ZIO_FLAG_CANFAIL, &flags, zb); if (err) return (err); @@ -276,7 +273,7 @@ traverse_visitbp(traverse_data_t *td, co SET_BOOKMARK(&czb, zb->zb_objset, zb->zb_object, zb->zb_level - 1, zb->zb_blkid * epb + i); - traverse_prefetch_metadata(td, buf, &cbp[i], &czb); + traverse_prefetch_metadata(td, &cbp[i], &czb); } /* recursively visitbp() blocks below this */ @@ -284,7 +281,7 @@ traverse_visitbp(traverse_data_t *td, co SET_BOOKMARK(&czb, zb->zb_objset, zb->zb_object, zb->zb_level - 1, zb->zb_blkid * epb + i); - err = traverse_visitbp(td, dnp, buf, &cbp[i], &czb); + err = traverse_visitbp(td, dnp, &cbp[i], &czb); if (err) { if (!hard) break; @@ -296,21 +293,20 @@ traverse_visitbp(traverse_data_t *td, co int i; int epb = BP_GET_LSIZE(bp) >> DNODE_SHIFT; - err = dsl_read(NULL, td->td_spa, bp, pbuf, - arc_getbuf_func, &buf, + err = arc_read(NULL, td->td_spa, bp, arc_getbuf_func, &buf, ZIO_PRIORITY_ASYNC_READ, ZIO_FLAG_CANFAIL, &flags, zb); if (err) return (err); dnp = buf->b_data; for (i = 0; i < epb; i++) { - prefetch_dnode_metadata(td, &dnp[i], buf, zb->zb_objset, + prefetch_dnode_metadata(td, &dnp[i], zb->zb_objset, zb->zb_blkid * epb + i); } /* recursively visitbp() blocks below this */ for (i = 0; i < epb; i++) { - err = traverse_dnode(td, &dnp[i], buf, zb->zb_objset, + err = traverse_dnode(td, &dnp[i], zb->zb_objset, zb->zb_blkid * epb + i); if (err) { if (!hard) @@ -323,24 +319,23 @@ traverse_visitbp(traverse_data_t *td, co objset_phys_t *osp; dnode_phys_t *dnp; - err = dsl_read_nolock(NULL, td->td_spa, bp, - arc_getbuf_func, &buf, + err = arc_read(NULL, td->td_spa, bp, arc_getbuf_func, &buf, ZIO_PRIORITY_ASYNC_READ, ZIO_FLAG_CANFAIL, &flags, zb); if (err) return (err); osp = buf->b_data; dnp = &osp->os_meta_dnode; - prefetch_dnode_metadata(td, dnp, buf, zb->zb_objset, + prefetch_dnode_metadata(td, dnp, zb->zb_objset, DMU_META_DNODE_OBJECT); if (arc_buf_size(buf) >= sizeof (objset_phys_t)) { prefetch_dnode_metadata(td, &osp->os_userused_dnode, - buf, zb->zb_objset, DMU_USERUSED_OBJECT); + zb->zb_objset, DMU_USERUSED_OBJECT); prefetch_dnode_metadata(td, &osp->os_groupused_dnode, - buf, zb->zb_objset, DMU_USERUSED_OBJECT); + zb->zb_objset, DMU_USERUSED_OBJECT); } - err = traverse_dnode(td, dnp, buf, zb->zb_objset, + err = traverse_dnode(td, dnp, zb->zb_objset, DMU_META_DNODE_OBJECT); if (err && hard) { lasterr = err; @@ -348,7 +343,7 @@ traverse_visitbp(traverse_data_t *td, co } if (err == 0 && arc_buf_size(buf) >= sizeof (objset_phys_t)) { dnp = &osp->os_userused_dnode; - err = traverse_dnode(td, dnp, buf, zb->zb_objset, + err = traverse_dnode(td, dnp, zb->zb_objset, DMU_USERUSED_OBJECT); } if (err && hard) { @@ -357,7 +352,7 @@ traverse_visitbp(traverse_data_t *td, co } if (err == 0 && arc_buf_size(buf) >= sizeof (objset_phys_t)) { dnp = &osp->os_groupused_dnode; - err = traverse_dnode(td, dnp, buf, zb->zb_objset, + err = traverse_dnode(td, dnp, zb->zb_objset, DMU_GROUPUSED_OBJECT); } } @@ -367,8 +362,7 @@ traverse_visitbp(traverse_data_t *td, co post: if (err == 0 && lasterr == 0 && (td->td_flags & TRAVERSE_POST)) { - err = td->td_func(td->td_spa, NULL, bp, pbuf, zb, dnp, - td->td_arg); + err = td->td_func(td->td_spa, NULL, bp, zb, dnp, td->td_arg); if (err == ERESTART) pause = B_TRUE; } @@ -384,25 +378,25 @@ post: static void prefetch_dnode_metadata(traverse_data_t *td, const dnode_phys_t *dnp, - arc_buf_t *buf, uint64_t objset, uint64_t object) + uint64_t objset, uint64_t object) { int j; zbookmark_t czb; for (j = 0; j < dnp->dn_nblkptr; j++) { SET_BOOKMARK(&czb, objset, object, dnp->dn_nlevels - 1, j); - traverse_prefetch_metadata(td, buf, &dnp->dn_blkptr[j], &czb); + traverse_prefetch_metadata(td, &dnp->dn_blkptr[j], &czb); } if (dnp->dn_flags & DNODE_FLAG_SPILL_BLKPTR) { SET_BOOKMARK(&czb, objset, object, 0, DMU_SPILL_BLKID); - traverse_prefetch_metadata(td, buf, &dnp->dn_spill, &czb); + traverse_prefetch_metadata(td, &dnp->dn_spill, &czb); } } static int traverse_dnode(traverse_data_t *td, const dnode_phys_t *dnp, - arc_buf_t *buf, uint64_t objset, uint64_t object) + uint64_t objset, uint64_t object) { int j, err = 0, lasterr = 0; zbookmark_t czb; @@ -410,7 +404,7 @@ traverse_dnode(traverse_data_t *td, cons for (j = 0; j < dnp->dn_nblkptr; j++) { SET_BOOKMARK(&czb, objset, object, dnp->dn_nlevels - 1, j); - err = traverse_visitbp(td, dnp, buf, &dnp->dn_blkptr[j], &czb); + err = traverse_visitbp(td, dnp, &dnp->dn_blkptr[j], &czb); if (err) { if (!hard) break; @@ -420,7 +414,7 @@ traverse_dnode(traverse_data_t *td, cons if (dnp->dn_flags & DNODE_FLAG_SPILL_BLKPTR) { SET_BOOKMARK(&czb, objset, object, 0, DMU_SPILL_BLKID); - err = traverse_visitbp(td, dnp, buf, &dnp->dn_spill, &czb); + err = traverse_visitbp(td, dnp, &dnp->dn_spill, &czb); if (err) { if (!hard) return (err); @@ -433,8 +427,7 @@ traverse_dnode(traverse_data_t *td, cons /* ARGSUSED */ static int traverse_prefetcher(spa_t *spa, zilog_t *zilog, const blkptr_t *bp, - arc_buf_t *pbuf, const zbookmark_t *zb, const dnode_phys_t *dnp, - void *arg) + const zbookmark_t *zb, const dnode_phys_t *dnp, void *arg) { prefetch_data_t *pfd = arg; uint32_t aflags = ARC_NOWAIT | ARC_PREFETCH; @@ -455,10 +448,8 @@ traverse_prefetcher(spa_t *spa, zilog_t cv_broadcast(&pfd->pd_cv); mutex_exit(&pfd->pd_mtx); - (void) dsl_read(NULL, spa, bp, pbuf, NULL, NULL, - ZIO_PRIORITY_ASYNC_READ, - ZIO_FLAG_CANFAIL | ZIO_FLAG_SPECULATIVE, - &aflags, zb); + (void) arc_read(NULL, spa, bp, NULL, NULL, ZIO_PRIORITY_ASYNC_READ, + ZIO_FLAG_CANFAIL | ZIO_FLAG_SPECULATIVE, &aflags, zb); return (0); } @@ -476,7 +467,7 @@ traverse_prefetch_thread(void *arg) SET_BOOKMARK(&czb, td.td_objset, ZB_ROOT_OBJECT, ZB_ROOT_LEVEL, ZB_ROOT_BLKID); - (void) traverse_visitbp(&td, NULL, NULL, td.td_rootbp, &czb); + (void) traverse_visitbp(&td, NULL, td.td_rootbp, &czb); mutex_enter(&td_main->td_pfd->pd_mtx); td_main->td_pfd->pd_exited = B_TRUE; @@ -540,7 +531,7 @@ traverse_impl(spa_t *spa, dsl_dataset_t SET_BOOKMARK(&czb, td.td_objset, ZB_ROOT_OBJECT, ZB_ROOT_LEVEL, ZB_ROOT_BLKID); - err = traverse_visitbp(&td, NULL, NULL, rootbp, &czb); + err = traverse_visitbp(&td, NULL, rootbp, &czb); mutex_enter(&pd.pd_mtx); pd.pd_cancel = B_TRUE; Modified: stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_tx.c ============================================================================== --- stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_tx.c Wed Feb 27 19:20:50 2013 (r247406) +++ stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_tx.c Wed Feb 27 19:22:27 2013 (r247407) @@ -21,7 +21,7 @@ /* * Copyright (c) 2005, 2010, Oracle and/or its affiliates. All rights reserved. * Copyright 2011 Nexenta Systems, Inc. All rights reserved. - * Copyright (c) 2012 by Delphix. All rights reserved. + * Copyright (c) 2013 by Delphix. All rights reserved. */ #include @@ -284,6 +284,7 @@ dmu_tx_count_write(dmu_tx_hold_t *txh, u delta = P2NPHASE(off, dn->dn_datablksz); } + min_ibs = max_ibs = dn->dn_indblkshift; if (dn->dn_maxblkid > 0) { /* * The blocksize can't change, @@ -291,13 +292,6 @@ dmu_tx_count_write(dmu_tx_hold_t *txh, u */ ASSERT(dn->dn_datablkshift != 0); min_bs = max_bs = dn->dn_datablkshift; - min_ibs = max_ibs = dn->dn_indblkshift; - } else if (dn->dn_indblkshift > max_ibs) { - /* - * This ensures that if we reduce DN_MAX_INDBLKSHIFT, - * the code will still work correctly on older pools. - */ - min_ibs = max_ibs = dn->dn_indblkshift; } /* Modified: stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_dataset.c ============================================================================== --- stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_dataset.c Wed Feb 27 19:20:50 2013 (r247406) +++ stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_dataset.c Wed Feb 27 19:22:27 2013 (r247407) @@ -1308,7 +1308,7 @@ struct killarg { /* ARGSUSED */ static int -kill_blkptr(spa_t *spa, zilog_t *zilog, const blkptr_t *bp, arc_buf_t *pbuf, +kill_blkptr(spa_t *spa, zilog_t *zilog, const blkptr_t *bp, const zbookmark_t *zb, const dnode_phys_t *dnp, void *arg) { struct killarg *ka = arg; Modified: stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_scan.c ============================================================================== --- stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_scan.c Wed Feb 27 19:20:50 2013 (r247406) +++ stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_scan.c Wed Feb 27 19:22:27 2013 (r247407) @@ -396,24 +396,6 @@ dsl_free_sync(zio_t *pio, dsl_pool_t *dp zio_nowait(zio_free_sync(pio, dp->dp_spa, txg, bpp, pio->io_flags)); } -int -dsl_read(zio_t *pio, spa_t *spa, const blkptr_t *bpp, arc_buf_t *pbuf, - arc_done_func_t *done, void *private, int priority, int zio_flags, - uint32_t *arc_flags, const zbookmark_t *zb) -{ - return (arc_read(pio, spa, bpp, pbuf, done, private, - priority, zio_flags, arc_flags, zb)); -} - -int -dsl_read_nolock(zio_t *pio, spa_t *spa, const blkptr_t *bpp, - arc_done_func_t *done, void *private, int priority, int zio_flags, - uint32_t *arc_flags, const zbookmark_t *zb) -{ - return (arc_read_nolock(pio, spa, bpp, done, private, - priority, zio_flags, arc_flags, zb)); -} - static uint64_t dsl_scan_ds_maxtxg(dsl_dataset_t *ds) { @@ -584,12 +566,8 @@ dsl_scan_prefetch(dsl_scan_t *scn, arc_b SET_BOOKMARK(&czb, objset, object, BP_GET_LEVEL(bp), blkid); - /* - * XXX need to make sure all of these arc_read() prefetches are - * done before setting xlateall (similar to dsl_read()) - */ (void) arc_read(scn->scn_zio_root, scn->scn_dp->dp_spa, bp, - buf, NULL, NULL, ZIO_PRIORITY_ASYNC_READ, + NULL, NULL, ZIO_PRIORITY_ASYNC_READ, ZIO_FLAG_CANFAIL | ZIO_FLAG_SCAN_THREAD, &flags, &czb); } @@ -647,8 +625,7 @@ dsl_scan_recurse(dsl_scan_t *scn, dsl_da blkptr_t *cbp; int epb = BP_GET_LSIZE(bp) >> SPA_BLKPTRSHIFT; - err = arc_read_nolock(NULL, dp->dp_spa, bp, - arc_getbuf_func, bufp, + err = arc_read(NULL, dp->dp_spa, bp, arc_getbuf_func, bufp, ZIO_PRIORITY_ASYNC_READ, zio_flags, &flags, zb); if (err) { scn->scn_phys.scn_errors++; @@ -670,8 +647,7 @@ dsl_scan_recurse(dsl_scan_t *scn, dsl_da } else if (BP_GET_TYPE(bp) == DMU_OT_USERGROUP_USED) { uint32_t flags = ARC_WAIT; - err = arc_read_nolock(NULL, dp->dp_spa, bp, - arc_getbuf_func, bufp, + err = arc_read(NULL, dp->dp_spa, bp, arc_getbuf_func, bufp, ZIO_PRIORITY_ASYNC_READ, zio_flags, &flags, zb); if (err) { scn->scn_phys.scn_errors++; @@ -683,8 +659,7 @@ dsl_scan_recurse(dsl_scan_t *scn, dsl_da int i, j; int epb = BP_GET_LSIZE(bp) >> DNODE_SHIFT; - err = arc_read_nolock(NULL, dp->dp_spa, bp, - arc_getbuf_func, bufp, + err = arc_read(NULL, dp->dp_spa, bp, arc_getbuf_func, bufp, ZIO_PRIORITY_ASYNC_READ, zio_flags, &flags, zb); if (err) { scn->scn_phys.scn_errors++; @@ -706,8 +681,7 @@ dsl_scan_recurse(dsl_scan_t *scn, dsl_da uint32_t flags = ARC_WAIT; objset_phys_t *osp; - err = arc_read_nolock(NULL, dp->dp_spa, bp, - arc_getbuf_func, bufp, + err = arc_read(NULL, dp->dp_spa, bp, arc_getbuf_func, bufp, ZIO_PRIORITY_ASYNC_READ, zio_flags, &flags, zb); if (err) { scn->scn_phys.scn_errors++; Modified: stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/metaslab.c ============================================================================== --- stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/metaslab.c Wed Feb 27 19:20:50 2013 (r247406) +++ stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/metaslab.c Wed Feb 27 19:22:27 2013 (r247407) @@ -21,6 +21,7 @@ /* * Copyright (c) 2005, 2010, Oracle and/or its affiliates. All rights reserved. * Copyright (c) 2012 by Delphix. All rights reserved. + * Copyright (c) 2013 by Saso Kiselkov. All rights reserved. */ #include @@ -97,6 +98,15 @@ int metaslab_prefetch_limit = SPA_DVAS_P int metaslab_smo_bonus_pct = 150; /* + * Should we be willing to write data to degraded vdevs? + */ +boolean_t zfs_write_to_degraded = B_FALSE; +SYSCTL_INT(_vfs_zfs, OID_AUTO, write_to_degraded, CTLFLAG_RW, + &zfs_write_to_degraded, 0, + "Allow writing data to degraded vdevs"); +TUNABLE_INT("vfs.zfs.write_to_degraded", &zfs_write_to_degraded); + +/* * ========================================================================== * Metaslab classes * ========================================================================== @@ -1383,10 +1393,13 @@ top: /* * Avoid writing single-copy data to a failing vdev + * unless the user instructs us that it is okay. */ if ((vd->vdev_stat.vs_write_errors > 0 || vd->vdev_state < VDEV_STATE_HEALTHY) && - d == 0 && dshift == 3) { + d == 0 && dshift == 3 && + !(zfs_write_to_degraded && vd->vdev_state == + VDEV_STATE_DEGRADED)) { all_zero = B_FALSE; goto next; } Modified: stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sa.c ============================================================================== --- stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sa.c Wed Feb 27 19:20:50 2013 (r247406) +++ stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sa.c Wed Feb 27 19:22:27 2013 (r247407) @@ -553,6 +553,7 @@ sa_find_sizes(sa_os_t *sa, sa_bulk_attr_ { int var_size = 0; int i; + int j = -1; int full_space; int hdrsize; boolean_t done = B_FALSE; @@ -574,11 +575,13 @@ sa_find_sizes(sa_os_t *sa, sa_bulk_attr_ sizeof (sa_hdr_phys_t); full_space = (buftype == SA_BONUS) ? DN_MAX_BONUSLEN : db->db_size; + ASSERT(IS_P2ALIGNED(full_space, 8)); for (i = 0; i != attr_count; i++) { boolean_t is_var_sz; - *total += P2ROUNDUP(attr_desc[i].sa_length, 8); + *total = P2ROUNDUP(*total, 8); + *total += attr_desc[i].sa_length; if (done) goto next; @@ -590,7 +593,14 @@ sa_find_sizes(sa_os_t *sa, sa_bulk_attr_ if (is_var_sz && var_size > 1) { if (P2ROUNDUP(hdrsize + sizeof (uint16_t), 8) + *total < full_space) { + /* + * Account for header space used by array of + * optional sizes of variable-length attributes. + * Record the index in case this increase needs + * to be reversed due to spill-over. + */ hdrsize += sizeof (uint16_t); + j = i; } else { done = B_TRUE; *index = i; @@ -619,6 +629,14 @@ next: *will_spill = B_TRUE; } + /* + * j holds the index of the last variable-sized attribute for + * which hdrsize was increased. Reverse the increase if that + * attribute will be relocated to the spill block. + */ + if (*will_spill && j == *index) + hdrsize -= sizeof (uint16_t); + hdrsize = P2ROUNDUP(hdrsize, 8); return (hdrsize); } @@ -709,6 +727,8 @@ sa_build_layouts(sa_handle_t *hdl, sa_bu for (i = 0, len_idx = 0, hash = -1ULL; i != attr_count; i++) { uint16_t length; + ASSERT(IS_P2ALIGNED(data_start, 8)); + ASSERT(IS_P2ALIGNED(buf_space, 8)); attrs[i] = attr_desc[i].sa_attr; length = SA_REGISTERED_LEN(sa, attrs[i]); if (length == 0) @@ -717,6 +737,7 @@ sa_build_layouts(sa_handle_t *hdl, sa_bu VERIFY(length == attr_desc[i].sa_length); if (buf_space < length) { /* switch to spill buffer */ + VERIFY(spilling); VERIFY(bonustype == DMU_OT_SA); if (buftype == SA_BONUS && !sa->sa_force_spill) { sa_find_layout(hdl->sa_os, hash, attrs_start, Modified: stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa.c ============================================================================== --- stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa.c Wed Feb 27 19:20:50 2013 (r247406) +++ stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa.c Wed Feb 27 19:22:27 2013 (r247407) @@ -1764,7 +1764,7 @@ spa_load_verify_done(zio_t *zio) /*ARGSUSED*/ static int spa_load_verify_cb(spa_t *spa, zilog_t *zilog, const blkptr_t *bp, - arc_buf_t *pbuf, const zbookmark_t *zb, const dnode_phys_t *dnp, void *arg) + const zbookmark_t *zb, const dnode_phys_t *dnp, void *arg) { if (bp != NULL) { zio_t *rio = arg; Modified: stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/arc.h ============================================================================== --- stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/arc.h Wed Feb 27 19:20:50 2013 (r247406) +++ stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/arc.h Wed Feb 27 19:22:27 2013 (r247407) @@ -49,7 +49,6 @@ struct arc_buf { arc_buf_hdr_t *b_hdr; arc_buf_t *b_next; kmutex_t b_evict_lock; - krwlock_t b_data_lock; void *b_data; arc_evict_func_t *b_efunc; void *b_private; @@ -93,8 +92,6 @@ void arc_buf_add_ref(arc_buf_t *buf, voi int arc_buf_remove_ref(arc_buf_t *buf, void *tag); int arc_buf_size(arc_buf_t *buf); void arc_release(arc_buf_t *buf, void *tag); -int arc_release_bp(arc_buf_t *buf, void *tag, blkptr_t *bp, spa_t *spa, - zbookmark_t *zb); int arc_released(arc_buf_t *buf); int arc_has_callback(arc_buf_t *buf); void arc_buf_freeze(arc_buf_t *buf); @@ -103,10 +100,7 @@ void arc_buf_thaw(arc_buf_t *buf); int arc_referenced(arc_buf_t *buf); #endif -int arc_read(zio_t *pio, spa_t *spa, const blkptr_t *bp, arc_buf_t *pbuf, - arc_done_func_t *done, void *priv, int priority, int zio_flags, - uint32_t *arc_flags, const zbookmark_t *zb); -int arc_read_nolock(zio_t *pio, spa_t *spa, const blkptr_t *bp, +int arc_read(zio_t *pio, spa_t *spa, const blkptr_t *bp, arc_done_func_t *done, void *priv, int priority, int flags, uint32_t *arc_flags, const zbookmark_t *zb); zio_t *arc_write(zio_t *pio, spa_t *spa, uint64_t txg, Modified: stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/dmu_traverse.h ============================================================================== --- stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/dmu_traverse.h Wed Feb 27 19:20:50 2013 (r247406) +++ stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/dmu_traverse.h Wed Feb 27 19:22:27 2013 (r247407) @@ -40,8 +40,7 @@ struct zilog; struct arc_buf; typedef int (blkptr_cb_t)(spa_t *spa, zilog_t *zilog, const blkptr_t *bp, - struct arc_buf *pbuf, const zbookmark_t *zb, const struct dnode_phys *dnp, - void *arg); + const zbookmark_t *zb, const struct dnode_phys *dnp, void *arg); #define TRAVERSE_PRE (1<<0) #define TRAVERSE_POST (1<<1) Modified: stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/dsl_pool.h ============================================================================== --- stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/dsl_pool.h Wed Feb 27 19:20:50 2013 (r247406) +++ stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/dsl_pool.h Wed Feb 27 19:22:27 2013 (r247407) @@ -134,12 +134,6 @@ void dsl_pool_willuse_space(dsl_pool_t * void dsl_free(dsl_pool_t *dp, uint64_t txg, const blkptr_t *bpp); void dsl_free_sync(zio_t *pio, dsl_pool_t *dp, uint64_t txg, const blkptr_t *bpp); -int dsl_read(zio_t *pio, spa_t *spa, const blkptr_t *bpp, arc_buf_t *pbuf, - arc_done_func_t *done, void *priv, int priority, int zio_flags, - uint32_t *arc_flags, const zbookmark_t *zb); -int dsl_read_nolock(zio_t *pio, spa_t *spa, const blkptr_t *bpp, - arc_done_func_t *done, void *priv, int priority, int zio_flags, - uint32_t *arc_flags, const zbookmark_t *zb); void dsl_pool_create_origin(dsl_pool_t *dp, dmu_tx_t *tx); void dsl_pool_upgrade_clones(dsl_pool_t *dp, dmu_tx_t *tx); void dsl_pool_upgrade_dir_clones(dsl_pool_t *dp, dmu_tx_t *tx); Modified: stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/refcount.h ============================================================================== --- stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/refcount.h Wed Feb 27 19:20:50 2013 (r247406) +++ stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/refcount.h Wed Feb 27 19:22:27 2013 (r247407) @@ -20,6 +20,7 @@ */ /* * Copyright (c) 2005, 2010, Oracle and/or its affiliates. All rights reserved. + * Copyright (c) 2012 by Delphix. All rights reserved. */ #ifndef _SYS_REFCOUNT_H @@ -54,8 +55,8 @@ typedef struct refcount { kmutex_t rc_mtx; list_t rc_list; list_t rc_removed; - int64_t rc_count; - int64_t rc_removed_count; + uint64_t rc_count; + uint64_t rc_removed_count; } refcount_t; /* Note: refcount_t must be initialized with refcount_create() */ Modified: stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev.c ============================================================================== --- stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev.c Wed Feb 27 19:20:50 2013 (r247406) +++ stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev.c Wed Feb 27 19:22:27 2013 (r247407) @@ -1328,7 +1328,8 @@ vdev_validate(vdev_t *vd, boolean_t stri if (vd->vdev_ops->vdev_op_leaf && vdev_readable(vd)) { uint64_t aux_guid = 0; nvlist_t *nvl; - uint64_t txg = strict ? spa->spa_config_txg : -1ULL; + uint64_t txg = spa_last_synced_txg(spa) != 0 ? + spa_last_synced_txg(spa) : -1ULL; if ((label = vdev_label_read_config(vd, txg)) == NULL) { vdev_set_state(vd, B_TRUE, VDEV_STATE_CANT_OPEN, @@ -1512,7 +1513,7 @@ vdev_reopen(vdev_t *vd) !l2arc_vdev_present(vd)) l2arc_add_vdev(spa, vd); } else { - (void) vdev_validate(vd, spa_last_synced_txg(spa)); + (void) vdev_validate(vd, B_TRUE); } /* Modified: stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_ioctl.c ============================================================================== --- stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_ioctl.c Wed Feb 27 19:20:50 2013 (r247406) +++ stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_ioctl.c Wed Feb 27 19:22:27 2013 (r247407) @@ -106,12 +106,18 @@ typedef enum { DATASET_NAME } zfs_ioc_namecheck_t; +typedef enum { + POOL_CHECK_NONE = 1 << 0, + POOL_CHECK_SUSPENDED = 1 << 1, + POOL_CHECK_READONLY = 1 << 2 +} zfs_ioc_poolcheck_t; + typedef struct zfs_ioc_vec { zfs_ioc_func_t *zvec_func; zfs_secpolicy_func_t *zvec_secpolicy; zfs_ioc_namecheck_t zvec_namecheck; boolean_t zvec_his_log; - boolean_t zvec_pool_check; + zfs_ioc_poolcheck_t zvec_pool_check; } zfs_ioc_vec_t; /* This array is indexed by zfs_userquota_prop_t */ @@ -5033,138 +5039,155 @@ zfs_ioc_unjail(zfs_cmd_t *zc) static zfs_ioc_vec_t zfs_ioc_vec[] = { { zfs_ioc_pool_create, zfs_secpolicy_config, POOL_NAME, B_FALSE, - B_FALSE }, + POOL_CHECK_NONE }, { zfs_ioc_pool_destroy, zfs_secpolicy_config, POOL_NAME, B_FALSE, - B_FALSE }, + POOL_CHECK_NONE }, { zfs_ioc_pool_import, zfs_secpolicy_config, POOL_NAME, B_TRUE, - B_FALSE }, + POOL_CHECK_NONE }, { zfs_ioc_pool_export, zfs_secpolicy_config, POOL_NAME, B_FALSE, - B_FALSE }, *** DIFF OUTPUT TRUNCATED AT 1000 LINES *** _______________________________________________ svn-src-all@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/svn-src-all To unsubscribe, send any mail to "svn-src-all-unsubscribe@freebsd.org" From owner-freebsd-fs@FreeBSD.ORG Wed Feb 27 19:30:02 2013 Return-Path: Delivered-To: freebsd-fs@smarthost.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id D3596E7A for ; Wed, 27 Feb 2013 19:30:02 +0000 (UTC) (envelope-from gnats@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:1900:2254:206c::16:87]) by mx1.freebsd.org (Postfix) with ESMTP id C488899A for ; Wed, 27 Feb 2013 19:30:02 +0000 (UTC) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.6/8.14.6) with ESMTP id r1RJU2n3060472 for ; Wed, 27 Feb 2013 19:30:02 GMT (envelope-from gnats@freefall.freebsd.org) Received: (from gnats@localhost) by freefall.freebsd.org (8.14.6/8.14.6/Submit) id r1RJU2bx060471; Wed, 27 Feb 2013 19:30:02 GMT (envelope-from gnats) Date: Wed, 27 Feb 2013 19:30:02 GMT Message-Id: <201302271930.r1RJU2bx060471@freefall.freebsd.org> To: freebsd-fs@FreeBSD.org Cc: From: dfilter@FreeBSD.ORG (dfilter service) Subject: Re: kern/175897: commit references a PR X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list Reply-To: dfilter service List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 27 Feb 2013 19:30:02 -0000 The following reply was made to PR kern/175897; it has been noted by GNATS. From: dfilter@FreeBSD.ORG (dfilter service) To: bug-followup@FreeBSD.org Cc: Subject: Re: kern/175897: commit references a PR Date: Wed, 27 Feb 2013 19:21:12 +0000 (UTC) Author: mm Date: Wed Feb 27 19:20:50 2013 New Revision: 247406 URL: http://svnweb.freebsd.org/changeset/base/247406 Log: MFC r246631,246651,246666,246675,246678,246688: Merge various ZFS bugfixes MFC r246631: Import vendor bugfixes Illumos ZFS issues: 3422 zpool create/syseventd race yield non-importable pool 3425 first write to a new zvol can fail with EFBIG MFC r246651: Import minor type change in refcount.h header from vendor (illumos). MFC r246666: Import vendor ZFS bugfix fixing a problem in arc_read(). Illumos ZFS issues: 3498 panic in arc_read(): !refcount_is_zero(&pbuf->b_hdr->b_refcnt) MFC r246675: Add tunable to allow block allocation on degraded vdevs. Illumos ZFS issues: 3507 Tunable to allow block allocation even on degraded vdevs MFC r246678: Import vendor bugfixes regarding SA rounding, header size and layout. This was already partially fixed by avg. Illumos ZFS issues: 3512 rounding discrepancy in sa_find_sizes() 3513 mismatch between SA header size and layout MFC r246688 [1]: Merge zfs_ioctl.c code that should have been merged together with ZFS v28. Fixes several problems if working with read-only pools. Changed code originaly introduced in onnv-gate 13061:bda0decf867b Contains changes up to illumos-gate 13700:4bc0783f6064 PR: kern/175897 [1] Suggested by: avg [1] Modified: stable/9/cddl/contrib/opensolaris/cmd/zdb/zdb.c stable/9/cddl/contrib/opensolaris/lib/libzfs/common/libzfs_import.c stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/bptree.c stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dbuf.c stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_diff.c stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_objset.c stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_send.c stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_traverse.c stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_tx.c stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_dataset.c stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_scan.c stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/metaslab.c stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sa.c stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa.c stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/arc.h stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/dmu_traverse.h stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/dsl_pool.h stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/refcount.h stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev.c stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_ioctl.c stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zil.c stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zvol.c Directory Properties: stable/9/cddl/contrib/opensolaris/ (props changed) stable/9/cddl/contrib/opensolaris/lib/libzfs/ (props changed) stable/9/sys/ (props changed) stable/9/sys/cddl/contrib/opensolaris/ (props changed) Modified: stable/9/cddl/contrib/opensolaris/cmd/zdb/zdb.c ============================================================================== --- stable/9/cddl/contrib/opensolaris/cmd/zdb/zdb.c Wed Feb 27 19:03:31 2013 (r247405) +++ stable/9/cddl/contrib/opensolaris/cmd/zdb/zdb.c Wed Feb 27 19:20:50 2013 (r247406) @@ -983,7 +983,7 @@ visit_indirect(spa_t *spa, const dnode_p arc_buf_t *buf; uint64_t fill = 0; - err = arc_read_nolock(NULL, spa, bp, arc_getbuf_func, &buf, + err = arc_read(NULL, spa, bp, arc_getbuf_func, &buf, ZIO_PRIORITY_ASYNC_READ, ZIO_FLAG_CANFAIL, &flags, zb); if (err) return (err); @@ -2001,9 +2001,8 @@ zdb_count_block(zdb_cb_t *zcb, zilog_t * bp, NULL, NULL, ZIO_FLAG_CANFAIL)), ==, 0); } -/* ARGSUSED */ static int -zdb_blkptr_cb(spa_t *spa, zilog_t *zilog, const blkptr_t *bp, arc_buf_t *pbuf, +zdb_blkptr_cb(spa_t *spa, zilog_t *zilog, const blkptr_t *bp, const zbookmark_t *zb, const dnode_phys_t *dnp, void *arg) { zdb_cb_t *zcb = arg; @@ -2410,7 +2409,7 @@ typedef struct zdb_ddt_entry { /* ARGSUSED */ static int zdb_ddt_add_cb(spa_t *spa, zilog_t *zilog, const blkptr_t *bp, - arc_buf_t *pbuf, const zbookmark_t *zb, const dnode_phys_t *dnp, void *arg) + const zbookmark_t *zb, const dnode_phys_t *dnp, void *arg) { avl_tree_t *t = arg; avl_index_t where; Modified: stable/9/cddl/contrib/opensolaris/lib/libzfs/common/libzfs_import.c ============================================================================== --- stable/9/cddl/contrib/opensolaris/lib/libzfs/common/libzfs_import.c Wed Feb 27 19:03:31 2013 (r247405) +++ stable/9/cddl/contrib/opensolaris/lib/libzfs/common/libzfs_import.c Wed Feb 27 19:20:50 2013 (r247406) @@ -526,13 +526,12 @@ get_configs(libzfs_handle_t *hdl, pool_l * version * pool guid * name - * pool txg (if available) * comment (if available) * pool state * hostid (if available) * hostname (if available) */ - uint64_t state, version, pool_txg; + uint64_t state, version; char *comment = NULL; version = fnvlist_lookup_uint64(tmp, @@ -548,11 +547,6 @@ get_configs(libzfs_handle_t *hdl, pool_l fnvlist_add_string(config, ZPOOL_CONFIG_POOL_NAME, name); - if (nvlist_lookup_uint64(tmp, - ZPOOL_CONFIG_POOL_TXG, &pool_txg) == 0) - fnvlist_add_uint64(config, - ZPOOL_CONFIG_POOL_TXG, pool_txg); - if (nvlist_lookup_string(tmp, ZPOOL_CONFIG_COMMENT, &comment) == 0) fnvlist_add_string(config, Modified: stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c ============================================================================== --- stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c Wed Feb 27 19:03:31 2013 (r247405) +++ stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c Wed Feb 27 19:20:50 2013 (r247406) @@ -940,7 +940,6 @@ buf_cons(void *vbuf, void *unused, int k bzero(buf, sizeof (arc_buf_t)); mutex_init(&buf->b_evict_lock, NULL, MUTEX_DEFAULT, NULL); - rw_init(&buf->b_data_lock, NULL, RW_DEFAULT, NULL); arc_space_consume(sizeof (arc_buf_t), ARC_SPACE_HDRS); return (0); @@ -970,7 +969,6 @@ buf_dest(void *vbuf, void *unused) arc_buf_t *buf = vbuf; mutex_destroy(&buf->b_evict_lock); - rw_destroy(&buf->b_data_lock); arc_space_return(sizeof (arc_buf_t), ARC_SPACE_HDRS); } @@ -2968,42 +2966,11 @@ arc_read_done(zio_t *zio) * * arc_read_done() will invoke all the requested "done" functions * for readers of this block. - * - * Normal callers should use arc_read and pass the arc buffer and offset - * for the bp. But if you know you don't need locking, you can use - * arc_read_nolock. */ int -arc_read(zio_t *pio, spa_t *spa, const blkptr_t *bp, arc_buf_t *pbuf, - arc_done_func_t *done, void *private, int priority, int zio_flags, - uint32_t *arc_flags, const zbookmark_t *zb) -{ - int err; - - if (pbuf == NULL) { - /* - * XXX This happens from traverse callback funcs, for - * the objset_phys_t block. - */ - return (arc_read_nolock(pio, spa, bp, done, private, priority, - zio_flags, arc_flags, zb)); - } - - ASSERT(!refcount_is_zero(&pbuf->b_hdr->b_refcnt)); - ASSERT3U((char *)bp - (char *)pbuf->b_data, <, pbuf->b_hdr->b_size); - rw_enter(&pbuf->b_data_lock, RW_READER); - - err = arc_read_nolock(pio, spa, bp, done, private, priority, - zio_flags, arc_flags, zb); - rw_exit(&pbuf->b_data_lock); - - return (err); -} - -int -arc_read_nolock(zio_t *pio, spa_t *spa, const blkptr_t *bp, - arc_done_func_t *done, void *private, int priority, int zio_flags, - uint32_t *arc_flags, const zbookmark_t *zb) +arc_read(zio_t *pio, spa_t *spa, const blkptr_t *bp, arc_done_func_t *done, + void *private, int priority, int zio_flags, uint32_t *arc_flags, + const zbookmark_t *zb) { arc_buf_hdr_t *hdr; arc_buf_t *buf; @@ -3482,19 +3449,6 @@ arc_release(arc_buf_t *buf, void *tag) } } -/* - * Release this buffer. If it does not match the provided BP, fill it - * with that block's contents. - */ -/* ARGSUSED */ -int -arc_release_bp(arc_buf_t *buf, void *tag, blkptr_t *bp, spa_t *spa, - zbookmark_t *zb) -{ - arc_release(buf, tag); - return (0); -} - int arc_released(arc_buf_t *buf) { Modified: stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/bptree.c ============================================================================== --- stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/bptree.c Wed Feb 27 19:03:31 2013 (r247405) +++ stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/bptree.c Wed Feb 27 19:20:50 2013 (r247406) @@ -135,7 +135,7 @@ bptree_add(objset_t *os, uint64_t obj, b /* ARGSUSED */ static int -bptree_visit_cb(spa_t *spa, zilog_t *zilog, const blkptr_t *bp, arc_buf_t *pbuf, +bptree_visit_cb(spa_t *spa, zilog_t *zilog, const blkptr_t *bp, const zbookmark_t *zb, const dnode_phys_t *dnp, void *arg) { int err; Modified: stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dbuf.c ============================================================================== --- stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dbuf.c Wed Feb 27 19:03:31 2013 (r247405) +++ stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dbuf.c Wed Feb 27 19:20:50 2013 (r247406) @@ -513,7 +513,6 @@ dbuf_read_impl(dmu_buf_impl_t *db, zio_t spa_t *spa; zbookmark_t zb; uint32_t aflags = ARC_NOWAIT; - arc_buf_t *pbuf; DB_DNODE_ENTER(db); dn = DB_DNODE(db); @@ -575,14 +574,8 @@ dbuf_read_impl(dmu_buf_impl_t *db, zio_t db->db.db_object, db->db_level, db->db_blkid); dbuf_add_ref(db, NULL); - /* ZIO_FLAG_CANFAIL callers have to check the parent zio's error */ - if (db->db_parent) - pbuf = db->db_parent->db_buf; - else - pbuf = db->db_objset->os_phys_buf; - - (void) dsl_read(zio, spa, db->db_blkptr, pbuf, + (void) arc_read(zio, spa, db->db_blkptr, dbuf_read_done, db, ZIO_PRIORITY_SYNC_READ, (*flags & DB_RF_CANFAIL) ? ZIO_FLAG_CANFAIL : ZIO_FLAG_MUSTSUCCEED, &aflags, &zb); @@ -982,7 +975,6 @@ void dbuf_release_bp(dmu_buf_impl_t *db) { objset_t *os; - zbookmark_t zb; DB_GET_OBJSET(&os, db); ASSERT(dsl_pool_sync_context(dmu_objset_pool(os))); @@ -990,13 +982,7 @@ dbuf_release_bp(dmu_buf_impl_t *db) list_link_active(&os->os_dsl_dataset->ds_synced_link)); ASSERT(db->db_parent == NULL || arc_released(db->db_parent->db_buf)); - zb.zb_objset = os->os_dsl_dataset ? - os->os_dsl_dataset->ds_object : 0; - zb.zb_object = db->db.db_object; - zb.zb_level = db->db_level; - zb.zb_blkid = db->db_blkid; - (void) arc_release_bp(db->db_buf, db, - db->db_blkptr, os->os_spa, &zb); + (void) arc_release(db->db_buf, db); } dbuf_dirty_record_t * @@ -1831,7 +1817,6 @@ dbuf_prefetch(dnode_t *dn, uint64_t blki if (bp && !BP_IS_HOLE(bp)) { int priority = dn->dn_type == DMU_OT_DDT_ZAP ? ZIO_PRIORITY_DDT_PREFETCH : ZIO_PRIORITY_ASYNC_READ; - arc_buf_t *pbuf; dsl_dataset_t *ds = dn->dn_objset->os_dsl_dataset; uint32_t aflags = ARC_NOWAIT | ARC_PREFETCH; zbookmark_t zb; @@ -1839,13 +1824,8 @@ dbuf_prefetch(dnode_t *dn, uint64_t blki SET_BOOKMARK(&zb, ds ? ds->ds_object : DMU_META_OBJSET, dn->dn_object, 0, blkid); - if (db) - pbuf = db->db_buf; - else - pbuf = dn->dn_objset->os_phys_buf; - - (void) dsl_read(NULL, dn->dn_objset->os_spa, - bp, pbuf, NULL, NULL, priority, + (void) arc_read(NULL, dn->dn_objset->os_spa, + bp, NULL, NULL, priority, ZIO_FLAG_CANFAIL | ZIO_FLAG_SPECULATIVE, &aflags, &zb); } Modified: stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_diff.c ============================================================================== --- stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_diff.c Wed Feb 27 19:03:31 2013 (r247405) +++ stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_diff.c Wed Feb 27 19:20:50 2013 (r247406) @@ -128,7 +128,7 @@ report_dnode(struct diffarg *da, uint64_ /* ARGSUSED */ static int -diff_cb(spa_t *spa, zilog_t *zilog, const blkptr_t *bp, arc_buf_t *pbuf, +diff_cb(spa_t *spa, zilog_t *zilog, const blkptr_t *bp, const zbookmark_t *zb, const dnode_phys_t *dnp, void *arg) { struct diffarg *da = arg; @@ -155,9 +155,9 @@ diff_cb(spa_t *spa, zilog_t *zilog, cons int blksz = BP_GET_LSIZE(bp); int i; - if (dsl_read(NULL, spa, bp, pbuf, - arc_getbuf_func, &abuf, ZIO_PRIORITY_ASYNC_READ, - ZIO_FLAG_CANFAIL, &aflags, zb) != 0) + if (arc_read(NULL, spa, bp, arc_getbuf_func, &abuf, + ZIO_PRIORITY_ASYNC_READ, ZIO_FLAG_CANFAIL, + &aflags, zb) != 0) return (EIO); blk = abuf->b_data; Modified: stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_objset.c ============================================================================== --- stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_objset.c Wed Feb 27 19:03:31 2013 (r247405) +++ stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_objset.c Wed Feb 27 19:20:50 2013 (r247406) @@ -276,12 +276,7 @@ dmu_objset_open_impl(spa_t *spa, dsl_dat aflags |= ARC_L2CACHE; dprintf_bp(os->os_rootbp, "reading %s", ""); - /* - * XXX when bprewrite scrub can change the bp, - * and this is called from dmu_objset_open_ds_os, the bp - * could change, and we'll need a lock. - */ - err = dsl_read_nolock(NULL, spa, os->os_rootbp, + err = arc_read(NULL, spa, os->os_rootbp, arc_getbuf_func, &os->os_phys_buf, ZIO_PRIORITY_SYNC_READ, ZIO_FLAG_CANFAIL, &aflags, &zb); if (err) { @@ -1124,8 +1119,7 @@ dmu_objset_sync(objset_t *os, zio_t *pio SET_BOOKMARK(&zb, os->os_dsl_dataset ? os->os_dsl_dataset->ds_object : DMU_META_OBJSET, ZB_ROOT_OBJECT, ZB_ROOT_LEVEL, ZB_ROOT_BLKID); - VERIFY3U(0, ==, arc_release_bp(os->os_phys_buf, &os->os_phys_buf, - os->os_rootbp, os->os_spa, &zb)); + arc_release(os->os_phys_buf, &os->os_phys_buf); dmu_write_policy(os, NULL, 0, 0, &zp); @@ -1764,7 +1758,7 @@ dmu_objset_prefetch(const char *name, vo SET_BOOKMARK(&zb, ds->ds_object, ZB_ROOT_OBJECT, ZB_ROOT_LEVEL, ZB_ROOT_BLKID); - (void) dsl_read_nolock(NULL, dsl_dataset_get_spa(ds), + (void) arc_read(NULL, dsl_dataset_get_spa(ds), &ds->ds_phys->ds_bp, NULL, NULL, ZIO_PRIORITY_ASYNC_READ, ZIO_FLAG_CANFAIL | ZIO_FLAG_SPECULATIVE, Modified: stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_send.c ============================================================================== --- stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_send.c Wed Feb 27 19:03:31 2013 (r247405) +++ stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_send.c Wed Feb 27 19:20:50 2013 (r247406) @@ -317,7 +317,7 @@ dump_dnode(dmu_sendarg_t *dsp, uint64_t /* ARGSUSED */ static int -backup_cb(spa_t *spa, zilog_t *zilog, const blkptr_t *bp, arc_buf_t *pbuf, +backup_cb(spa_t *spa, zilog_t *zilog, const blkptr_t *bp, const zbookmark_t *zb, const dnode_phys_t *dnp, void *arg) { dmu_sendarg_t *dsp = arg; @@ -346,9 +346,9 @@ backup_cb(spa_t *spa, zilog_t *zilog, co uint32_t aflags = ARC_WAIT; arc_buf_t *abuf; - if (dsl_read(NULL, spa, bp, pbuf, - arc_getbuf_func, &abuf, ZIO_PRIORITY_ASYNC_READ, - ZIO_FLAG_CANFAIL, &aflags, zb) != 0) + if (arc_read(NULL, spa, bp, arc_getbuf_func, &abuf, + ZIO_PRIORITY_ASYNC_READ, ZIO_FLAG_CANFAIL, + &aflags, zb) != 0) return (EIO); blk = abuf->b_data; @@ -365,9 +365,9 @@ backup_cb(spa_t *spa, zilog_t *zilog, co arc_buf_t *abuf; int blksz = BP_GET_LSIZE(bp); - if (arc_read_nolock(NULL, spa, bp, - arc_getbuf_func, &abuf, ZIO_PRIORITY_ASYNC_READ, - ZIO_FLAG_CANFAIL, &aflags, zb) != 0) + if (arc_read(NULL, spa, bp, arc_getbuf_func, &abuf, + ZIO_PRIORITY_ASYNC_READ, ZIO_FLAG_CANFAIL, + &aflags, zb) != 0) return (EIO); err = dump_spill(dsp, zb->zb_object, blksz, abuf->b_data); @@ -377,9 +377,9 @@ backup_cb(spa_t *spa, zilog_t *zilog, co arc_buf_t *abuf; int blksz = BP_GET_LSIZE(bp); - if (dsl_read(NULL, spa, bp, pbuf, - arc_getbuf_func, &abuf, ZIO_PRIORITY_ASYNC_READ, - ZIO_FLAG_CANFAIL, &aflags, zb) != 0) { + if (arc_read(NULL, spa, bp, arc_getbuf_func, &abuf, + ZIO_PRIORITY_ASYNC_READ, ZIO_FLAG_CANFAIL, + &aflags, zb) != 0) { if (zfs_send_corrupt_data) { /* Send a block filled with 0x"zfs badd bloc" */ abuf = arc_buf_alloc(spa, blksz, &abuf, Modified: stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_traverse.c ============================================================================== --- stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_traverse.c Wed Feb 27 19:03:31 2013 (r247405) +++ stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_traverse.c Wed Feb 27 19:20:50 2013 (r247406) @@ -62,9 +62,9 @@ typedef struct traverse_data { } traverse_data_t; static int traverse_dnode(traverse_data_t *td, const dnode_phys_t *dnp, - arc_buf_t *buf, uint64_t objset, uint64_t object); + uint64_t objset, uint64_t object); static void prefetch_dnode_metadata(traverse_data_t *td, const dnode_phys_t *, - arc_buf_t *buf, uint64_t objset, uint64_t object); + uint64_t objset, uint64_t object); static int traverse_zil_block(zilog_t *zilog, blkptr_t *bp, void *arg, uint64_t claim_txg) @@ -81,7 +81,7 @@ traverse_zil_block(zilog_t *zilog, blkpt SET_BOOKMARK(&zb, td->td_objset, ZB_ZIL_OBJECT, ZB_ZIL_LEVEL, bp->blk_cksum.zc_word[ZIL_ZC_SEQ]); - (void) td->td_func(td->td_spa, zilog, bp, NULL, &zb, NULL, td->td_arg); + (void) td->td_func(td->td_spa, zilog, bp, &zb, NULL, td->td_arg); return (0); } @@ -105,7 +105,7 @@ traverse_zil_record(zilog_t *zilog, lr_t SET_BOOKMARK(&zb, td->td_objset, lr->lr_foid, ZB_ZIL_LEVEL, lr->lr_offset / BP_GET_LSIZE(bp)); - (void) td->td_func(td->td_spa, zilog, bp, NULL, &zb, NULL, + (void) td->td_func(td->td_spa, zilog, bp, &zb, NULL, td->td_arg); } return (0); @@ -182,7 +182,7 @@ traverse_pause(traverse_data_t *td, cons static void traverse_prefetch_metadata(traverse_data_t *td, - arc_buf_t *pbuf, const blkptr_t *bp, const zbookmark_t *zb) + const blkptr_t *bp, const zbookmark_t *zb) { uint32_t flags = ARC_NOWAIT | ARC_PREFETCH; @@ -200,14 +200,13 @@ traverse_prefetch_metadata(traverse_data if (BP_GET_LEVEL(bp) == 0 && BP_GET_TYPE(bp) != DMU_OT_DNODE) return; - (void) arc_read(NULL, td->td_spa, bp, - pbuf, NULL, NULL, ZIO_PRIORITY_ASYNC_READ, - ZIO_FLAG_CANFAIL, &flags, zb); + (void) arc_read(NULL, td->td_spa, bp, NULL, NULL, + ZIO_PRIORITY_ASYNC_READ, ZIO_FLAG_CANFAIL, &flags, zb); } static int traverse_visitbp(traverse_data_t *td, const dnode_phys_t *dnp, - arc_buf_t *pbuf, const blkptr_t *bp, const zbookmark_t *zb) + const blkptr_t *bp, const zbookmark_t *zb) { zbookmark_t czb; int err = 0, lasterr = 0; @@ -228,8 +227,7 @@ traverse_visitbp(traverse_data_t *td, co } if (BP_IS_HOLE(bp)) { - err = td->td_func(td->td_spa, NULL, NULL, pbuf, zb, dnp, - td->td_arg); + err = td->td_func(td->td_spa, NULL, NULL, zb, dnp, td->td_arg); return (err); } @@ -249,7 +247,7 @@ traverse_visitbp(traverse_data_t *td, co } if (td->td_flags & TRAVERSE_PRE) { - err = td->td_func(td->td_spa, NULL, bp, pbuf, zb, dnp, + err = td->td_func(td->td_spa, NULL, bp, zb, dnp, td->td_arg); if (err == TRAVERSE_VISIT_NO_CHILDREN) return (0); @@ -265,8 +263,7 @@ traverse_visitbp(traverse_data_t *td, co blkptr_t *cbp; int epb = BP_GET_LSIZE(bp) >> SPA_BLKPTRSHIFT; - err = dsl_read(NULL, td->td_spa, bp, pbuf, - arc_getbuf_func, &buf, + err = arc_read(NULL, td->td_spa, bp, arc_getbuf_func, &buf, ZIO_PRIORITY_ASYNC_READ, ZIO_FLAG_CANFAIL, &flags, zb); if (err) return (err); @@ -276,7 +273,7 @@ traverse_visitbp(traverse_data_t *td, co SET_BOOKMARK(&czb, zb->zb_objset, zb->zb_object, zb->zb_level - 1, zb->zb_blkid * epb + i); - traverse_prefetch_metadata(td, buf, &cbp[i], &czb); + traverse_prefetch_metadata(td, &cbp[i], &czb); } /* recursively visitbp() blocks below this */ @@ -284,7 +281,7 @@ traverse_visitbp(traverse_data_t *td, co SET_BOOKMARK(&czb, zb->zb_objset, zb->zb_object, zb->zb_level - 1, zb->zb_blkid * epb + i); - err = traverse_visitbp(td, dnp, buf, &cbp[i], &czb); + err = traverse_visitbp(td, dnp, &cbp[i], &czb); if (err) { if (!hard) break; @@ -296,21 +293,20 @@ traverse_visitbp(traverse_data_t *td, co int i; int epb = BP_GET_LSIZE(bp) >> DNODE_SHIFT; - err = dsl_read(NULL, td->td_spa, bp, pbuf, - arc_getbuf_func, &buf, + err = arc_read(NULL, td->td_spa, bp, arc_getbuf_func, &buf, ZIO_PRIORITY_ASYNC_READ, ZIO_FLAG_CANFAIL, &flags, zb); if (err) return (err); dnp = buf->b_data; for (i = 0; i < epb; i++) { - prefetch_dnode_metadata(td, &dnp[i], buf, zb->zb_objset, + prefetch_dnode_metadata(td, &dnp[i], zb->zb_objset, zb->zb_blkid * epb + i); } /* recursively visitbp() blocks below this */ for (i = 0; i < epb; i++) { - err = traverse_dnode(td, &dnp[i], buf, zb->zb_objset, + err = traverse_dnode(td, &dnp[i], zb->zb_objset, zb->zb_blkid * epb + i); if (err) { if (!hard) @@ -323,24 +319,23 @@ traverse_visitbp(traverse_data_t *td, co objset_phys_t *osp; dnode_phys_t *dnp; - err = dsl_read_nolock(NULL, td->td_spa, bp, - arc_getbuf_func, &buf, + err = arc_read(NULL, td->td_spa, bp, arc_getbuf_func, &buf, ZIO_PRIORITY_ASYNC_READ, ZIO_FLAG_CANFAIL, &flags, zb); if (err) return (err); osp = buf->b_data; dnp = &osp->os_meta_dnode; - prefetch_dnode_metadata(td, dnp, buf, zb->zb_objset, + prefetch_dnode_metadata(td, dnp, zb->zb_objset, DMU_META_DNODE_OBJECT); if (arc_buf_size(buf) >= sizeof (objset_phys_t)) { prefetch_dnode_metadata(td, &osp->os_userused_dnode, - buf, zb->zb_objset, DMU_USERUSED_OBJECT); + zb->zb_objset, DMU_USERUSED_OBJECT); prefetch_dnode_metadata(td, &osp->os_groupused_dnode, - buf, zb->zb_objset, DMU_USERUSED_OBJECT); + zb->zb_objset, DMU_USERUSED_OBJECT); } - err = traverse_dnode(td, dnp, buf, zb->zb_objset, + err = traverse_dnode(td, dnp, zb->zb_objset, DMU_META_DNODE_OBJECT); if (err && hard) { lasterr = err; @@ -348,7 +343,7 @@ traverse_visitbp(traverse_data_t *td, co } if (err == 0 && arc_buf_size(buf) >= sizeof (objset_phys_t)) { dnp = &osp->os_userused_dnode; - err = traverse_dnode(td, dnp, buf, zb->zb_objset, + err = traverse_dnode(td, dnp, zb->zb_objset, DMU_USERUSED_OBJECT); } if (err && hard) { @@ -357,7 +352,7 @@ traverse_visitbp(traverse_data_t *td, co } if (err == 0 && arc_buf_size(buf) >= sizeof (objset_phys_t)) { dnp = &osp->os_groupused_dnode; - err = traverse_dnode(td, dnp, buf, zb->zb_objset, + err = traverse_dnode(td, dnp, zb->zb_objset, DMU_GROUPUSED_OBJECT); } } @@ -367,8 +362,7 @@ traverse_visitbp(traverse_data_t *td, co post: if (err == 0 && lasterr == 0 && (td->td_flags & TRAVERSE_POST)) { - err = td->td_func(td->td_spa, NULL, bp, pbuf, zb, dnp, - td->td_arg); + err = td->td_func(td->td_spa, NULL, bp, zb, dnp, td->td_arg); if (err == ERESTART) pause = B_TRUE; } @@ -384,25 +378,25 @@ post: static void prefetch_dnode_metadata(traverse_data_t *td, const dnode_phys_t *dnp, - arc_buf_t *buf, uint64_t objset, uint64_t object) + uint64_t objset, uint64_t object) { int j; zbookmark_t czb; for (j = 0; j < dnp->dn_nblkptr; j++) { SET_BOOKMARK(&czb, objset, object, dnp->dn_nlevels - 1, j); - traverse_prefetch_metadata(td, buf, &dnp->dn_blkptr[j], &czb); + traverse_prefetch_metadata(td, &dnp->dn_blkptr[j], &czb); } if (dnp->dn_flags & DNODE_FLAG_SPILL_BLKPTR) { SET_BOOKMARK(&czb, objset, object, 0, DMU_SPILL_BLKID); - traverse_prefetch_metadata(td, buf, &dnp->dn_spill, &czb); + traverse_prefetch_metadata(td, &dnp->dn_spill, &czb); } } static int traverse_dnode(traverse_data_t *td, const dnode_phys_t *dnp, - arc_buf_t *buf, uint64_t objset, uint64_t object) + uint64_t objset, uint64_t object) { int j, err = 0, lasterr = 0; zbookmark_t czb; @@ -410,7 +404,7 @@ traverse_dnode(traverse_data_t *td, cons for (j = 0; j < dnp->dn_nblkptr; j++) { SET_BOOKMARK(&czb, objset, object, dnp->dn_nlevels - 1, j); - err = traverse_visitbp(td, dnp, buf, &dnp->dn_blkptr[j], &czb); + err = traverse_visitbp(td, dnp, &dnp->dn_blkptr[j], &czb); if (err) { if (!hard) break; @@ -420,7 +414,7 @@ traverse_dnode(traverse_data_t *td, cons if (dnp->dn_flags & DNODE_FLAG_SPILL_BLKPTR) { SET_BOOKMARK(&czb, objset, object, 0, DMU_SPILL_BLKID); - err = traverse_visitbp(td, dnp, buf, &dnp->dn_spill, &czb); + err = traverse_visitbp(td, dnp, &dnp->dn_spill, &czb); if (err) { if (!hard) return (err); @@ -433,8 +427,7 @@ traverse_dnode(traverse_data_t *td, cons /* ARGSUSED */ static int traverse_prefetcher(spa_t *spa, zilog_t *zilog, const blkptr_t *bp, - arc_buf_t *pbuf, const zbookmark_t *zb, const dnode_phys_t *dnp, - void *arg) + const zbookmark_t *zb, const dnode_phys_t *dnp, void *arg) { prefetch_data_t *pfd = arg; uint32_t aflags = ARC_NOWAIT | ARC_PREFETCH; @@ -455,10 +448,8 @@ traverse_prefetcher(spa_t *spa, zilog_t cv_broadcast(&pfd->pd_cv); mutex_exit(&pfd->pd_mtx); - (void) dsl_read(NULL, spa, bp, pbuf, NULL, NULL, - ZIO_PRIORITY_ASYNC_READ, - ZIO_FLAG_CANFAIL | ZIO_FLAG_SPECULATIVE, - &aflags, zb); + (void) arc_read(NULL, spa, bp, NULL, NULL, ZIO_PRIORITY_ASYNC_READ, + ZIO_FLAG_CANFAIL | ZIO_FLAG_SPECULATIVE, &aflags, zb); return (0); } @@ -476,7 +467,7 @@ traverse_prefetch_thread(void *arg) SET_BOOKMARK(&czb, td.td_objset, ZB_ROOT_OBJECT, ZB_ROOT_LEVEL, ZB_ROOT_BLKID); - (void) traverse_visitbp(&td, NULL, NULL, td.td_rootbp, &czb); + (void) traverse_visitbp(&td, NULL, td.td_rootbp, &czb); mutex_enter(&td_main->td_pfd->pd_mtx); td_main->td_pfd->pd_exited = B_TRUE; @@ -540,7 +531,7 @@ traverse_impl(spa_t *spa, dsl_dataset_t SET_BOOKMARK(&czb, td.td_objset, ZB_ROOT_OBJECT, ZB_ROOT_LEVEL, ZB_ROOT_BLKID); - err = traverse_visitbp(&td, NULL, NULL, rootbp, &czb); + err = traverse_visitbp(&td, NULL, rootbp, &czb); mutex_enter(&pd.pd_mtx); pd.pd_cancel = B_TRUE; Modified: stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_tx.c ============================================================================== --- stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_tx.c Wed Feb 27 19:03:31 2013 (r247405) +++ stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_tx.c Wed Feb 27 19:20:50 2013 (r247406) @@ -21,7 +21,7 @@ /* * Copyright (c) 2005, 2010, Oracle and/or its affiliates. All rights reserved. * Copyright 2011 Nexenta Systems, Inc. All rights reserved. - * Copyright (c) 2012 by Delphix. All rights reserved. + * Copyright (c) 2013 by Delphix. All rights reserved. */ #include @@ -284,6 +284,7 @@ dmu_tx_count_write(dmu_tx_hold_t *txh, u delta = P2NPHASE(off, dn->dn_datablksz); } + min_ibs = max_ibs = dn->dn_indblkshift; if (dn->dn_maxblkid > 0) { /* * The blocksize can't change, @@ -291,13 +292,6 @@ dmu_tx_count_write(dmu_tx_hold_t *txh, u */ ASSERT(dn->dn_datablkshift != 0); min_bs = max_bs = dn->dn_datablkshift; - min_ibs = max_ibs = dn->dn_indblkshift; - } else if (dn->dn_indblkshift > max_ibs) { - /* - * This ensures that if we reduce DN_MAX_INDBLKSHIFT, - * the code will still work correctly on older pools. - */ - min_ibs = max_ibs = dn->dn_indblkshift; } /* Modified: stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_dataset.c ============================================================================== --- stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_dataset.c Wed Feb 27 19:03:31 2013 (r247405) +++ stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_dataset.c Wed Feb 27 19:20:50 2013 (r247406) @@ -1308,7 +1308,7 @@ struct killarg { /* ARGSUSED */ static int -kill_blkptr(spa_t *spa, zilog_t *zilog, const blkptr_t *bp, arc_buf_t *pbuf, +kill_blkptr(spa_t *spa, zilog_t *zilog, const blkptr_t *bp, const zbookmark_t *zb, const dnode_phys_t *dnp, void *arg) { struct killarg *ka = arg; Modified: stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_scan.c ============================================================================== --- stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_scan.c Wed Feb 27 19:03:31 2013 (r247405) +++ stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_scan.c Wed Feb 27 19:20:50 2013 (r247406) @@ -396,24 +396,6 @@ dsl_free_sync(zio_t *pio, dsl_pool_t *dp zio_nowait(zio_free_sync(pio, dp->dp_spa, txg, bpp, pio->io_flags)); } -int -dsl_read(zio_t *pio, spa_t *spa, const blkptr_t *bpp, arc_buf_t *pbuf, - arc_done_func_t *done, void *private, int priority, int zio_flags, - uint32_t *arc_flags, const zbookmark_t *zb) -{ - return (arc_read(pio, spa, bpp, pbuf, done, private, - priority, zio_flags, arc_flags, zb)); -} - -int -dsl_read_nolock(zio_t *pio, spa_t *spa, const blkptr_t *bpp, - arc_done_func_t *done, void *private, int priority, int zio_flags, - uint32_t *arc_flags, const zbookmark_t *zb) -{ - return (arc_read_nolock(pio, spa, bpp, done, private, - priority, zio_flags, arc_flags, zb)); -} - static uint64_t dsl_scan_ds_maxtxg(dsl_dataset_t *ds) { @@ -584,12 +566,8 @@ dsl_scan_prefetch(dsl_scan_t *scn, arc_b SET_BOOKMARK(&czb, objset, object, BP_GET_LEVEL(bp), blkid); - /* - * XXX need to make sure all of these arc_read() prefetches are - * done before setting xlateall (similar to dsl_read()) - */ (void) arc_read(scn->scn_zio_root, scn->scn_dp->dp_spa, bp, - buf, NULL, NULL, ZIO_PRIORITY_ASYNC_READ, + NULL, NULL, ZIO_PRIORITY_ASYNC_READ, ZIO_FLAG_CANFAIL | ZIO_FLAG_SCAN_THREAD, &flags, &czb); } @@ -647,8 +625,7 @@ dsl_scan_recurse(dsl_scan_t *scn, dsl_da blkptr_t *cbp; int epb = BP_GET_LSIZE(bp) >> SPA_BLKPTRSHIFT; - err = arc_read_nolock(NULL, dp->dp_spa, bp, - arc_getbuf_func, bufp, + err = arc_read(NULL, dp->dp_spa, bp, arc_getbuf_func, bufp, ZIO_PRIORITY_ASYNC_READ, zio_flags, &flags, zb); if (err) { scn->scn_phys.scn_errors++; @@ -670,8 +647,7 @@ dsl_scan_recurse(dsl_scan_t *scn, dsl_da } else if (BP_GET_TYPE(bp) == DMU_OT_USERGROUP_USED) { uint32_t flags = ARC_WAIT; - err = arc_read_nolock(NULL, dp->dp_spa, bp, - arc_getbuf_func, bufp, + err = arc_read(NULL, dp->dp_spa, bp, arc_getbuf_func, bufp, ZIO_PRIORITY_ASYNC_READ, zio_flags, &flags, zb); if (err) { scn->scn_phys.scn_errors++; @@ -683,8 +659,7 @@ dsl_scan_recurse(dsl_scan_t *scn, dsl_da int i, j; int epb = BP_GET_LSIZE(bp) >> DNODE_SHIFT; - err = arc_read_nolock(NULL, dp->dp_spa, bp, - arc_getbuf_func, bufp, + err = arc_read(NULL, dp->dp_spa, bp, arc_getbuf_func, bufp, ZIO_PRIORITY_ASYNC_READ, zio_flags, &flags, zb); if (err) { scn->scn_phys.scn_errors++; @@ -706,8 +681,7 @@ dsl_scan_recurse(dsl_scan_t *scn, dsl_da uint32_t flags = ARC_WAIT; objset_phys_t *osp; - err = arc_read_nolock(NULL, dp->dp_spa, bp, - arc_getbuf_func, bufp, + err = arc_read(NULL, dp->dp_spa, bp, arc_getbuf_func, bufp, ZIO_PRIORITY_ASYNC_READ, zio_flags, &flags, zb); if (err) { scn->scn_phys.scn_errors++; Modified: stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/metaslab.c ============================================================================== --- stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/metaslab.c Wed Feb 27 19:03:31 2013 (r247405) +++ stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/metaslab.c Wed Feb 27 19:20:50 2013 (r247406) @@ -21,6 +21,7 @@ /* * Copyright (c) 2005, 2010, Oracle and/or its affiliates. All rights reserved. * Copyright (c) 2012 by Delphix. All rights reserved. + * Copyright (c) 2013 by Saso Kiselkov. All rights reserved. */ #include @@ -97,6 +98,15 @@ int metaslab_prefetch_limit = SPA_DVAS_P int metaslab_smo_bonus_pct = 150; /* + * Should we be willing to write data to degraded vdevs? + */ +boolean_t zfs_write_to_degraded = B_FALSE; +SYSCTL_INT(_vfs_zfs, OID_AUTO, write_to_degraded, CTLFLAG_RW, + &zfs_write_to_degraded, 0, + "Allow writing data to degraded vdevs"); +TUNABLE_INT("vfs.zfs.write_to_degraded", &zfs_write_to_degraded); + +/* * ========================================================================== * Metaslab classes * ========================================================================== @@ -1383,10 +1393,13 @@ top: /* * Avoid writing single-copy data to a failing vdev + * unless the user instructs us that it is okay. */ if ((vd->vdev_stat.vs_write_errors > 0 || vd->vdev_state < VDEV_STATE_HEALTHY) && - d == 0 && dshift == 3) { + d == 0 && dshift == 3 && + !(zfs_write_to_degraded && vd->vdev_state == + VDEV_STATE_DEGRADED)) { all_zero = B_FALSE; goto next; } Modified: stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sa.c ============================================================================== --- stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sa.c Wed Feb 27 19:03:31 2013 (r247405) +++ stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sa.c Wed Feb 27 19:20:50 2013 (r247406) @@ -553,6 +553,7 @@ sa_find_sizes(sa_os_t *sa, sa_bulk_attr_ { int var_size = 0; int i; + int j = -1; int full_space; int hdrsize; boolean_t done = B_FALSE; @@ -574,11 +575,13 @@ sa_find_sizes(sa_os_t *sa, sa_bulk_attr_ sizeof (sa_hdr_phys_t); full_space = (buftype == SA_BONUS) ? DN_MAX_BONUSLEN : db->db_size; + ASSERT(IS_P2ALIGNED(full_space, 8)); for (i = 0; i != attr_count; i++) { boolean_t is_var_sz; - *total += P2ROUNDUP(attr_desc[i].sa_length, 8); + *total = P2ROUNDUP(*total, 8); + *total += attr_desc[i].sa_length; if (done) goto next; @@ -590,7 +593,14 @@ sa_find_sizes(sa_os_t *sa, sa_bulk_attr_ if (is_var_sz && var_size > 1) { if (P2ROUNDUP(hdrsize + sizeof (uint16_t), 8) + *total < full_space) { + /* + * Account for header space used by array of + * optional sizes of variable-length attributes. + * Record the index in case this increase needs + * to be reversed due to spill-over. + */ hdrsize += sizeof (uint16_t); + j = i; } else { done = B_TRUE; *index = i; @@ -619,6 +629,14 @@ next: *will_spill = B_TRUE; } + /* + * j holds the index of the last variable-sized attribute for + * which hdrsize was increased. Reverse the increase if that + * attribute will be relocated to the spill block. + */ + if (*will_spill && j == *index) + hdrsize -= sizeof (uint16_t); + hdrsize = P2ROUNDUP(hdrsize, 8); return (hdrsize); } @@ -709,6 +727,8 @@ sa_build_layouts(sa_handle_t *hdl, sa_bu for (i = 0, len_idx = 0, hash = -1ULL; i != attr_count; i++) { uint16_t length; + ASSERT(IS_P2ALIGNED(data_start, 8)); + ASSERT(IS_P2ALIGNED(buf_space, 8)); attrs[i] = attr_desc[i].sa_attr; length = SA_REGISTERED_LEN(sa, attrs[i]); if (length == 0) @@ -717,6 +737,7 @@ sa_build_layouts(sa_handle_t *hdl, sa_bu VERIFY(length == attr_desc[i].sa_length); if (buf_space < length) { /* switch to spill buffer */ + VERIFY(spilling); VERIFY(bonustype == DMU_OT_SA); if (buftype == SA_BONUS && !sa->sa_force_spill) { sa_find_layout(hdl->sa_os, hash, attrs_start, Modified: stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa.c ============================================================================== --- stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa.c Wed Feb 27 19:03:31 2013 (r247405) +++ stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa.c Wed Feb 27 19:20:50 2013 (r247406) @@ -1764,7 +1764,7 @@ spa_load_verify_done(zio_t *zio) /*ARGSUSED*/ static int spa_load_verify_cb(spa_t *spa, zilog_t *zilog, const blkptr_t *bp, - arc_buf_t *pbuf, const zbookmark_t *zb, const dnode_phys_t *dnp, void *arg) + const zbookmark_t *zb, const dnode_phys_t *dnp, void *arg) { if (bp != NULL) { zio_t *rio = arg; Modified: stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/arc.h ============================================================================== --- stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/arc.h Wed Feb 27 19:03:31 2013 (r247405) +++ stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/arc.h Wed Feb 27 19:20:50 2013 (r247406) @@ -49,7 +49,6 @@ struct arc_buf { arc_buf_hdr_t *b_hdr; arc_buf_t *b_next; kmutex_t b_evict_lock; - krwlock_t b_data_lock; void *b_data; arc_evict_func_t *b_efunc; void *b_private; @@ -93,8 +92,6 @@ void arc_buf_add_ref(arc_buf_t *buf, voi int arc_buf_remove_ref(arc_buf_t *buf, void *tag); int arc_buf_size(arc_buf_t *buf); void arc_release(arc_buf_t *buf, void *tag); -int arc_release_bp(arc_buf_t *buf, void *tag, blkptr_t *bp, spa_t *spa, - zbookmark_t *zb); int arc_released(arc_buf_t *buf); int arc_has_callback(arc_buf_t *buf); void arc_buf_freeze(arc_buf_t *buf); @@ -103,10 +100,7 @@ void arc_buf_thaw(arc_buf_t *buf); int arc_referenced(arc_buf_t *buf); #endif -int arc_read(zio_t *pio, spa_t *spa, const blkptr_t *bp, arc_buf_t *pbuf, - arc_done_func_t *done, void *priv, int priority, int zio_flags, - uint32_t *arc_flags, const zbookmark_t *zb); -int arc_read_nolock(zio_t *pio, spa_t *spa, const blkptr_t *bp, +int arc_read(zio_t *pio, spa_t *spa, const blkptr_t *bp, arc_done_func_t *done, void *priv, int priority, int flags, uint32_t *arc_flags, const zbookmark_t *zb); zio_t *arc_write(zio_t *pio, spa_t *spa, uint64_t txg, Modified: stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/dmu_traverse.h ============================================================================== --- stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/dmu_traverse.h Wed Feb 27 19:03:31 2013 (r247405) +++ stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/dmu_traverse.h Wed Feb 27 19:20:50 2013 (r247406) @@ -40,8 +40,7 @@ struct zilog; struct arc_buf; typedef int (blkptr_cb_t)(spa_t *spa, zilog_t *zilog, const blkptr_t *bp, - struct arc_buf *pbuf, const zbookmark_t *zb, const struct dnode_phys *dnp, - void *arg); + const zbookmark_t *zb, const struct dnode_phys *dnp, void *arg); #define TRAVERSE_PRE (1<<0) #define TRAVERSE_POST (1<<1) Modified: stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/dsl_pool.h ============================================================================== --- stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/dsl_pool.h Wed Feb 27 19:03:31 2013 (r247405) +++ stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/dsl_pool.h Wed Feb 27 19:20:50 2013 (r247406) @@ -134,12 +134,6 @@ void dsl_pool_willuse_space(dsl_pool_t * void dsl_free(dsl_pool_t *dp, uint64_t txg, const blkptr_t *bpp); void dsl_free_sync(zio_t *pio, dsl_pool_t *dp, uint64_t txg, const blkptr_t *bpp); -int dsl_read(zio_t *pio, spa_t *spa, const blkptr_t *bpp, arc_buf_t *pbuf, - arc_done_func_t *done, void *priv, int priority, int zio_flags, - uint32_t *arc_flags, const zbookmark_t *zb); -int dsl_read_nolock(zio_t *pio, spa_t *spa, const blkptr_t *bpp, - arc_done_func_t *done, void *priv, int priority, int zio_flags, - uint32_t *arc_flags, const zbookmark_t *zb); void dsl_pool_create_origin(dsl_pool_t *dp, dmu_tx_t *tx); void dsl_pool_upgrade_clones(dsl_pool_t *dp, dmu_tx_t *tx); void dsl_pool_upgrade_dir_clones(dsl_pool_t *dp, dmu_tx_t *tx); Modified: stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/refcount.h ============================================================================== --- stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/refcount.h Wed Feb 27 19:03:31 2013 (r247405) +++ stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/refcount.h Wed Feb 27 19:20:50 2013 (r247406) @@ -20,6 +20,7 @@ */ /* * Copyright (c) 2005, 2010, Oracle and/or its affiliates. All rights reserved. + * Copyright (c) 2012 by Delphix. All rights reserved. */ #ifndef _SYS_REFCOUNT_H @@ -54,8 +55,8 @@ typedef struct refcount { kmutex_t rc_mtx; list_t rc_list; list_t rc_removed; - int64_t rc_count; - int64_t rc_removed_count; + uint64_t rc_count; + uint64_t rc_removed_count; } refcount_t; /* Note: refcount_t must be initialized with refcount_create() */ Modified: stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev.c ============================================================================== --- stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev.c Wed Feb 27 19:03:31 2013 (r247405) +++ stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev.c Wed Feb 27 19:20:50 2013 (r247406) @@ -1328,7 +1328,8 @@ vdev_validate(vdev_t *vd, boolean_t stri if (vd->vdev_ops->vdev_op_leaf && vdev_readable(vd)) { uint64_t aux_guid = 0; nvlist_t *nvl; - uint64_t txg = strict ? spa->spa_config_txg : -1ULL; + uint64_t txg = spa_last_synced_txg(spa) != 0 ? + spa_last_synced_txg(spa) : -1ULL; if ((label = vdev_label_read_config(vd, txg)) == NULL) { vdev_set_state(vd, B_TRUE, VDEV_STATE_CANT_OPEN, @@ -1512,7 +1513,7 @@ vdev_reopen(vdev_t *vd) !l2arc_vdev_present(vd)) l2arc_add_vdev(spa, vd); } else { - (void) vdev_validate(vd, spa_last_synced_txg(spa)); + (void) vdev_validate(vd, B_TRUE); } /* Modified: stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_ioctl.c ============================================================================== --- stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_ioctl.c Wed Feb 27 19:03:31 2013 (r247405) +++ stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_ioctl.c Wed Feb 27 19:20:50 2013 (r247406) @@ -106,12 +106,18 @@ typedef enum { DATASET_NAME } zfs_ioc_namecheck_t; +typedef enum { + POOL_CHECK_NONE = 1 << 0, + POOL_CHECK_SUSPENDED = 1 << 1, + POOL_CHECK_READONLY = 1 << 2 +} zfs_ioc_poolcheck_t; + typedef struct zfs_ioc_vec { zfs_ioc_func_t *zvec_func; zfs_secpolicy_func_t *zvec_secpolicy; zfs_ioc_namecheck_t zvec_namecheck; boolean_t zvec_his_log; - boolean_t zvec_pool_check; + zfs_ioc_poolcheck_t zvec_pool_check; } zfs_ioc_vec_t; /* This array is indexed by zfs_userquota_prop_t */ @@ -5033,138 +5039,155 @@ zfs_ioc_unjail(zfs_cmd_t *zc) static zfs_ioc_vec_t zfs_ioc_vec[] = { { zfs_ioc_pool_create, zfs_secpolicy_config, POOL_NAME, B_FALSE, - B_FALSE }, + POOL_CHECK_NONE }, { zfs_ioc_pool_destroy, zfs_secpolicy_config, POOL_NAME, B_FALSE, - B_FALSE }, + POOL_CHECK_NONE }, { zfs_ioc_pool_import, zfs_secpolicy_config, POOL_NAME, B_TRUE, - B_FALSE }, + POOL_CHECK_NONE }, { zfs_ioc_pool_export, zfs_secpolicy_config, POOL_NAME, B_FALSE, - B_FALSE }, *** DIFF OUTPUT TRUNCATED AT 1000 LINES *** _______________________________________________ svn-src-all@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/svn-src-all To unsubscribe, send any mail to "svn-src-all-unsubscribe@freebsd.org" From owner-freebsd-fs@FreeBSD.ORG Thu Feb 28 02:59:29 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 58506410; Thu, 28 Feb 2013 02:59:29 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) Received: from esa-jnhn.mail.uoguelph.ca (esa-jnhn.mail.uoguelph.ca [131.104.91.44]) by mx1.freebsd.org (Postfix) with ESMTP id E84E73A4; Thu, 28 Feb 2013 02:59:28 +0000 (UTC) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AqAEACTHLlGDaFvO/2dsb2JhbABFhk+4e4JlgRNzgiYjVkQZAgRVBogmrweSZ45gGRsHgi2BEwOIaoY8hxuJY4cHgyaCCQ X-IronPort-AV: E=Sophos;i="4.84,752,1355115600"; d="scan'208";a="18641552" Received: from erie.cs.uoguelph.ca (HELO zcs3.mail.uoguelph.ca) ([131.104.91.206]) by esa-jnhn.mail.uoguelph.ca with ESMTP; 27 Feb 2013 21:59:22 -0500 Received: from zcs3.mail.uoguelph.ca (localhost.localdomain [127.0.0.1]) by zcs3.mail.uoguelph.ca (Postfix) with ESMTP id 08FFCB3F0D; Wed, 27 Feb 2013 21:59:22 -0500 (EST) Date: Wed, 27 Feb 2013 21:59:22 -0500 (EST) From: Rick Macklem To: FreeBSD Filesystems Message-ID: <707174204.3391839.1362020362019.JavaMail.root@erie.cs.uoguelph.ca> In-Reply-To: <860349954.3391816.1362020304865.JavaMail.root@erie.cs.uoguelph.ca> Subject: should vn_fullpath1() ever return a path with "." in it? MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="----=_Part_3391838_1284162422.1362020362017" X-Originating-IP: [172.17.91.201] X-Mailer: Zimbra 6.0.10_GA_2692 (ZimbraWebClient - FF3.0 (Win)/6.0.10_GA_2692) Cc: Sergey Kandaurov , Kostik Belousov X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 28 Feb 2013 02:59:29 -0000 ------=_Part_3391838_1284162422.1362020362017 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit Hi, Sergey Kandaurov reported a problem where getcwd() returns a path with "/./" imbedded in it for an NFSv4 mount. This is caused by a mount point crossing on the server when at the server's root because vn_fullpath1() uses VV_ROOT to spot mount point crossings. The current workaround is to use the sysctls: debug.disablegetcwd=1 debug.disablefullpath=1 However, it would be nice to fix this when vn_fullpath1() is being used. A simple fix is to have vn_fullpath1() fail when it finds "." as a directory match in the path. When vn_fullpath1() fails, the syscalls fail and that allows the libc algorithm to be used (which works for this case because it doesn't depend on VV_ROOT being set, etc). So, I am wondering if a patch (I have attached one) that makes vn_fullpath1() fail when it matches "." will break anything else? (I don't think so, since the code checks for VV_ROOT in the loop above the check for a match of ".", but I am not sure?) Thanks for any input w.r.t. this, rick ------=_Part_3391838_1284162422.1362020362017 Content-Type: text/x-patch; name=getcwd.patch Content-Transfer-Encoding: base64 Content-Disposition: attachment; filename=getcwd.patch LS0tIGtlcm4vdmZzX2NhY2hlLmMuc2F2CTIwMTMtMDItMjcgMjA6NDQ6NDIuMDAwMDAwMDAwIC0w NTAwCisrKyBrZXJuL3Zmc19jYWNoZS5jCTIwMTMtMDItMjcgMjE6MTA6MzkuMDAwMDAwMDAwIC0w NTAwCkBAIC0xMzMzLDYgKzEzMzMsMjAgQEAgdm5fZnVsbHBhdGgxKHN0cnVjdCB0aHJlYWQgKnRk LCBzdHJ1Y3QgdgogCQkJICAgIHN0YXJ0dnAsIE5VTEwsIDAsIDApOwogCQkJYnJlYWs7CiAJCX0K KwkJaWYgKGJ1ZltidWZsZW5dID09ICcuJyAmJiAoYnVmW2J1ZmxlbiArIDFdID09ICdcMCcgfHwK KwkJICAgIGJ1ZltidWZsZW4gKyAxXSA9PSAnLycpKSB7CisJCQkvKgorCQkJICogRmFpbCBpZiBp dCBtYXRjaGVkICIuIi4gVGhpcyBzaG91bGQgb25seSBoYXBwZW4KKwkJCSAqIGZvciBORlN2NCBt b3VudHMgdGhhdCBjcm9zcyBzZXJ2ZXIgbW91bnQgcG9pbnRzLgorCQkJICovCisJCQlDQUNIRV9S VU5MT0NLKCk7CisJCQl2cmVsZSh2cCk7CisJCQludW1mdWxscGF0aGZhaWwxKys7CisJCQllcnJv ciA9IEVOT0VOVDsKKwkJCVNEVF9QUk9CRSh2ZnMsIG5hbWVjYWNoZSwgZnVsbHBhdGgsIHJldHVy biwKKwkJCSAgICBlcnJvciwgdnAsIE5VTEwsIDAsIDApOworCQkJYnJlYWs7CisJCX0KIAkJYnVm Wy0tYnVmbGVuXSA9ICcvJzsKIAkJc2xhc2hfcHJlZml4ZWQgPSAxOwogCX0K ------=_Part_3391838_1284162422.1362020362017-- From owner-freebsd-fs@FreeBSD.ORG Thu Feb 28 06:51:19 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 4E32F731 for ; Thu, 28 Feb 2013 06:51:19 +0000 (UTC) (envelope-from it.helpdesk@mab.ae) Received: from mail.mab.ae (mail2.mab.ae [94.56.15.51]) by mx1.freebsd.org (Postfix) with ESMTP id 75AF8E50 for ; Thu, 28 Feb 2013 06:51:17 +0000 (UTC) Received: from DXBHUB02.MAB.PRD (Not Verified[172.16.5.126]) by mail.mab.ae with MailMarshal (v6, 8, 4, 9558) id ; Thu, 28 Feb 2013 10:35:19 +0400 Received: from DXBMBX01.MAB.PRD ([fe80::e0ca:10ea:97a8:27d0]) by dxbhub02 ([172.16.5.124]) with mapi id 14.01.0355.002; Thu, 28 Feb 2013 10:35:19 +0400 From: IT Helpdesk To: "freebsd-fs@freebsd.org" Subject: Re: Policy Breaches, Malformed, Spam Type - Zero Day, Routing, Encryption, Undetermined, Spam, Malformed Mime, Spam Type - Phish, Spam Type - Pornographic, Suspect Summary Digest: 1 Messages Thread-Topic: Policy Breaches, Malformed, Spam Type - Zero Day, Routing, Encryption, Undetermined, Spam, Malformed Mime, Spam Type - Phish, Spam Type - Pornographic, Suspect Summary Digest: 1 Messages Thread-Index: Ac4VfcC0L3C5MuOiTf6fGI5idyf1Bw== Date: Thu, 28 Feb 2013 06:35:18 +0000 Message-ID: <238EE51378AEA748BD52DD92AF2450F234CD665E@DXBMBX01.MAB.PRD> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [172.16.6.147] MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable X-Content-Filtered-By: Mailman/MimeDel 2.1.14 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 28 Feb 2013 06:51:19 -0000 #########################################################################= ############ Disclaimer: This message is for the named person's use only. It may cont= ain confidential, proprietary or legally privileged information. No conf= identiality or privilege is waived or lost by any incorrect transmission.= =20 If you receive this message in error, please immediately delete it and al= l copies of it from your system, destroy any hard copies of it and notify= =20the sender.You must not, directly or indirectly, use, disclose, distri= bute, print, or copy any part of this message if you are not the intended= =20recipient.=20 Email transmission cannot be guaranteed to be secure or error-free as inf= ormation may be intercepted, corrupted, lost, destroyed, arrive late or i= ncomplete, or contain viruses. Therefore, we do not accept any liability,= =20for any error or omission in this email or for any resulting loss or d= amage suffered as a result of email transmission.=20 Any views expressed in this message are those of the individual sender, e= xcept where the message states otherwise and the sender is authorized to = state them to be the views of any such entity. MAB Facilities Management = L.L.C and any of its subsidiaries each reserve the right to monitor all e= -mail communications through its networks.=20 #########################################################################= ############ From owner-freebsd-fs@FreeBSD.ORG Thu Feb 28 07:05:23 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id AC218B0A; Thu, 28 Feb 2013 07:05:23 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from kib.kiev.ua (kib.kiev.ua [IPv6:2001:470:d5e7:1::1]) by mx1.freebsd.org (Postfix) with ESMTP id 06DC9ED8; Thu, 28 Feb 2013 07:05:22 +0000 (UTC) Received: from tom.home (kostik@localhost [127.0.0.1]) by kib.kiev.ua (8.14.6/8.14.6) with ESMTP id r1S75FXk088416; Thu, 28 Feb 2013 09:05:15 +0200 (EET) (envelope-from kostikbel@gmail.com) DKIM-Filter: OpenDKIM Filter v2.8.0 kib.kiev.ua r1S75FXk088416 Received: (from kostik@localhost) by tom.home (8.14.6/8.14.6/Submit) id r1S75FiS088414; Thu, 28 Feb 2013 09:05:15 +0200 (EET) (envelope-from kostikbel@gmail.com) X-Authentication-Warning: tom.home: kostik set sender to kostikbel@gmail.com using -f Date: Thu, 28 Feb 2013 09:05:15 +0200 From: Konstantin Belousov To: Rick Macklem Subject: Re: should vn_fullpath1() ever return a path with "." in it? Message-ID: <20130228070515.GK2454@kib.kiev.ua> References: <860349954.3391816.1362020304865.JavaMail.root@erie.cs.uoguelph.ca> <707174204.3391839.1362020362019.JavaMail.root@erie.cs.uoguelph.ca> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="TOcCsfss/f1fJPnO" Content-Disposition: inline In-Reply-To: <707174204.3391839.1362020362019.JavaMail.root@erie.cs.uoguelph.ca> User-Agent: Mutt/1.5.21 (2010-09-15) X-Spam-Status: No, score=-2.0 required=5.0 tests=ALL_TRUSTED,BAYES_00, DKIM_ADSP_CUSTOM_MED,FREEMAIL_FROM,NML_ADSP_CUSTOM_MED autolearn=no version=3.3.2 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on tom.home Cc: FreeBSD Filesystems , Sergey Kandaurov X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 28 Feb 2013 07:05:23 -0000 --TOcCsfss/f1fJPnO Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Wed, Feb 27, 2013 at 09:59:22PM -0500, Rick Macklem wrote: > Hi, >=20 > Sergey Kandaurov reported a problem where getcwd() returns a > path with "/./" imbedded in it for an NFSv4 mount. This is > caused by a mount point crossing on the server when at the > server's root because vn_fullpath1() uses VV_ROOT to spot > mount point crossings. >=20 > The current workaround is to use the sysctls: > debug.disablegetcwd=3D1 > debug.disablefullpath=3D1 >=20 > However, it would be nice to fix this when vn_fullpath1() > is being used. >=20 > A simple fix is to have vn_fullpath1() fail when it finds > "." as a directory match in the path. When vn_fullpath1() > fails, the syscalls fail and that allows the libc algorithm > to be used (which works for this case because it doesn't > depend on VV_ROOT being set, etc). >=20 > So, I am wondering if a patch (I have attached one) that > makes vn_fullpath1() fail when it matches "." will break > anything else? (I don't think so, since the code checks > for VV_ROOT in the loop above the check for a match of > ".", but I am not sure?) >=20 > Thanks for any input w.r.t. this, rick > --- kern/vfs_cache.c.sav 2013-02-27 20:44:42.000000000 -0500 > +++ kern/vfs_cache.c 2013-02-27 21:10:39.000000000 -0500 > @@ -1333,6 +1333,20 @@ vn_fullpath1(struct thread *td, struct v > startvp, NULL, 0, 0); > break; > } > + if (buf[buflen] =3D=3D '.' && (buf[buflen + 1] =3D=3D '\0' || > + buf[buflen + 1] =3D=3D '/')) { > + /* > + * Fail if it matched ".". This should only happen > + * for NFSv4 mounts that cross server mount points. > + */ > + CACHE_RUNLOCK(); > + vrele(vp); > + numfullpathfail1++; > + error =3D ENOENT; > + SDT_PROBE(vfs, namecache, fullpath, return, > + error, vp, NULL, 0, 0); > + break; > + } > buf[--buflen] =3D '/'; > slash_prefixed =3D 1; > } I do not quite understand this. Did the dvp (parent) vnode returned by VOP_VPTOCNP() equal to vp (child) vnode in the case of the "." name ? It must be, for the correct operation, but also it should cause the almost infinite loop in the vn_fullpath1(). The loop is not really infinite due to a limited size of the buffer where the infinite amount of "./" is placed. Anyway, I think we should do better than this patch, even if it is legitimate. I think that the better place to check the condition is the default implementation of VOP_VPTOCNP(). Am I right that this is where it broke for you ? diff --git a/sys/kern/vfs_default.c b/sys/kern/vfs_default.c index 00d064e..1dd0185 100644 --- a/sys/kern/vfs_default.c +++ b/sys/kern/vfs_default.c @@ -856,8 +856,12 @@ vop_stdvptocnp(struct vop_vptocnp_args *ap) error =3D ENOMEM; goto out; } - bcopy(dp->d_name, buf + i, dp->d_namlen); - error =3D 0; + if (dp->d_namlen =3D=3D 1 && dp->d_name[0] =3D=3D '.') { + error =3D ENOENT; + } else { + bcopy(dp->d_name, buf + i, dp->d_namlen); + error =3D 0; + } goto out; } } while (len > 0 || !eofflag); --TOcCsfss/f1fJPnO Content-Type: application/pgp-signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.19 (FreeBSD) iQIcBAEBAgAGBQJRLwGqAAoJEJDCuSvBvK1BDWAP/2btss4VP6rNctXFRP+Sg89v HyhDquJdAUqhSCbQgRMTUrUQWf/K/O3RJAZ+/6S062JQH7vYfvGkB7YFnMVk0oml 2Lho0Qie4lMM2zwH/otWpJ3L0FxRed5dG3vB0jmBYXFTizzGiFPx0jgr2X40vuVE n6cdICidbApt4hbuSSBE3V2c1XqpufbOWYp3uKrqdQ/twMdR6nsEOGnMeGCNaqm3 tDv2LNLJIz+6MYwerCeELkNuxQpPZRMCHL54t72WeIbhGcC5aK225txpyw7sJPhG UDqLDGEuwSj5xbwqt9ISEkd2HqumzhRuUhmTX/popF+TDaJP6uQAEVwwS4UxdMCt y+qzn+zO4xlFljHwGGaxf+8abrfZ/31+w3riSOd3HI3MPVbEkHH9Z05LX7bjabeO Xs0L3Yeh1LDL6/adwkpZYdUHcWwNhywzXt0oduVXO9NAhPvlxDRs2o9yjkvJCzob axjcwY6H8QEnHLwlTsGO/lWYQLtSzXr4EmQta2EjjmMMby7J7dg6NdS+d1DfViZc eZCXx8sl3LyPRYXUwZ72522csR974TVirHFWh0dqsUkcVLxLQdQE8Pdkx2UWlyJA KaWmSsf8b6YN3bVmQbQwSsrM97r7o/W06OgscZXwxqZUPNy5WlRl+VfxrjPGspnd JrIyI8FEx0oKz63uan5Z =d3e2 -----END PGP SIGNATURE----- --TOcCsfss/f1fJPnO-- From owner-freebsd-fs@FreeBSD.ORG Thu Feb 28 08:06:46 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id E9B976C2; Thu, 28 Feb 2013 08:06:46 +0000 (UTC) (envelope-from lev@FreeBSD.org) Received: from onlyone.friendlyhosting.spb.ru (onlyone.friendlyhosting.spb.ru [46.4.40.135]) by mx1.freebsd.org (Postfix) with ESMTP id B22961111; Thu, 28 Feb 2013 08:06:46 +0000 (UTC) Received: from lion.home.serebryakov.spb.ru (unknown [IPv6:2001:470:923f:1:9421:367:9d7d:512b]) (Authenticated sender: lev@serebryakov.spb.ru) by onlyone.friendlyhosting.spb.ru (Postfix) with ESMTPA id BCF1B4AC57; Thu, 28 Feb 2013 12:06:35 +0400 (MSK) Date: Thu, 28 Feb 2013 12:06:30 +0400 From: Lev Serebryakov Organization: FreeBSD X-Priority: 3 (Normal) Message-ID: <1796551389.20130228120630@serebryakov.spb.ru> To: freebsd-fs@freebsd.org, freebsd-current@freebsd.org Subject: Unexpected SU+J inconsistency AGAIN -- please, don't shift topic to ZFS! MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list Reply-To: lev@FreeBSD.org List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 28 Feb 2013 08:06:47 -0000 Hello, Freebsd-fs. My server runs 9.1-STABLE and have 8Tb UFS2 SU+J FS. It crashed a several minutes ago (I don't know reason yet) and fsck says "Unexpected SU+J inconsistency" (Inode mode/directory tyme mismatch) and requested full check (which will take more than hour on such FS). All drives are perfectly healthy according to SMART, it is SATA WD20EARS/EARX mix. In my experience, SU/SU+J fsck never completes successful on this FS :( Does SU+J work at all? Here was topic in closed mailing list about it, started as topic about using CURRENT on FreeBSD's cluster, but it was shifted to ZFS discussion without changing "Subject" line after several iterations without any conclusion. Could I do something to help debug this problem? Please, don't give advices like "Convert to ZFS". ZFS is great, but, I think, we should have robust "native" and simple FS too. -- // Black Lion AKA Lev Serebryakov From owner-freebsd-fs@FreeBSD.ORG Thu Feb 28 08:33:32 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 45C925F6; Thu, 28 Feb 2013 08:33:32 +0000 (UTC) (envelope-from lev@FreeBSD.org) Received: from onlyone.friendlyhosting.spb.ru (onlyone.friendlyhosting.spb.ru [46.4.40.135]) by mx1.freebsd.org (Postfix) with ESMTP id 06A9912C2; Thu, 28 Feb 2013 08:33:31 +0000 (UTC) Received: from lion.home.serebryakov.spb.ru (unknown [IPv6:2001:470:923f:1:9421:367:9d7d:512b]) (Authenticated sender: lev@serebryakov.spb.ru) by onlyone.friendlyhosting.spb.ru (Postfix) with ESMTPA id C90034AC57; Thu, 28 Feb 2013 12:33:30 +0400 (MSK) Date: Thu, 28 Feb 2013 12:33:25 +0400 From: Lev Serebryakov Organization: FreeBSD X-Priority: 3 (Normal) Message-ID: <1238720635.20130228123325@serebryakov.spb.ru> To: freebsd-fs@freebsd.org, freebsd-current@freebsd.org Subject: Re: Unexpected SU+J inconsistency AGAIN -- please, don't shift topic to ZFS! In-Reply-To: <1796551389.20130228120630@serebryakov.spb.ru> References: <1796551389.20130228120630@serebryakov.spb.ru> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list Reply-To: lev@FreeBSD.org List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 28 Feb 2013 08:33:32 -0000 Hello, Lev. You wrote 28 =D1=84=D0=B5=D0=B2=D1=80=D0=B0=D0=BB=D1=8F 2013 =D0=B3., 12:06= :30: LS> Hello, Freebsd-fs. LS> My server runs 9.1-STABLE and have 8Tb UFS2 SU+J FS. LS> It crashed a several minutes ago (I don't know reason yet) and fsck LS> says "Unexpected SU+J inconsistency" (Inode mode/directory tyme LS> mismatch) and requested full check (which will take more than hour on LS> such FS). Full fsck found "INTERNAL ERROR: DUPS WITH SOFTUPDATES" and keeps running.= .. --=20 // Black Lion AKA Lev Serebryakov From owner-freebsd-fs@FreeBSD.ORG Thu Feb 28 09:07:42 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 29F7FF98; Thu, 28 Feb 2013 09:07:42 +0000 (UTC) (envelope-from yerenkow@gmail.com) Received: from mail-da0-f46.google.com (mail-da0-f46.google.com [209.85.210.46]) by mx1.freebsd.org (Postfix) with ESMTP id C90031611; Thu, 28 Feb 2013 09:07:41 +0000 (UTC) Received: by mail-da0-f46.google.com with SMTP id z8so763294dad.5 for ; Thu, 28 Feb 2013 01:07:35 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:x-received:in-reply-to:references:date:message-id :subject:from:to:cc:content-type; bh=1aba5cCFpfKeW8SjF9cANZJnEbpLg/20UUidbpyvFEk=; b=GVlFmIuD4i9UcfhuRKz5vYOPfeBo/T/EtW5mddH1YZAlGgWdl8rfmxd88J6YJqooaz F5SZkGffUTE6zRdEV5+c4D0kB4H63QIBLHqnyW1ECsOcAF9xa4DDuGKZCiy2NVfSe7nv 1sCAL12O2O/NV1KPI3JwUYzlCeCqINLW8wNLl1FWjsmv8raKaChAWpN3rF6HooT0HJw7 PoMMCl1aSK0MNSTQXszA0ORvM4OqEizwj3p/iBpAtFvTgE0gFax4Spv6nTgLk4U5hfV1 uDSMKyOtW2MEdxukbjLBMjthqByDlT9T558yCyaFizNyrrpVaruLlbWxDZESwL8Fms0k bDsg== MIME-Version: 1.0 X-Received: by 10.68.134.3 with SMTP id pg3mr8171454pbb.51.1362042455545; Thu, 28 Feb 2013 01:07:35 -0800 (PST) Received: by 10.68.36.69 with HTTP; Thu, 28 Feb 2013 01:07:35 -0800 (PST) In-Reply-To: <1238720635.20130228123325@serebryakov.spb.ru> References: <1796551389.20130228120630@serebryakov.spb.ru> <1238720635.20130228123325@serebryakov.spb.ru> Date: Thu, 28 Feb 2013 11:07:35 +0200 Message-ID: Subject: Re: Unexpected SU+J inconsistency AGAIN -- please, don't shift topic to ZFS! From: Alexander Yerenkow To: lev@freebsd.org Content-Type: text/plain; charset=ISO-8859-1 X-Content-Filtered-By: Mailman/MimeDel 2.1.14 Cc: freebsd-fs@freebsd.org, freebsd-current@freebsd.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 28 Feb 2013 09:07:42 -0000 How about tell us 9.1-STABLE from which date you run? Do you use any dumps/snapshots in this FS? In past, that could broke things. -- Regards, Alexander Yerenkow From owner-freebsd-fs@FreeBSD.ORG Thu Feb 28 09:11:08 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 29071466; Thu, 28 Feb 2013 09:11:08 +0000 (UTC) (envelope-from lev@FreeBSD.org) Received: from onlyone.friendlyhosting.spb.ru (onlyone.friendlyhosting.spb.ru [IPv6:2a01:4f8:131:60a2::2]) by mx1.freebsd.org (Postfix) with ESMTP id DEE65164F; Thu, 28 Feb 2013 09:11:07 +0000 (UTC) Received: from lion.home.serebryakov.spb.ru (unknown [IPv6:2001:470:923f:1:9421:367:9d7d:512b]) (Authenticated sender: lev@serebryakov.spb.ru) by onlyone.friendlyhosting.spb.ru (Postfix) with ESMTPA id 5B5614AC57; Thu, 28 Feb 2013 13:11:00 +0400 (MSK) Date: Thu, 28 Feb 2013 13:10:55 +0400 From: Lev Serebryakov Organization: FreeBSD Project X-Priority: 3 (Normal) Message-ID: <376897409.20130228131055@serebryakov.spb.ru> To: Alexander Yerenkow Subject: Re: Unexpected SU+J inconsistency AGAIN -- please, don't shift topic to ZFS! In-Reply-To: References: <1796551389.20130228120630@serebryakov.spb.ru> <1238720635.20130228123325@serebryakov.spb.ru> MIME-Version: 1.0 Content-Type: text/plain; charset=windows-1251 Content-Transfer-Encoding: quoted-printable Cc: freebsd-fs@freebsd.org, freebsd-current@freebsd.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list Reply-To: lev@FreeBSD.org List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 28 Feb 2013 09:11:08 -0000 Hello, Alexander. You wrote 28 =F4=E5=E2=F0=E0=EB=FF 2013 =E3., 13:07:35: AY> How about tell us 9.1-STABLE from which date you run? r244957 AY> Do you use any dumps/snapshots in this FS? Nope.=20 --=20 // Black Lion AKA Lev Serebryakov From owner-freebsd-fs@FreeBSD.ORG Thu Feb 28 10:13:31 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 953552D8; Thu, 28 Feb 2013 10:13:31 +0000 (UTC) (envelope-from lev@FreeBSD.org) Received: from onlyone.friendlyhosting.spb.ru (onlyone.friendlyhosting.spb.ru [46.4.40.135]) by mx1.freebsd.org (Postfix) with ESMTP id 3E6C4192F; Thu, 28 Feb 2013 10:13:30 +0000 (UTC) Received: from lion.home.serebryakov.spb.ru (unknown [IPv6:2001:470:923f:1:9421:367:9d7d:512b]) (Authenticated sender: lev@serebryakov.spb.ru) by onlyone.friendlyhosting.spb.ru (Postfix) with ESMTPA id 254904AC57; Thu, 28 Feb 2013 14:13:29 +0400 (MSK) Date: Thu, 28 Feb 2013 14:13:23 +0400 From: Lev Serebryakov Organization: FreeBSD X-Priority: 3 (Normal) Message-ID: <1158712592.20130228141323@serebryakov.spb.ru> To: freebsd-fs@freebsd.org, freebsd-current@freebsd.org Subject: Re: Unexpected SU+J inconsistency AGAIN -- please, don't shift topic to ZFS! In-Reply-To: <1238720635.20130228123325@serebryakov.spb.ru> References: <1796551389.20130228120630@serebryakov.spb.ru> <1238720635.20130228123325@serebryakov.spb.ru> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list Reply-To: lev@FreeBSD.org List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 28 Feb 2013 10:13:31 -0000 Hello, Lev. You wrote 28 =D1=84=D0=B5=D0=B2=D1=80=D0=B0=D0=BB=D1=8F 2013 =D0=B3., 12:33= :25: LS>> My server runs 9.1-STABLE and have 8Tb UFS2 SU+J FS. LS>> It crashed a several minutes ago (I don't know reason yet) and fsck LS>> says "Unexpected SU+J inconsistency" (Inode mode/directory tyme LS>> mismatch) and requested full check (which will take more than hour on LS>> such FS). LS> Full fsck found "INTERNAL ERROR: DUPS WITH SOFTUPDATES" and keeps runn= ing... full fsck reconnected about 1000 files, which was written in time of crash. Really, sever crashed when SVN mirror seed was been unpacking on this FS, so there was massive file creation at this time. --=20 // Black Lion AKA Lev Serebryakov From owner-freebsd-fs@FreeBSD.ORG Thu Feb 28 10:23:07 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 39F2B655; Thu, 28 Feb 2013 10:23:07 +0000 (UTC) (envelope-from yerenkow@gmail.com) Received: from mail-ve0-f172.google.com (mail-ve0-f172.google.com [209.85.128.172]) by mx1.freebsd.org (Postfix) with ESMTP id CD1001999; Thu, 28 Feb 2013 10:23:06 +0000 (UTC) Received: by mail-ve0-f172.google.com with SMTP id cz11so1625961veb.17 for ; Thu, 28 Feb 2013 02:23:00 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:x-received:in-reply-to:references:date:message-id :subject:from:to:cc:content-type; bh=U1qrMep/B2zkWGFqSiR44X5W1nJtQfi9GHLLAWZEmeM=; b=rgD5KWyyblHVQlxiBqYY39KXGc7UwVdQwNvA8oXrP8miDJEstNaoy366mErGEpGnk8 kTF6KRZ/ixkwAKjWCNFvEZ9kFsf4YMTJFcxE0g9Kl//85PA/QuEeEYrAVmtahpU2mM72 3Lewu1McNDTGyb9L870rTbhbXoBCwLYz9sQAAZ6tMxGNyDwbc1CcNtIkNJ8/NNDWtond fvAxvKvdQaxZNoafZBBfIkcahPqoLS22CekkZpryrPw3xa3J188v2e08K9DxsMi4xXUN Tgi3tPav9Lu5b6KisdullGbF4lO0cSSYxLgilxtaYvj92QmY7ebW36fIIHvurU2dlf4h 7a1g== MIME-Version: 1.0 X-Received: by 10.52.96.163 with SMTP id dt3mr2042152vdb.11.1362046980062; Thu, 28 Feb 2013 02:23:00 -0800 (PST) Received: by 10.52.228.163 with HTTP; Thu, 28 Feb 2013 02:22:59 -0800 (PST) In-Reply-To: <1158712592.20130228141323@serebryakov.spb.ru> References: <1796551389.20130228120630@serebryakov.spb.ru> <1238720635.20130228123325@serebryakov.spb.ru> <1158712592.20130228141323@serebryakov.spb.ru> Date: Thu, 28 Feb 2013 12:22:59 +0200 Message-ID: Subject: Re: Unexpected SU+J inconsistency AGAIN -- please, don't shift topic to ZFS! From: Alexander Yerenkow To: lev@freebsd.org Content-Type: text/plain; charset=KOI8-R Content-Transfer-Encoding: quoted-printable X-Content-Filtered-By: Mailman/MimeDel 2.1.14 Cc: freebsd-fs@freebsd.org, freebsd-current@freebsd.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 28 Feb 2013 10:23:07 -0000 2013/2/28 Lev Serebryakov > Hello, Lev. > You wrote 28 =C6=C5=D7=D2=C1=CC=D1 2013 =C7., 12:33:25: > > LS>> My server runs 9.1-STABLE and have 8Tb UFS2 SU+J FS. > LS>> It crashed a several minutes ago (I don't know reason yet) and fsck > LS>> says "Unexpected SU+J inconsistency" (Inode mode/directory tyme > LS>> mismatch) and requested full check (which will take more than hour o= n > LS>> such FS). > LS> Full fsck found "INTERNAL ERROR: DUPS WITH SOFTUPDATES" and keeps > running... > full fsck reconnected about 1000 files, which was written in time of > crash. > Really, sever crashed when SVN mirror seed was been unpacking on > this FS, so there was massive file creation at this time. > > Could you afford reproducing this? :) Also, would be nice to know how look your setup (CPUs, how much disks, how they connected, is it hw raid, etc). > -- > // Black Lion AKA Lev Serebryakov > > _______________________________________________ > freebsd-current@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-current > To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org= " > --=20 Regards, Alexander Yerenkow From owner-freebsd-fs@FreeBSD.ORG Thu Feb 28 10:31:36 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 925509C8; Thu, 28 Feb 2013 10:31:36 +0000 (UTC) (envelope-from lev@FreeBSD.org) Received: from onlyone.friendlyhosting.spb.ru (onlyone.friendlyhosting.spb.ru [46.4.40.135]) by mx1.freebsd.org (Postfix) with ESMTP id 3782E1A07; Thu, 28 Feb 2013 10:31:36 +0000 (UTC) Received: from lion.home.serebryakov.spb.ru (unknown [IPv6:2001:470:923f:1:9421:367:9d7d:512b]) (Authenticated sender: lev@serebryakov.spb.ru) by onlyone.friendlyhosting.spb.ru (Postfix) with ESMTPA id AF6BF4AC57; Thu, 28 Feb 2013 14:31:34 +0400 (MSK) Date: Thu, 28 Feb 2013 14:31:29 +0400 From: Lev Serebryakov Organization: FreeBSD X-Priority: 3 (Normal) Message-ID: <583012022.20130228143129@serebryakov.spb.ru> To: Alexander Yerenkow Subject: Re: Unexpected SU+J inconsistency AGAIN -- please, don't shift topic to ZFS! In-Reply-To: References: <1796551389.20130228120630@serebryakov.spb.ru> <1238720635.20130228123325@serebryakov.spb.ru> <1158712592.20130228141323@serebryakov.spb.ru> MIME-Version: 1.0 Content-Type: text/plain; charset=koi8-r Content-Transfer-Encoding: quoted-printable Cc: freebsd-fs@freebsd.org, freebsd-current@freebsd.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list Reply-To: lev@FreeBSD.org List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 28 Feb 2013 10:31:36 -0000 Hello, Alexander. You wrote 28 =C6=C5=D7=D2=C1=CC=D1 2013 =C7., 14:22:59: AY> Could you afford reproducing this? :) After half a day of memtest86+ :) I want to be sure, that it is not memory problem first. AY> Also, would be nice to know how look your setup (CPUs, how much disks, = how AY> they connected, is it hw raid, etc). Simple E4500 CPU on Q35-based desktop (ASUS) MoBo, 6GiB memory (under test now!), Samsung 500GiB SATA HDD for system, 5x2Tb WD Green (4xWD20EARS, 1xWD20EARX which replace failed WD20EARS), all disks are connected to 6 SATA ports of chipset (no RAID controller), WD disks are in software RAID5 with geom_raid5 (from ports, but I'm active maintainer of it). Disks are in "Default" configuration: WC and NCQ are enabled. I know, that FS guys could blame geom_raid5, as it could delay real write up to 15 seconds, but it never "lies" about writes (it doesn't mark BIOs complete till they are really sent to disk) and I could not reproduce any problems with it on many hours tests on VMs (and I don't want to experiment a lot on real hardware, as it contains my real data). Maybe, it is subtile interference between raid5 implementation and SU+J, but in such case I want to understand what does raid5 do wrong. --=20 // Black Lion AKA Lev Serebryakov From owner-freebsd-fs@FreeBSD.ORG Thu Feb 28 10:43:57 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 9000BCD4 for ; Thu, 28 Feb 2013 10:43:57 +0000 (UTC) (envelope-from radiomlodychbandytow@o2.pl) Received: from moh3-ve1.go2.pl (moh3-ve1.go2.pl [193.17.41.30]) by mx1.freebsd.org (Postfix) with ESMTP id 1E6F71A79 for ; Thu, 28 Feb 2013 10:43:56 +0000 (UTC) Received: from moh3-ve1.go2.pl (unknown [10.0.0.117]) by moh3-ve1.go2.pl (Postfix) with ESMTP id 0E376A6A02B for ; Thu, 28 Feb 2013 11:43:56 +0100 (CET) Received: from unknown (unknown [10.0.0.42]) by moh3-ve1.go2.pl (Postfix) with SMTP for ; Thu, 28 Feb 2013 11:43:56 +0100 (CET) Received: from unknown [93.175.66.185] by poczta.o2.pl with ESMTP id XhYMnM; Thu, 28 Feb 2013 11:43:56 +0100 Message-ID: <512F34E7.40602@o2.pl> Date: Thu, 28 Feb 2013 11:43:51 +0100 From: =?UTF-8?B?UmFkaW8gbcWCb2R5Y2ggYmFuZHl0w7N3?= User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:17.0) Gecko/20130201 Thunderbird/17.0.2 MIME-Version: 1.0 To: freebsd-fs@freebsd.org, Ivan Voras Subject: Re: freebsd-fs Digest, Vol 506, Issue 4 References: In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-O2-Trust: 1, 33 X-O2-SPF: neutral X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 28 Feb 2013 10:43:57 -0000 On 28/02/2013 10:07, freebsd-fs-request@freebsd.org wrote: > Message: 2 > Date: Wed, 27 Feb 2013 18:01:27 +0100 > From: Ivan Voras > To:freebsd-fs@freebsd.org > Subject: Re: Some filesystem thoughts > Message-ID: > Content-Type: text/plain; charset="utf-8" > > On 20/02/2013 20:26, Radio m?odych bandyt?w wrote: > >> >The way I see it is not to treat files as streams of bytes. That's not >> >what they are, files have meanings and there are tools that bring them >> >out. A picture is a stored emotion. OK, there are no tools for that yet. >> >But it is also an array of pixels. And a container with exif data. And >> >may be a container with an encrypted archive. And, a stream of bytes too. >> >They have multiple facets. >> >I think that it would be useful to somehow expose them to applications. >> >Wouldn't it be useful to be able to grep through pdfs in your email >> >attachments? > I think the problem is presentation - offering just the "grep" function > is waste of effort since those using GUIs will generally not use grep. > What you're talking about is something like google tried to do with > android (and, probably, failed): a unified search interface across all > applications and their data. Not really. grep was just an example of a more general thing; tools having access to not directly visible properties of files. Another example: I download a src.7z from some project because I want to see how do they do a particular thing. To browse it with my file manager, I need to mount it or extract it unless my file manager supports .7z files. It does, I can just step in. Phew, I saved my time...but only until I get to a file that I want to view with my text editor - then I have to do either of this things 'cause there's no path that FM can pass to the editor. > > Actually, modern smartphones & tablets are slowly moving into the > direction that there are no "files" and no "filesystems" on your device, > but rather jost your "data" and "apps" which both are managed by the > system (and possibly reside in a "cloud"). It may be that the > "hierarhical filesystem" idea has just not so useful or efficient any > more (but OTOH, I don't see it going away any time soon). I checked "hierarchical filesystem" search term. After a quick look I see that it's a thing to read into somewhat deeper. > >> >Mass-edit music tags with sed? Manually edit with your favourite text >> >editor instead of the sucky one-liner provided by your favourite music >> >player? >> >How about video players being able to play videos by reading them in >> >decoded form directly from the filesystem instead of having to integrate >> >a significant number of complex libraries to provide sufficient format >> >coverage? > All those things already exist (or will exist soon) in modern GUI > desktop environments, and especially on handheld-enabled OSes. The way > they are achieved is to introduce a Grand Unified Interface (or several > of them, as it happens), which severly abstract the low-level libraries, > even to the point where the (GUI) application doesn't know it's dealing > with actual files or something completely different. That's not really the same. I don't know if there's any Turing-complete mass-tagger in Unix and for sure that's not a norm. Advanced text edition features like caps corrections are useful in tagging sometimes. More than once I wanted to run jpegtran and similar tools on artwork embedded in music files...the list could go on. Yet why would media managers be supposed to implement such things? There are text editors and scripting languages designed precisely for such jobs, they just don't have access to the file properties needed. They could have, but why would all tools be supposed to implement dozens of interfaces to handle narrow special cases? We already have a Grand Unified Interface interface that almost all programs implement - a filesystem interface. > > If you're more concerned about the technical aspects, then learning to > write filesystems in FUSE would be a good starting point for you. Thanks, but for now I prefer concepts. I know I can learn technicalities, but I don't see myself implementing such thing any time soon and very possibly - ever. -- Twoje radio From owner-freebsd-fs@FreeBSD.ORG Thu Feb 28 12:48:36 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 80F7832A; Thu, 28 Feb 2013 12:48:36 +0000 (UTC) (envelope-from lev@FreeBSD.org) Received: from onlyone.friendlyhosting.spb.ru (onlyone.friendlyhosting.spb.ru [IPv6:2a01:4f8:131:60a2::2]) by mx1.freebsd.org (Postfix) with ESMTP id 2AD4E214; Thu, 28 Feb 2013 12:48:36 +0000 (UTC) Received: from lion.home.serebryakov.spb.ru (unknown [IPv6:2001:470:923f:1:9421:367:9d7d:512b]) (Authenticated sender: lev@serebryakov.spb.ru) by onlyone.friendlyhosting.spb.ru (Postfix) with ESMTPA id C3CFF4AC57; Thu, 28 Feb 2013 16:48:26 +0400 (MSK) Date: Thu, 28 Feb 2013 16:48:21 +0400 From: Lev Serebryakov Organization: FreeBSD X-Priority: 3 (Normal) Message-ID: <1698593972.20130228164821@serebryakov.spb.ru> To: freebsd-fs@freebsd.org, freebsd-current@freebsd.org Subject: Panic in ffs_valloc (Was: Unexpected SU+J inconsistency AGAIN -- please, don't shift topic to ZFS!) In-Reply-To: <1158712592.20130228141323@serebryakov.spb.ru> References: <1796551389.20130228120630@serebryakov.spb.ru> <1238720635.20130228123325@serebryakov.spb.ru> <1158712592.20130228141323@serebryakov.spb.ru> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list Reply-To: lev@FreeBSD.org List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 28 Feb 2013 12:48:36 -0000 Hello, Lev. You wrote 28 =D1=84=D0=B5=D0=B2=D1=80=D0=B0=D0=BB=D1=8F 2013 =D0=B3., 14:13= :23: LS>>> My server runs 9.1-STABLE and have 8Tb UFS2 SU+J FS. LS>>> It crashed a several minutes ago (I don't know reason yet) and fsck LS>>> says "Unexpected SU+J inconsistency" (Inode mode/directory tyme LS>>> mismatch) and requested full check (which will take more than hour on LS>>> such FS). LS>> Full fsck found "INTERNAL ERROR: DUPS WITH SOFTUPDATES" and keeps run= ning... LS> full fsck reconnected about 1000 files, which was written in time of LS> crash. LS> Really, sever crashed when SVN mirror seed was been unpacking on LS> this FS, so there was massive file creation at this time. Ok, I've checked memory, and now I have booted system with crashlog (!) Here it is (please note, that panic() was called by ffs_valloc): #0 doadump (textdump=3D) at pcpu.h:229 229 pcpu.h: No such file or directory. in pcpu.h (kgdb) #0 doadump (textdump=3D) at pcpu.h:229 #1 0xffffffff80431494 in kern_reboot (howto=3D260) at /usr/src/sys/kern/kern_shutdown.c:448 #2 0xffffffff80431997 in panic (fmt=3D0x1
) at /usr/src/sys/kern/kern_shutdown.c:636 #3 0xffffffff80573d8c in ffs_valloc (pvp=3D0xfffffe0024d68000, mode=3D3320= 4, cred=3D0xfffffe0023d52700, vpp=3D0xffffff81c35586b8) at /usr/src/sys/ufs/ffs/ffs_alloc.c:995 #4 0xffffffff805aa126 in ufs_makeinode (mode=3D33204, dvp=3D0xfffffe0024d6= 8000, vpp=3D0xffffff81c3558a10, cnp=3D0xffffff81c3558a38) at /usr/src/sys/ufs/ufs/ufs_vnops.c:2614 #5 0xffffffff80634391 in VOP_CREATE_APV (vop=3D, a=3D0xffffff81c3558920) at vnode_if.c:252 #6 0xffffffff804d389a in vn_open_cred (ndp=3D0xffffff81c35589d0, flagp=3D0xffffff81c35589cc, cmode=3D, vn_open_flags=3D, cred=3D0xfffffe0023d52700, fp=3D0xfffffe00ae9cf370) at vnode_if.h:109 #7 0xffffffff804cc0d9 in kern_openat (td=3D0xfffffe012d095000, fd=3D-100, path=3D0x801c951e0
, pathseg=3DUIO_USERSPACE, flags=3D2562, mode=3D) at /usr/src/sys/kern/vfs_syscalls.c:1132 #8 0xffffffff805f1400 in amd64_syscall (td=3D0xfffffe012d095000, traced=3D= 0) at subr_syscall.c:135 #9 0xffffffff805dbfc7 in Xfast_syscall () at /usr/src/sys/amd64/amd64/exception.S:387 #10 0x000000080177ce5c in ?? () Previous frame inner to this frame (corrupt stack?) (kgdb) Full textdump: http://lev.serebryakov.spb.ru/crashes/core-ffs-crash.txt.1 Please note, that FS was loaded by torrent client (40Mbit/s outbound traffic) and unpacking of svnmirror-base-r238500.tar.xz from this FS to itself. So, it was really high multistream load. I'll try to reproduce this on SINGLE disk, without geom_radi5 :) --=20 // Black Lion AKA Lev Serebryakov From owner-freebsd-fs@FreeBSD.ORG Thu Feb 28 14:57:02 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 74FFFB0E; Thu, 28 Feb 2013 14:57:02 +0000 (UTC) (envelope-from lev@FreeBSD.org) Received: from onlyone.friendlyhosting.spb.ru (onlyone.friendlyhosting.spb.ru [IPv6:2a01:4f8:131:60a2::2]) by mx1.freebsd.org (Postfix) with ESMTP id 1ABF19A1; Thu, 28 Feb 2013 14:57:02 +0000 (UTC) Received: from lion.home.serebryakov.spb.ru (unknown [IPv6:2001:470:923f:1:9421:367:9d7d:512b]) (Authenticated sender: lev@serebryakov.spb.ru) by onlyone.friendlyhosting.spb.ru (Postfix) with ESMTPA id 858794AC57; Thu, 28 Feb 2013 18:56:53 +0400 (MSK) Date: Thu, 28 Feb 2013 18:56:47 +0400 From: Lev Serebryakov Organization: FreeBSD X-Priority: 3 (Normal) Message-ID: <1502041051.20130228185647@serebryakov.spb.ru> To: Ivan Voras Subject: Re: Unexpected SU+J inconsistency AGAIN -- please, don't shift topic to ZFS! In-Reply-To: References: <1796551389.20130228120630@serebryakov.spb.ru> <1238720635.20130228123325@serebryakov.spb.ru> <1158712592.20130228141323@serebryakov.spb.ru> <583012022.20130228143129@serebryakov.spb.ru> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Cc: freebsd-fs@freebsd.org, freebsd-current@freebsd.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list Reply-To: lev@FreeBSD.org List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 28 Feb 2013 14:57:02 -0000 Hello, Ivan. You wrote 28 =D1=84=D0=B5=D0=B2=D1=80=D0=B0=D0=BB=D1=8F 2013 =D0=B3., 18:19= :38: >> Maybe, it is subtile interference between raid5 implementation and >> SU+J, but in such case I want to understand what does raid5 do >> wrong. IV> You guessed correctly, I was going to blame geom_raid5 :) It is not first time :( But every time such discussion ends without any practical results. One time, Kirk say, that delayed writes are Ok for SU until bottom layer doesn't lie about operation completeness. geom_raid5 could delay writes (in hope that next writes will combine nicely and allow not to do read-calculate-write cycle for read alone), but it never mark BIO complete until it is really completed (layers down to geom_raid5 returns completion). So, every BIO in wait queue is "in flight" from GEOM/VFS point of view. Maybe, it is fatal for journal :( And want I really want to see is "SYNC" flag for BIO and that all journal-related writes will be marked with it. Also all commits originated with fsync() MUST be marked in same way, really. Alexander Motin (ahci driver author) assured me, that he'll add support for such flag in driver to flush drive cache too, if it will be introduced. IMHO, lack of this (or similar) flag is bad idea even without geom_raid5 with its optimistic behavior. There was commit r246876, but I don't understand exactly what it means, as no real FS or driver's code was touched. But I'm writing about this idea for 3rd or 4th time without any results :( And I don't mean, that it should be implemented ASAP by someone, I mean I didn't see any support from FS guys (Kirk and somebody else, I don't remember exactly participants of these old thread, but he was not you) like "go ahead and send your patch". All these threads was very defensive from FS guru side, like "we don't need it, fix hardware, disable caches". IV> Is this a production setup you have? Can you afford to destroy it and IV> re-create it for the purpose of testing, this time with geom_raid3 IV> (which should be synchronous with respect to writes)? Unfortunately, it is production setup and I don't have any spare hardware for second one :( I've posted panic stacktrace -- and it is FFS-related too -- and now preparing setup with only one HDD and same high load to try reproduce it without geom_raid5. But I don't have enough hardware (3 spare HDDs at least!) to reproduce it with geom_raid3 or other copy of geiom_radi5. --=20 // Black Lion AKA Lev Serebryakov From owner-freebsd-fs@FreeBSD.ORG Thu Feb 28 15:00:57 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id A5BAFCFA; Thu, 28 Feb 2013 15:00:57 +0000 (UTC) (envelope-from lev@FreeBSD.org) Received: from onlyone.friendlyhosting.spb.ru (onlyone.friendlyhosting.spb.ru [46.4.40.135]) by mx1.freebsd.org (Postfix) with ESMTP id 67FD79EA; Thu, 28 Feb 2013 15:00:57 +0000 (UTC) Received: from lion.home.serebryakov.spb.ru (unknown [IPv6:2001:470:923f:1:9421:367:9d7d:512b]) (Authenticated sender: lev@serebryakov.spb.ru) by onlyone.friendlyhosting.spb.ru (Postfix) with ESMTPA id 06C084AC58; Thu, 28 Feb 2013 19:00:54 +0400 (MSK) Date: Thu, 28 Feb 2013 19:00:49 +0400 From: Lev Serebryakov Organization: FreeBSD X-Priority: 3 (Normal) Message-ID: <1843530475.20130228190049@serebryakov.spb.ru> To: Ivan Voras Subject: Re: Unexpected SU+J inconsistency AGAIN -- please, don't shift topic to ZFS! In-Reply-To: <1502041051.20130228185647@serebryakov.spb.ru> References: <1796551389.20130228120630@serebryakov.spb.ru> <1238720635.20130228123325@serebryakov.spb.ru> <1158712592.20130228141323@serebryakov.spb.ru> <583012022.20130228143129@serebryakov.spb.ru> <1502041051.20130228185647@serebryakov.spb.ru> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Cc: freebsd-fs@freebsd.org, freebsd-current@freebsd.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list Reply-To: lev@FreeBSD.org List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 28 Feb 2013 15:00:57 -0000 Hello, Ivan. You wrote 28 =D1=84=D0=B5=D0=B2=D1=80=D0=B0=D0=BB=D1=8F 2013 =D0=B3., 18:56= :47: LS> There was commit r246876, but I don't understand exactly what it LS> means, as no real FS or driver's code was touched. And, yes, barriers are much stronger than "sync writes", as they should flush all previous writes, even that is not related to journal or metadata and could wait more (simple file data could be fixed on plates out of order without destroying filesystem structure). --=20 // Black Lion AKA Lev Serebryakov From owner-freebsd-fs@FreeBSD.ORG Thu Feb 28 15:28:10 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 49CD088A; Thu, 28 Feb 2013 15:28:10 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) Received: from esa-annu.net.uoguelph.ca (esa-annu.mail.uoguelph.ca [131.104.91.36]) by mx1.freebsd.org (Postfix) with ESMTP id DF1EEB6A; Thu, 28 Feb 2013 15:28:09 +0000 (UTC) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AqEEADd1L1GDaFvO/2dsb2JhbABFhk+5AYJcgRBzgh8BAQQBIwRSBRYOCgICDRkCWQaIIAavWJIXgSOMKoETNAeCLYETA4hqjVeJY4cHgyaBSz4 X-IronPort-AV: E=Sophos;i="4.84,755,1355115600"; d="scan'208";a="16276035" Received: from erie.cs.uoguelph.ca (HELO zcs3.mail.uoguelph.ca) ([131.104.91.206]) by esa-annu.net.uoguelph.ca with ESMTP; 28 Feb 2013 10:28:03 -0500 Received: from zcs3.mail.uoguelph.ca (localhost.localdomain [127.0.0.1]) by zcs3.mail.uoguelph.ca (Postfix) with ESMTP id 15CCAB3F18; Thu, 28 Feb 2013 10:28:03 -0500 (EST) Date: Thu, 28 Feb 2013 10:28:03 -0500 (EST) From: Rick Macklem To: Konstantin Belousov Message-ID: <664298325.3403590.1362065283063.JavaMail.root@erie.cs.uoguelph.ca> In-Reply-To: <20130228070515.GK2454@kib.kiev.ua> Subject: Re: should vn_fullpath1() ever return a path with "." in it? MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Originating-IP: [172.17.91.202] X-Mailer: Zimbra 6.0.10_GA_2692 (ZimbraWebClient - FF3.0 (Win)/6.0.10_GA_2692) Cc: FreeBSD Filesystems , Sergey Kandaurov X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 28 Feb 2013 15:28:10 -0000 Konstantin Belousov wrote: > On Wed, Feb 27, 2013 at 09:59:22PM -0500, Rick Macklem wrote: > > Hi, > > > > Sergey Kandaurov reported a problem where getcwd() returns a > > path with "/./" imbedded in it for an NFSv4 mount. This is > > caused by a mount point crossing on the server when at the > > server's root because vn_fullpath1() uses VV_ROOT to spot > > mount point crossings. > > > > The current workaround is to use the sysctls: > > debug.disablegetcwd=1 > > debug.disablefullpath=1 > > > > However, it would be nice to fix this when vn_fullpath1() > > is being used. > > > > A simple fix is to have vn_fullpath1() fail when it finds > > "." as a directory match in the path. When vn_fullpath1() > > fails, the syscalls fail and that allows the libc algorithm > > to be used (which works for this case because it doesn't > > depend on VV_ROOT being set, etc). > > > > So, I am wondering if a patch (I have attached one) that > > makes vn_fullpath1() fail when it matches "." will break > > anything else? (I don't think so, since the code checks > > for VV_ROOT in the loop above the check for a match of > > ".", but I am not sure?) > > > > Thanks for any input w.r.t. this, rick > > > --- kern/vfs_cache.c.sav 2013-02-27 20:44:42.000000000 -0500 > > +++ kern/vfs_cache.c 2013-02-27 21:10:39.000000000 -0500 > > @@ -1333,6 +1333,20 @@ vn_fullpath1(struct thread *td, struct v > > startvp, NULL, 0, 0); > > break; > > } > > + if (buf[buflen] == '.' && (buf[buflen + 1] == '\0' || > > + buf[buflen + 1] == '/')) { > > + /* > > + * Fail if it matched ".". This should only happen > > + * for NFSv4 mounts that cross server mount points. > > + */ > > + CACHE_RUNLOCK(); > > + vrele(vp); > > + numfullpathfail1++; > > + error = ENOENT; > > + SDT_PROBE(vfs, namecache, fullpath, return, > > + error, vp, NULL, 0, 0); > > + break; > > + } > > buf[--buflen] = '/'; > > slash_prefixed = 1; > > } > > I do not quite understand this. Did the dvp (parent) vnode returned by > VOP_VPTOCNP() equal to vp (child) vnode in the case of the "." name ? Well, the vnodes aren't the same, but the fileid (think NFS i-node#) is the value for "." and ".." (2 for a UFS exported fs). The vnodes are based on the file handles and dvp will be for the mount point in the other file system on the server. NFSv4 has 2 attributes for a server mount point directory: fileid - which is the fileid# for the root (2 for UFS) mounted_on_fileid - which is the fileid of the directory in the parent file system The parent file system has a different fsid, which becomes the st_dev and, as such, the userland algorithm in getcwd() works. The case I test is where the server mount point is one directory level below the local mount point in the client. For example: /mnt is the local mount point and /mnt/sub1 is a server mount point (different file system than /mnt). - when vn_fullpath1() gets up to /mnt/sub1 (which doesn't have VV_ROOT set on it), vn_vptocnp_locked() matches "." for the fileno. I think there is code in vn_vptocnp_locked() that avoids a match for ".." or that could match too. - then it does /mnt, which does have VV_ROOT set and it works. > It must be, for the correct operation, but also it should cause the > almost > infinite loop in the vn_fullpath1(). The loop is not really infinite > due > to a limited size of the buffer where the infinite amount of "./" is > placed. > As noted above, I think this loop is avoided by dvp != vp. Within the NFSv4 mount, there can be multiple instances of a fileid (st_ino), but the have different fsids (st_dev) and different vnodes. > Anyway, I think we should do better than this patch, even if it is > legitimate. I think that the better place to check the condition is > the > default implementation of VOP_VPTOCNP(). Am I right that this is where > it broke for you ? > Yep. I wasn't sure what the implications of putting the fix further down were. (I was planning to ask if the patch should go in a lower level function, but forgot to ask;-) I'll test this patch and let you know if it works. Thanks, rick > diff --git a/sys/kern/vfs_default.c b/sys/kern/vfs_default.c > index 00d064e..1dd0185 100644 > --- a/sys/kern/vfs_default.c > +++ b/sys/kern/vfs_default.c > @@ -856,8 +856,12 @@ vop_stdvptocnp(struct vop_vptocnp_args *ap) > error = ENOMEM; > goto out; > } > - bcopy(dp->d_name, buf + i, dp->d_namlen); > - error = 0; > + if (dp->d_namlen == 1 && dp->d_name[0] == '.') { > + error = ENOENT; > + } else { > + bcopy(dp->d_name, buf + i, dp->d_namlen); > + error = 0; > + } > goto out; > } > } while (len > 0 || !eofflag); From owner-freebsd-fs@FreeBSD.ORG Thu Feb 28 17:32:35 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 89D31B93; Thu, 28 Feb 2013 17:32:35 +0000 (UTC) (envelope-from lev@FreeBSD.org) Received: from onlyone.friendlyhosting.spb.ru (onlyone.friendlyhosting.spb.ru [46.4.40.135]) by mx1.freebsd.org (Postfix) with ESMTP id 44C183FC; Thu, 28 Feb 2013 17:32:35 +0000 (UTC) Received: from lion.home.serebryakov.spb.ru (unknown [IPv6:2001:470:923f:1:9421:367:9d7d:512b]) (Authenticated sender: lev@serebryakov.spb.ru) by onlyone.friendlyhosting.spb.ru (Postfix) with ESMTPA id 64B434AC57; Thu, 28 Feb 2013 21:32:23 +0400 (MSK) Date: Thu, 28 Feb 2013 21:32:17 +0400 From: Lev Serebryakov Organization: FreeBSD X-Priority: 3 (Normal) Message-ID: <127827160.20130228213217@serebryakov.spb.ru> To: Ivan Voras Subject: Re: Unexpected SU+J inconsistency AGAIN -- please, don't shift topic to ZFS! In-Reply-To: References: <1796551389.20130228120630@serebryakov.spb.ru> <1238720635.20130228123325@serebryakov.spb.ru> <1158712592.20130228141323@serebryakov.spb.ru> <583012022.20130228143129@serebryakov.spb.ru> <1502041051.20130228185647@serebryakov.spb.ru> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Cc: freebsd-fs@freebsd.org, freebsd-geom@freebsd.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list Reply-To: lev@FreeBSD.org List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 28 Feb 2013 17:32:35 -0000 Hello, Ivan. You wrote 28 =D1=84=D0=B5=D0=B2=D1=80=D0=B0=D0=BB=D1=8F 2013 =D0=B3., 21:01= :46: >> One time, Kirk say, that delayed writes are Ok for SU until bottom >> layer doesn't lie about operation completeness. geom_raid5 could >> delay writes (in hope that next writes will combine nicely and allow >> not to do read-calculate-write cycle for read alone), but it never >> mark BIO complete until it is really completed (layers down to >> geom_raid5 returns completion). So, every BIO in wait queue is "in >> flight" from GEOM/VFS point of view. Maybe, it is fatal for journal :( IV> It shouldn't be - it could be a bug. I'll try to reproduce it on VM, but it could be hard, as virtual storage have very different (really -- much simpler) characteristics and behavior. >> And want I really want to see is "SYNC" flag for BIO and that all >> journal-related writes will be marked with it. Also all commits >> originated with fsync() MUST be marked in same way, really. Alexander >> Motin (ahci driver author) assured me, that he'll add support for >> such flag in driver to flush drive cache too, if it will be >> introduced. IV> Hmmm, once upon a time I actually tried to add it: IV> http://people.freebsd.org/~ivoras/diffs/fsync_flush.patch I have almost the same patch here :) IV> This is from 2011, and was never really reviewed. Kirk said it was a IV> good idea (meaning the implementation could be wrong, YMMV) :) It will be great to see this idea committed, really! Could I help somehow? IV> I don't know whether it's significant, but ffs_softdep.c contains 6 IV> bawrite() calls (meaning buf async write), in softdep_process_journal(), IV> softdep_journal_freeblocks(), softdep_fsync_mountdev(), sync_cgs(), and IV> flush_deplist(). As far as I understand (I've examined this code when try to understand how to add this BIO_SYNC flag), ASYNC/SYNC here means something different. It is only about does caller want to sleep till operation is completed, and doesn't mean sync or async write... So, I'm not sure, which of these calls should be marked for flushing (or should be marked with new ORDERED/BARRIER flag at least). SU code is complicated enough without journal, and with journal it is much more complicated to simply understand it :( --=20 // Black Lion AKA Lev Serebryakov From owner-freebsd-fs@FreeBSD.ORG Thu Feb 28 23:49:27 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 2D90E50F for ; Thu, 28 Feb 2013 23:49:27 +0000 (UTC) (envelope-from allan@physics.umn.edu) Received: from mail.physics.umn.edu (smtp.spa.umn.edu [128.101.220.4]) by mx1.freebsd.org (Postfix) with ESMTP id 082FB9D2 for ; Thu, 28 Feb 2013 23:49:26 +0000 (UTC) Received: from spa-sysadm-01.spa.umn.edu ([134.84.199.8]) by mail.physics.umn.edu with esmtpsa (TLSv1:CAMELLIA256-SHA:256) (Exim 4.77 (FreeBSD)) (envelope-from ) id 1UBCr5-000Bts-VB for freebsd-fs@freebsd.org; Thu, 28 Feb 2013 17:25:52 -0600 Message-ID: <512FE773.3060903@physics.umn.edu> Date: Thu, 28 Feb 2013 17:25:39 -0600 From: Graham Allan User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:17.0) Gecko/17.0 Thunderbird/17.0 MIME-Version: 1.0 To: freebsd-fs@freebsd.org Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on mrmachenry.spa.umn.edu X-Spam-Level: X-Spam-Status: No, score=-1.7 required=5.0 tests=ALL_TRUSTED,AWL,BAYES_50, TW_ZF,T_RP_MATCHES_RCVD autolearn=no version=3.3.2 Subject: benefit of GEOM labels for ZFS, was Hard drive device names... serial numbers X-SA-Exim-Version: 4.2 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 28 Feb 2013 23:49:27 -0000 Sorry to come in late on this thread but I've been struggling with thinking about the same issue, from a different perspective. Several months ago we created our first "large" ZFS storage system, using 42 drives plus a few SSDs in one of the oft-used Supermicro 45-drive chassis. It has been working really nicely but has led to some puzzling over the best way to do some things when we build more. We made our pool using geom drive labels. Ever since, I've been wondering if this really gives any advantage - at least for this type of system. If you need to replace a drive, you don't really know which enclosure slot any given da device is, and so our answer has been to dig around using sg3_utils commands wrapped in a bit of perl, to try and correlate the da device to the slot via the drive serial number. At this point, having a geom label just seems like an extra bit of indirection to increase my confusion :-) Although setting the geom label to the drive serial number might be a serious improvement... We're about to add a couple more of these shelves to the system, giving a total of 135 drives (although each shelf would be a separate pool), and given that they will be standard consumer grade drives, some frequency of replacement is a given. Does anyone have any good tips on how to manage a large number of drives in a zfs pool like this? Thanks, Graham -- ------------------------------------------------------------------------- Graham Allan School of Physics and Astronomy - University of Minnesota ------------------------------------------------------------------------- From owner-freebsd-fs@FreeBSD.ORG Fri Mar 1 00:58:54 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 13E9F878; Fri, 1 Mar 2013 00:58:54 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) Received: from esa-annu.net.uoguelph.ca (esa-annu.mail.uoguelph.ca [131.104.91.36]) by mx1.freebsd.org (Postfix) with ESMTP id C0372CF7; Fri, 1 Mar 2013 00:58:53 +0000 (UTC) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AqAEAC/8L1GDaFvO/2dsb2JhbABFhk+5CYJcgRRzgh8BAQUjBFIbDgoCAg0ZAlkGiCavMZIhgSOMKoETNAeCLYETA4hqjVeJY4cHgyaBSz4 X-IronPort-AV: E=Sophos;i="4.84,758,1355115600"; d="scan'208";a="16387626" Received: from erie.cs.uoguelph.ca (HELO zcs3.mail.uoguelph.ca) ([131.104.91.206]) by esa-annu.net.uoguelph.ca with ESMTP; 28 Feb 2013 19:58:51 -0500 Received: from zcs3.mail.uoguelph.ca (localhost.localdomain [127.0.0.1]) by zcs3.mail.uoguelph.ca (Postfix) with ESMTP id 76EBDB3F7D; Thu, 28 Feb 2013 19:58:51 -0500 (EST) Date: Thu, 28 Feb 2013 19:58:51 -0500 (EST) From: Rick Macklem To: Konstantin Belousov Message-ID: <1208475167.3432384.1362099531469.JavaMail.root@erie.cs.uoguelph.ca> In-Reply-To: <20130228070515.GK2454@kib.kiev.ua> Subject: Re: should vn_fullpath1() ever return a path with "." in it? MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Originating-IP: [172.17.91.203] X-Mailer: Zimbra 6.0.10_GA_2692 (ZimbraWebClient - FF3.0 (Win)/6.0.10_GA_2692) Cc: FreeBSD Filesystems , Sergey Kandaurov X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 01 Mar 2013 00:58:54 -0000 Kostik Belousov wrote: > On Wed, Feb 27, 2013 at 09:59:22PM -0500, Rick Macklem wrote: > > Hi, > > > > Sergey Kandaurov reported a problem where getcwd() returns a > > path with "/./" imbedded in it for an NFSv4 mount. This is > > caused by a mount point crossing on the server when at the > > server's root because vn_fullpath1() uses VV_ROOT to spot > > mount point crossings. > > > > The current workaround is to use the sysctls: > > debug.disablegetcwd=1 > > debug.disablefullpath=1 > > > > However, it would be nice to fix this when vn_fullpath1() > > is being used. > > > > A simple fix is to have vn_fullpath1() fail when it finds > > "." as a directory match in the path. When vn_fullpath1() > > fails, the syscalls fail and that allows the libc algorithm > > to be used (which works for this case because it doesn't > > depend on VV_ROOT being set, etc). > > > > So, I am wondering if a patch (I have attached one) that > > makes vn_fullpath1() fail when it matches "." will break > > anything else? (I don't think so, since the code checks > > for VV_ROOT in the loop above the check for a match of > > ".", but I am not sure?) > > > > Thanks for any input w.r.t. this, rick > > > --- kern/vfs_cache.c.sav 2013-02-27 20:44:42.000000000 -0500 > > +++ kern/vfs_cache.c 2013-02-27 21:10:39.000000000 -0500 > > @@ -1333,6 +1333,20 @@ vn_fullpath1(struct thread *td, struct v > > startvp, NULL, 0, 0); > > break; > > } > > + if (buf[buflen] == '.' && (buf[buflen + 1] == '\0' || > > + buf[buflen + 1] == '/')) { > > + /* > > + * Fail if it matched ".". This should only happen > > + * for NFSv4 mounts that cross server mount points. > > + */ > > + CACHE_RUNLOCK(); > > + vrele(vp); > > + numfullpathfail1++; > > + error = ENOENT; > > + SDT_PROBE(vfs, namecache, fullpath, return, > > + error, vp, NULL, 0, 0); > > + break; > > + } > > buf[--buflen] = '/'; > > slash_prefixed = 1; > > } > > I do not quite understand this. Did the dvp (parent) vnode returned by > VOP_VPTOCNP() equal to vp (child) vnode in the case of the "." name ? > It must be, for the correct operation, but also it should cause the > almost > infinite loop in the vn_fullpath1(). The loop is not really infinite > due > to a limited size of the buffer where the infinite amount of "./" is > placed. > > Anyway, I think we should do better than this patch, even if it is > legitimate. I think that the better place to check the condition is > the > default implementation of VOP_VPTOCNP(). Am I right that this is where > it broke for you ? > > diff --git a/sys/kern/vfs_default.c b/sys/kern/vfs_default.c > index 00d064e..1dd0185 100644 > --- a/sys/kern/vfs_default.c > +++ b/sys/kern/vfs_default.c > @@ -856,8 +856,12 @@ vop_stdvptocnp(struct vop_vptocnp_args *ap) > error = ENOMEM; > goto out; > } > - bcopy(dp->d_name, buf + i, dp->d_namlen); > - error = 0; > + if (dp->d_namlen == 1 && dp->d_name[0] == '.') { > + error = ENOENT; > + } else { > + bcopy(dp->d_name, buf + i, dp->d_namlen); > + error = 0; > + } > goto out; > } > } while (len > 0 || !eofflag); Yes, this patch fixes the problem too. If you think it is safe to do this, I can commit the patch in mid-April. Maybe Sergey can test it? Thanks yet again, rick From owner-freebsd-fs@FreeBSD.ORG Fri Mar 1 03:30:32 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id A954ECDD for ; Fri, 1 Mar 2013 03:30:32 +0000 (UTC) (envelope-from fjwcash@gmail.com) Received: from mail-qa0-f48.google.com (mail-qa0-f48.google.com [209.85.216.48]) by mx1.freebsd.org (Postfix) with ESMTP id 707212F3 for ; Fri, 1 Mar 2013 03:30:32 +0000 (UTC) Received: by mail-qa0-f48.google.com with SMTP id j8so1904613qah.14 for ; Thu, 28 Feb 2013 19:30:25 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:x-received:in-reply-to:references:date:message-id :subject:from:to:cc:content-type; bh=bxTE2Siu5yBzT8clYam6Sc3Ucvct8ao/QL0OoX+9Mcc=; b=r7GyQ3GC0+uiSENmxWSiWepwsVkQIttMkBZsU7a9oySlHDkSL+DMttwyX1F2l2xNe1 LTuO0uvbBAjAYvtPwyJakF6xZ4IusShSC+/nr9AKq+KcZiGAb/hmRVAFNf8rWfiUeIEl Az0o1Qdzps+b+lICnJsyW1XlktoY4zASbqmt/lMN2qFfmPVurElhuSiVe1DlX/rbi2Ro Qv5ayY0ZakOjGoi/fjubS2MUuDWofh97Ibn83tauFer2Az91Jkp3ZxAU7WZ/Y1vJyk+D ceNdsYbDJb9RBb38AfMWmAXupHyqjW/Biwf6BffIKhxG5sCFhF8Ka3NluKNPDSyK42T5 4Y4w== MIME-Version: 1.0 X-Received: by 10.49.128.170 with SMTP id np10mr2041434qeb.37.1362108625286; Thu, 28 Feb 2013 19:30:25 -0800 (PST) Received: by 10.49.106.233 with HTTP; Thu, 28 Feb 2013 19:30:25 -0800 (PST) Received: by 10.49.106.233 with HTTP; Thu, 28 Feb 2013 19:30:25 -0800 (PST) In-Reply-To: <512FE773.3060903@physics.umn.edu> References: <512FE773.3060903@physics.umn.edu> Date: Thu, 28 Feb 2013 19:30:25 -0800 Message-ID: Subject: Re: benefit of GEOM labels for ZFS, was Hard drive device names... serial numbers From: Freddie Cash To: Graham Allan Content-Type: text/plain; charset=UTF-8 X-Content-Filtered-By: Mailman/MimeDel 2.1.14 Cc: FreeBSD Filesystems X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 01 Mar 2013 03:30:32 -0000 You label the drive with something that tells you: - enclosure - column - row IOW, something that definitively tells you where the drive is located, without having to pull the drive to find it. To do so, you have to install 1 drive at a time, and label it at that point. For example, we use the following pattern: encX-A-# Where X tells you which enclosure it's in, A tells you which column it's in (letters start at A increasing to the right), and # tells you the disk in the column, numbered top-down. Whether you label the entire drive using glabel or just a GPT partition is up to you. We use GPT labels. On 2013-02-28 3:49 PM, "Graham Allan" wrote: > Sorry to come in late on this thread but I've been struggling with > thinking about the same issue, from a different perspective. > > Several months ago we created our first "large" ZFS storage system, using > 42 drives plus a few SSDs in one of the oft-used Supermicro 45-drive > chassis. It has been working really nicely but has led to some puzzling > over the best way to do some things when we build more. > > We made our pool using geom drive labels. Ever since, I've been wondering > if this really gives any advantage - at least for this type of system. If > you need to replace a drive, you don't really know which enclosure slot any > given da device is, and so our answer has been to dig around using > sg3_utils commands wrapped in a bit of perl, to try and correlate the da > device to the slot via the drive serial number. > > At this point, having a geom label just seems like an extra bit of > indirection to increase my confusion :-) Although setting the geom label to > the drive serial number might be a serious improvement... > > We're about to add a couple more of these shelves to the system, giving a > total of 135 drives (although each shelf would be a separate pool), and > given that they will be standard consumer grade drives, some frequency of > replacement is a given. > > Does anyone have any good tips on how to manage a large number of drives > in a zfs pool like this? > > Thanks, > > Graham > -- > ------------------------------**------------------------------** > ------------- > Graham Allan > School of Physics and Astronomy - University of Minnesota > ------------------------------**------------------------------** > ------------- > ______________________________**_________________ > freebsd-fs@freebsd.org mailing list > http://lists.freebsd.org/**mailman/listinfo/freebsd-fs > To unsubscribe, send any mail to "freebsd-fs-unsubscribe@**freebsd.org > " > From owner-freebsd-fs@FreeBSD.ORG Fri Mar 1 05:11:47 2013 Return-Path: Delivered-To: freebsd-fs@FreeBSD.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id A524AB8A; Fri, 1 Mar 2013 05:11:47 +0000 (UTC) (envelope-from truckman@FreeBSD.org) Received: from gw.catspoiler.org (gw.catspoiler.org [75.1.14.242]) by mx1.freebsd.org (Postfix) with ESMTP id 6F80D7FC; Fri, 1 Mar 2013 05:11:47 +0000 (UTC) Received: from FreeBSD.org (mousie.catspoiler.org [192.168.101.2]) by gw.catspoiler.org (8.13.3/8.13.3) with ESMTP id r215BWoU092532; Thu, 28 Feb 2013 21:11:36 -0800 (PST) (envelope-from truckman@FreeBSD.org) Message-Id: <201303010511.r215BWoU092532@gw.catspoiler.org> Date: Thu, 28 Feb 2013 21:11:32 -0800 (PST) From: Don Lewis Subject: Re: Panic in ffs_valloc (Was: Unexpected SU+J inconsistency AGAIN -- please, don't shift topic to ZFS!) To: lev@FreeBSD.org In-Reply-To: <1698593972.20130228164821@serebryakov.spb.ru> MIME-Version: 1.0 Content-Type: TEXT/plain; charset=iso-8859-5 Content-Transfer-Encoding: 8BIT Cc: freebsd-fs@FreeBSD.org, freebsd-current@FreeBSD.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 01 Mar 2013 05:11:47 -0000 On 28 Feb, Lev Serebryakov wrote: > Hello, Lev. > You wrote 28 äÕÒàÐÛï 2013 Ó., 14:13:23: > > LS>>> My server runs 9.1-STABLE and have 8Tb UFS2 SU+J FS. > LS>>> It crashed a several minutes ago (I don't know reason yet) and fsck > LS>>> says "Unexpected SU+J inconsistency" (Inode mode/directory tyme > LS>>> mismatch) and requested full check (which will take more than hour on > LS>>> such FS). > LS>> Full fsck found "INTERNAL ERROR: DUPS WITH SOFTUPDATES" and keeps running... > LS> full fsck reconnected about 1000 files, which was written in time of > LS> crash. > LS> Really, sever crashed when SVN mirror seed was been unpacking on > LS> this FS, so there was massive file creation at this time. > Ok, I've checked memory, and now I have booted system with crashlog > (!) > > Here it is (please note, that panic() was called by ffs_valloc): > > #0 doadump (textdump=) at pcpu.h:229 > 229 pcpu.h: No such file or directory. > in pcpu.h > (kgdb) #0 doadump (textdump=) at pcpu.h:229 > #1 0xffffffff80431494 in kern_reboot (howto=260) > at /usr/src/sys/kern/kern_shutdown.c:448 > #2 0xffffffff80431997 in panic (fmt=0x1
) > at /usr/src/sys/kern/kern_shutdown.c:636 > #3 0xffffffff80573d8c in ffs_valloc (pvp=0xfffffe0024d68000, mode=33204, > cred=0xfffffe0023d52700, vpp=0xffffff81c35586b8) > at /usr/src/sys/ufs/ffs/ffs_alloc.c:995 > #4 0xffffffff805aa126 in ufs_makeinode (mode=33204, dvp=0xfffffe0024d68000, > vpp=0xffffff81c3558a10, cnp=0xffffff81c3558a38) > at /usr/src/sys/ufs/ufs/ufs_vnops.c:2614 > #5 0xffffffff80634391 in VOP_CREATE_APV (vop=, > a=0xffffff81c3558920) at vnode_if.c:252 > #6 0xffffffff804d389a in vn_open_cred (ndp=0xffffff81c35589d0, > flagp=0xffffff81c35589cc, cmode=, > vn_open_flags=, cred=0xfffffe0023d52700, > fp=0xfffffe00ae9cf370) at vnode_if.h:109 > #7 0xffffffff804cc0d9 in kern_openat (td=0xfffffe012d095000, fd=-100, > path=0x801c951e0
, > pathseg=UIO_USERSPACE, flags=2562, mode=) > at /usr/src/sys/kern/vfs_syscalls.c:1132 > #8 0xffffffff805f1400 in amd64_syscall (td=0xfffffe012d095000, traced=0) > at subr_syscall.c:135 > #9 0xffffffff805dbfc7 in Xfast_syscall () > at /usr/src/sys/amd64/amd64/exception.S:387 > #10 0x000000080177ce5c in ?? () > Previous frame inner to this frame (corrupt stack?) > (kgdb) > > Full textdump: http://lev.serebryakov.spb.ru/crashes/core-ffs-crash.txt.1 > > Please note, that FS was loaded by torrent client (40Mbit/s outbound > traffic) and unpacking of svnmirror-base-r238500.tar.xz from this FS > to itself. So, it was really high multistream load. > > I'll try to reproduce this on SINGLE disk, without geom_radi5 :) The fact that the filesystem code called panic() indicates that the filesystem was already corrupt by that point. That's a likely reason for fsck complaining about the unexpected SU+J inconsistency. Incorrect write ordering that allowed the filesystem to become inconsistent because some pending writes were lost because of the panic might not be necessary, but this might have allowed an earlier crash where a full fsck was skipped to leave the filesystem in this state. This panic might also be a result of the bug fixed in 246877, but I have my doubts about that. From owner-freebsd-fs@FreeBSD.ORG Fri Mar 1 06:22:51 2013 Return-Path: Delivered-To: freebsd-fs@FreeBSD.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 7AACA312; Fri, 1 Mar 2013 06:22:51 +0000 (UTC) (envelope-from lev@FreeBSD.org) Received: from onlyone.friendlyhosting.spb.ru (onlyone.friendlyhosting.spb.ru [IPv6:2a01:4f8:131:60a2::2]) by mx1.freebsd.org (Postfix) with ESMTP id 2501C9DE; Fri, 1 Mar 2013 06:22:51 +0000 (UTC) Received: from lion.home.serebryakov.spb.ru (unknown [IPv6:2001:470:923f:1:9421:367:9d7d:512b]) (Authenticated sender: lev@serebryakov.spb.ru) by onlyone.friendlyhosting.spb.ru (Postfix) with ESMTPA id E0E064AC57; Fri, 1 Mar 2013 10:22:43 +0400 (MSK) Date: Fri, 1 Mar 2013 10:22:37 +0400 From: Lev Serebryakov Organization: FreeBSD Project X-Priority: 3 (Normal) Message-ID: <352538988.20130301102237@serebryakov.spb.ru> To: Don Lewis Subject: Re: Panic in ffs_valloc (Was: Unexpected SU+J inconsistency AGAIN -- please, don't shift topic to ZFS!) In-Reply-To: <201303010511.r215BWoU092532@gw.catspoiler.org> References: <1698593972.20130228164821@serebryakov.spb.ru> <201303010511.r215BWoU092532@gw.catspoiler.org> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-5 Content-Transfer-Encoding: quoted-printable Cc: freebsd-fs@FreeBSD.org, freebsd-current@FreeBSD.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list Reply-To: lev@FreeBSD.org List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 01 Mar 2013 06:22:51 -0000 Hello, Don. You wrote 1 =DC=D0=E0=E2=D0 2013 =D3., 9:11:32: DL> The fact that the filesystem code called panic() indicates that the DL> filesystem was already corrupt by that point. That's a likely reason DL> for fsck complaining about the unexpected SU+J inconsistency. DL> Incorrect write ordering that allowed the filesystem to become DL> inconsistent because some pending writes were lost because of the panic DL> might not be necessary, but this might have allowed an earlier crash DL> where a full fsck was skipped to leave the filesystem in this state. As far, as I understand, if this theory is right (file system corruption which left unnoticed by "standard" fsck), it is bug in FFS SU+J too, as it should not be corrupted by reordered writes (if writes is properly reported as completed even if they were reordered). DL> This panic might also be a result of the bug fixed in 246877, but I have DL> my doubts about that. It was not MFCed :( --=20 // Black Lion AKA Lev Serebryakov From owner-freebsd-fs@FreeBSD.ORG Fri Mar 1 08:03:01 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 373B8503 for ; Fri, 1 Mar 2013 08:03:01 +0000 (UTC) (envelope-from pluknet@gmail.com) Received: from mail-wg0-f45.google.com (mail-wg0-f45.google.com [74.125.82.45]) by mx1.freebsd.org (Postfix) with ESMTP id B8954E67 for ; Fri, 1 Mar 2013 08:03:00 +0000 (UTC) Received: by mail-wg0-f45.google.com with SMTP id dq12so2237675wgb.24 for ; Fri, 01 Mar 2013 00:02:54 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:x-received:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:cc:content-type; bh=WWkYj5OlnVmugdkk5dqssw9/rcfRXLKBJ2ZY9cAlloU=; b=wDDpN4Dmnnqg5LfOP7yEDnAe3E8XGxj6MaII/JsiwZ3/huh+OVjee2xC92jaOG0eSb Nwsix297bM/IuQYEU6yppEFYd2S+7+8YpMIe4RU5O5lBUVQUiCLxCss/EUOnAl1LGY4E v8zUp19IU0bcmlOlMGVbld0WrkY6smEN2o6nVyBC0KbYmB35py6NZwXd9d/k1HN6a2lJ fl7OmXQA8ky+9pjkxAAvllcwGLFjeniPcb4qPUL+7lZfSP05xMiOjtzcxhaEDDqsGW7l suGxv5Sv3uMdxJcckm60D6+nWzOIl/e6lBQUpy71bj5ceDxCBNyr8zGDuawzmOoBT/cP gtSQ== MIME-Version: 1.0 X-Received: by 10.180.79.227 with SMTP id m3mr2184825wix.12.1362124974582; Fri, 01 Mar 2013 00:02:54 -0800 (PST) Sender: pluknet@gmail.com Received: by 10.194.86.167 with HTTP; Fri, 1 Mar 2013 00:02:54 -0800 (PST) In-Reply-To: <1208475167.3432384.1362099531469.JavaMail.root@erie.cs.uoguelph.ca> References: <20130228070515.GK2454@kib.kiev.ua> <1208475167.3432384.1362099531469.JavaMail.root@erie.cs.uoguelph.ca> Date: Fri, 1 Mar 2013 11:02:54 +0300 X-Google-Sender-Auth: NiIxkvWo5PgdXSyMFYs_GHzvosA Message-ID: Subject: Re: should vn_fullpath1() ever return a path with "." in it? From: Sergey Kandaurov To: Rick Macklem Content-Type: text/plain; charset=ISO-8859-1 Cc: FreeBSD Filesystems X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 01 Mar 2013 08:03:01 -0000 On 1 March 2013 04:58, Rick Macklem wrote: > Kostik Belousov wrote: >> On Wed, Feb 27, 2013 at 09:59:22PM -0500, Rick Macklem wrote: >> > Hi, >> > >> > Sergey Kandaurov reported a problem where getcwd() returns a >> > path with "/./" imbedded in it for an NFSv4 mount. This is >> > caused by a mount point crossing on the server when at the >> > server's root because vn_fullpath1() uses VV_ROOT to spot >> > mount point crossings. >> > >> > The current workaround is to use the sysctls: >> > debug.disablegetcwd=1 >> > debug.disablefullpath=1 >> > >> > However, it would be nice to fix this when vn_fullpath1() >> > is being used. >> > >> > A simple fix is to have vn_fullpath1() fail when it finds >> > "." as a directory match in the path. When vn_fullpath1() >> > fails, the syscalls fail and that allows the libc algorithm >> > to be used (which works for this case because it doesn't >> > depend on VV_ROOT being set, etc). >> > >> > So, I am wondering if a patch (I have attached one) that >> > makes vn_fullpath1() fail when it matches "." will break >> > anything else? (I don't think so, since the code checks >> > for VV_ROOT in the loop above the check for a match of >> > ".", but I am not sure?) >> > >> > Thanks for any input w.r.t. this, rick >> >> > --- kern/vfs_cache.c.sav 2013-02-27 20:44:42.000000000 -0500 >> > +++ kern/vfs_cache.c 2013-02-27 21:10:39.000000000 -0500 >> > @@ -1333,6 +1333,20 @@ vn_fullpath1(struct thread *td, struct v >> > startvp, NULL, 0, 0); >> > break; >> > } >> > + if (buf[buflen] == '.' && (buf[buflen + 1] == '\0' || >> > + buf[buflen + 1] == '/')) { >> > + /* >> > + * Fail if it matched ".". This should only happen >> > + * for NFSv4 mounts that cross server mount points. >> > + */ >> > + CACHE_RUNLOCK(); >> > + vrele(vp); >> > + numfullpathfail1++; >> > + error = ENOENT; >> > + SDT_PROBE(vfs, namecache, fullpath, return, >> > + error, vp, NULL, 0, 0); >> > + break; >> > + } >> > buf[--buflen] = '/'; >> > slash_prefixed = 1; >> > } >> >> I do not quite understand this. Did the dvp (parent) vnode returned by >> VOP_VPTOCNP() equal to vp (child) vnode in the case of the "." name ? >> It must be, for the correct operation, but also it should cause the >> almost >> infinite loop in the vn_fullpath1(). The loop is not really infinite >> due >> to a limited size of the buffer where the infinite amount of "./" is >> placed. >> >> Anyway, I think we should do better than this patch, even if it is >> legitimate. I think that the better place to check the condition is >> the >> default implementation of VOP_VPTOCNP(). Am I right that this is where >> it broke for you ? >> >> diff --git a/sys/kern/vfs_default.c b/sys/kern/vfs_default.c >> index 00d064e..1dd0185 100644 >> --- a/sys/kern/vfs_default.c >> +++ b/sys/kern/vfs_default.c >> @@ -856,8 +856,12 @@ vop_stdvptocnp(struct vop_vptocnp_args *ap) >> error = ENOMEM; >> goto out; >> } >> - bcopy(dp->d_name, buf + i, dp->d_namlen); >> - error = 0; >> + if (dp->d_namlen == 1 && dp->d_name[0] == '.') { >> + error = ENOENT; >> + } else { >> + bcopy(dp->d_name, buf + i, dp->d_namlen); >> + error = 0; >> + } >> goto out; >> } >> } while (len > 0 || !eofflag); > > Yes, this patch fixes the problem too. If you think it is safe to > do this, I can commit the patch in mid-April. Maybe Sergey can > test it? > > Thanks yet again, rick > Hi Rick Sorry but I am no longer able to test NFSv4. -- wbr, pluknet From owner-freebsd-fs@FreeBSD.ORG Fri Mar 1 11:26:43 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id A9F28330 for ; Fri, 1 Mar 2013 11:26:43 +0000 (UTC) (envelope-from mailinglists.tech@gmail.com) Received: from mail-qa0-f47.google.com (mail-qa0-f47.google.com [209.85.216.47]) by mx1.freebsd.org (Postfix) with ESMTP id 74F4C838 for ; Fri, 1 Mar 2013 11:26:43 +0000 (UTC) Received: by mail-qa0-f47.google.com with SMTP id j8so4940908qah.13 for ; Fri, 01 Mar 2013 03:26:37 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:x-received:date:message-id:subject:from:to :content-type; bh=1/1bIr7tY46lLqU0iBTmNKABSYkFx/QwcpeqTeRnUIU=; b=AHL839mAsZIj9LIrnrAAA4N5s/4fOeFuFU4ZXaDizbC18+EctLm6pMwm3b6a5S1Hiu 4e30j25dh5Pho4VYtcp8TKEk4gsUbk3Vs2GXTN9COuX7BrM67qG2USrqGyKVk4LWOBLT rX0m38fxBSOE8Adxbhh54Eg1MCzS2Zoo4S/zAlSchiMh3BUF5MQYBuR+Gp4ocz9lDz52 vevkb97Pcar47kkIRe85oX8RMJ+NpVQpVYYQkjaDR4YiShk+e3uOZqPLCrkjedaLUj58 2ezFA+IztbCOOZIXamHVyfdMoZo6TagncAMv+0A9XSgya1EYy3FMB5DbWGbdFk60hXIW vNHg== MIME-Version: 1.0 X-Received: by 10.229.203.78 with SMTP id fh14mr3515476qcb.143.1362137197659; Fri, 01 Mar 2013 03:26:37 -0800 (PST) Received: by 10.49.110.70 with HTTP; Fri, 1 Mar 2013 03:26:37 -0800 (PST) Date: Fri, 1 Mar 2013 12:26:37 +0100 Message-ID: Subject: I am to silly to mount a zpool while boot From: tech mailinglists To: freebsd-fs@freebsd.org Content-Type: text/plain; charset=ISO-8859-1 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 01 Mar 2013 11:26:43 -0000 Hello all, I think that I only can be an idiot to get in such a problem but I am not able to mount a zpool via fstab while boot. I have a FreeBSD i386 PV Xen DomU running with 3 disks xbd0 (ext2 for /boot), xbd1 (UFS for /) and xbd2 (ZFS/zpool with name home to mount at /home). I now tried everything I could find. So my fstab entry looks like this: home /home zfs rw,late 0 0 The real problem is that after a reboot the zpool is no longer imported, I really don't know why I always have to reimport the pool via zpool import -d /dev home. Because of this the filesystem never can be mounted via fstab while boot and I get dropped into a shell where I need to do this always manually. So why the pool always isn't imported after boot and how can I solve this issue? And is the fstab entry correct itself? So would it work when the pool gets imported with it's name befor the fstab entry is parsed? Hope that someone give me a few hints or a solution. Best Regards From owner-freebsd-fs@FreeBSD.ORG Fri Mar 1 11:28:20 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id C279F497; Fri, 1 Mar 2013 11:28:20 +0000 (UTC) (envelope-from lev@FreeBSD.org) Received: from onlyone.friendlyhosting.spb.ru (onlyone.friendlyhosting.spb.ru [46.4.40.135]) by mx1.freebsd.org (Postfix) with ESMTP id 82D62854; Fri, 1 Mar 2013 11:28:20 +0000 (UTC) Received: from lion.home.serebryakov.spb.ru (unknown [IPv6:2001:470:923f:1:9421:367:9d7d:512b]) (Authenticated sender: lev@serebryakov.spb.ru) by onlyone.friendlyhosting.spb.ru (Postfix) with ESMTPA id 9F5BE4AC59; Fri, 1 Mar 2013 15:28:02 +0400 (MSK) Date: Fri, 1 Mar 2013 15:27:56 +0400 From: Lev Serebryakov Organization: FreeBSD X-Priority: 3 (Normal) Message-ID: <612776324.20130301152756@serebryakov.spb.ru> To: Ivan Voras , Don Lewis Subject: Re: Unexpected SU+J inconsistency AGAIN -- please, don't shift topic to ZFS! In-Reply-To: References: <1796551389.20130228120630@serebryakov.spb.ru> <1238720635.20130228123325@serebryakov.spb.ru> <1158712592.20130228141323@serebryakov.spb.ru> <583012022.20130228143129@serebryakov.spb.ru> <1502041051.20130228185647@serebryakov.spb.ru> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Cc: freebsd-fs@freebsd.org, freebsd-geom@freebsd.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list Reply-To: lev@FreeBSD.org List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 01 Mar 2013 11:28:20 -0000 Hello, Ivan. You wrote 28 =D1=84=D0=B5=D0=B2=D1=80=D0=B0=D0=BB=D1=8F 2013 =D0=B3., 21:01= :46: >> One time, Kirk say, that delayed writes are Ok for SU until bottom >> layer doesn't lie about operation completeness. geom_raid5 could >> delay writes (in hope that next writes will combine nicely and allow >> not to do read-calculate-write cycle for read alone), but it never >> mark BIO complete until it is really completed (layers down to >> geom_raid5 returns completion). So, every BIO in wait queue is "in >> flight" from GEOM/VFS point of view. Maybe, it is fatal for journal :( IV> It shouldn't be - it could be a bug. I understand, that it proves nothing, but I've tried to repeat "previous crash corrupt FS in journal-undetectable way" theory by killing virtual system when there is massive writing to geom_radi5-based FS (on virtual drives, unfortunately). I've done 15 tries (as it is manual testing, it takes about 1-1.5 hours total), but every time FS was Ok after double-fsck (first with journal and last without one). Of course, there was MASSIVE loss of data, as timeout and size of cache in geom_raid5 was set very high (sometimes FS becomes empty after unpacking 50% of SVN mirror seed, crash and check) but FS was consistent every time! --=20 // Black Lion AKA Lev Serebryakov From owner-freebsd-fs@FreeBSD.ORG Fri Mar 1 12:03:18 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 8519FA36 for ; Fri, 1 Mar 2013 12:03:18 +0000 (UTC) (envelope-from peter.maloney@brockmann-consult.de) Received: from moutng.kundenserver.de (moutng.kundenserver.de [212.227.17.10]) by mx1.freebsd.org (Postfix) with ESMTP id 065039AC for ; Fri, 1 Mar 2013 12:03:17 +0000 (UTC) Received: from [10.3.0.26] ([141.4.215.32]) by mrelayeu.kundenserver.de (node=mreu3) with ESMTP (Nemesis) id 0MQYci-1UNazE271Y-00Tot0; Fri, 01 Mar 2013 13:03:12 +0100 Message-ID: <513098FF.8030806@brockmann-consult.de> Date: Fri, 01 Mar 2013 13:03:11 +0100 From: Peter Maloney User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:17.0) Gecko/17.0 Thunderbird/17.0 MIME-Version: 1.0 To: freebsd-fs@freebsd.org Subject: Re: I am to silly to mount a zpool while boot References: In-Reply-To: X-Enigmail-Version: 1.5 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit X-Provags-ID: V02:K0:0xjhKHGwD7FWeX/IoJiN7QakZcMgAwKNFpFVsLn9faq nvvChqD6Ok0SXu+Cx+ylPlcVKLvUkXzuEJtT8z4gFfYIj09rOn +A99LXg8139dsqhaj72hY2FDuOK7naoDkMb3AYIpg0v2KcmLi+ 9qRUZbsC874NabiSGPihDQ8BC0oTScc8vgIqwtdULQBDoO8F4x AoA0I2sW1MhtsPoXahO5JGi0eXCCI3KXSyPQMuMmOZQvGwRr5U ABZ+RiOD7KDUmVFkSnwRWzWEQ7RhzzelvvxuMHG+bPveygrJIs OZGD6qlRsBmrOCnYvKLJG08tfwWxgay9C89Xh0Pd3/HtHtOgzd 2nTiEKms0z72CUDtdmBN3v5mrg+Q1rvKq6imq3Eu0 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 01 Mar 2013 12:03:18 -0000 For the mount, don't use fstab. use: zfs set mountpoint=/home poolname/path/to/dataset And for the import, add zfs_enable="YES" to rc.conf. And I think that's it. (all my FreeBSD systems are pure zfs, so not sure what troubles you would get if you had UFS on root) On 2013-03-01 12:26, tech mailinglists wrote: > Hello all, > > I think that I only can be an idiot to get in such a problem but I am > not able to mount a zpool via fstab while boot. > > I have a FreeBSD i386 PV Xen DomU running with 3 disks xbd0 (ext2 for > /boot), xbd1 (UFS for /) and xbd2 (ZFS/zpool with name home to mount > at /home). > > I now tried everything I could find. So my fstab entry looks like this: > > home /home zfs rw,late 0 0 > > The real problem is that after a reboot the zpool is no longer > imported, I really don't know why I always have to reimport the pool > via zpool import -d /dev home. Because of this the filesystem never > can be mounted via fstab while boot and I get dropped into a shell > where I need to do this always manually. > > So why the pool always isn't imported after boot and how can I solve this issue? > > And is the fstab entry correct itself? So would it work when the pool > gets imported with it's name befor the fstab entry is parsed? > > Hope that someone give me a few hints or a solution. > > Best Regards > _______________________________________________ > freebsd-fs@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-fs > To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org" -- -------------------------------------------- Peter Maloney Brockmann Consult Max-Planck-Str. 2 21502 Geesthacht Germany Tel: +49 4152 889 300 Fax: +49 4152 889 333 E-mail: peter.maloney@brockmann-consult.de Internet: http://www.brockmann-consult.de -------------------------------------------- From owner-freebsd-fs@FreeBSD.ORG Fri Mar 1 12:12:10 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 114DFB00 for ; Fri, 1 Mar 2013 12:12:10 +0000 (UTC) (envelope-from tevans.uk@googlemail.com) Received: from mail-vc0-f173.google.com (mail-vc0-f173.google.com [209.85.220.173]) by mx1.freebsd.org (Postfix) with ESMTP id AA8C09E9 for ; Fri, 1 Mar 2013 12:12:09 +0000 (UTC) Received: by mail-vc0-f173.google.com with SMTP id fy27so1911125vcb.18 for ; Fri, 01 Mar 2013 04:12:03 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=googlemail.com; s=20120113; h=mime-version:x-received:in-reply-to:references:date:message-id :subject:from:to:cc:content-type; bh=ZKZJFHMdL08yFQGdOAjRuuCu7YP/ZcMNWY0MhF5phS0=; b=c95mfvf4ci+IyQ1OHScSxunzP3Zg+SGquJJ9o8CBfPC0Ka1xE9mg88dnoP+hCs79k9 Y6JbAvsFoWIxaVbqm6E/25XgWEQKnA56tp0Q667Sel3nsApI1jYwwryZNjKsl1ZWRS5E uq6/ElP56cj7ztSuT/s1bB6qxy13dI4J6ajDt6kaT8k2uPYjiW1ly3oT3SguAP6E64vV ndvdop9NqCvOCDE1gZ90/vOithiZ5+D9a7nQjmrGIwrn41M7stMoajEQiEcSESZwQAN3 1gOVIXWIc/i4/nBml2BdwznCynO+Q0kQXfXbdSKt0oMwVUSaDZ27yAaRqo+Zof94pKuP 2ctA== MIME-Version: 1.0 X-Received: by 10.58.205.179 with SMTP id lh19mr4025462vec.7.1362139923599; Fri, 01 Mar 2013 04:12:03 -0800 (PST) Received: by 10.58.223.170 with HTTP; Fri, 1 Mar 2013 04:12:03 -0800 (PST) In-Reply-To: <513098FF.8030806@brockmann-consult.de> References: <513098FF.8030806@brockmann-consult.de> Date: Fri, 1 Mar 2013 12:12:03 +0000 Message-ID: Subject: Re: I am to silly to mount a zpool while boot From: Tom Evans To: Peter Maloney Content-Type: text/plain; charset=UTF-8 Cc: freebsd-fs@freebsd.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 01 Mar 2013 12:12:10 -0000 On Fri, Mar 1, 2013 at 12:03 PM, Peter Maloney wrote: > For the mount, don't use fstab. use: > > zfs set mountpoint=/home poolname/path/to/dataset > > And for the import, add > > zfs_enable="YES" > > to rc.conf. > > > And I think that's it. (all my FreeBSD systems are pure zfs, so not sure > what troubles you would get if you had UFS on root) > I have UFS root, ZFS for /usr, /var etc, due to BIOS/loader issues when initially trying to get ZFS boot working on this box. This is the total contents of fstab: /dev/gpt/root / ufs rw 1 1 /dev/gpt/swap1 none swap sw 0 0 /dev/gpt/swap2 none swap sw 0 0 The ZFS fs is mounted by the mountpoint property: > $ zfs get mountpoint tank NAME PROPERTY VALUE SOURCE tank mountpoint /tank default ZFS is loaded as usual, by adding zfs_load="YES" to /boot/loader.conf and zfs_enable="YES" to /etc/rc.conf Hope that helps Tom From owner-freebsd-fs@FreeBSD.ORG Fri Mar 1 14:57:34 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 34C0627C; Fri, 1 Mar 2013 14:57:34 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) Received: from esa-jnhn.mail.uoguelph.ca (esa-jnhn.mail.uoguelph.ca [131.104.91.44]) by mx1.freebsd.org (Postfix) with ESMTP id BF5051C5; Fri, 1 Mar 2013 14:57:33 +0000 (UTC) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AqAEAILBMFGDaFvO/2dsb2JhbABEhk+7ZIESc4IfAQEFIwRSGw4KAgINGQJZBhOIE65oki6BI4wqgRM0B4ItgRMDiGuNWIljhwiDJoFLPg X-IronPort-AV: E=Sophos;i="4.84,761,1355115600"; d="scan'208";a="18880314" Received: from erie.cs.uoguelph.ca (HELO zcs3.mail.uoguelph.ca) ([131.104.91.206]) by esa-jnhn.mail.uoguelph.ca with ESMTP; 01 Mar 2013 09:57:26 -0500 Received: from zcs3.mail.uoguelph.ca (localhost.localdomain [127.0.0.1]) by zcs3.mail.uoguelph.ca (Postfix) with ESMTP id BCFB9B3F13; Fri, 1 Mar 2013 09:57:26 -0500 (EST) Date: Fri, 1 Mar 2013 09:57:26 -0500 (EST) From: Rick Macklem To: Sergey Kandaurov Message-ID: <298688524.3444408.1362149846756.JavaMail.root@erie.cs.uoguelph.ca> In-Reply-To: Subject: Re: should vn_fullpath1() ever return a path with "." in it? MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Originating-IP: [172.17.91.201] X-Mailer: Zimbra 6.0.10_GA_2692 (ZimbraWebClient - FF3.0 (Win)/6.0.10_GA_2692) Cc: FreeBSD Filesystems X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 01 Mar 2013 14:57:34 -0000 Sergey Kandaurov wrote: > On 1 March 2013 04:58, Rick Macklem wrote: > > Kostik Belousov wrote: > >> On Wed, Feb 27, 2013 at 09:59:22PM -0500, Rick Macklem wrote: > >> > Hi, > >> > > >> > Sergey Kandaurov reported a problem where getcwd() returns a > >> > path with "/./" imbedded in it for an NFSv4 mount. This is > >> > caused by a mount point crossing on the server when at the > >> > server's root because vn_fullpath1() uses VV_ROOT to spot > >> > mount point crossings. > >> > > >> > The current workaround is to use the sysctls: > >> > debug.disablegetcwd=1 > >> > debug.disablefullpath=1 > >> > > >> > However, it would be nice to fix this when vn_fullpath1() > >> > is being used. > >> > > >> > A simple fix is to have vn_fullpath1() fail when it finds > >> > "." as a directory match in the path. When vn_fullpath1() > >> > fails, the syscalls fail and that allows the libc algorithm > >> > to be used (which works for this case because it doesn't > >> > depend on VV_ROOT being set, etc). > >> > > >> > So, I am wondering if a patch (I have attached one) that > >> > makes vn_fullpath1() fail when it matches "." will break > >> > anything else? (I don't think so, since the code checks > >> > for VV_ROOT in the loop above the check for a match of > >> > ".", but I am not sure?) > >> > > >> > Thanks for any input w.r.t. this, rick > >> > >> > --- kern/vfs_cache.c.sav 2013-02-27 20:44:42.000000000 -0500 > >> > +++ kern/vfs_cache.c 2013-02-27 21:10:39.000000000 -0500 > >> > @@ -1333,6 +1333,20 @@ vn_fullpath1(struct thread *td, struct v > >> > startvp, NULL, 0, 0); > >> > break; > >> > } > >> > + if (buf[buflen] == '.' && (buf[buflen + 1] == '\0' || > >> > + buf[buflen + 1] == '/')) { > >> > + /* > >> > + * Fail if it matched ".". This should only happen > >> > + * for NFSv4 mounts that cross server mount points. > >> > + */ > >> > + CACHE_RUNLOCK(); > >> > + vrele(vp); > >> > + numfullpathfail1++; > >> > + error = ENOENT; > >> > + SDT_PROBE(vfs, namecache, fullpath, return, > >> > + error, vp, NULL, 0, 0); > >> > + break; > >> > + } > >> > buf[--buflen] = '/'; > >> > slash_prefixed = 1; > >> > } > >> > >> I do not quite understand this. Did the dvp (parent) vnode returned > >> by > >> VOP_VPTOCNP() equal to vp (child) vnode in the case of the "." name > >> ? > >> It must be, for the correct operation, but also it should cause the > >> almost > >> infinite loop in the vn_fullpath1(). The loop is not really > >> infinite > >> due > >> to a limited size of the buffer where the infinite amount of "./" > >> is > >> placed. > >> > >> Anyway, I think we should do better than this patch, even if it is > >> legitimate. I think that the better place to check the condition is > >> the > >> default implementation of VOP_VPTOCNP(). Am I right that this is > >> where > >> it broke for you ? > >> > >> diff --git a/sys/kern/vfs_default.c b/sys/kern/vfs_default.c > >> index 00d064e..1dd0185 100644 > >> --- a/sys/kern/vfs_default.c > >> +++ b/sys/kern/vfs_default.c > >> @@ -856,8 +856,12 @@ vop_stdvptocnp(struct vop_vptocnp_args *ap) > >> error = ENOMEM; > >> goto out; > >> } > >> - bcopy(dp->d_name, buf + i, dp->d_namlen); > >> - error = 0; > >> + if (dp->d_namlen == 1 && dp->d_name[0] == '.') { > >> + error = ENOENT; > >> + } else { > >> + bcopy(dp->d_name, buf + i, dp->d_namlen); > >> + error = 0; > >> + } > >> goto out; > >> } > >> } while (len > 0 || !eofflag); > > > > Yes, this patch fixes the problem too. If you think it is safe to > > do this, I can commit the patch in mid-April. Maybe Sergey can > > test it? > > > > Thanks yet again, rick > > > > Hi Rick > Sorry but I am no longer able to test NFSv4. > No problem. I can reproduce the problem, so I think it's fine w.r.t. testing to see if it fixes the bug. Thanks, rick > -- > wbr, > pluknet From owner-freebsd-fs@FreeBSD.ORG Fri Mar 1 16:54:25 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id B200DCEE for ; Fri, 1 Mar 2013 16:54:25 +0000 (UTC) (envelope-from dean.jones@oregonstate.edu) Received: from smtp1.oregonstate.edu (smtp1.oregonstate.edu [128.193.15.35]) by mx1.freebsd.org (Postfix) with ESMTP id 828CF990 for ; Fri, 1 Mar 2013 16:54:25 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by smtp1.oregonstate.edu (Postfix) with ESMTP id 3DA573E3F4 for ; Fri, 1 Mar 2013 08:53:18 -0800 (PST) X-Virus-Scanned: amavisd-new at oregonstate.edu Received: from smtp1.oregonstate.edu ([127.0.0.1]) by localhost (smtp.oregonstate.edu [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id wdsGWVbxZvSx for ; Fri, 1 Mar 2013 08:53:18 -0800 (PST) Received: from mail-ia0-f181.google.com (mail-ia0-f181.google.com [209.85.210.181]) (using TLSv1 with cipher RC4-SHA (128/128 bits)) (No client certificate requested) by smtp1.oregonstate.edu (Postfix) with ESMTPSA id EA1973E4E8 for ; Fri, 1 Mar 2013 08:53:17 -0800 (PST) Received: by mail-ia0-f181.google.com with SMTP id w33so2770263iag.40 for ; Fri, 01 Mar 2013 08:53:17 -0800 (PST) X-Received: by 10.50.193.200 with SMTP id hq8mr13075152igc.101.1362156797089; Fri, 01 Mar 2013 08:53:17 -0800 (PST) MIME-Version: 1.0 Received: by 10.64.33.161 with HTTP; Fri, 1 Mar 2013 08:52:56 -0800 (PST) In-Reply-To: References: <512FE773.3060903@physics.umn.edu> From: Dean Jones Date: Fri, 1 Mar 2013 08:52:56 -0800 Message-ID: Subject: Re: benefit of GEOM labels for ZFS, was Hard drive device names... serial numbers To: Freddie Cash Content-Type: text/plain; charset=ISO-8859-1 X-Content-Filtered-By: Mailman/MimeDel 2.1.14 Cc: FreeBSD Filesystems X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 01 Mar 2013 16:54:25 -0000 On Thu, Feb 28, 2013 at 7:30 PM, Freddie Cash wrote: > You label the drive with something that tells you: > - enclosure > - column > - row > > For example, we use the following pattern: encX-A-# > > Where X tells you which enclosure it's in, A tells you which column it's in > (letters start at A increasing to the right), and # tells you the disk in > the column, numbered top-down. > > Whether you label the entire drive using glabel or just a GPT partition is > up to you. We use GPT labels. > I like your labeling convention. I'll add that glabel is FreeBSD specific, so if a pool might ever be imported under another OS that GPT labels are universal. From owner-freebsd-fs@FreeBSD.ORG Fri Mar 1 18:00:54 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id C33ED3C1; Fri, 1 Mar 2013 18:00:54 +0000 (UTC) (envelope-from mckusick@mckusick.com) Received: from chez.mckusick.com (chez.mckusick.com [IPv6:2001:5a8:4:7e72:4a5b:39ff:fe12:452]) by mx1.freebsd.org (Postfix) with ESMTP id 9DEE5D75; Fri, 1 Mar 2013 18:00:54 +0000 (UTC) Received: from chez.mckusick.com (localhost [127.0.0.1]) by chez.mckusick.com (8.14.3/8.14.3) with ESMTP id r21I0pBD034998; Fri, 1 Mar 2013 10:00:51 -0800 (PST) (envelope-from mckusick@chez.mckusick.com) Message-Id: <201303011800.r21I0pBD034998@chez.mckusick.com> To: lev@freebsd.org Subject: Re: Panic in ffs_valloc (Was: Unexpected SU+J inconsistency AGAIN -- please, don't shift topic to ZFS!) In-reply-to: <352538988.20130301102237@serebryakov.spb.ru> Date: Fri, 01 Mar 2013 10:00:51 -0800 From: Kirk McKusick X-Spam-Status: No, score=0.0 required=5.0 tests=MISSING_MID, UNPARSEABLE_RELAY autolearn=failed version=3.2.5 X-Spam-Checker-Version: SpamAssassin 3.2.5 (2008-06-10) on chez.mckusick.com Cc: freebsd-fs@freebsd.org, Don Lewis X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 01 Mar 2013 18:00:54 -0000 > Date: Fri, 1 Mar 2013 10:22:37 +0400 > From: Lev Serebryakov > To: Don Lewis > Subject: Re: Panic in ffs_valloc (Was: Unexpected SU+J inconsistency AGAIN -- > Cc: freebsd-fs@freebsd.org, freebsd-current@freebsd.org > > DL> The fact that the filesystem code called panic() indicates that the > DL> filesystem was already corrupt by that point. That's a likely reason > DL> for fsck complaining about the unexpected SU+J inconsistency. > > DL> Incorrect write ordering that allowed the filesystem to become > DL> inconsistent because some pending writes were lost because of the panic > DL> might not be necessary, but this might have allowed an earlier crash > DL> where a full fsck was skipped to leave the filesystem in this state. > As far, as I understand, if this theory is right (file system > corruption which left unnoticed by "standard" fsck), it is bug in FFS > SU+J too, as it should not be corrupted by reordered writes (if > writes is properly reported as completed even if they were > reordered). If the bitmaps are left corrupted (in particular if blocks are marked free that are actually in use), then that panic can occur. Such a state should never be possible when running with SU even if you have crashed multiple times and restarted without running fsck. To reduce the number of possible points of failure, I suggest that you try running with just SU (i.e., turn off the SU+J jornalling). you can do this with `tunefs -j disable /dev/fsdisk'. This will turn off journalling, but not soft updates. You can verify this by then running `tunefs -p /dev/fsdisk' to ensure that soft updates are still enabled. As you have already stated, the filesystem is fine with reordered writes provided that they are not completed (iodone) until they are well and truely on the disk. > DL> This panic might also be a result of the bug fixed in 246877, but I have > DL> my doubts about that. > It was not MFCed :( > > -- > // Black Lion AKA Lev Serebryakov I will MFC 246876 and 246877 once they have been in head long enough to have confidence that they will not cause trouble. That means at least a month (well more than the two weeks they have presently been there). Note these changes only pass the barrier request down to the GEOM layer. I don't know whether it actually makes it to the drive layer and if it does whether the drive layer actually implements it. My goal was to get the ball rolling. Kirk McKusick From owner-freebsd-fs@FreeBSD.ORG Fri Mar 1 20:16:53 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id D810FDBB for ; Fri, 1 Mar 2013 20:16:53 +0000 (UTC) (envelope-from gperez@entel.upc.edu) Received: from violet.upc.es (violet.upc.es [147.83.2.51]) by mx1.freebsd.org (Postfix) with ESMTP id 694901598 for ; Fri, 1 Mar 2013 20:16:52 +0000 (UTC) Received: from ackerman2.upc.es (ackerman2.upc.es [147.83.2.244]) by violet.upc.es (8.14.1/8.13.1) with ESMTP id r21KGjGb004662 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=FAIL); Fri, 1 Mar 2013 21:16:45 +0100 Received: from [192.168.1.110] (247.Red-81-39-132.dynamicIP.rima-tde.net [81.39.132.247]) (authenticated bits=0) by ackerman2.upc.es (8.14.4/8.14.4) with ESMTP id r21KGhh4008484 (version=TLSv1/SSLv3 cipher=DHE-RSA-CAMELLIA256-SHA bits=256 verify=NO); Fri, 1 Mar 2013 21:16:45 +0100 Message-ID: <51310CAA.1020701@entel.upc.edu> Date: Fri, 01 Mar 2013 21:16:42 +0100 From: =?ISO-8859-1?Q?Gustau_P=E9rez_i_Querol?= User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:17.0) Gecko/20130220 Thunderbird/17.0.3 MIME-Version: 1.0 To: Graham Allan Subject: Re: benefit of GEOM labels for ZFS, was Hard drive device names... serial numbers References: <512FE773.3060903@physics.umn.edu> In-Reply-To: <512FE773.3060903@physics.umn.edu> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 8bit X-Scanned-By: MIMEDefang 2.70 on 147.83.2.244 X-Mail-Scanned: Criba 2.0 + Clamd X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-3.0 (violet.upc.es [147.83.2.51]); Fri, 01 Mar 2013 21:16:46 +0100 (CET) Cc: freebsd-fs@freebsd.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 01 Mar 2013 20:16:53 -0000 Al 01/03/2013 00:25, En/na Graham Allan ha escrit: > Sorry to come in late on this thread but I've been struggling with > thinking about the same issue, from a different perspective. > > Several months ago we created our first "large" ZFS storage system, > using 42 drives plus a few SSDs in one of the oft-used Supermicro > 45-drive chassis. It has been working really nicely but has led to > some puzzling over the best way to do some things when we build more. > > We made our pool using geom drive labels. Ever since, I've been > wondering if this really gives any advantage - at least for this type > of system. If you need to replace a drive, you don't really know which > enclosure slot any given da device is, and so our answer has been to > dig around using sg3_utils commands wrapped in a bit of perl, to try > and correlate the da device to the slot via the drive serial number. > > At this point, having a geom label just seems like an extra bit of > indirection to increase my confusion :-) Although setting the geom > label to the drive serial number might be a serious improvement... > > We're about to add a couple more of these shelves to the system, > giving a total of 135 drives (although each shelf would be a separate > pool), and given that they will be standard consumer grade drives, > some frequency of replacement is a given. > > Does anyone have any good tips on how to manage a large number of > drives in a zfs pool like this? > I don't have such a large array, I have about 8 or 10 drives at most but I'd go with Freddie's convention. I'd also go with GPT labels instead of geom labels because the former are universal. I'd also ensure that you can easily identify driver with leds. Either by issuing commands to the disk controller (I use mfiutil to visually identify them) or by using ses, but you probably have though. Greets, Gustau -- Salut i força, Gustau --------------------------------------------------------------------------- Prou top-posting : http://ca.wikipedia.org/wiki/Top-posting Stop top-posting : http://en.wikipedia.org/wiki/Posting_style O O O Gustau Pérez i Querol O O O Unitat de Gestió dels departaments O O O Matemàtica Aplicada IV i Enginyeria Telemàtica Universitat Politècnica de Catalunya Edifici C3 - Despatx S101-B UPC Campus Nord UPC C/ Jordi Girona, 1-3 08034 - Barcelona From owner-freebsd-fs@FreeBSD.ORG Fri Mar 1 20:23:01 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id EDABBEBE; Fri, 1 Mar 2013 20:23:01 +0000 (UTC) (envelope-from lev@FreeBSD.org) Received: from onlyone.friendlyhosting.spb.ru (onlyone.friendlyhosting.spb.ru [IPv6:2a01:4f8:131:60a2::2]) by mx1.freebsd.org (Postfix) with ESMTP id 7B1F115E9; Fri, 1 Mar 2013 20:23:01 +0000 (UTC) Received: from lion.home.serebryakov.spb.ru (unknown [IPv6:2001:470:923f:1:9421:367:9d7d:512b]) (Authenticated sender: lev@serebryakov.spb.ru) by onlyone.friendlyhosting.spb.ru (Postfix) with ESMTPA id 65EFB4AC58; Sat, 2 Mar 2013 00:22:51 +0400 (MSK) Date: Sat, 2 Mar 2013 00:22:44 +0400 From: Lev Serebryakov Organization: FreeBSD Project X-Priority: 3 (Normal) Message-ID: <1352492388.20130302002244@serebryakov.spb.ru> To: Kirk McKusick Subject: Re: Panic in ffs_valloc (Was: Unexpected SU+J inconsistency AGAIN -- please, don't shift topic to ZFS!) In-Reply-To: <201303011800.r21I0pBD034998@chez.mckusick.com> References: <352538988.20130301102237@serebryakov.spb.ru> <201303011800.r21I0pBD034998@chez.mckusick.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Cc: freebsd-fs@freebsd.org, Don Lewis X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list Reply-To: lev@FreeBSD.org List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 01 Mar 2013 20:23:02 -0000 Hello, Kirk. You wrote 1 =D0=BC=D0=B0=D1=80=D1=82=D0=B0 2013 =D0=B3., 22:00:51: >> As far, as I understand, if this theory is right (file system >> corruption which left unnoticed by "standard" fsck), it is bug in FFS >> SU+J too, as it should not be corrupted by reordered writes (if >> writes is properly reported as completed even if they were >> reordered). KM> If the bitmaps are left corrupted (in particular if blocks are marked KM> free that are actually in use), then that panic can occur. Such a state KM> should never be possible when running with SU even if you have crashed KM> multiple times and restarted without running fsck. I run fsck every time (ok, every half-year) when server crashes due to my awkward experiments on live system, but I run it as it runs: with journal after upgrade to 9-STABLE, not full old-fashioned run. KM> To reduce the number of possible points of failure, I suggest that KM> you try running with just SU (i.e., turn off the SU+J jornalling). KM> you can do this with `tunefs -j disable /dev/fsdisk'. This will KM> turn off journalling, but not soft updates. You can verify this KM> by then running `tunefs -p /dev/fsdisk' to ensure that soft updates KM> are still enabled. And wait another half a year :) I'm trying to reproduce this situation on VM (VirtualBox with virtual HDDs), but no luck (yet?). KM> I will MFC 246876 and 246877 once they have been in head long enough KM> to have confidence that they will not cause trouble. That means at KM> least a month (well more than the two weeks they have presently been KM> there). KM> Note these changes only pass the barrier request down to the GEOM KM> layer. I don't know whether it actually makes it to the drive layer KM> and if it does whether the drive layer actually implements it. My KM> goal was to get the ball rolling. I'm have controversial feelings about this barriers. IMHO, all writes to UFS (FFS) could and should be divided into two classes: data writes and metadata (including journal, as FFS doesn't have data journaling) writes. IMHO (it is last time I type these 4 letters, but, please, add it when you read this before and after each my sentence, as I'm not FS expert at any grade), data writes could be done as best effort till fsync() is called (or file is opened with appropriate flag, which is equivalent to automatic fsync() after each write). They could be delayed, reordered, etc. But metadata should have some strong guarantees (and fsync()'ed data too, of course). Such division could allow best possible performance & consistent FS metadata (maybe not consistent user data -- but every application which needs strong guarantees, like RDBMS, use fsync() anyway). Now you add "BARRIER" write. It looks too strong to use it often. It will force writing of ALL data from caches, even if your intention is to write only 2 or 3 blocks of metadata. It could solve problems with FS metadata, but it will degrade performance, especially in multithreaded load. Update of inode map for creating 0 bytes file flag by one process (protected with barrier) will flush whole data cache (maybe, hundred of meagbytes) of other one. It is better than noting, but, it is not best solution. Every write should be marked as "critical" or "loose" and critical-marked buffers (BIOs) must be written ASAP and before all other _crtitcal_ BIOs (not all BIOs after it with or without flag). So, barrier should affect only other barriers (ordered writes). Default, "loose" semantic (for data) will exactly what we have now. It is very hard to implement contract "It only ensure that buffers written before that buffer will get to the media before any buffers written after that buffer" in any other way but full flush, which, as I stated above, will hurt performace in such cases as effective RAID5-like implementations which gain a lot from combining wrties together by spatial (not time) property. And for full flush (which is needed sometimes, of course) we already have BIO_FLUSH command. Anyway, I'll support new semantic in geom_raid5 ASAP. But, unfortunately, now it could be supported as it is simple write followed by BIO_FLUSH -- not very effective :( --=20 // Black Lion AKA Lev Serebryakov From owner-freebsd-fs@FreeBSD.ORG Fri Mar 1 20:43:53 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 2DC6CCCC; Fri, 1 Mar 2013 20:43:53 +0000 (UTC) (envelope-from utisoft@gmail.com) Received: from mail-ia0-x22d.google.com (mail-ia0-x22d.google.com [IPv6:2607:f8b0:4001:c02::22d]) by mx1.freebsd.org (Postfix) with ESMTP id E3B8B172A; Fri, 1 Mar 2013 20:43:52 +0000 (UTC) Received: by mail-ia0-f173.google.com with SMTP id h37so3043435iak.18 for ; Fri, 01 Mar 2013 12:43:52 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=x-received:mime-version:in-reply-to:references:from:date:message-id :subject:to:cc:content-type; bh=2g9kPW/UNrvBQ2y31CE+9hcRlAGhq335WEw+3Nrux7U=; b=mtWTcfOt5u88QYbcR1TfnYITgplpKRngh+tX1j3IyJ9Ddgnj8QrjOAQKyuH3VQhR5N X7SBcot2NCug+7abb5BdSiXRo/Zmf8hOYe8BegYoL9IsbGqIqmRphUJbejMsGPjaA5vb zcPjSXESytJLwGdTy33JSWfc7X/Yq2dvHbEfWGbgc2O2cn1giKZA3J2QuWRR9PYfQSw7 pI553v+fvGwVDluF0JuDjbJNJTdrchXlySfDB6xV3rqp0YCsxVjzCU9PTyvkbKC4pbXE cC3vRYbwjHxa4GzLBiLMKG5Fq7WQFYg74Vox5QpyXdwnhgbSkG37aE3AsA2qRuYFN0uU bMSA== X-Received: by 10.42.126.133 with SMTP id e5mr7497058ics.17.1362170632573; Fri, 01 Mar 2013 12:43:52 -0800 (PST) MIME-Version: 1.0 Received: by 10.64.63.12 with HTTP; Fri, 1 Mar 2013 12:43:22 -0800 (PST) In-Reply-To: References: <20130121221617.GA23909@icarus.home.lan> <50FED818.7070704@FreeBSD.org> <20130125083619.GA51096@icarus.home.lan> <20130125211232.GA3037@icarus.home.lan> <20130125212559.GA1772@icarus.home.lan> <20130125213209.GA1858@icarus.home.lan> <20130126011754.GA1806@icarus.home.lan> <51267055.3040500@FreeBSD.org> From: Chris Rees Date: Fri, 1 Mar 2013 20:43:22 +0000 Message-ID: Subject: Re: disk "flipped" - a known problem? To: mav@freebsd.org Content-Type: text/plain; charset=ISO-8859-1 Cc: Jeremy Chadwick , "freebsd-fs@freebsd.org" , Andriy Gapon X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 01 Mar 2013 20:43:53 -0000 On 23 February 2013 09:39, Chris Rees wrote: > > On 21 Feb 2013 19:07, "Alexander Motin" wrote: >> >> On 26.01.2013 03:17, Jeremy Chadwick wrote: >> > Okay, I've figured out the exact, 100% reproducible condition that >> > causes the situation. It took me a lot of tries and a digital pocket >> > recorder to take verbal notes (there are just too many things to look at >> > simultaneously), but I've figured it out. >> > >> > I'm sorry for the verbosity, but it's necessary. >> > >> > Assume the disk we're talking about is /dev/ada5. >> > >> > 1. Prior to any issues, we have this: >> > >> > root@icarus:~ # ls -l /dev/ada5* /dev/xpt* /dev/pass5* >> > crw-r----- 1 root operator 0x8c Jan 25 16:41 /dev/ada5 >> > crw------- 1 root operator 0x75 Jan 25 16:35 /dev/pass5 >> > crw------- 1 root operator 0x51 Jan 25 16:35 /dev/xpt0 >> > >> > 2. ada5 begins experiencing issues -- ATA commands (CDBs) submit do not >> > get a response (not going to discuss how/why that can happen). >> > >> > 3. These types of messages are seen on console (naturally the CDB and >> > request type will vary -- in this case it was because I was doing the dd >> > zero'ing, thus tickling the bad sector/naughty firmware on the drive): >> > >> > Jan 25 16:29:28 icarus kernel: ahcich5: Timeout on slot 0 port 0 >> > Jan 25 16:29:28 icarus kernel: ahcich5: is 00000000 cs 00000000 ss >> > 00000001 rs 00000001 tfd 40 serr 00000000 cmd 0004c017 >> > Jan 25 16:29:28 icarus kernel: ahcich5: AHCI reset... >> > Jan 25 16:29:28 icarus kernel: ahcich5: SATA connect time=1000us >> > status=00000113 >> > Jan 25 16:29:28 icarus kernel: ahcich5: AHCI reset: device found >> > Jan 25 16:29:28 icarus kernel: (ada5:ahcich5:0:0:0): WRITE_FPDMA_QUEUED. >> > ACB: 61 80 80 77 01 40 00 00 00 00 00 00 >> > Jan 25 16:29:28 icarus kernel: (ada5:ahcich5:0:0:0): CAM status: Command >> > timeout >> > Jan 25 16:29:28 icarus kernel: (ada5:ahcich5:0:0:0): Retrying command >> > >> > 4. Any I/O submit to ada5 during this time blocks (this is normal). >> > >> > 5. **While this situation is happening**, something using xpt(4) >> > attempts to submit a CDB to the disk (ex. smartctl -a /dev/ada5). >> > This request also blocks (again, normal). >> > >> > 6. Physical device falls off bus, or CAM kicks the disk off the bus. >> > Doesn't matter which. We see messages resembling this (boy am I tired >> > of this interspersed output problem): >> > >> > Jan 25 16:29:32 icarus kernel: (ada5:ahcich5:0:0:0): lost device >> > Jan 25 16:29:32 icarus kernel: (pass5:ahcich5:0:0:0): lost device >> > Jan 25 16:29:32 icarus kernel: (ada5:ahcich5:0:0:0): removing device >> > entry >> > Jan 25 16:29:32 icarus kernel: (pass5:ahcich5:0:0:0): passdevgonecb: >> > devfs entry is gone >> > >> > 7. Standard I/O requests fail with errno=6 "Device not configured". >> > xpt(4) requests also fail with the same errno. >> > >> > 8. Device-wise, at this stage all we have is: >> > >> > root@icarus:~ # ls -l /dev/ada5* /dev/xpt* /dev/pass5* >> > crw------- 1 root operator 0x51 Jan 25 16:35 /dev/xpt0 >> > >> > 9. Device comes back online for whatever reason. FreeBSD sees the disk, >> > blah blah blah: >> > >> > Jan 25 16:30:16 icarus kernel: GEOM: new disk ada5 >> > Jan 25 16:30:16 icarus kernel: ada5: >> > ATA-7 SATA 1.x device >> > Jan 25 16:30:16 icarus kernel: ada5: Serial Number WD-WMAP41573589 >> > Jan 25 16:30:16 icarus kernel: ada5: 150.000MB/s transfers (SATA 1.x, >> > UDMA6, PIO 8192bytes) >> > Jan 25 16:30:16 icarus kernel: ada5: Command Queueing enabled >> > Jan 25 16:30:16 icarus kernel: ada5: 143089MB (293046768 512 byte >> > sectors: 16H 63S/T 16383C) >> > Jan 25 16:30:16 icarus kernel: ada5: Previously was known as ad14 >> > >> > ...um, where's pass5? >> > >> > 10. /dev/pass5 is now completely (permanently) missing: >> > >> > root@icarus:~ # ls -l /dev/ada5* /dev/xpt* /dev/pass5* >> > crw-r----- 1 root operator 0x99 Jan 25 16:42 /dev/ada5 >> > crw------- 1 root operator 0x51 Jan 25 16:35 /dev/xpt0 >> > >> > 11. Any further attempts to communicate via xpt(4) with ada5 fail. >> > Detaching and reattaching the disk does not fix the issue; the only fix >> > is to reboot the system. >> > >> > 12. "camcontrol debug -IPXp scbus5" results in tons and tons of output >> > all pertaining to xpt(4). It looks like xpt(4) is in some kind of >> > loop. >> > >> > Below is my verbose boot (with non-kernel things removed), which >> > also includes "camcontrol debug" output once things are in a bad state: >> > >> > http://jdc.koitsu.org/freebsd/xpt_oddity.log >> > >> > In this log you'll see that after 1 CAM timeout I yanked the drive, then >> > roughly 30 seconds later reinserted it. >> > >> > If you need me to turn on CAM debugging *prior* to the above, I can do >> > that, just let me know. >> > >> > The important step is #5. Without that, the problem shown in #9/10/11 >> > does not happen. >> > >> > It's a good thing I don't run smartd(8) -- most users I see using that >> > software set the interval to something like 180s or 60s. Imagine this >> > frustration: "okay so the disk fell off the bus, but what, now I can't >> > talk to it with SMART? Uhhh... Err, works now? Whatever". >> >> I think, the problem may already be fixed in HEAD by r244014 by ken@. >> I've just merged it to 9-STABLE at r247115. So if it is still possible >> to reproduce the situation, it would be good to try. > > I think I've been having the same troubles since upgrading from 9.0, so I'm > going to try applying that to 9.1-R and I'll also give feedback. Yup, I no longer get weird disconnects after this patch (5 days later now). Thank you very much! Chris From owner-freebsd-fs@FreeBSD.ORG Fri Mar 1 22:31:23 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 9DF74944 for ; Fri, 1 Mar 2013 22:31:23 +0000 (UTC) (envelope-from mailinglists.tech@gmail.com) Received: from mail-ea0-x235.google.com (mail-ea0-x235.google.com [IPv6:2a00:1450:4013:c01::235]) by mx1.freebsd.org (Postfix) with ESMTP id 208CB1AAB for ; Fri, 1 Mar 2013 22:31:22 +0000 (UTC) Received: by mail-ea0-f181.google.com with SMTP id i13so417348eaa.40 for ; Fri, 01 Mar 2013 14:31:22 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=x-received:message-id:date:from:user-agent:mime-version:to:cc :subject:references:in-reply-to:content-type :content-transfer-encoding; bh=pZBvZY3F7igvyYzyt7hHx5fcHOhoKTgqbulafVkaIGI=; b=bf2nIe/kZBgyVx1PjUBPRoDlydin0poLWz8q7tG1BeO33XZS7B3gq2IgUgUZW/QqnE XQ0kunX34OJuuUSTnd8C4VtAip1avhphC2pd+8DfgPTudrsrkgJZt/GgaqWHiLCTJU9E rHFbN1ZQeyQpgnCXFcJXKsiKFPtetes0XGnkzkTkoddrYrLinvIfxzs+nmjli4ckBk4o IRUaar/txFFV+T80P0jFnOHhBy3Kv4z6jAnAuipvcv9qT4ObMhD59vAngbwExAILbuRc sQjGK2XMVwOG6Xe/yfkavO7u9Gf8Py4ac+Wkr6PtSJGbT6yWWM4bZwpnpEkEILfI91pR JYUQ== X-Received: by 10.14.3.70 with SMTP id 46mr32385815eeg.2.1362177082289; Fri, 01 Mar 2013 14:31:22 -0800 (PST) Received: from [127.0.0.1] (ashlynn.lippux.de. [5.9.218.242]) by mx.google.com with ESMTPS id 3sm19345585eej.6.2013.03.01.14.31.20 (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Fri, 01 Mar 2013 14:31:21 -0800 (PST) Message-ID: <51312C32.6000207@gmail.com> Date: Fri, 01 Mar 2013 23:31:14 +0100 From: tech mailinglists User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:17.0) Gecko/20130215 Thunderbird/17.0.3 MIME-Version: 1.0 To: Peter Maloney Subject: Re: I am to silly to mount a zpool while boot References: <513098FF.8030806@brockmann-consult.de> In-Reply-To: <513098FF.8030806@brockmann-consult.de> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: freebsd-fs@freebsd.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 01 Mar 2013 22:31:23 -0000 Am 01.03.2013 13:03, schrieb Peter Maloney: > For the mount, don't use fstab. use: > > zfs set mountpoint=/home poolname/path/to/dataset > > And for the import, add > > zfs_enable="YES" > > to rc.conf. > > > And I think that's it. (all my FreeBSD systems are pure zfs, so not sure > what troubles you would get if you had UFS on root) > > > On 2013-03-01 12:26, tech mailinglists wrote: >> Hello all, >> >> I think that I only can be an idiot to get in such a problem but I am >> not able to mount a zpool via fstab while boot. >> >> I have a FreeBSD i386 PV Xen DomU running with 3 disks xbd0 (ext2 for >> /boot), xbd1 (UFS for /) and xbd2 (ZFS/zpool with name home to mount >> at /home). >> >> I now tried everything I could find. So my fstab entry looks like this: >> >> home /home zfs rw,late 0 0 >> >> The real problem is that after a reboot the zpool is no longer >> imported, I really don't know why I always have to reimport the pool >> via zpool import -d /dev home. Because of this the filesystem never >> can be mounted via fstab while boot and I get dropped into a shell >> where I need to do this always manually. >> >> So why the pool always isn't imported after boot and how can I solve this issue? >> >> And is the fstab entry correct itself? So would it work when the pool >> gets imported with it's name befor the fstab entry is parsed? >> >> Hope that someone give me a few hints or a solution. >> >> Best Regards >> _______________________________________________ >> freebsd-fs@freebsd.org mailing list >> http://lists.freebsd.org/mailman/listinfo/freebsd-fs >> To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org" > Hello all, a few of the things I already had done. But the real problem is I think that the pool doesn't get imported automatically. I read that ZFS searches in special directories when it tries to import. So is there a way to set an option which says that it should search in /dev? I always have to do this after reboot: zpool import -d /dev tank Than tank (pool) gets mounted at /tank and the zvol tank/home gets mounted on /home. So I think that the import of the zpool fails. I have set zfs_enable="YES" in /etc/rc.conf also zfs_load=YES as boot parameter which gets shown in kenv and commented out the fstab entry. So I read that the import normally should work automatically when the module is loaded and zfs is enabled but I think the fact that my pool is located on /dev/xbd2 is the problem. Best Regards From owner-freebsd-fs@FreeBSD.ORG Fri Mar 1 22:52:27 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id DB0B5151 for ; Fri, 1 Mar 2013 22:52:27 +0000 (UTC) (envelope-from lkchen@k-state.edu) Received: from ksu-out.merit.edu (ksu-out.merit.edu [207.75.117.132]) by mx1.freebsd.org (Postfix) with ESMTP id A90101B6C for ; Fri, 1 Mar 2013 22:52:27 +0000 (UTC) X-Merit-ExtLoop1: 1 X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AgAFAPgvMVHPS3TT/2dsb2JhbAA6CoZPuRCDYBZzgh8BAQUjYg8aAg0ZAlmILKA5jlWJMohogSOMLxCBWIIXgRMDiGueQ4FSgVSBTD0 X-IronPort-AV: E=Sophos;i="4.84,765,1355115600"; d="scan'208";a="905809341" X-MERIT-SOURCE: KSU Received: from ksu-sfpop-mailstore02.merit.edu ([207.75.116.211]) by sfpop-ironport05.merit.edu with ESMTP; 01 Mar 2013 17:52:20 -0500 Date: Fri, 1 Mar 2013 17:52:20 -0500 (EST) From: "Lawrence K. Chen, P.Eng." To: freebsd-fs@freebsd.org Message-ID: <1602333081.21816316.1362178340105.JavaMail.root@k-state.edu> In-Reply-To: <51310CAA.1020701@entel.upc.edu> Subject: Re: benefit of GEOM labels for ZFS, was Hard drive device names... serial numbers MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Originating-IP: [129.130.0.181] X-Mailer: Zimbra 7.2.2_GA_2852 (ZimbraWebClient - GC25 ([unknown])/7.2.2_GA_2852) X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 01 Mar 2013 22:52:27 -0000 ----- Original Message ----- > Al 01/03/2013 00:25, En/na Graham Allan ha escrit: > > Sorry to come in late on this thread but I've been struggling with > > thinking about the same issue, from a different perspective. > > > > Several months ago we created our first "large" ZFS storage system, > > using 42 drives plus a few SSDs in one of the oft-used Supermicro > > 45-drive chassis. It has been working really nicely but has led to > > some puzzling over the best way to do some things when we build > > more. > > > > We made our pool using geom drive labels. Ever since, I've been > > wondering if this really gives any advantage - at least for this > > type > > of system. If you need to replace a drive, you don't really know > > which > > enclosure slot any given da device is, and so our answer has been > > to > > dig around using sg3_utils commands wrapped in a bit of perl, to > > try > > and correlate the da device to the slot via the drive serial > > number. > > > > At this point, having a geom label just seems like an extra bit of > > indirection to increase my confusion :-) Although setting the geom > > label to the drive serial number might be a serious improvement... > > > > We're about to add a couple more of these shelves to the system, > > giving a total of 135 drives (although each shelf would be a > > separate > > pool), and given that they will be standard consumer grade drives, > > some frequency of replacement is a given. > > > > Does anyone have any good tips on how to manage a large number of > > drives in a zfs pool like this? > > > > I don't have such a large array, I have about 8 or 10 drives at > most but I'd go with Freddie's convention. I'd also go with GPT > labels > instead of geom labels because the former are universal. > > I'd also ensure that you can easily identify driver with leds. > Either by issuing commands to the disk controller (I use mfiutil to > visually identify them) or by using ses, but you probably have > though. > I only have 15 drives...(12 HDDs and 3 SSDs) but the ordering of drives seemed to randomize on every boot (wonder now if the controller was doing some kind of staggering in spin ups. And, their other drivers cope with it. They provide a v1.1 driver for FreeBSD 7.2 or source to the v1.0 driver.) And, then everything moved around when I changed controllers a few times. I had resorted at one point to putting device.hints to force all the drives to keep their mapping. Which caused problems elsewhere, and a mess when I added another controller. But, then I changed to more meaningful GPT labels and exported and re-imported my zpools with '-d /dev/gpt', and now things are ok. L From owner-freebsd-fs@FreeBSD.ORG Fri Mar 1 22:59:58 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 606F1262 for ; Fri, 1 Mar 2013 22:59:58 +0000 (UTC) (envelope-from fjwcash@gmail.com) Received: from mail-qa0-f48.google.com (mail-qa0-f48.google.com [209.85.216.48]) by mx1.freebsd.org (Postfix) with ESMTP id 269E71BBD for ; Fri, 1 Mar 2013 22:59:57 +0000 (UTC) Received: by mail-qa0-f48.google.com with SMTP id j8so62186qah.7 for ; Fri, 01 Mar 2013 14:59:57 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:x-received:in-reply-to:references:date:message-id :subject:from:to:cc:content-type; bh=GhBTFB57zu0Vttuf+Ri9l7x168Ijm2PqKMToODLvdPU=; b=ckpEc/1xClSQ7C06GxFafSH9eCs7OnffZPp+Lz9l1mWTDO0d4E9NBJDRk3sk9Z0B6X epQJtRcWn32a1jccLr1IhcEUWkqEneWTQSUDNzRaPCmIJ7JjtpH8Q/emifGbDW9EtQro ES1BmCZEh4r7HWtto5csh/YBDQx5RR2YPwKpydPgnDau0CzhdNKMbP1dlK/IHGq6Ly84 7w9bAf9JALZW6HK+c8EiXCel0L9a30iDlUqQCOhlNxivASQCYlG1WvH3r4UXtBn03lL3 ORg/OajsyA3BjzblKx24E/O6kzM1+wEf1i59soNLapTCWl/pet0T3V6XNAwbkjadgpk1 Ezdg== MIME-Version: 1.0 X-Received: by 10.224.203.131 with SMTP id fi3mr22428502qab.77.1362178797242; Fri, 01 Mar 2013 14:59:57 -0800 (PST) Received: by 10.49.106.233 with HTTP; Fri, 1 Mar 2013 14:59:57 -0800 (PST) In-Reply-To: <51312C32.6000207@gmail.com> References: <513098FF.8030806@brockmann-consult.de> <51312C32.6000207@gmail.com> Date: Fri, 1 Mar 2013 14:59:57 -0800 Message-ID: Subject: Re: I am to silly to mount a zpool while boot From: Freddie Cash To: tech mailinglists Content-Type: text/plain; charset=UTF-8 X-Content-Filtered-By: Mailman/MimeDel 2.1.14 Cc: FreeBSD Filesystems X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 01 Mar 2013 22:59:58 -0000 What's the output of: zfs get mountpoint tank/home On Fri, Mar 1, 2013 at 2:31 PM, tech mailinglists < mailinglists.tech@gmail.com> wrote: > Am 01.03.2013 13:03, schrieb Peter Maloney: > >> For the mount, don't use fstab. use: >> >> zfs set mountpoint=/home poolname/path/to/dataset >> >> And for the import, add >> >> zfs_enable="YES" >> >> to rc.conf. >> >> >> And I think that's it. (all my FreeBSD systems are pure zfs, so not sure >> what troubles you would get if you had UFS on root) >> >> >> On 2013-03-01 12:26, tech mailinglists wrote: >> >>> Hello all, >>> >>> I think that I only can be an idiot to get in such a problem but I am >>> not able to mount a zpool via fstab while boot. >>> >>> I have a FreeBSD i386 PV Xen DomU running with 3 disks xbd0 (ext2 for >>> /boot), xbd1 (UFS for /) and xbd2 (ZFS/zpool with name home to mount >>> at /home). >>> >>> I now tried everything I could find. So my fstab entry looks like this: >>> >>> home /home zfs rw,late 0 0 >>> >>> The real problem is that after a reboot the zpool is no longer >>> imported, I really don't know why I always have to reimport the pool >>> via zpool import -d /dev home. Because of this the filesystem never >>> can be mounted via fstab while boot and I get dropped into a shell >>> where I need to do this always manually. >>> >>> So why the pool always isn't imported after boot and how can I solve >>> this issue? >>> >>> And is the fstab entry correct itself? So would it work when the pool >>> gets imported with it's name befor the fstab entry is parsed? >>> >>> Hope that someone give me a few hints or a solution. >>> >>> Best Regards >>> ______________________________**_________________ >>> freebsd-fs@freebsd.org mailing list >>> http://lists.freebsd.org/**mailman/listinfo/freebsd-fs >>> To unsubscribe, send any mail to "freebsd-fs-unsubscribe@**freebsd.org >>> " >>> >> >> > Hello all, > > a few of the things I already had done. But the real problem is I think > that the pool doesn't get imported automatically. I read that ZFS searches > in special directories when it tries to import. So is there a way to set an > option which says that it should search in /dev? I always have to do this > after reboot: > > zpool import -d /dev tank > > Than tank (pool) gets mounted at /tank and the zvol tank/home gets mounted > on /home. > > So I think that the import of the zpool fails. I have set zfs_enable="YES" > in /etc/rc.conf also zfs_load=YES as boot parameter which gets shown in > kenv and commented out the fstab entry. So I read that the import normally > should work automatically when the module is loaded and zfs is enabled but > I think the fact that my pool is located on /dev/xbd2 is the problem. > > Best Regards > ______________________________**_________________ > freebsd-fs@freebsd.org mailing list > http://lists.freebsd.org/**mailman/listinfo/freebsd-fs > To unsubscribe, send any mail to "freebsd-fs-unsubscribe@**freebsd.org > " > -- Freddie Cash fjwcash@gmail.com From owner-freebsd-fs@FreeBSD.ORG Sat Mar 2 00:53:06 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 04737FA6 for ; Sat, 2 Mar 2013 00:53:06 +0000 (UTC) (envelope-from mailinglists.tech@gmail.com) Received: from mail-ee0-f54.google.com (mail-ee0-f54.google.com [74.125.83.54]) by mx1.freebsd.org (Postfix) with ESMTP id 945881F39 for ; Sat, 2 Mar 2013 00:53:05 +0000 (UTC) Received: by mail-ee0-f54.google.com with SMTP id c41so2845077eek.13 for ; Fri, 01 Mar 2013 16:53:04 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=x-received:message-id:date:from:user-agent:mime-version:to:cc :subject:references:in-reply-to:content-type; bh=LZfbTPfAfRNl3GvNhvGjv2sStgBu3yKwitfl7qU21MU=; b=PXseLQIXhYIFVIFYgN6N5wm7RC/fmDxeXa3wggxetX5D577iMN5BDg7KtFheFWIUwo UVkLjflPteG4jbsBQURHxsYTL/el74z6koXGktiFhVWWDDmSzTrOD6Yw3jXqqihhwEmh qYI/+yf4rdrxJXjSMYn4QazYMneuFWmNk5NufK7tBQJO9o1J3cIxnAkr3ZZ3TpEoKcp5 NAUNji4j/YUYm1E9aqEi2VEsDxObqeSnbQHgFyG07qgxmj0ktOe2gvlZjUud2KR1mRP6 vbmRCUFtAy27ecPSpkfqn2hF7gjwhGpnrJJJ66Ro5AxqD0jpCkQ9m/yi0UBacrV47J7Y CbSQ== X-Received: by 10.14.3.133 with SMTP id 5mr32752698eeh.43.1362185584336; Fri, 01 Mar 2013 16:53:04 -0800 (PST) Received: from [127.0.0.1] (ashlynn.lippux.de. [5.9.218.242]) by mx.google.com with ESMTPS id d47sm19817075eem.9.2013.03.01.16.53.01 (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Fri, 01 Mar 2013 16:53:03 -0800 (PST) Message-ID: <51314D67.7040704@gmail.com> Date: Sat, 02 Mar 2013 01:52:55 +0100 From: tech mailinglists User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:17.0) Gecko/20130215 Thunderbird/17.0.3 MIME-Version: 1.0 To: Freddie Cash Subject: Re: I am to silly to mount a zpool while boot References: <513098FF.8030806@brockmann-consult.de> <51312C32.6000207@gmail.com> In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Content-Filtered-By: Mailman/MimeDel 2.1.14 Cc: FreeBSD Filesystems X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 02 Mar 2013 00:53:06 -0000 Am 01.03.2013 23:59, schrieb Freddie Cash: > What's the output of: > > zfs get mountpoint tank/home > > > On Fri, Mar 1, 2013 at 2:31 PM, tech mailinglists > > wrote: > > Am 01.03.2013 13:03, schrieb Peter Maloney: > > For the mount, don't use fstab. use: > > zfs set mountpoint=/home poolname/path/to/dataset > > And for the import, add > > zfs_enable="YES" > > to rc.conf. > > > And I think that's it. (all my FreeBSD systems are pure zfs, > so not sure > what troubles you would get if you had UFS on root) > > > On 2013-03-01 12:26, tech mailinglists wrote: > > Hello all, > > I think that I only can be an idiot to get in such a > problem but I am > not able to mount a zpool via fstab while boot. > > I have a FreeBSD i386 PV Xen DomU running with 3 disks > xbd0 (ext2 for > /boot), xbd1 (UFS for /) and xbd2 (ZFS/zpool with name > home to mount > at /home). > > I now tried everything I could find. So my fstab entry > looks like this: > > home /home zfs rw,late 0 0 > > The real problem is that after a reboot the zpool is no longer > imported, I really don't know why I always have to > reimport the pool > via zpool import -d /dev home. Because of this the > filesystem never > can be mounted via fstab while boot and I get dropped into > a shell > where I need to do this always manually. > > So why the pool always isn't imported after boot and how > can I solve this issue? > > And is the fstab entry correct itself? So would it work > when the pool > gets imported with it's name befor the fstab entry is parsed? > > Hope that someone give me a few hints or a solution. > > Best Regards > _______________________________________________ > freebsd-fs@freebsd.org > mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-fs > To unsubscribe, send any mail to > "freebsd-fs-unsubscribe@freebsd.org > " > > > > Hello all, > > a few of the things I already had done. But the real problem is I > think that the pool doesn't get imported automatically. I read > that ZFS searches in special directories when it tries to import. > So is there a way to set an option which says that it should > search in /dev? I always have to do this after reboot: > > zpool import -d /dev tank > > Than tank (pool) gets mounted at /tank and the zvol tank/home gets > mounted on /home. > > So I think that the import of the zpool fails. I have set > zfs_enable="YES" in /etc/rc.conf also zfs_load=YES as boot > parameter which gets shown in kenv and commented out the fstab > entry. So I read that the import normally should work > automatically when the module is loaded and zfs is enabled but I > think the fact that my pool is located on /dev/xbd2 is the problem. > > Best Regards > _______________________________________________ > freebsd-fs@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-fs > To unsubscribe, send any mail to > "freebsd-fs-unsubscribe@freebsd.org > " > > > > > -- > Freddie Cash > fjwcash@gmail.com The mountpoint of tank/home is set to /home. The output looks like this: NAME PROPERTY VALUE SOURCE tank/home mountpoint /home local Best Regards From owner-freebsd-fs@FreeBSD.ORG Sat Mar 2 01:50:02 2013 Return-Path: Delivered-To: freebsd-fs@smarthost.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id EBF69628 for ; Sat, 2 Mar 2013 01:50:02 +0000 (UTC) (envelope-from gnats@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:1900:2254:206c::16:87]) by mx1.freebsd.org (Postfix) with ESMTP id DB4EB1A9 for ; Sat, 2 Mar 2013 01:50:02 +0000 (UTC) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.6/8.14.6) with ESMTP id r221o1Tr096279 for ; Sat, 2 Mar 2013 01:50:01 GMT (envelope-from gnats@freefall.freebsd.org) Received: (from gnats@localhost) by freefall.freebsd.org (8.14.6/8.14.6/Submit) id r221o1Po096271; Sat, 2 Mar 2013 01:50:01 GMT (envelope-from gnats) Date: Sat, 2 Mar 2013 01:50:01 GMT Message-Id: <201303020150.r221o1Po096271@freefall.freebsd.org> To: freebsd-fs@FreeBSD.org Cc: From: "Steven Hartland" Subject: Re: kern/153695: [patch] [zfs] Booting from zpool created on 4k-sector drive doesn' t work X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list Reply-To: Steven Hartland List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 02 Mar 2013 01:50:03 -0000 The following reply was made to PR kern/153695; it has been noted by GNATS. From: "Steven Hartland" To: , Cc: Subject: Re: kern/153695: [patch] [zfs] Booting from zpool created on 4k-sector drive doesn't work Date: Sat, 2 Mar 2013 01:43:06 -0000 So this is no longer a problem and can be closed? ================================================ This e.mail is private and confidential between Multiplay (UK) Ltd. and the person or entity to whom it is addressed. In the event of misdirection, the recipient is prohibited from using, copying, printing or otherwise disseminating it or any information contained in it. In the event of misdirection, illegible or incomplete transmission please telephone +44 845 868 1337 or return the E.mail to postmaster@multiplay.co.uk. From owner-freebsd-fs@FreeBSD.ORG Sat Mar 2 06:44:51 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 799142FB for ; Sat, 2 Mar 2013 06:44:51 +0000 (UTC) (envelope-from daniel@digsys.bg) Received: from smtp-sofia.digsys.bg (smtp-sofia.digsys.bg [193.68.21.123]) by mx1.freebsd.org (Postfix) with ESMTP id D65F7D9F for ; Sat, 2 Mar 2013 06:44:50 +0000 (UTC) Received: from [193.68.136.207] (digsys207-136.pip.digsys.bg [193.68.136.207]) (authenticated bits=0) by smtp-sofia.digsys.bg (8.14.6/8.14.6) with ESMTP id r226It6G085564 (version=TLSv1/SSLv3 cipher=AES128-SHA bits=128 verify=NO); Sat, 2 Mar 2013 08:18:56 +0200 (EET) (envelope-from daniel@digsys.bg) References: <512FE773.3060903@physics.umn.edu> Mime-Version: 1.0 (1.0) In-Reply-To: Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: quoted-printable Message-Id: X-Mailer: iPad Mail (10B146) From: Daniel Kalchev Subject: Re: benefit of GEOM labels for ZFS, was Hard drive device names... serial numbers Date: Sat, 2 Mar 2013 08:18:56 +0200 To: Freddie Cash Cc: FreeBSD Filesystems X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 02 Mar 2013 06:44:51 -0000 On 01.03.2013, at 05:30, Freddie Cash wrote: > For example, we use the following pattern: encX-A-# >=20 > Where X tells you which enclosure it's in, A tells you which column it's i= n > (letters start at A increasing to the right), and # tells you the disk in > the column, numbered top-down. We use similar labeling, but usually rely on the vendor's drive cage labels a= nd do not use column numbers. But if your enclosures have column labels it m= akes sense. Anything that makes it obvious for the technician to locate the d= rive without consulting too much documentation makes sense. Just stick to on= e coordinate system for all enclosures in one location :) Using labels greatly simplifies ZFS management in cases of disaster - you ma= y have to boot another recovery system and no scripts or hard wired drive in= formation may be available to assist you. Daniel= From owner-freebsd-fs@FreeBSD.ORG Sat Mar 2 15:12:13 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 339F7BD5 for ; Sat, 2 Mar 2013 15:12:12 +0000 (UTC) (envelope-from gperez@entel.upc.edu) Received: from violet.upc.es (violet.upc.es [147.83.2.51]) by mx1.freebsd.org (Postfix) with ESMTP id A1C2AE6 for ; Sat, 2 Mar 2013 15:12:11 +0000 (UTC) Received: from ackerman2.upc.es (ackerman2.upc.es [147.83.2.244]) by violet.upc.es (8.14.1/8.13.1) with ESMTP id r22FC8At011156 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=FAIL); Sat, 2 Mar 2013 16:12:09 +0100 Received: from [192.168.1.110] (247.Red-81-39-132.dynamicIP.rima-tde.net [81.39.132.247]) (authenticated bits=0) by ackerman2.upc.es (8.14.4/8.14.4) with ESMTP id r22FC7L0018977 (version=TLSv1/SSLv3 cipher=DHE-RSA-CAMELLIA256-SHA bits=256 verify=NO); Sat, 2 Mar 2013 16:12:08 +0100 Message-ID: <513216C6.6030108@entel.upc.edu> Date: Sat, 02 Mar 2013 16:12:06 +0100 From: =?ISO-8859-1?Q?Gustau_P=E9rez_i_Querol?= User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:17.0) Gecko/20130220 Thunderbird/17.0.3 MIME-Version: 1.0 To: "Lawrence K. Chen, P.Eng." Subject: Re: benefit of GEOM labels for ZFS, was Hard drive device names... serial numbers References: <1602333081.21816316.1362178340105.JavaMail.root@k-state.edu> In-Reply-To: <1602333081.21816316.1362178340105.JavaMail.root@k-state.edu> X-Scanned-By: MIMEDefang 2.70 on 147.83.2.244 X-Mail-Scanned: Criba 2.0 + Clamd X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-3.0 (violet.upc.es [147.83.2.51]); Sat, 02 Mar 2013 16:12:09 +0100 (CET) Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 8bit X-Content-Filtered-By: Mailman/MimeDel 2.1.14 Cc: FreeBSD FS X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 02 Mar 2013 15:12:13 -0000 Al 01/03/2013 23:52, En/na Lawrence K. Chen, P.Eng. ha escrit: > > > I only have 15 drives...(12 HDDs and 3 SSDs) but the ordering of drives seemed to randomize on every boot (wonder now if the controller was doing some kind of staggering in spin ups. And, their other drivers cope with it. They provide a v1.1 driver for FreeBSD 7.2 or source to the v1.0 driver.) And, then everything moved around when I changed controllers a few times. > > I had resorted at one point to putting device.hints to force all the drives to keep their mapping. Which caused problems elsewhere, and a mess when I added another controller. But, then I changed to more meaningful GPT labels and exported and re-imported my zpools with '-d /dev/gpt', and now things are ok. That reordering issue is what made me switch to geom labels first (IIRC I did by 5.x era) and next switched to GPT labels. The GPT allows me first to easily used those drives with ZFS and specially because those labels belong to the partition scheme and thus are filesystem independent. What I found specially useful is to be able to identify those drives. When I drive fails I know where it is because of the simple name convention I use; but having a blinking led helps a lot, specially when writing the recovery plan for the rest of the team. Gus > L > _______________________________________________ > freebsd-fs@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-fs > To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org" -- Salut i força, Gustau --------------------------------------------------------------------------- Prou top-posting : http://ca.wikipedia.org/wiki/Top-posting Stop top-posting : http://en.wikipedia.org/wiki/Posting_style O O O Gustau Pérez i Querol O O O Unitat de Gestió dels departaments O O O Matemàtica Aplicada IV i Enginyeria Telemàtica Universitat Politècnica de Catalunya Edifici C3 - Despatx S101-B UPC Campus Nord UPC C/ Jordi Girona, 1-3 08034 - Barcelona