From owner-freebsd-fs@FreeBSD.ORG Sun Apr 7 08:19:20 2013 Return-Path: Delivered-To: freebsd-fs@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 15F86937 for ; Sun, 7 Apr 2013 08:19:20 +0000 (UTC) (envelope-from lev@FreeBSD.org) Received: from onlyone.friendlyhosting.spb.ru (onlyone.friendlyhosting.spb.ru [46.4.40.135]) by mx1.freebsd.org (Postfix) with ESMTP id D29F51D39 for ; Sun, 7 Apr 2013 08:19:19 +0000 (UTC) Received: from lion.home.serebryakov.spb.ru (unknown [IPv6:2001:470:923f:1:4c8a:eb64:d147:30a4]) (Authenticated sender: lev@serebryakov.spb.ru) by onlyone.friendlyhosting.spb.ru (Postfix) with ESMTPA id A8EF64AC57 for ; Sun, 7 Apr 2013 12:19:18 +0400 (MSK) Date: Sun, 7 Apr 2013 12:19:17 +0400 From: Lev Serebryakov Organization: FreeBSD X-Priority: 3 (Normal) Message-ID: <8710583097.20130407121917@serebryakov.spb.ru> To: freebsd-fs@FreeBSD.org Subject: ZFS snapshots and daily security checks MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list Reply-To: lev@FreeBSD.org List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 07 Apr 2013 08:19:20 -0000 Hello, Freebsd-fs. I've set up periodic ZFS snapshots with zfSnap script, and found that every new snapshot is reported in daily security check output, what is very inconvenient. Here is strange difference between `mount' and `mount -p' output: `mount' doesn't show mounted ZFS snapshots, but `mount -p' does. Is it possible to exclude these snapshots from `mount -p' output or don't mount them to hierarchy by default? -- // Black Lion AKA Lev Serebryakov From owner-freebsd-fs@FreeBSD.ORG Sun Apr 7 19:16:49 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 7702FD3D for ; Sun, 7 Apr 2013 19:16:49 +0000 (UTC) (envelope-from krichy@cflinux.hu) Received: from pi.nmdps.net (pi.nmdps.net [IPv6:2a01:be00:10:201:0:80:0:1]) by mx1.freebsd.org (Postfix) with ESMTP id 3ABDEB3C for ; Sun, 7 Apr 2013 19:16:48 +0000 (UTC) Received: from pi.nmdps.net (pi.nmdps.net [109.61.102.5]) (Authenticated sender: krichy@cflinux.hu) by pi.nmdps.net (Postfix) with ESMTPSA id 54BF91C08 for ; Sun, 7 Apr 2013 21:16:46 +0200 (CEST) Date: Sun, 7 Apr 2013 21:16:44 +0200 (CEST) From: Richard Kojedzinszky X-X-Sender: krichy@pi.nmdps.net To: freebsd-fs@freebsd.org Subject: zfs hang Message-ID: User-Agent: Alpine 2.00 (BSF 1167 2008-08-23) MIME-Version: 1.0 Content-Type: TEXT/PLAIN; format=flowed; charset=US-ASCII X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 07 Apr 2013 19:16:49 -0000 Dear FS devs, Pr kern/161968 still seems to be alive, at least on stable/9. Unfortunately, I am not familiar with zfs internals, so I just ask if someone knows the problem, and could fix it? If I can help somehow, I will. Thanks in advance, Kojedzinszky Richard From owner-freebsd-fs@FreeBSD.ORG Sun Apr 7 20:29:17 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 158FE9DB for ; Sun, 7 Apr 2013 20:29:17 +0000 (UTC) (envelope-from msdt161@msn.com) Received: from blu0-omc4-s18.blu0.hotmail.com (blu0-omc4-s18.blu0.hotmail.com [65.55.111.157]) by mx1.freebsd.org (Postfix) with ESMTP id DA22AED2 for ; Sun, 7 Apr 2013 20:29:16 +0000 (UTC) Received: from BLU0-SMTP133 ([65.55.111.135]) by blu0-omc4-s18.blu0.hotmail.com with Microsoft SMTPSVC(6.0.3790.4675); Sun, 7 Apr 2013 13:28:10 -0700 X-EIP: [mM7VQnsB1PcDDScRuxvf0vaJPRxoRWGP] X-Originating-Email: [msdt161@msn.com] Message-ID: Received: from user-PC ([41.138.100.80]) by BLU0-SMTP133.phx.gbl over TLS secured channel with Microsoft SMTPSVC(6.0.3790.4675); Sun, 7 Apr 2013 13:28:08 -0700 From: "M . S" To: "freebsd-fs@freebsd.org" Date: Sun, 7 Apr 2013 19:46:32 +0100 Subject: Hello. MIME-Version: 1.0 X-Mailer: aspNetEmail ver 3.6.1.22 X-OriginalArrivalTime: 07 Apr 2013 20:28:09.0866 (UTC) FILETIME=[6B6936A0:01CE33CE] Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Content-Filtered-By: Mailman/MimeDel 2.1.14 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list Reply-To: mso_1415@kimo.com List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 07 Apr 2013 20:29:17 -0000 Hello=2E=0D=0A=26nbsp;=26nbsp;=26nbsp;=26nbsp;=26nbsp; I assume you and y= our family are in good health=2E =0D=0AI am the Foreign Operations Manage= r at one of the leading generation bank here in Burkina Faso West Africa=2E= =0D=0A=26nbsp;=26nbsp; This being a wide world in which it can be difficu= lt to make new acquaintances and because it is virtually impossible to kn= ow who is trustworthy and who can be believed, i have decided to repose c= onfidence in you after much fasting and prayer=2E=26nbsp;=26nbsp; It is o= nly because of this that I have decided to confide in you and to share wi= th you this confidential business=2E=0D=0A=26nbsp; In my bank; there resi= des an overdue and abandoned huge amount of money=2E=26nbsp; When the acc= ount holder suddenly passed on, he left no beneficiary who would be entit= led to the receipt of the amount=2E=26nbsp; For this reason, I have found= it expedient to transfer this currency to a trustworthy individual with = capacity to act as foreign business partner=2E=26nbsp; Thus i humbly requ= est your assistance to claim this amount=2E=0D=0A=26nbsp;=26nbsp;=26nbsp;= Upon the transfer of this amount in your account, you will take 45=25 as= your share from the total currency, 10=25 will be shared to Charity Orga= nizations in both country and 45=25 will be for me=2E =0D=0APlease if you= are really sure you can handle this project, contact me immediately for = details of the amount involved=2E=0D=0AI am looking forward to read from = you soon=2E =0D=0AThanks =0D=0AYour Good friend, From owner-freebsd-fs@FreeBSD.ORG Sun Apr 7 22:31:02 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 0FD87CAB; Sun, 7 Apr 2013 22:31:02 +0000 (UTC) (envelope-from ler@lerctr.org) Received: from thebighonker.lerctr.org (lrosenman-1-pt.tunnel.tserv8.dal1.ipv6.he.net [IPv6:2001:470:1f0e:3ad::2]) by mx1.freebsd.org (Postfix) with ESMTP id D15DD33A; Sun, 7 Apr 2013 22:31:01 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lerctr.org; s=lerami; h=Message-ID:Subject:To:From:Date:Content-Transfer-Encoding:Content-Type:MIME-Version; bh=I7vaFBL2goRXMxfWs5TsBxReNqjIbn/p92TqN/EcHaA=; b=WDg5w+PHsZyvMTBrEoYjB1SeHWiEVNgOUi7AK2afC9x7wqzuEP9eaYRmU+9mhDFUUb+a15an+vfsoDs3K+CPh6EYuzs6fhIOxuz3TH8fWXn0ZSeHfDVFxnY/s+tPuYX8NM8Ks7N1BlVjoOodc5VDB8Eh3F6jHYpMcGxXJRlW4fs=; Received: from localhost.lerctr.org ([127.0.0.1]:29504 helo=webmail.lerctr.org) by thebighonker.lerctr.org with esmtpa (Exim 4.80.1 (FreeBSD)) (envelope-from ) id 1UOy71-000PGx-7h; Sun, 07 Apr 2013 17:31:00 -0500 Received: from cpe-72-182-19-162.austin.res.rr.com ([72.182.19.162]) by webmail.lerctr.org with HTTP (HTTP/1.1 POST); Sun, 07 Apr 2013 17:30:58 -0500 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit Date: Sun, 07 Apr 2013 17:30:58 -0500 From: Larry Rosenman To: , , Subject: Fwd: Re: [CRASH] ZFS recv (fwd)/CURRENT Message-ID: <289c2c9fdee18a5acefcb225e28b9310@webmail.lerctr.org> X-Sender: ler@lerctr.org User-Agent: Roundcube Webmail/0.8.5 X-Spam-Score: -5.3 (-----) X-LERCTR-Spam-Score: -5.3 (-----) X-Spam-Report: SpamScore (-5.3/5.0) ALL_TRUSTED=-1, BAYES_00=-1.9, RP_MATCHES_RCVD=-2.373 X-LERCTR-Spam-Report: SpamScore (-5.3/5.0) ALL_TRUSTED=-1, BAYES_00=-1.9, RP_MATCHES_RCVD=-2.373 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 07 Apr 2013 22:31:02 -0000 Anyone have any ideas on how to debug this? -------- Original Message -------- Subject: Re: [CRASH] ZFS recv (fwd)/CURRENT Date: 2013-04-07 17:04 From: Larry Rosenman To: Martin Matuska I updated to 249241, and got: backups/TBH/home/unofficial-dbs@2013-04-07 received 33.2MB stream in 14 seconds (2.37MB/sec) receiving incremental stream of vault/home/spolk@2013-04-07 into zroot/backups/TBH/home/spolk@2013-04-07 received 103KB stream in 1 seconds (103KB/sec) receiving incremental stream of vault/home/cb3law@2013-04-07 into zroot/backups/TBH/home/cb3law@2013-04-07 received 18.5KB stream in 1 seconds (18.5KB/sec) receiving incremental stream of vault/home/raymie@2013-04-07 into zroot/backups/TBH/home/raymie@2013-04-07 received 25.8KB stream in 1 seconds (25.8KB/sec) receiving incremental stream of vault/home/ekb@2013-04-07 into zroot/backups/TBH/home/ekb@2013-04-07 received 1.63MB stream in 2 seconds (833KB/sec) receiving incremental stream of vault/ldyer@2013-04-07 into zroot/backups/TBH/ldyer@2013-04-07 received 13.0KB stream in 1 seconds (13.0KB/sec) receiving incremental stream of vault/tmp@2013-04-07 into zroot/backups/TBH/tmp@2013-04-07 received 214KB stream in 3 seconds (71.2KB/sec) receiving incremental stream of vault/var@2013-04-07 into zroot/backups/TBH/var@2013-04-07 cannot receive incremental stream: invalid backup stream # I've seen this before, but have not been able to solve it.... I have copies of both streams..... # cat backup-TBH-ZFS.sh #!/bin/sh DATE=`date "+%Y-%m-%d"` DATE2=2013-03-24 #DATE2=`date -v "-1d" "+%Y-%m-%d"` # snap the source ssh root@tbh.lerctr.org zfs snapshot -r vault@${DATE} # zfs copy the source to here. ssh root@tbh.lerctr.org "zfs send -R -D -I vault@${DATE2} vault@${DATE} | \ tee /tmp/backup.stream.send.${DATE} | \ ssh home.lerctr.org \"tee /tmp/backup.stream.receive.${DATE} | \ zfs recv -F -u -v -d zroot/backups/TBH\"" # make sure we NEVER allow the backup stuff to automount. /sbin/zfs list -H -t filesystem -r zroot/backups/TBH| \ awk '{printf "/sbin/zfs set canmount=noauto %s\n",$1}' | sh # is the script I ran. HELP. On 2013-04-05 15:52, Martin Matuska wrote: > Hi larry, > so if you desperatedly need a solution, use the attached patch > against -CURRENT, it should work correctly. > A fix in Illumos + a backport to current will follow soon. > Cheers, > mm > On 5.4.2013 21:52, Matthew Ahrens wrote: > >> Yep, that's basically what I'll be pushing to illumos. >> --matt >> On Fri, Apr 5, 2013 at 12:33 PM, Martin Matuska >> wrote: >> >>> Yes, you are right. What about the attached updated patch? Here I >>> use >>> gmep for both holds and releases. >>> Thanks, >>> mm >>> On 5.4.2013 21:07, Matthew Ahrens wrote: >>>> I am working on integrating a fix into illumos. FYI, the patch you >>>> attached will not quite work, because you are using different tags >>>> to >>>> hold and release the dataset (FTAG is a macro which has a >>>> calling-function-specific value). >>>> --matt >>> -- >>> Martin Matuska >>> FreeBSD committer >>> http://blog.vx.sk [1] > -- > Martin Matuska > FreeBSD committer > http://blog.vx.sk [1] > > Links: > ------ > [1] http://blog.vx.sk -- Larry Rosenman http://www.lerctr.org/~ler Phone: +1 214-642-9640 (c) E-Mail: ler@lerctr.org US Mail: 430 Valona Loop, Round Rock, TX 78681-3893 From owner-freebsd-fs@FreeBSD.ORG Mon Apr 8 00:54:40 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 3B602BA1 for ; Mon, 8 Apr 2013 00:54:40 +0000 (UTC) (envelope-from jdc@koitsu.org) Received: from qmta12.emeryville.ca.mail.comcast.net (qmta12.emeryville.ca.mail.comcast.net [IPv6:2001:558:fe2d:44:76:96:27:227]) by mx1.freebsd.org (Postfix) with ESMTP id 21471968 for ; Mon, 8 Apr 2013 00:54:40 +0000 (UTC) Received: from omta11.emeryville.ca.mail.comcast.net ([76.96.30.36]) by qmta12.emeryville.ca.mail.comcast.net with comcast id MBcl1l0040mlR8UACCufJi; Mon, 08 Apr 2013 00:54:39 +0000 Received: from koitsu.strangled.net ([67.180.84.87]) by omta11.emeryville.ca.mail.comcast.net with comcast id MCue1l00D1t3BNj8XCueki; Mon, 08 Apr 2013 00:54:38 +0000 Received: by icarus.home.lan (Postfix, from userid 1000) id 1174573A1B; Sun, 7 Apr 2013 17:54:38 -0700 (PDT) Date: Sun, 7 Apr 2013 17:54:38 -0700 From: Jeremy Chadwick To: lev@FreeBSD.org Subject: Re: ZFS snapshots and daily security checks Message-ID: <20130408005438.GA66727@icarus.home.lan> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.5.21 (2010-09-15) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=comcast.net; s=q20121106; t=1365382479; bh=xOh9gs+I71ArBXKK+TTATRGPrJWUtsXoHSF3kPLCEhs=; h=Received:Received:Received:Date:From:To:Subject:Message-ID: MIME-Version:Content-Type; b=Oqba6M8r2SPM2lhXqpA7rJW51GAiAETBodNTtSzvAWz49A8ZooJTKrilA3WWPIdaK el15oJQ7VlJpL2CNA/dMmkvqyBfDvWE+NwbygFkomtCh4RY8QjhUbpBVtjbjPdHMFQ X3WyySskv0NzANeFwmjGBY8C8d4olfjKZiu+fJ+a5xs8UwTRtNDCtSe0cJT1K7u1N7 Yg1Mw2ux1z9Syh1KfUrw0NCIIppV39D/4irT0RJz6eqyfw0zXc7vvJtK63zlWxoF6d ybMKlZrHK8P4HD+sQhx+gje5rC+2OgVvbRqrOcpAEi14F5i5JnhsAkC8Hhe3C8hefS 9QqOud+9xyy0g== Cc: freebsd-fs@freebsd.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 08 Apr 2013 00:54:40 -0000 > Here is strange difference between `mount' and `mount -p' output: > `mount' doesn't show mounted ZFS snapshots, but `mount -p' does. > > Is it possible to exclude these snapshots from `mount -p' output or > don't mount them to hierarchy by default? Taken from my stable/9 r249160 system: root@icarus:~ # df -k Filesystem 1024-blocks Used Avail Capacity Mounted on /dev/ada0p2 2063900 664952 1233836 35% / devfs 1 1 0 100% /dev /dev/ada0p4 16503324 147416 15035644 1% /var /dev/ada0p5 16503324 44 15183016 0% /tmp /dev/ada0p6 25316000 6612788 16677932 28% /usr backups 1915745760 469845400 1445900360 25% /backups data/home 1462173974 16568039 1445605935 1% /home data/storage 1897306539 451700604 1445605935 24% /storage devfs 1 1 0 100% /var/named/dev root@icarus:~ # zfs snapshot -r data/home@now root@icarus:~ # touch /home/ilikedata root@icarus:~ # zfs list -t snapshot NAME USED AVAIL REFER MOUNTPOINT data/home@now 160K - 15.8G - root@icarus:~ # /sbin/mount -p /dev/ada0p2 / ufs rw 1 1 devfs /dev devfs rw,multilabel 0 0 /dev/ada0p4 /var ufs rw 2 2 /dev/ada0p5 /tmp ufs rw 2 2 /dev/ada0p6 /usr ufs rw 2 2 backups /backups zfs rw,nfsv4acls 0 0 data/home /home zfs rw,nfsv4acls 0 0 data/storage /storage zfs rw,nfsv4acls 0 0 devfs /var/named/dev devfs rw,multilabel 0 0 root@icarus:~ # /sbin/mount /dev/ada0p2 on / (ufs, local) devfs on /dev (devfs, local, multilabel) /dev/ada0p4 on /var (ufs, local, soft-updates) /dev/ada0p5 on /tmp (ufs, local, soft-updates) /dev/ada0p6 on /usr (ufs, local, soft-updates) backups on /backups (zfs, local, nfsv4acls) data/home on /home (zfs, local, nfsv4acls) data/storage on /storage (zfs, local, nfsv4acls) devfs on /var/named/dev (devfs, local, multilabel) And now after mounting the snapshot: root@icarus:~ # /sbin/mount -t zfs data/home@now /mnt root@icarus:~ # df -k Filesystem 1024-blocks Used Avail Capacity Mounted on /dev/ada0p2 2063900 664952 1233836 35% / devfs 1 1 0 100% /dev /dev/ada0p4 16503324 147416 15035644 1% /var /dev/ada0p5 16503324 44 15183016 0% /tmp /dev/ada0p6 25316000 6612788 16677932 28% /usr backups 1915745760 469845400 1445900360 25% /backups data/home 1462173553 16568044 1445605509 1% /home data/storage 1897306113 451700604 1445605509 24% /storage devfs 1 1 0 100% /var/named/dev data/home@now 1462173553 16568044 1445605509 1% /mnt root@icarus:~ # /sbin/mount -p /dev/ada0p2 / ufs rw 1 1 devfs /dev devfs rw,multilabel 0 0 /dev/ada0p4 /var ufs rw 2 2 /dev/ada0p5 /tmp ufs rw 2 2 /dev/ada0p6 /usr ufs rw 2 2 backups /backups zfs rw,nfsv4acls 0 0 data/home /home zfs rw,nfsv4acls 0 0 data/storage /storage zfs rw,nfsv4acls 0 0 devfs /var/named/dev devfs rw,multilabel 0 0 data/home@now /mnt zfs ro,noatime,nfsv4acls 0 0 root@icarus:~ # /sbin/mount /dev/ada0p2 on / (ufs, local) devfs on /dev (devfs, local, multilabel) /dev/ada0p4 on /var (ufs, local, soft-updates) /dev/ada0p5 on /tmp (ufs, local, soft-updates) /dev/ada0p6 on /usr (ufs, local, soft-updates) backups on /backups (zfs, local, nfsv4acls) data/home on /home (zfs, local, nfsv4acls) data/storage on /storage (zfs, local, nfsv4acls) devfs on /var/named/dev (devfs, local, multilabel) data/home@now on /mnt (zfs, local, noatime, read-only, nfsv4acls) It seems to me mount and mount -p show the mounted snapshot. -- | Jeremy Chadwick jdc@koitsu.org | | UNIX Systems Administrator http://jdc.koitsu.org/ | | Mountain View, CA, US | | Making life hard for others since 1977. PGP 4BD6C0CB | From owner-freebsd-fs@FreeBSD.ORG Mon Apr 8 05:19:05 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 0F011353 for ; Mon, 8 Apr 2013 05:19:05 +0000 (UTC) (envelope-from quartz@sneakertech.com) Received: from relay02.pair.com (relay02.pair.com [209.68.5.16]) by mx1.freebsd.org (Postfix) with SMTP id A8E25169 for ; Mon, 8 Apr 2013 05:19:04 +0000 (UTC) Received: (qmail 29471 invoked by uid 0); 8 Apr 2013 05:12:21 -0000 Received: from 173.48.104.62 (HELO ?10.2.2.1?) (173.48.104.62) by relay02.pair.com with SMTP; 8 Apr 2013 05:12:21 -0000 X-pair-Authenticated: 173.48.104.62 Message-ID: <516251B4.7050809@sneakertech.com> Date: Mon, 08 Apr 2013 01:12:20 -0400 From: Quartz User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.6; rv:10.0.2) Gecko/20120216 Thunderbird/10.0.2 MIME-Version: 1.0 To: "Lawrence K. Chen, P.Eng." Subject: Re: ZFS: Failed pool causes system to hang References: <1964862508.3535448.1365199766508.JavaMail.root@k-state.edu> In-Reply-To: <1964862508.3535448.1365199766508.JavaMail.root@k-state.edu> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit Cc: FreeBSD FS X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 08 Apr 2013 05:19:05 -0000 > So, this thread seems to just stop....and can't see if it was > resolved or not. It wasn't. Jeremy Chadwick was the only one who really responded, but besides confirming it wasn't specific to my hardware, there wasn't a lot he could do. He suggested I email some of the kernel folks directly and/or open a PR about it. (I'm planning on doing both, but haven't had time over the weekend). > Anyways, my input would be did you want long enough to see if the > system will boot before declaring it hung? > I've had my system crash at bad times, which has resulted in the > appearance that the boot is hung...but its busy churning away.... > It seemed hung at trying to mount root It might not have been clear from the back and forth, but my issue isn't a "boot hang" per se, but that "reboots also hang". The zfs subsystem hangs so thoroughly it blocks all io on all disks and prevents the reboot/halt/shutdown procedure from taking the machine down gracefully. Once I press the physical front-panel reboot button the machine comes up immediately (sans the offending pool). And yes I've waited over half an hour and it never recovers. My discussion with Jeremy indicated that the infinite wait is an "expected failure" in the sense that zfs would not be come back to life given the circumstances. ______________________________________ it has a certain smooth-brained appeal From owner-freebsd-fs@FreeBSD.ORG Mon Apr 8 07:18:52 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 18A6AE1 for ; Mon, 8 Apr 2013 07:18:52 +0000 (UTC) (envelope-from joar.jegleim@gmail.com) Received: from mail-wi0-x235.google.com (mail-wi0-x235.google.com [IPv6:2a00:1450:400c:c05::235]) by mx1.freebsd.org (Postfix) with ESMTP id A745F681 for ; Mon, 8 Apr 2013 07:18:51 +0000 (UTC) Received: by mail-wi0-f181.google.com with SMTP id hj8so2235814wib.2 for ; Mon, 08 Apr 2013 00:18:50 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:x-received:in-reply-to:references:date:message-id :subject:from:to:cc:content-type; bh=PfXhymztOwSTAYDSN6LpWySGz7hPe6Cmsxwm9GUUkDA=; b=PyMXdVhYGTombrYcony6+ltuHZOGPXbSuJ59z5IjUhFHQpGTXl/xWxbijkbDpusauM uKrp8TOsglG0CBkwVLTKHTKJjabJED+Z+m4KA5Sa6+SzPzxMcaAOHk1YCGC9ER7kZNKm XECiRt222cTqFmiIhYm+eVOnxBw57fY9HsGf2dyxgXBLYyMHyNTYSqepn74ZTIvCxJrE py75pnUKvEUTnOPzxmtE/V6HjouGIjNGgdVxY7dtS9TeRQ1K0iNh2HDCUqdX/grc2TfY X+NWbE2XLJMuSNsrEAyqP16YPaCoH7W+JNDASV73Wnf7pzI/NutUWdjlL2M1ljrFr09i 7ngw== MIME-Version: 1.0 X-Received: by 10.194.88.138 with SMTP id bg10mr29558429wjb.13.1365405530654; Mon, 08 Apr 2013 00:18:50 -0700 (PDT) Received: by 10.216.34.9 with HTTP; Mon, 8 Apr 2013 00:18:50 -0700 (PDT) In-Reply-To: References: <8B0FFF01-B8CC-41C0-B0A2-58046EA4E998@my.gd> <515EB744.5000607@brockmann-consult.de> Date: Mon, 8 Apr 2013 09:18:50 +0200 Message-ID: Subject: Re: Regarding regular zfs From: Joar Jegleim To: Ronald Klop Content-Type: text/plain; charset=UTF-8 X-Content-Filtered-By: Mailman/MimeDel 2.1.14 Cc: "freebsd-fs@freebsd.org" X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 08 Apr 2013 07:18:52 -0000 the rsync was running from the live system. As I wrote earlier, the problem seem to only occur while the backup server is rsync'ing from the slave (zfs receiving side), so I was actually trying to figure out if this was to be expected (as in zfs sync, where the receiving end get a diff and roll 'back' to version = latest snapshot from 'master') with a setup with +1TB data and +2million files . On 5 April 2013 16:07, Ronald Klop wrote: > On Fri, 05 Apr 2013 15:02:12 +0200, Joar Jegleim > wrote: > > You make some interesting points . >> I don't _think_ the script 'causes more than 1 zfs write at a time, and >> I'm >> sure 'nothing else' is doing that neither . But I'm gonna check that out >> because it does sound like a logical explanation. >> I'm wondering if the rsync from the receiving server (that is: the backup >> server is doing rsync from the zfs receive server) could 'cause the same >> problem, it's only reading though ... >> >> >> >> > Do you run the rsync from a snapshot or from the 'live' filesystem? The > live one changes during zfs receive. I don't know if that has anything to > do with your problem, but rsync from a snapshot gives a consistent backup > anyway. > > BTW: It is probably more simple for you to test if the rsync is related to > the problem, than for other people to theorize about it here. > > Ronald. > -- ---------------------- Joar Jegleim Homepage: http://cosmicb.no Linkedin: http://no.linkedin.com/in/joarjegleim fb: http://www.facebook.com/joar.jegleim AKA: CosmicB @Freenode ---------------------- From owner-freebsd-fs@FreeBSD.ORG Mon Apr 8 07:42:04 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id DE269755 for ; Mon, 8 Apr 2013 07:42:04 +0000 (UTC) (envelope-from lev@FreeBSD.org) Received: from onlyone.friendlyhosting.spb.ru (onlyone.friendlyhosting.spb.ru [IPv6:2a01:4f8:131:60a2::2]) by mx1.freebsd.org (Postfix) with ESMTP id 6E89A7C6 for ; Mon, 8 Apr 2013 07:42:04 +0000 (UTC) Received: from lion.home.serebryakov.spb.ru (unknown [IPv6:2001:470:923f:1:900d:c887:884e:713b]) (Authenticated sender: lev@serebryakov.spb.ru) by onlyone.friendlyhosting.spb.ru (Postfix) with ESMTPA id B27C24AC57; Mon, 8 Apr 2013 11:42:02 +0400 (MSK) Date: Mon, 8 Apr 2013 11:42:00 +0400 From: Lev Serebryakov Organization: FreeBSD Project X-Priority: 3 (Normal) Message-ID: <1504594172.20130408114200@serebryakov.spb.ru> To: Jeremy Chadwick Subject: Re: ZFS snapshots and daily security checks In-Reply-To: <20130408005438.GA66727@icarus.home.lan> References: <20130408005438.GA66727@icarus.home.lan> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Cc: freebsd-fs@freebsd.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list Reply-To: lev@FreeBSD.org List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 08 Apr 2013 07:42:04 -0000 Hello, Jeremy. You wrote 8 =D0=B0=D0=BF=D1=80=D0=B5=D0=BB=D1=8F 2013 =D0=B3., 4:54:38: >> Is it possible to exclude these snapshots from `mount -p' output or >> don't mount them to hierarchy by default? JC> Taken from my stable/9 r249160 system: And here is my 9.1-STABLE r244958 (I'm filtering out all hourly output, or this message will be infinite): % df -k Filesystem 1024-blocks Used Avail Capacity Mounted on /dev/mirror/root 2026028 675598 1188348 36% / devfs 1 1 0 100% /dev fdescfs 1 1 0 100% /dev/fd procfs 4 4 0 100% /proc /dev/mirror/var 16244332 6285320 8659466 42% /var /dev/mirror/tmp 1012972 12290 919646 1% /tmp /dev/mirror/usr 64995336 10259340 49536370 17% /usr /dev/mirror/databases 101554148 174252 93255566 0% /var/databas= es pool 487184219 21 487184198 0% /pool pool/home 511417117 24232919 487184198 5% /usr/home devfs 1 1 0 100% /var/named/d= ev % mount /dev/mirror/root on / (ufs, local) devfs on /dev (devfs, local) fdescfs on /dev/fd (fdescfs) procfs on /proc (procfs, local) /dev/mirror/var on /var (ufs, local, soft-updates) /dev/mirror/tmp on /tmp (ufs, local, soft-updates) /dev/mirror/usr on /usr (ufs, local, soft-updates) /dev/mirror/databases on /var/databases (ufs, local, soft-updates) pool on /pool (zfs, local, nfsv4acls) pool/home on /usr/home (zfs, local, nfsv4acls) devfs on /var/named/dev (devfs, local) % zfs list -t snapshot | grep -v hourly NAME USED AVAIL REFER MOUNTPOINT pool/home@daily-2013-04-05_03.01.28--1m 544K - 23.1G - pool/home@daily-2013-04-06_03.01.20--1m 688K - 23.1G - pool/home@weekly-2013-04-06_04.15.34--1y 1.70M - 23.1G - pool/home@daily-2013-04-07_03.04.44--1m 1.15M - 23.1G - pool/home@daily-2013-04-08_03.01.31--1m 437K - 23.1G - % mount -p | grep -v hourly /dev/mirror/root / ufs rw 1 1 devfs /dev devfs rw 0 0 fdescfs /dev/fd fdescfs rw 0 0 procfs /proc procfs rw 0 0 /dev/mirror/var /var ufs rw 2 2 /dev/mirror/tmp /tmp ufs rw 2 2 /dev/mirror/usr /usr ufs rw 2 2 /dev/mirror/databases /var/databases ufs rw 3 3 pool /pool zfs rw,nfsv4acls 0 0 pool/home /usr/home zfs rw,nfsv4acls 0 0 devfs /var/named/dev devfs rw 0 0 pool/home@daily-2013-04-05_03.01.28--1m /usr/home/.zfs/snapshot/daily-2013-= 04-05_03.01.28--1m zfs ro,nosuid,noatime,nfsv4acls 0 0 pool/home@daily-2013-04-06_03.01.20--1m /usr/home/.zfs/snapshot/daily-2013-= 04-06_03.01.20--1m zfs ro,nosuid,noatime,nfsv4acls 0 0 pool/home@weekly-2013-04-06_04.15.34--1y /usr/home/.zfs/snapshot/weekly-201= 3-04-06_04.15.34--1y zfs ro,nosuid,noatime,nfsv4acls 0 0 pool/home@daily-2013-04-07_03.04.44--1m /usr/home/.zfs/snapshot/daily-2013-= 04-07_03.04.44--1m zfs ro,nosuid,noatime,nfsv4acls 0 0 % JC> It seems to me mount and mount -p show the mounted snapshot. I didn't mount snapshot specifically, and they are created by zfSnap script from ports (sysutils/zfsnap). As I can see in this script, snapshots are created with /sbin/zfs snapshot -r ${fs}@${snapshot} --=20 // Black Lion AKA Lev Serebryakov From owner-freebsd-fs@FreeBSD.ORG Mon Apr 8 07:43:15 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 43DAA7E4 for ; Mon, 8 Apr 2013 07:43:15 +0000 (UTC) (envelope-from will@firepipe.net) Received: from mail-ve0-f176.google.com (mail-ve0-f176.google.com [209.85.128.176]) by mx1.freebsd.org (Postfix) with ESMTP id 0A0967DA for ; Mon, 8 Apr 2013 07:43:14 +0000 (UTC) Received: by mail-ve0-f176.google.com with SMTP id ox1so5028564veb.21 for ; Mon, 08 Apr 2013 00:43:14 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=mime-version:x-received:in-reply-to:references:date:message-id :subject:from:to:cc:content-type:x-gm-message-state; bh=Eju21sOoQo9ZbIwpx32TjCUJ6oDgilpxmrM4OYkGIgo=; b=ED7KCwDaW9CHtEIQTEMYS+VVxXCZt/lvVu0csGfEHvYO6H8aa0HDGgjJc5U5NnBFBe YhFmBb+S2phL7+e1lIU2Md6A5P7YEhdT0zfEx0E87htJN4eaP8iYcgY3CYF61ojKBt3O Rm5+fX+1ZkTLY+wfsOSzST3RDKhutsd39z6j9pM6iaJxCjiaWydag6BOZso9dEgDvX6F /W+RvQVByUr/BBzJoAejgg+GuC31TfswCfA+NcxQt1YCL4jmyen8037dfIAjerxJhpJb Mxk+2f3xTPYehbpkjZd9aXRl3v8pI+ytdCYRLuwi0ueXfRMWqD2HYu5pMOxRxISANncQ GsKQ== MIME-Version: 1.0 X-Received: by 10.221.4.6 with SMTP id oa6mr10905477vcb.29.1365406994030; Mon, 08 Apr 2013 00:43:14 -0700 (PDT) Received: by 10.58.248.105 with HTTP; Mon, 8 Apr 2013 00:43:13 -0700 (PDT) In-Reply-To: References: Date: Mon, 8 Apr 2013 01:43:13 -0600 Message-ID: Subject: Re: zfs hang From: Will Andrews To: Richard Kojedzinszky X-Gm-Message-State: ALoCoQmKXGtuPXm4Qq84E1fT6PY6Pa90QvAX9FJTvgfj50TPM4KYgCcQZsWkvACjM4rRKIDrV0t/ Content-Type: text/plain; charset=UTF-8 X-Content-Filtered-By: Mailman/MimeDel 2.1.14 Cc: "freebsd-fs@freebsd.org" X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 08 Apr 2013 07:43:15 -0000 I've written a patch that extensively redesigns the zvol implementation to get rid of various deadlocks in it, including this one. I am working to push the improvements (along with various others) out. --Will. On Sunday, April 7, 2013, Richard Kojedzinszky wrote: > Dear FS devs, > > Pr kern/161968 still seems to be alive, at least on stable/9. > Unfortunately, I am not familiar with zfs internals, so I just ask if > someone knows the problem, and could fix it? > > If I can help somehow, I will. > > Thanks in advance, > > Kojedzinszky Richard > ______________________________**_________________ > freebsd-fs@freebsd.org mailing list > http://lists.freebsd.org/**mailman/listinfo/freebsd-fs > To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org" > From owner-freebsd-fs@FreeBSD.ORG Mon Apr 8 08:07:39 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id A702B76 for ; Mon, 8 Apr 2013 08:07:39 +0000 (UTC) (envelope-from jdc@koitsu.org) Received: from qmta01.emeryville.ca.mail.comcast.net (qmta01.emeryville.ca.mail.comcast.net [IPv6:2001:558:fe2d:43:76:96:30:16]) by mx1.freebsd.org (Postfix) with ESMTP id 8E2CB945 for ; Mon, 8 Apr 2013 08:07:39 +0000 (UTC) Received: from omta15.emeryville.ca.mail.comcast.net ([76.96.30.71]) by qmta01.emeryville.ca.mail.comcast.net with comcast id ML491l0041Y3wxoA1L7fub; Mon, 08 Apr 2013 08:07:39 +0000 Received: from koitsu.strangled.net ([67.180.84.87]) by omta15.emeryville.ca.mail.comcast.net with comcast id ML7e1l0081t3BNj8bL7e31; Mon, 08 Apr 2013 08:07:38 +0000 Received: by icarus.home.lan (Postfix, from userid 1000) id 1CE0673A1B; Mon, 8 Apr 2013 01:07:38 -0700 (PDT) Date: Mon, 8 Apr 2013 01:07:38 -0700 From: Jeremy Chadwick To: Lev Serebryakov Subject: Re: ZFS snapshots and daily security checks Message-ID: <20130408080738.GA73905@icarus.home.lan> References: <20130408005438.GA66727@icarus.home.lan> <1504594172.20130408114200@serebryakov.spb.ru> MIME-Version: 1.0 Content-Type: text/plain; charset=unknown-8bit Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <1504594172.20130408114200@serebryakov.spb.ru> User-Agent: Mutt/1.5.21 (2010-09-15) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=comcast.net; s=q20121106; t=1365408459; bh=d9vKqrxaBTKR2VgMDuf73qCiV3Job6s3Y3duC7ip35o=; h=Received:Received:Received:Date:From:To:Subject:Message-ID: MIME-Version:Content-Type; b=UO2YUwvyhPYZRPPoHbpHT6Yqo4HwLr+/1xuOsjaOhUETzICmU8CS0OVORy4P/JWVG +ZKDMqbPv5eOdABQCZgw9r+WwH6ID2jnOCFEfBa6mFjeEYzvrAVJIYi9H7F4NrxhDr jyLVFnRPnVNXKdboT1CqBZ6/5xf9u5HyalJ1WUyPtBzo/e8hoIfZzN9pi0MMkproK9 Qdky1PbJE7yRvLrlHtp7buD0t//cork1pmUgpbSkrUWCj/9iQhGOIdFkWE8k3eNAW5 HtvpUnudyFAjcEIUOTUqyDfWoKoxNcRaX/L3lGeS21zzB52alLK3R/jHRqMLRZvgQ3 RU+mLcLCqBIhw== Cc: freebsd-fs@freebsd.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 08 Apr 2013 08:07:39 -0000 On Mon, Apr 08, 2013 at 11:42:00AM +0400, Lev Serebryakov wrote: > Hello, Jeremy. > You wrote 8 апреля 2013 г., 4:54:38: > > >> Is it possible to exclude these snapshots from `mount -p' output or > >> don't mount them to hierarchy by default? > JC> Taken from my stable/9 r249160 system: > And here is my 9.1-STABLE r244958 (I'm filtering out all hourly > output, or this message will be infinite): > > % df -k > Filesystem 1024-blocks Used Avail Capacity Mounted on > /dev/mirror/root 2026028 675598 1188348 36% / > devfs 1 1 0 100% /dev > fdescfs 1 1 0 100% /dev/fd > procfs 4 4 0 100% /proc > /dev/mirror/var 16244332 6285320 8659466 42% /var > /dev/mirror/tmp 1012972 12290 919646 1% /tmp > /dev/mirror/usr 64995336 10259340 49536370 17% /usr > /dev/mirror/databases 101554148 174252 93255566 0% /var/databases > pool 487184219 21 487184198 0% /pool > pool/home 511417117 24232919 487184198 5% /usr/home > devfs 1 1 0 100% /var/named/dev > % mount > /dev/mirror/root on / (ufs, local) > devfs on /dev (devfs, local) > fdescfs on /dev/fd (fdescfs) > procfs on /proc (procfs, local) > /dev/mirror/var on /var (ufs, local, soft-updates) > /dev/mirror/tmp on /tmp (ufs, local, soft-updates) > /dev/mirror/usr on /usr (ufs, local, soft-updates) > /dev/mirror/databases on /var/databases (ufs, local, soft-updates) > pool on /pool (zfs, local, nfsv4acls) > pool/home on /usr/home (zfs, local, nfsv4acls) > devfs on /var/named/dev (devfs, local) > % zfs list -t snapshot | grep -v hourly > NAME USED AVAIL REFER MOUNTPOINT > pool/home@daily-2013-04-05_03.01.28--1m 544K - 23.1G - > pool/home@daily-2013-04-06_03.01.20--1m 688K - 23.1G - > pool/home@weekly-2013-04-06_04.15.34--1y 1.70M - 23.1G - > pool/home@daily-2013-04-07_03.04.44--1m 1.15M - 23.1G - > pool/home@daily-2013-04-08_03.01.31--1m 437K - 23.1G - > % mount -p | grep -v hourly > /dev/mirror/root / ufs rw 1 1 > devfs /dev devfs rw 0 0 > fdescfs /dev/fd fdescfs rw 0 0 > procfs /proc procfs rw 0 0 > /dev/mirror/var /var ufs rw 2 2 > /dev/mirror/tmp /tmp ufs rw 2 2 > /dev/mirror/usr /usr ufs rw 2 2 > /dev/mirror/databases /var/databases ufs rw 3 3 > pool /pool zfs rw,nfsv4acls 0 0 > pool/home /usr/home zfs rw,nfsv4acls 0 0 > devfs /var/named/dev devfs rw 0 0 > pool/home@daily-2013-04-05_03.01.28--1m /usr/home/.zfs/snapshot/daily-2013-04-05_03.01.28--1m zfs ro,nosuid,noatime,nfsv4acls 0 0 > pool/home@daily-2013-04-06_03.01.20--1m /usr/home/.zfs/snapshot/daily-2013-04-06_03.01.20--1m zfs ro,nosuid,noatime,nfsv4acls 0 0 > pool/home@weekly-2013-04-06_04.15.34--1y /usr/home/.zfs/snapshot/weekly-2013-04-06_04.15.34--1y zfs ro,nosuid,noatime,nfsv4acls 0 0 > pool/home@daily-2013-04-07_03.04.44--1m /usr/home/.zfs/snapshot/daily-2013-04-07_03.04.44--1m zfs ro,nosuid,noatime,nfsv4acls 0 0 > % > > JC> It seems to me mount and mount -p show the mounted snapshot. > I didn't mount snapshot specifically, and they are created by zfSnap > script from ports (sysutils/zfsnap). > As I can see in this script, snapshots are created with > > /sbin/zfs snapshot -r ${fs}@${snapshot} I don't know what to tell you -- my output clearly shows that after creating a snapshot with "zfs snapshot -r filesystem@snapname" that mount nor mount -p show anything. I wonder if you have either pool or filesystem-level attributes which are causing your issue. Here are mine, for the pool and filesystem I used in my previous mail (pool "data" and filesystem "data/home"): root@icarus:~ # zpool get all data NAME PROPERTY VALUE SOURCE data size 2.72T - data capacity 24% - data altroot - default data health ONLINE - data guid 4221681810446459190 default data version - default data bootfs - default data delegation on default data autoreplace off default data cachefile - default data failmode wait default data listsnapshots off default data autoexpand off default data dedupditto 0 default data dedupratio 1.00x - data free 2.06T - data allocated 671G - data readonly off - data comment - default data expandsize 0 - data freeing 0 default data feature@async_destroy enabled local data feature@empty_bpobj active local data feature@lz4_compress enabled local root@icarus:~ # zfs get all data/home NAME PROPERTY VALUE SOURCE data/home type filesystem - data/home creation Tue Jan 22 23:48 2013 - data/home used 15.8G - data/home available 1.35T - data/home referenced 15.8G - data/home compressratio 1.00x - data/home mounted yes - data/home quota none default data/home reservation none default data/home recordsize 128K default data/home mountpoint /home local data/home sharenfs off default data/home checksum on default data/home compression off default data/home atime on default data/home devices on default data/home exec on default data/home setuid on default data/home readonly off default data/home jailed off default data/home snapdir hidden default data/home aclmode discard default data/home aclinherit restricted default data/home canmount on default data/home xattr off temporary data/home copies 1 default data/home version 5 - data/home utf8only off - data/home normalization none - data/home casesensitivity sensitive - data/home vscan off default data/home nbmand off default data/home sharesmb off default data/home refquota none default data/home refreservation none default data/home primarycache all default data/home secondarycache all default data/home usedbysnapshots 0 - data/home usedbydataset 15.8G - data/home usedbychildren 0 - data/home usedbyrefreservation 0 - data/home logbias latency default data/home dedup off default data/home mlslabel - data/home sync standard default data/home refcompressratio 1.00x - data/home written 15.8G - data/home logicalused 15.2G - data/home logicalreferenced 15.2G - -- | Jeremy Chadwick jdc@koitsu.org | | UNIX Systems Administrator http://jdc.koitsu.org/ | | Mountain View, CA, US | | Making life hard for others since 1977. PGP 4BD6C0CB | From owner-freebsd-fs@FreeBSD.ORG Mon Apr 8 08:29:53 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id EA76F727 for ; Mon, 8 Apr 2013 08:29:53 +0000 (UTC) (envelope-from joar.jegleim@gmail.com) Received: from mail-we0-x22f.google.com (mail-we0-x22f.google.com [IPv6:2a00:1450:400c:c03::22f]) by mx1.freebsd.org (Postfix) with ESMTP id 73D11A47 for ; Mon, 8 Apr 2013 08:29:53 +0000 (UTC) Received: by mail-we0-f175.google.com with SMTP id t11so4322753wey.6 for ; Mon, 08 Apr 2013 01:29:52 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:x-received:in-reply-to:references:date:message-id :subject:from:to:cc:content-type; bh=xOaa7inGBn0tYXLgJYvF23Pb9Qn5rd1eQTJ10KIJ7lc=; b=IOcoQ6o88vafjnYr0jpF/vCGqNLsCrXI8Ar5S0H3fPp9WqWnhV68D+Vyl+TEijIt5T 8RbzuAkW2xbF0gC4a0MCWki+gD5y9I3+u0VVx5SlsuWH4V0xDMmP1nzM41qeOivmMmP7 AYihZumlKDxKOjaBDOhD3Q8vlXWWEqwwf2whzZe+WdhMcW20cMURCYgL9rJfc1YKZ2XG dKZS26CB1+4BswYLLY+de5Tu8wCDWvRB4Fc0/UXWYniGhP0DZFjd7l1IMBA1gVdOuZqe 2P1MP4svnvCnd7/eNCb4xF4bLSeVnWpzpVCLD0lO0/l4VGKp5P4Agg4+1HRFjae88wT1 LSXw== MIME-Version: 1.0 X-Received: by 10.180.77.66 with SMTP id q2mr11219345wiw.13.1365409792393; Mon, 08 Apr 2013 01:29:52 -0700 (PDT) Received: by 10.216.34.9 with HTTP; Mon, 8 Apr 2013 01:29:52 -0700 (PDT) In-Reply-To: <20130405211249.GB31958@server.rulingia.com> References: <20130405211249.GB31958@server.rulingia.com> Date: Mon, 8 Apr 2013 10:29:52 +0200 Message-ID: Subject: Re: Regarding regular zfs From: Joar Jegleim To: Peter Jeremy Content-Type: text/plain; charset=UTF-8 X-Content-Filtered-By: Mailman/MimeDel 2.1.14 Cc: "freebsd-fs@freebsd.org" X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 08 Apr 2013 08:29:54 -0000 [...]"Are you deleting old snapshots after the newer snapshots have been sent?"[...] yeah, the script deletes old snapshots. The slave will usually hold 2 snapshots ( 1 being the initial snapshot received via zfs send from master, 2nd being the latest snapshot received from master) . [...]"Can you clarify which machine you mean by server in the last line above. I presume you mean the slave machine running "zfs recv". If you monitor the "server" with "vmstat -v 1", "gstat -a" and "zfs-mon -a" (the latter is part of ports/sysutils/zfs-stats) during the "freeze", what do you see? Are the disks saturated or idle? Are the "cache" or "free" values close to zero?" [...] The last line "Everything on the server halts / hangs completely." I'm talking about the 'slave' (the receiving end) I'll check how cache is doing, but as I wrote in my previous reply, the 'slave' server is completely unresponsive, nothing works at all for 5-15 seconds, when the server is responsive again (can ssh in and so on) I can't seem to find anything in dmesg or any log hinting about anything at all that went 'wrong' . "There was a bug in interface between ZFS ARC and FreeBSD VM that resulted in ARC starvation. This was fixed between 8.2 and 8.3/9.0." ah, ok . "Do you have atime enabled or disabled? What happens when you don't run rsync at the same time? Are you able to break into DDB?" atime is disabled. When I don't run rsync the server seem ok, I've tried to detect any hang (as in I ssh into the server and issue various commands such as top, ls and so on) while not rsync'ing and there might have been a really minor 'glitch' but it was hardly noticeable at all, and nothing compared to those 5-15 seconds when the backup server is doing the rsync (from the live volume, not a snapshot) . I could try DDB, I'm gonna have to get back to you on that, I haven't debug'ed FreeBSD kernel before and the system is in production, so I would have to be cautious. I might be able to try out that during this week . [...]Apart from the rsync whilst receiving, everything sounds OK. It's possible that the rsync whilst receiving is triggering a bug.[...] I sort of think so too, at least since the whole OS is unresponsive / hang for anything from 5-15 seconds . -- ---------------------- Joar Jegleim Homepage: http://cosmicb.no Linkedin: http://no.linkedin.com/in/joarjegleim fb: http://www.facebook.com/joar.jegleim AKA: CosmicB @Freenode ---------------------- On 5 April 2013 23:12, Peter Jeremy wrote: > On 2013-Apr-05 12:17:27 +0200, Joar Jegleim > wrote: > >I've got this script that initially zfs send's a whole zfs volume, and > >for every send after that only sends the diff . So after the initial zfs > >send, the diff's usually take less than a minute to send over. > > Are you deleting old snapshots after the newer snapshots have been sent? > > >I've had increasing problems on the 'slave', it seem to grind to a > >halt for anything between 5-20 seconds after every zfs receive . > Everything > >on the server halts / hangs completely. > > Can you clarify which machine you mean by server in the last line above. > I presume you mean the slave machine running "zfs recv". > > If you monitor the "server" with "vmstat -v 1", "gstat -a" and "zfs-mon -a" > (the latter is part of ports/sysutils/zfs-stats) during the "freeze", > what do you see? Are the disks saturated or idle? Are the "cache" or > "free" values close to zero? > > ># 16GB arc_max ( server got 30GB of ram, but had a couple 'freeze' > >situations, suspect zfs.arc ate too much memory) > > There was a bug in interface between ZFS ARC and FreeBSD VM that resulted > in ARC starvation. This was fixed between 8.2 and 8.3/9.0. > > >I suspect it may have something to do with the zfs volume being sent > >is mount'ed on the slave, and I'm also doing the backups from the > >slave, which means a lot of the time the backup server is rsyncing the > >zfs volume being updated. > > Do you have atime enabled or disabled? What happens when you don't run > rsync at the same time? > > Are you able to break into DDB? > > >In my setup have I taken the use case for zfs send / receive too far > >(?) as in, it's not meant for this kind of syncing and this often, so > >there's actually nothing 'wrong'. > > Apart from the rsync whilst receiving, everything sounds OK. It's > possible that the rsync whilst receiving is triggering a bug. > > -- > Peter Jeremy > From owner-freebsd-fs@FreeBSD.ORG Mon Apr 8 08:48:10 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 7077A1D8 for ; Mon, 8 Apr 2013 08:48:10 +0000 (UTC) (envelope-from peter@rulingia.com) Received: from vps.rulingia.com (host-122-100-2-194.octopus.com.au [122.100.2.194]) by mx1.freebsd.org (Postfix) with ESMTP id E5F91B2E for ; Mon, 8 Apr 2013 08:48:09 +0000 (UTC) Received: from server.rulingia.com (c220-239-237-213.belrs5.nsw.optusnet.com.au [220.239.237.213]) by vps.rulingia.com (8.14.5/8.14.5) with ESMTP id r388m1Jw056396 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK); Mon, 8 Apr 2013 18:48:01 +1000 (EST) (envelope-from peter@rulingia.com) X-Bogosity: Ham, spamicity=0.000000 Received: from server.rulingia.com (localhost.rulingia.com [127.0.0.1]) by server.rulingia.com (8.14.5/8.14.5) with ESMTP id r388luVL003160 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Mon, 8 Apr 2013 18:47:56 +1000 (EST) (envelope-from peter@server.rulingia.com) Received: (from peter@localhost) by server.rulingia.com (8.14.5/8.14.5/Submit) id r388lupn003159; Mon, 8 Apr 2013 18:47:56 +1000 (EST) (envelope-from peter) Date: Mon, 8 Apr 2013 18:47:56 +1000 From: Peter Jeremy To: Joar Jegleim Subject: Re: Regarding regular zfs Message-ID: <20130408084756.GD31958@server.rulingia.com> References: <20130405211249.GB31958@server.rulingia.com> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="R3G7APHDIzY6R/pk" Content-Disposition: inline In-Reply-To: X-PGP-Key: http://www.rulingia.com/keys/peter.pgp User-Agent: Mutt/1.5.21 (2010-09-15) Cc: "freebsd-fs@freebsd.org" X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 08 Apr 2013 08:48:10 -0000 --R3G7APHDIzY6R/pk Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On 2013-Apr-08 10:29:52 +0200, Joar Jegleim wrote: >I'll check how cache is doing, but as I wrote in my previous reply, the >'slave' server is completely unresponsive, nothing works at all for 5-15 >seconds, when the server is responsive again (can ssh in and so on) I can't >seem to find anything in dmesg or any log hinting about anything at all >that went 'wrong' . If you have iostat/gstat/top/... running, does it hang (stop updating) during this period? Is it pingable during the "hang"? How about iostat/gstat/top/... running on the console? =20 Do you have compression or dedup enabled? >I could try DDB, I'm gonna have to get back to you on that, I haven't >debug'ed FreeBSD kernel before and the system is in production, so I would >have to be cautious. I might be able to try out that during this week . Do you have a test system that you can reproduce the problem on? The reason I ask about DDB it that it would be useful to get a 'ps' whilst the system is hung and it sounds like DDB is the only way to get that. --=20 Peter Jeremy --R3G7APHDIzY6R/pk Content-Type: application/pgp-signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.19 (FreeBSD) iEYEARECAAYFAlFihDwACgkQ/opHv/APuIdTzACaA2goH3sEQrsMAQq4tEZN5z0T 2tsAoIQVrf9eStN1x0j1FY5pk1rjIYsw =bS8z -----END PGP SIGNATURE----- --R3G7APHDIzY6R/pk-- From owner-freebsd-fs@FreeBSD.ORG Mon Apr 8 08:50:07 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 09DF327A for ; Mon, 8 Apr 2013 08:50:07 +0000 (UTC) (envelope-from lev@FreeBSD.org) Received: from onlyone.friendlyhosting.spb.ru (onlyone.friendlyhosting.spb.ru [IPv6:2a01:4f8:131:60a2::2]) by mx1.freebsd.org (Postfix) with ESMTP id C6F4AB4F for ; Mon, 8 Apr 2013 08:50:06 +0000 (UTC) Received: from lion.home.serebryakov.spb.ru (unknown [IPv6:2001:470:923f:1:900d:c887:884e:713b]) (Authenticated sender: lev@serebryakov.spb.ru) by onlyone.friendlyhosting.spb.ru (Postfix) with ESMTPA id A51374AC57; Mon, 8 Apr 2013 12:50:04 +0400 (MSK) Date: Mon, 8 Apr 2013 12:50:02 +0400 From: Lev Serebryakov Organization: FreeBSD X-Priority: 3 (Normal) Message-ID: <1884594284.20130408125002@serebryakov.spb.ru> To: Jeremy Chadwick Subject: Re: ZFS snapshots and daily security checks In-Reply-To: <20130408080738.GA73905@icarus.home.lan> References: <20130408005438.GA66727@icarus.home.lan> <1504594172.20130408114200@serebryakov.spb.ru> <20130408080738.GA73905@icarus.home.lan> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Cc: freebsd-fs@freebsd.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list Reply-To: lev@FreeBSD.org List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 08 Apr 2013 08:50:07 -0000 Hello, Jeremy. You wrote 8 =D0=B0=D0=BF=D1=80=D0=B5=D0=BB=D1=8F 2013 =D0=B3., 12:07:38: JC> I don't know what to tell you -- my output clearly shows that after JC> creating a snapshot with "zfs snapshot -r filesystem@snapname" that JC> mount nor mount -p show anything. What's really wander me, why is here difference between `mount' and `mount -p' output on my system. It looks like `-p' option should be cosmetic one... JC> I wonder if you have either pool or filesystem-level attributes which JC> are causing your issue. JC> Here are mine, for the pool and filesystem I used in my previous mail JC> (pool "data" and filesystem "data/home"): JC> data/home snapdir hidden default pool/home snapdir visible default It is only not size- and date-related difference. So, we know why here is difference between my and your `mount -p' outputs! (BTW, why both values are default?!) And here is some conflict of interests: it is god to allow useres restore their files from snapshots without my help (and it is require visible snapshots), but it is very annoying output in security checks... And why output of mount depends on visual option? I need to read mount sources. --=20 // Black Lion AKA Lev Serebryakov From owner-freebsd-fs@FreeBSD.ORG Mon Apr 8 08:56:34 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 7499937E for ; Mon, 8 Apr 2013 08:56:34 +0000 (UTC) (envelope-from lev@FreeBSD.org) Received: from onlyone.friendlyhosting.spb.ru (onlyone.friendlyhosting.spb.ru [46.4.40.135]) by mx1.freebsd.org (Postfix) with ESMTP id 3B7E3BE6 for ; Mon, 8 Apr 2013 08:56:33 +0000 (UTC) Received: from lion.home.serebryakov.spb.ru (unknown [IPv6:2001:470:923f:1:900d:c887:884e:713b]) (Authenticated sender: lev@serebryakov.spb.ru) by onlyone.friendlyhosting.spb.ru (Postfix) with ESMTPA id 6169F4AC58; Mon, 8 Apr 2013 12:56:32 +0400 (MSK) Date: Mon, 8 Apr 2013 12:56:29 +0400 From: Lev Serebryakov Organization: FreeBSD X-Priority: 3 (Normal) Message-ID: <1629044728.20130408125629@serebryakov.spb.ru> To: Jeremy Chadwick , freebsd-fs@freebsd.org Subject: Re: ZFS snapshots and daily security checks In-Reply-To: <1884594284.20130408125002@serebryakov.spb.ru> References: <20130408005438.GA66727@icarus.home.lan> <1504594172.20130408114200@serebryakov.spb.ru> <20130408080738.GA73905@icarus.home.lan> <1884594284.20130408125002@serebryakov.spb.ru> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list Reply-To: lev@FreeBSD.org List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 08 Apr 2013 08:56:34 -0000 Hello, Lev. You wrote 8 =D0=B0=D0=BF=D1=80=D0=B5=D0=BB=D1=8F 2013 =D0=B3., 12:50:02: LS> And why output of mount depends on visual option? I need to read LS> mount sources. Ok, such mounts have MNT_IGNORE ignore flag, and `-p' on `mount' change output style AND implies `-v' (verbose) flag too. I don't know why, and I don't think it will be changed :( --=20 // Black Lion AKA Lev Serebryakov From owner-freebsd-fs@FreeBSD.ORG Mon Apr 8 08:58:34 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id AA2B7418 for ; Mon, 8 Apr 2013 08:58:34 +0000 (UTC) (envelope-from lev@FreeBSD.org) Received: from onlyone.friendlyhosting.spb.ru (onlyone.friendlyhosting.spb.ru [IPv6:2a01:4f8:131:60a2::2]) by mx1.freebsd.org (Postfix) with ESMTP id 7161AC01 for ; Mon, 8 Apr 2013 08:58:34 +0000 (UTC) Received: from lion.home.serebryakov.spb.ru (unknown [IPv6:2001:470:923f:1:900d:c887:884e:713b]) (Authenticated sender: lev@serebryakov.spb.ru) by onlyone.friendlyhosting.spb.ru (Postfix) with ESMTPA id 9BFFF4AC58; Mon, 8 Apr 2013 12:58:33 +0400 (MSK) Date: Mon, 8 Apr 2013 12:58:31 +0400 From: Lev Serebryakov Organization: FreeBSD X-Priority: 3 (Normal) Message-ID: <593142909.20130408125831@serebryakov.spb.ru> To: Jeremy Chadwick , freebsd-fs@freebsd.org Subject: Re: ZFS snapshots and daily security checks In-Reply-To: <1629044728.20130408125629@serebryakov.spb.ru> References: <20130408005438.GA66727@icarus.home.lan> <1504594172.20130408114200@serebryakov.spb.ru> <20130408080738.GA73905@icarus.home.lan> <1884594284.20130408125002@serebryakov.spb.ru> <1629044728.20130408125629@serebryakov.spb.ru> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list Reply-To: lev@FreeBSD.org List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 08 Apr 2013 08:58:34 -0000 Hello, Lev. You wrote 8 =D0=B0=D0=BF=D1=80=D0=B5=D0=BB=D1=8F 2013 =D0=B3., 12:56:29: LS> Ok, such mounts have MNT_IGNORE ignore flag, and `-p' on `mount' LS> change output style AND implies `-v' (verbose) flag too. LS> I don't know why, and I don't think it will be changed :( And, of course, it is described in `man mount', but I don't look for shortest path, as usual! --=20 // Black Lion AKA Lev Serebryakov From owner-freebsd-fs@FreeBSD.ORG Mon Apr 8 09:30:19 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 11B93989 for ; Mon, 8 Apr 2013 09:30:19 +0000 (UTC) (envelope-from jdc@koitsu.org) Received: from qmta12.emeryville.ca.mail.comcast.net (qmta12.emeryville.ca.mail.comcast.net [IPv6:2001:558:fe2d:44:76:96:27:227]) by mx1.freebsd.org (Postfix) with ESMTP id ECBD6E28 for ; Mon, 8 Apr 2013 09:30:18 +0000 (UTC) Received: from omta23.emeryville.ca.mail.comcast.net ([76.96.30.90]) by qmta12.emeryville.ca.mail.comcast.net with comcast id MMW01l0011wfjNsACMWJiV; Mon, 08 Apr 2013 09:30:18 +0000 Received: from koitsu.strangled.net ([67.180.84.87]) by omta23.emeryville.ca.mail.comcast.net with comcast id MMWH1l00K1t3BNj8jMWHSP; Mon, 08 Apr 2013 09:30:18 +0000 Received: by icarus.home.lan (Postfix, from userid 1000) id 9853173A33; Mon, 8 Apr 2013 02:30:17 -0700 (PDT) Date: Mon, 8 Apr 2013 02:30:17 -0700 From: Jeremy Chadwick To: Lev Serebryakov Subject: Re: ZFS snapshots and daily security checks Message-ID: <20130408093017.GA76398@icarus.home.lan> References: <20130408005438.GA66727@icarus.home.lan> <1504594172.20130408114200@serebryakov.spb.ru> <20130408080738.GA73905@icarus.home.lan> <1884594284.20130408125002@serebryakov.spb.ru> MIME-Version: 1.0 Content-Type: text/plain; charset=unknown-8bit Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <1884594284.20130408125002@serebryakov.spb.ru> User-Agent: Mutt/1.5.21 (2010-09-15) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=comcast.net; s=q20121106; t=1365413418; bh=LDszSYquqA0cYS0lFxKP6P0D7a2wAFFwDjUdjsqTEqk=; h=Received:Received:Received:Date:From:To:Subject:Message-ID: MIME-Version:Content-Type; b=fdTEr3iqhAxOqzcPLqVC6nktGmxxw3mXUPeeO53yvEIlmOKwEWufb8ChfVi55qnu6 /IDU+pQm2e2h9JNJf3cgp69RS3/Wb85UMGsWTESp+WOSuHzynBrVEr7mbhhcdNeYDL femEpKgPJAc21GbGh3cpJtZv0ljkdufQursYAxWaR7hFe9wy8xv6w86VwlVD3dHxHd QE9SzqY59oNcIcepNAyaA5OM8fuq2V0EIfo2dEsew+Bh6AHf+582mR4qd5V+8SsYl7 yUtHTCON1eiF6uJdiOTCRK1duxR93NwnH+v6Kdv/XI5HrvwkGFNdlxn1krrqxWLpkf j9wbAPMA14ysA== Cc: freebsd-fs@freebsd.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 08 Apr 2013 09:30:19 -0000 On Mon, Apr 08, 2013 at 12:50:02PM +0400, Lev Serebryakov wrote: > Hello, Jeremy. > You wrote 8 апреля 2013 г., 12:07:38: > > JC> I don't know what to tell you -- my output clearly shows that after > JC> creating a snapshot with "zfs snapshot -r filesystem@snapname" that > JC> mount nor mount -p show anything. > What's really wander me, why is here difference between `mount' and > `mount -p' output on my system. It looks like `-p' option should be > cosmetic one... > > JC> I wonder if you have either pool or filesystem-level attributes which > JC> are causing your issue. > > JC> Here are mine, for the pool and filesystem I used in my previous mail > JC> (pool "data" and filesystem "data/home"): > > JC> data/home snapdir hidden default > pool/home snapdir visible default > > It is only not size- and date-related difference. So, we know why > here is difference between my and your `mount -p' outputs! (BTW, why > both values are default?!) And what about the properties for the filesystem called "pool" (yes, I said filesystem, and I mean it)? My theory is that your "pool" filesystem has the snapdir property as visible, and therefore all filesystems under pool (ex. "pool/home") would inherit the value. Looking at the ZFS code, hidden **is** the default, even in r244958 (which you're running): http://svnweb.freebsd.org/base/stable/9/sys/cddl/contrib/opensolaris/common/zfs/zfs_prop.c?view=annotate See line 218. The 3rd parameter, ZFS_SNAPDIR_HIDDEN, is what defines the default value. -- | Jeremy Chadwick jdc@koitsu.org | | UNIX Systems Administrator http://jdc.koitsu.org/ | | Mountain View, CA, US | | Making life hard for others since 1977. PGP 4BD6C0CB | From owner-freebsd-fs@FreeBSD.ORG Mon Apr 8 09:49:06 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 89063BF3 for ; Mon, 8 Apr 2013 09:49:06 +0000 (UTC) (envelope-from lev@FreeBSD.org) Received: from onlyone.friendlyhosting.spb.ru (onlyone.friendlyhosting.spb.ru [IPv6:2a01:4f8:131:60a2::2]) by mx1.freebsd.org (Postfix) with ESMTP id 502CBED5 for ; Mon, 8 Apr 2013 09:49:06 +0000 (UTC) Received: from lion.home.serebryakov.spb.ru (unknown [IPv6:2001:470:923f:1:900d:c887:884e:713b]) (Authenticated sender: lev@serebryakov.spb.ru) by onlyone.friendlyhosting.spb.ru (Postfix) with ESMTPA id EE05B4AC58; Mon, 8 Apr 2013 13:49:04 +0400 (MSK) Date: Mon, 8 Apr 2013 13:49:02 +0400 From: Lev Serebryakov Organization: FreeBSD Project X-Priority: 3 (Normal) Message-ID: <82684806.20130408134902@serebryakov.spb.ru> To: Jeremy Chadwick Subject: Re: ZFS snapshots and daily security checks In-Reply-To: <20130408093017.GA76398@icarus.home.lan> References: <20130408005438.GA66727@icarus.home.lan> <1504594172.20130408114200@serebryakov.spb.ru> <20130408080738.GA73905@icarus.home.lan> <1884594284.20130408125002@serebryakov.spb.ru> <20130408093017.GA76398@icarus.home.lan> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Cc: freebsd-fs@freebsd.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list Reply-To: lev@FreeBSD.org List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 08 Apr 2013 09:49:06 -0000 Hello, Jeremy. You wrote 8 =D0=B0=D0=BF=D1=80=D0=B5=D0=BB=D1=8F 2013 =D0=B3., 13:30:17: JC> My theory is that your "pool" filesystem has the snapdir property as JC> visible, and therefore all filesystems under pool (ex. "pool/home") JC> would inherit the value. Nope :) It is "hidden, default" JC> Looking at the ZFS code, hidden **is** the default, even in r244958 JC> (which you're running): JC> http://svnweb.freebsd.org/base/stable/9/sys/cddl/contrib/opensolaris/co= mmon/zfs/zfs_prop.c?view=3Dannotate JC> See line 218. The 3rd parameter, ZFS_SNAPDIR_HIDDEN, is what defines JC> the default value. Pool and FS was created long time ago :) Ok, it is not very interesting, why it was set to "visible". Now we understand why snapshots were "mounted" and why only `mount -p' show them. Last question is how to make them mounted (to allow users use them) and don't have bogus 25 line difference (24 hourly snapshots and 1 daily snapshot) in each daily security report... It looks like, I need simply add properly crafted "grep -v" to security script --=20 // Black Lion AKA Lev Serebryakov From owner-freebsd-fs@FreeBSD.ORG Mon Apr 8 11:06:43 2013 Return-Path: Delivered-To: freebsd-fs@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id E36041DF for ; Mon, 8 Apr 2013 11:06:43 +0000 (UTC) (envelope-from owner-bugmaster@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:1900:2254:206c::16:87]) by mx1.freebsd.org (Postfix) with ESMTP id C5810350 for ; Mon, 8 Apr 2013 11:06:43 +0000 (UTC) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.6/8.14.6) with ESMTP id r38B6h52057214 for ; Mon, 8 Apr 2013 11:06:43 GMT (envelope-from owner-bugmaster@FreeBSD.org) Received: (from gnats@localhost) by freefall.freebsd.org (8.14.6/8.14.6/Submit) id r38B6hQe057212 for freebsd-fs@FreeBSD.org; Mon, 8 Apr 2013 11:06:43 GMT (envelope-from owner-bugmaster@FreeBSD.org) Date: Mon, 8 Apr 2013 11:06:43 GMT Message-Id: <201304081106.r38B6hQe057212@freefall.freebsd.org> X-Authentication-Warning: freefall.freebsd.org: gnats set sender to owner-bugmaster@FreeBSD.org using -f From: FreeBSD bugmaster To: freebsd-fs@FreeBSD.org Subject: Current problem reports assigned to freebsd-fs@FreeBSD.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 08 Apr 2013 11:06:43 -0000 Note: to view an individual PR, use: http://www.freebsd.org/cgi/query-pr.cgi?pr=(number). The following is a listing of current problems submitted by FreeBSD users. These represent problem reports covering all versions including experimental development code and obsolete releases. S Tracker Resp. Description -------------------------------------------------------------------------------- o kern/177658 fs [ufs] FreeBSD panics after get full filesystem with uf o kern/177536 fs [zfs] zfs livelock (deadlock) with high write-to-disk o kern/177445 fs [hast] HAST panic o kern/177240 fs [zfs] zpool import failed with state UNAVAIL but all d o kern/176978 fs [zfs] [panic] zfs send -D causes "panic: System call i o kern/176857 fs [softupdates] [panic] 9.1-RELEASE/amd64/GENERIC panic o bin/176253 fs zpool(8): zfs pool indentation is misleading/wrong o kern/176141 fs [zfs] sharesmb=on makes errors for sharenfs, and still o kern/175950 fs [zfs] Possible deadlock in zfs after long uptime o kern/175897 fs [zfs] operations on readonly zpool hang o kern/175179 fs [zfs] ZFS may attach wrong device on move o kern/175071 fs [ufs] [panic] softdep_deallocate_dependencies: unrecov o kern/174372 fs [zfs] Pagefault appears to be related to ZFS o kern/174315 fs [zfs] chflags uchg not supported o kern/174310 fs [zfs] root point mounting broken on CURRENT with multi o kern/174279 fs [ufs] UFS2-SU+J journal and filesystem corruption o kern/174060 fs [ext2fs] Ext2FS system crashes (buffer overflow?) o kern/173830 fs [zfs] Brain-dead simple change to ZFS error descriptio o kern/173718 fs [zfs] phantom directory in zraid2 pool f kern/173657 fs [nfs] strange UID map with nfsuserd o kern/173363 fs [zfs] [panic] Panic on 'zpool replace' on readonly poo o kern/173136 fs [unionfs] mounting above the NFS read-only share panic o kern/172942 fs [smbfs] Unmounting a smb mount when the server became o kern/172348 fs [unionfs] umount -f of filesystem in use with readonly o kern/172334 fs [unionfs] unionfs permits recursive union mounts; caus o kern/171626 fs [tmpfs] tmpfs should be noisier when the requested siz o kern/171415 fs [zfs] zfs recv fails with "cannot receive incremental o kern/170945 fs [gpt] disk layout not portable between direct connect o bin/170778 fs [zfs] [panic] FreeBSD panics randomly o kern/170680 fs [nfs] Multiple NFS Client bug in the FreeBSD 7.4-RELEA o kern/170497 fs [xfs][panic] kernel will panic whenever I ls a mounted o kern/169945 fs [zfs] [panic] Kernel panic while importing zpool (afte o kern/169480 fs [zfs] ZFS stalls on heavy I/O o kern/169398 fs [zfs] Can't remove file with permanent error o kern/169339 fs panic while " : > /etc/123" o kern/169319 fs [zfs] zfs resilver can't complete o kern/168947 fs [nfs] [zfs] .zfs/snapshot directory is messed up when o kern/168942 fs [nfs] [hang] nfsd hangs after being restarted (not -HU o kern/168158 fs [zfs] incorrect parsing of sharenfs options in zfs (fs o kern/167979 fs [ufs] DIOCGDINFO ioctl does not work on 8.2 file syste o kern/167977 fs [smbfs] mount_smbfs results are differ when utf-8 or U o kern/167688 fs [fusefs] Incorrect signal handling with direct_io o kern/167685 fs [zfs] ZFS on USB drive prevents shutdown / reboot o kern/167612 fs [portalfs] The portal file system gets stuck inside po o kern/167272 fs [zfs] ZFS Disks reordering causes ZFS to pick the wron o kern/167260 fs [msdosfs] msdosfs disk was mounted the second time whe o kern/167109 fs [zfs] [panic] zfs diff kernel panic Fatal trap 9: gene o kern/167105 fs [nfs] mount_nfs can not handle source exports wiht mor o kern/167067 fs [zfs] [panic] ZFS panics the server o kern/167065 fs [zfs] boot fails when a spare is the boot disk o kern/167048 fs [nfs] [patch] RELEASE-9 crash when using ZFS+NULLFS+NF o kern/166912 fs [ufs] [panic] Panic after converting Softupdates to jo o kern/166851 fs [zfs] [hang] Copying directory from the mounted UFS di o kern/166477 fs [nfs] NFS data corruption. o kern/165950 fs [ffs] SU+J and fsck problem o kern/165521 fs [zfs] [hang] livelock on 1 Gig of RAM with zfs when 31 o kern/165392 fs Multiple mkdir/rmdir fails with errno 31 o kern/165087 fs [unionfs] lock violation in unionfs o kern/164472 fs [ufs] fsck -B panics on particular data inconsistency o kern/164370 fs [zfs] zfs destroy for snapshot fails on i386 and sparc o kern/164261 fs [nullfs] [patch] fix panic with NFS served from NULLFS o kern/164256 fs [zfs] device entry for volume is not created after zfs o kern/164184 fs [ufs] [panic] Kernel panic with ufs_makeinode o kern/163801 fs [md] [request] allow mfsBSD legacy installed in 'swap' o kern/163770 fs [zfs] [hang] LOR between zfs&syncer + vnlru leading to o kern/163501 fs [nfs] NFS exporting a dir and a subdir in that dir to o kern/162944 fs [coda] Coda file system module looks broken in 9.0 o kern/162860 fs [zfs] Cannot share ZFS filesystem to hosts with a hyph o kern/162751 fs [zfs] [panic] kernel panics during file operations o kern/162591 fs [nullfs] cross-filesystem nullfs does not work as expe o kern/162519 fs [zfs] "zpool import" relies on buggy realpath() behavi o kern/161968 fs [zfs] [hang] renaming snapshot with -r including a zvo o kern/161864 fs [ufs] removing journaling from UFS partition fails on o bin/161807 fs [patch] add option for explicitly specifying metadata o kern/161579 fs [smbfs] FreeBSD sometimes panics when an smb share is o kern/161533 fs [zfs] [panic] zfs receive panic: system ioctl returnin o kern/161438 fs [zfs] [panic] recursed on non-recursive spa_namespace_ o kern/161424 fs [nullfs] __getcwd() calls fail when used on nullfs mou o kern/161280 fs [zfs] Stack overflow in gptzfsboot o kern/161205 fs [nfs] [pfsync] [regression] [build] Bug report freebsd o kern/161169 fs [zfs] [panic] ZFS causes kernel panic in dbuf_dirty o kern/161112 fs [ufs] [lor] filesystem LOR in FreeBSD 9.0-BETA3 o kern/160893 fs [zfs] [panic] 9.0-BETA2 kernel panic o kern/160860 fs [ufs] Random UFS root filesystem corruption with SU+J o kern/160801 fs [zfs] zfsboot on 8.2-RELEASE fails to boot from root-o o kern/160790 fs [fusefs] [panic] VPUTX: negative ref count with FUSE o kern/160777 fs [zfs] [hang] RAID-Z3 causes fatal hang upon scrub/impo o kern/160706 fs [zfs] zfs bootloader fails when a non-root vdev exists o kern/160591 fs [zfs] Fail to boot on zfs root with degraded raidz2 [r o kern/160410 fs [smbfs] [hang] smbfs hangs when transferring large fil o kern/160283 fs [zfs] [patch] 'zfs list' does abort in make_dataset_ha o kern/159930 fs [ufs] [panic] kernel core o kern/159402 fs [zfs][loader] symlinks cause I/O errors o kern/159357 fs [zfs] ZFS MAXNAMELEN macro has confusing name (off-by- o kern/159356 fs [zfs] [patch] ZFS NAME_ERR_DISKLIKE check is Solaris-s o kern/159351 fs [nfs] [patch] - divide by zero in mountnfs() o kern/159251 fs [zfs] [request]: add FLETCHER4 as DEDUP hash option o kern/159077 fs [zfs] Can't cd .. with latest zfs version o kern/159048 fs [smbfs] smb mount corrupts large files o kern/159045 fs [zfs] [hang] ZFS scrub freezes system o kern/158839 fs [zfs] ZFS Bootloader Fails if there is a Dead Disk o kern/158802 fs amd(8) ICMP storm and unkillable process. o kern/158231 fs [nullfs] panic on unmounting nullfs mounted over ufs o f kern/157929 fs [nfs] NFS slow read o kern/157399 fs [zfs] trouble with: mdconfig force delete && zfs strip o kern/157179 fs [zfs] zfs/dbuf.c: panic: solaris assert: arc_buf_remov o kern/156797 fs [zfs] [panic] Double panic with FreeBSD 9-CURRENT and o kern/156781 fs [zfs] zfs is losing the snapshot directory, p kern/156545 fs [ufs] mv could break UFS on SMP systems o kern/156193 fs [ufs] [hang] UFS snapshot hangs && deadlocks processes o kern/156039 fs [nullfs] [unionfs] nullfs + unionfs do not compose, re o kern/155615 fs [zfs] zfs v28 broken on sparc64 -current o kern/155587 fs [zfs] [panic] kernel panic with zfs p kern/155411 fs [regression] [8.2-release] [tmpfs]: mount: tmpfs : No o kern/155199 fs [ext2fs] ext3fs mounted as ext2fs gives I/O errors o bin/155104 fs [zfs][patch] use /dev prefix by default when importing o kern/154930 fs [zfs] cannot delete/unlink file from full volume -> EN o kern/154828 fs [msdosfs] Unable to create directories on external USB o kern/154491 fs [smbfs] smb_co_lock: recursive lock for object 1 p kern/154228 fs [md] md getting stuck in wdrain state o kern/153996 fs [zfs] zfs root mount error while kernel is not located o kern/153753 fs [zfs] ZFS v15 - grammatical error when attempting to u o kern/153716 fs [zfs] zpool scrub time remaining is incorrect o kern/153695 fs [patch] [zfs] Booting from zpool created on 4k-sector o kern/153680 fs [xfs] 8.1 failing to mount XFS partitions o kern/153418 fs [zfs] [panic] Kernel Panic occurred writing to zfs vol o kern/153351 fs [zfs] locking directories/files in ZFS o bin/153258 fs [patch][zfs] creating ZVOLs requires `refreservation' s kern/153173 fs [zfs] booting from a gzip-compressed dataset doesn't w o bin/153142 fs [zfs] ls -l outputs `ls: ./.zfs: Operation not support o kern/153126 fs [zfs] vdev failure, zpool=peegel type=vdev.too_small o kern/152022 fs [nfs] nfs service hangs with linux client [regression] o kern/151942 fs [zfs] panic during ls(1) zfs snapshot directory o kern/151905 fs [zfs] page fault under load in /sbin/zfs o bin/151713 fs [patch] Bug in growfs(8) with respect to 32-bit overfl o kern/151648 fs [zfs] disk wait bug o kern/151629 fs [fs] [patch] Skip empty directory entries during name o kern/151330 fs [zfs] will unshare all zfs filesystem after execute a o kern/151326 fs [nfs] nfs exports fail if netgroups contain duplicate o kern/151251 fs [ufs] Can not create files on filesystem with heavy us o kern/151226 fs [zfs] can't delete zfs snapshot o kern/150503 fs [zfs] ZFS disks are UNAVAIL and corrupted after reboot o kern/150501 fs [zfs] ZFS vdev failure vdev.bad_label on amd64 o kern/150390 fs [zfs] zfs deadlock when arcmsr reports drive faulted o kern/150336 fs [nfs] mountd/nfsd became confused; refused to reload n o kern/149208 fs mksnap_ffs(8) hang/deadlock o kern/149173 fs [patch] [zfs] make OpenSolaris installa o kern/149015 fs [zfs] [patch] misc fixes for ZFS code to build on Glib o kern/149014 fs [zfs] [patch] declarations in ZFS libraries/utilities o kern/149013 fs [zfs] [patch] make ZFS makefiles use the libraries fro o kern/148504 fs [zfs] ZFS' zpool does not allow replacing drives to be o kern/148490 fs [zfs]: zpool attach - resilver bidirectionally, and re o kern/148368 fs [zfs] ZFS hanging forever on 8.1-PRERELEASE o kern/148138 fs [zfs] zfs raidz pool commands freeze o kern/147903 fs [zfs] [panic] Kernel panics on faulty zfs device o kern/147881 fs [zfs] [patch] ZFS "sharenfs" doesn't allow different " o kern/147420 fs [ufs] [panic] ufs_dirbad, nullfs, jail panic (corrupt o kern/146941 fs [zfs] [panic] Kernel Double Fault - Happens constantly o kern/146786 fs [zfs] zpool import hangs with checksum errors o kern/146708 fs [ufs] [panic] Kernel panic in softdep_disk_write_compl o kern/146528 fs [zfs] Severe memory leak in ZFS on i386 o kern/146502 fs [nfs] FreeBSD 8 NFS Client Connection to Server s kern/145712 fs [zfs] cannot offline two drives in a raidz2 configurat o kern/145411 fs [xfs] [panic] Kernel panics shortly after mounting an f bin/145309 fs bsdlabel: Editing disk label invalidates the whole dev o kern/145272 fs [zfs] [panic] Panic during boot when accessing zfs on o kern/145246 fs [ufs] dirhash in 7.3 gratuitously frees hashes when it o kern/145238 fs [zfs] [panic] kernel panic on zpool clear tank o kern/145229 fs [zfs] Vast differences in ZFS ARC behavior between 8.0 o kern/145189 fs [nfs] nfsd performs abysmally under load o kern/144929 fs [ufs] [lor] vfs_bio.c + ufs_dirhash.c p kern/144447 fs [zfs] sharenfs fsunshare() & fsshare_main() non functi o kern/144416 fs [panic] Kernel panic on online filesystem optimization s kern/144415 fs [zfs] [panic] kernel panics on boot after zfs crash o kern/144234 fs [zfs] Cannot boot machine with recent gptzfsboot code o kern/143825 fs [nfs] [panic] Kernel panic on NFS client o bin/143572 fs [zfs] zpool(1): [patch] The verbose output from iostat o kern/143212 fs [nfs] NFSv4 client strange work ... o kern/143184 fs [zfs] [lor] zfs/bufwait LOR o kern/142878 fs [zfs] [vfs] lock order reversal o kern/142597 fs [ext2fs] ext2fs does not work on filesystems with real o kern/142489 fs [zfs] [lor] allproc/zfs LOR o kern/142466 fs Update 7.2 -> 8.0 on Raid 1 ends with screwed raid [re o kern/142306 fs [zfs] [panic] ZFS drive (from OSX Leopard) causes two o kern/142068 fs [ufs] BSD labels are got deleted spontaneously o kern/141897 fs [msdosfs] [panic] Kernel panic. msdofs: file name leng o kern/141463 fs [nfs] [panic] Frequent kernel panics after upgrade fro o kern/141305 fs [zfs] FreeBSD ZFS+sendfile severe performance issues ( o kern/141091 fs [patch] [nullfs] fix panics with DIAGNOSTIC enabled o kern/141086 fs [nfs] [panic] panic("nfs: bioread, not dir") on FreeBS o kern/141010 fs [zfs] "zfs scrub" fails when backed by files in UFS2 o kern/140888 fs [zfs] boot fail from zfs root while the pool resilveri o kern/140661 fs [zfs] [patch] /boot/loader fails to work on a GPT/ZFS- o kern/140640 fs [zfs] snapshot crash o kern/140068 fs [smbfs] [patch] smbfs does not allow semicolon in file o kern/139725 fs [zfs] zdb(1) dumps core on i386 when examining zpool c o kern/139715 fs [zfs] vfs.numvnodes leak on busy zfs p bin/139651 fs [nfs] mount(8): read-only remount of NFS volume does n o kern/139407 fs [smbfs] [panic] smb mount causes system crash if remot o kern/138662 fs [panic] ffs_blkfree: freeing free block o kern/138421 fs [ufs] [patch] remove UFS label limitations o kern/138202 fs mount_msdosfs(1) see only 2Gb o kern/136968 fs [ufs] [lor] ufs/bufwait/ufs (open) o kern/136945 fs [ufs] [lor] filedesc structure/ufs (poll) o kern/136944 fs [ffs] [lor] bufwait/snaplk (fsync) o kern/136873 fs [ntfs] Missing directories/files on NTFS volume o kern/136865 fs [nfs] [patch] NFS exports atomic and on-the-fly atomic p kern/136470 fs [nfs] Cannot mount / in read-only, over NFS o kern/135546 fs [zfs] zfs.ko module doesn't ignore zpool.cache filenam o kern/135469 fs [ufs] [panic] kernel crash on md operation in ufs_dirb o kern/135050 fs [zfs] ZFS clears/hides disk errors on reboot o kern/134491 fs [zfs] Hot spares are rather cold... o kern/133676 fs [smbfs] [panic] umount -f'ing a vnode-based memory dis p kern/133174 fs [msdosfs] [patch] msdosfs must support multibyte inter o kern/132960 fs [ufs] [panic] panic:ffs_blkfree: freeing free frag o kern/132397 fs reboot causes filesystem corruption (failure to sync b o kern/132331 fs [ufs] [lor] LOR ufs and syncer o kern/132237 fs [msdosfs] msdosfs has problems to read MSDOS Floppy o kern/132145 fs [panic] File System Hard Crashes o kern/131441 fs [unionfs] [nullfs] unionfs and/or nullfs not combineab o kern/131360 fs [nfs] poor scaling behavior of the NFS server under lo o kern/131342 fs [nfs] mounting/unmounting of disks causes NFS to fail o bin/131341 fs makefs: error "Bad file descriptor" on the mount poin o kern/130920 fs [msdosfs] cp(1) takes 100% CPU time while copying file o kern/130210 fs [nullfs] Error by check nullfs o kern/129760 fs [nfs] after 'umount -f' of a stale NFS share FreeBSD l o kern/129488 fs [smbfs] Kernel "bug" when using smbfs in smbfs_smb.c: o kern/129231 fs [ufs] [patch] New UFS mount (norandom) option - mostly o kern/129152 fs [panic] non-userfriendly panic when trying to mount(8) o kern/127787 fs [lor] [ufs] Three LORs: vfslock/devfs/vfslock, ufs/vfs o bin/127270 fs fsck_msdosfs(8) may crash if BytesPerSec is zero o kern/127029 fs [panic] mount(8): trying to mount a write protected zi o kern/126287 fs [ufs] [panic] Kernel panics while mounting an UFS file o kern/125895 fs [ffs] [panic] kernel: panic: ffs_blkfree: freeing free s kern/125738 fs [zfs] [request] SHA256 acceleration in ZFS o kern/123939 fs [msdosfs] corrupts new files o kern/122380 fs [ffs] ffs_valloc:dup alloc (Soekris 4801/7.0/USB Flash o bin/122172 fs [fs]: amd(8) automount daemon dies on 6.3-STABLE i386, o bin/121898 fs [nullfs] pwd(1)/getcwd(2) fails with Permission denied o bin/121072 fs [smbfs] mount_smbfs(8) cannot normally convert the cha o kern/120483 fs [ntfs] [patch] NTFS filesystem locking changes o kern/120482 fs [ntfs] [patch] Sync style changes between NetBSD and F o kern/118912 fs [2tb] disk sizing/geometry problem with large array o kern/118713 fs [minidump] [patch] Display media size required for a k o kern/118318 fs [nfs] NFS server hangs under special circumstances o bin/118249 fs [ufs] mv(1): moving a directory changes its mtime o kern/118126 fs [nfs] [patch] Poor NFS server write performance o kern/118107 fs [ntfs] [panic] Kernel panic when accessing a file at N o kern/117954 fs [ufs] dirhash on very large directories blocks the mac o bin/117315 fs [smbfs] mount_smbfs(8) and related options can't mount o kern/117158 fs [zfs] zpool scrub causes panic if geli vdevs detach on o bin/116980 fs [msdosfs] [patch] mount_msdosfs(8) resets some flags f o conf/116931 fs lack of fsck_cd9660 prevents mounting iso images with o kern/116583 fs [ffs] [hang] System freezes for short time when using o bin/115361 fs [zfs] mount(8) gets into a state where it won't set/un o kern/114955 fs [cd9660] [patch] [request] support for mask,dirmask,ui o kern/114847 fs [ntfs] [patch] [request] dirmask support for NTFS ala o kern/114676 fs [ufs] snapshot creation panics: snapacct_ufs2: bad blo o bin/114468 fs [patch] [request] add -d option to umount(8) to detach o kern/113852 fs [smbfs] smbfs does not properly implement DFS referral o bin/113838 fs [patch] [request] mount(8): add support for relative p o bin/113049 fs [patch] [request] make quot(8) use getopt(3) and show o kern/112658 fs [smbfs] [patch] smbfs and caching problems (resolves b o kern/111843 fs [msdosfs] Long Names of files are incorrectly created o kern/111782 fs [ufs] dump(8) fails horribly for large filesystems s bin/111146 fs [2tb] fsck(8) fails on 6T filesystem o bin/107829 fs [2TB] fdisk(8): invalid boundary checking in fdisk / w o kern/106107 fs [ufs] left-over fsck_snapshot after unfinished backgro o kern/104406 fs [ufs] Processes get stuck in "ufs" state under persist o kern/104133 fs [ext2fs] EXT2FS module corrupts EXT2/3 filesystems o kern/103035 fs [ntfs] Directories in NTFS mounted disc images appear o kern/101324 fs [smbfs] smbfs sometimes not case sensitive when it's s o kern/99290 fs [ntfs] mount_ntfs ignorant of cluster sizes s bin/97498 fs [request] newfs(8) has no option to clear the first 12 o kern/97377 fs [ntfs] [patch] syntax cleanup for ntfs_ihash.c o kern/95222 fs [cd9660] File sections on ISO9660 level 3 CDs ignored o kern/94849 fs [ufs] rename on UFS filesystem is not atomic o bin/94810 fs fsck(8) incorrectly reports 'file system marked clean' o kern/94769 fs [ufs] Multiple file deletions on multi-snapshotted fil o kern/94733 fs [smbfs] smbfs may cause double unlock o kern/93942 fs [vfs] [patch] panic: ufs_dirbad: bad dir (patch from D o kern/92272 fs [ffs] [hang] Filling a filesystem while creating a sna o kern/91134 fs [smbfs] [patch] Preserve access and modification time a kern/90815 fs [smbfs] [patch] SMBFS with character conversions somet o kern/88657 fs [smbfs] windows client hang when browsing a samba shar o kern/88555 fs [panic] ffs_blkfree: freeing free frag on AMD 64 o bin/87966 fs [patch] newfs(8): introduce -A flag for newfs to enabl o kern/87859 fs [smbfs] System reboot while umount smbfs. o kern/86587 fs [msdosfs] rm -r /PATH fails with lots of small files o bin/85494 fs fsck_ffs: unchecked use of cg_inosused macro etc. o kern/80088 fs [smbfs] Incorrect file time setting on NTFS mounted vi o bin/74779 fs Background-fsck checks one filesystem twice and omits o kern/73484 fs [ntfs] Kernel panic when doing `ls` from the client si o bin/73019 fs [ufs] fsck_ufs(8) cannot alloc 607016868 bytes for ino o kern/71774 fs [ntfs] NTFS cannot "see" files on a WinXP filesystem o bin/70600 fs fsck(8) throws files away when it can't grow lost+foun o kern/68978 fs [panic] [ufs] crashes with failing hard disk, loose po o kern/65920 fs [nwfs] Mounted Netware filesystem behaves strange o kern/65901 fs [smbfs] [patch] smbfs fails fsx write/truncate-down/tr o kern/61503 fs [smbfs] mount_smbfs does not work as non-root o kern/55617 fs [smbfs] Accessing an nsmb-mounted drive via a smb expo o kern/51685 fs [hang] Unbounded inode allocation causes kernel to loc o kern/36566 fs [smbfs] System reboot with dead smb mount and umount o bin/27687 fs fsck(8) wrapper is not properly passing options to fsc o kern/18874 fs [2TB] 32bit NFS servers export wrong negative values t 305 problems total. From owner-freebsd-fs@FreeBSD.ORG Mon Apr 8 13:08:58 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 43D30474 for ; Mon, 8 Apr 2013 13:08:58 +0000 (UTC) (envelope-from gallasch@free.de) Received: from smtp.free.de (smtp.free.de [91.204.6.103]) by mx1.freebsd.org (Postfix) with ESMTP id AFDB7D11 for ; Mon, 8 Apr 2013 13:08:57 +0000 (UTC) Received: (qmail 16853 invoked from network); 8 Apr 2013 15:08:56 +0200 Received: from smtp.free.de (HELO orwell.free.de) (gallasch@free.de@[91.204.4.103]) (envelope-sender ) by smtp.free.de (qmail-ldap-1.03) with AES128-SHA encrypted SMTP for ; 8 Apr 2013 15:08:56 +0200 From: Kai Gallasch Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: quoted-printable Subject: FreeBSD 9.1 and swap on zfs Date: Mon, 8 Apr 2013 15:08:55 +0200 Message-Id: <9407C6ED-3B4C-4BA2-8B88-F8A998E0A847@free.de> To: freebsd-fs@freebsd.org Mime-Version: 1.0 (Apple Message framework v1085) X-Mailer: Apple Mail (2.1085) X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 08 Apr 2013 13:08:58 -0000 Hi. When running a ZFS on root FreeBSD install.. is it for FreeBSD 9.1 (ZFS v28) still not advisable to use a vdev as = swapspace? - like: # zfs create -V 8G \ -o org.freebsd:swap=3Don \ -o sync=3Ddisabled \ -o primarycache=3Dnone \ -o secondarycache=3Dnone rpool/swap # swapon /dev/zvol/rpool/swap Often voiced fears for swapping on zfs are, that at the moment the = server starts to swap, ZFS will start to compete for memory with the = short-on-memory server itself (reason for swapping) and the system will = lock up shortly after. Seems a lot of ZFS-on-root single disk setups make use of an extra swap = partition on the root-disk. People booting of a mirrored-zpool often seem to have a swap partition = on both disks forming the zpool and use those as devices for a gmirror. = They then use the gmirror device as swap. Which approach is the recommended one? Swapping to ZFS *or* a swap partition / gmirror on top of two = partitions?=20 Regards, Kai Gallasch. From owner-freebsd-fs@FreeBSD.ORG Mon Apr 8 13:13:54 2013 Return-Path: Delivered-To: freebsd-fs@smarthost.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 2EAD6647; Mon, 8 Apr 2013 13:13:54 +0000 (UTC) (envelope-from emaste@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:1900:2254:206c::16:87]) by mx1.freebsd.org (Postfix) with ESMTP id 0A052D73; Mon, 8 Apr 2013 13:13:54 +0000 (UTC) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.6/8.14.6) with ESMTP id r38DDrEf085539; Mon, 8 Apr 2013 13:13:53 GMT (envelope-from emaste@freefall.freebsd.org) Received: (from emaste@localhost) by freefall.freebsd.org (8.14.6/8.14.6/Submit) id r38DDrxS085538; Mon, 8 Apr 2013 13:13:53 GMT (envelope-from emaste) Date: Mon, 8 Apr 2013 13:13:53 GMT Message-Id: <201304081313.r38DDrxS085538@freefall.freebsd.org> To: nigel@netmsi.com, emaste@FreeBSD.org, freebsd-fs@FreeBSD.org From: emaste@FreeBSD.org Subject: Re: kern/160860: [ufs] Random UFS root filesystem corruption with SU+J [regression] X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 08 Apr 2013 13:13:54 -0000 Synopsis: [ufs] Random UFS root filesystem corruption with SU+J [regression] State-Changed-From-To: open->feedback State-Changed-By: emaste State-Changed-When: Mon Apr 8 13:13:16 UTC 2013 State-Changed-Why: Feedback has been requested from submitter http://www.freebsd.org/cgi/query-pr.cgi?pr=160860 From owner-freebsd-fs@FreeBSD.ORG Mon Apr 8 13:57:04 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 38C79EE3 for ; Mon, 8 Apr 2013 13:57:04 +0000 (UTC) (envelope-from krichy@cflinux.hu) Received: from pi.nmdps.net (pi.nmdps.net [IPv6:2a01:be00:10:201:0:80:0:1]) by mx1.freebsd.org (Postfix) with ESMTP id EBB77F45 for ; Mon, 8 Apr 2013 13:57:03 +0000 (UTC) Received: from pi.nmdps.net (pi.nmdps.net [109.61.102.5]) (Authenticated sender: krichy@cflinux.hu) by pi.nmdps.net (Postfix) with ESMTPSA id E92F6119B; Mon, 8 Apr 2013 15:57:01 +0200 (CEST) Date: Mon, 8 Apr 2013 15:56:59 +0200 (CEST) From: Richard Kojedzinszky X-X-Sender: krichy@pi.nmdps.net To: Will Andrews Subject: Re: zfs hang In-Reply-To: Message-ID: References: User-Agent: Alpine 2.00 (BSF 1167 2008-08-23) MIME-Version: 1.0 Content-Type: MULTIPART/MIXED; BOUNDARY="2628712688-2030582202-1365429421=:32515" Cc: "freebsd-fs@freebsd.org" X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 08 Apr 2013 13:57:04 -0000 This message is in MIME format. The first part should be readable text, while the remaining parts are likely unreadable without MIME-aware tools. --2628712688-2030582202-1365429421=:32515 Content-Type: TEXT/PLAIN; charset=utf-8; format=flowed Content-Transfer-Encoding: 8BIT Dear Will, Thank you for your responce, I think many of us are waiting for it. Thanks in advance, Kojedzinszky Richard On Mon, 8 Apr 2013, Will Andrews wrote: > I've written a patch that extensively redesigns the zvol implementation to get rid of various > deadlocks in it, including this one. > I am working to push the improvements (along with various others) out. > > --Will. > > On Sunday, April 7, 2013, Richard Kojedzinszky wrote: > Dear FS devs, > > Pr kern/161968 still seems to be alive, at least on stable/9. > Unfortunately, I am not familiar with zfs internals, so I just ask if someone knows > the problem, and could fix it? > > If I can help somehow, I will. > > Thanks in advance, > > Kojedzinszky Richard > _______________________________________________ > freebsd-fs@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-fs > To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org" > > > --2628712688-2030582202-1365429421=:32515-- From owner-freebsd-fs@FreeBSD.ORG Mon Apr 8 13:58:58 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 2C420F7E for ; Mon, 8 Apr 2013 13:58:58 +0000 (UTC) (envelope-from mad@madpilot.net) Received: from winston.madpilot.net (winston.madpilot.net [78.47.75.155]) by mx1.freebsd.org (Postfix) with ESMTP id AC302F6E for ; Mon, 8 Apr 2013 13:58:57 +0000 (UTC) Received: from winston.madpilot.net (localhost [127.0.0.1]) by winston.madpilot.net (Postfix) with ESMTP id 3ZktQN4VF9zFTXB; Mon, 8 Apr 2013 15:53:48 +0200 (CEST) X-Virus-Scanned: amavisd-new at madpilot.net Received: from winston.madpilot.net ([127.0.0.1]) by winston.madpilot.net (winston.madpilot.net [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id g9n1kekShwbJ; Mon, 8 Apr 2013 15:53:43 +0200 (CEST) Received: from vwg82.hq.ignesti.it (unknown [77.246.14.1]) by winston.madpilot.net (Postfix) with ESMTPSA; Mon, 8 Apr 2013 15:53:43 +0200 (CEST) Message-ID: <5162CBE8.5050104@madpilot.net> Date: Mon, 08 Apr 2013 15:53:44 +0200 From: Guido Falsi User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:17.0) Gecko/20130404 Thunderbird/17.0.5 MIME-Version: 1.0 To: Kai Gallasch Subject: Re: FreeBSD 9.1 and swap on zfs References: <9407C6ED-3B4C-4BA2-8B88-F8A998E0A847@free.de> In-Reply-To: <9407C6ED-3B4C-4BA2-8B88-F8A998E0A847@free.de> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit Cc: freebsd-fs@freebsd.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 08 Apr 2013 13:58:58 -0000 On 04/08/13 15:08, Kai Gallasch wrote: > Hi. > > When running a ZFS on root FreeBSD install.. > > is it for FreeBSD 9.1 (ZFS v28) still not advisable to use a vdev as swapspace? - like: > > # zfs create -V 8G \ > -o org.freebsd:swap=on \ > -o sync=disabled \ > -o primarycache=none \ > -o secondarycache=none rpool/swap > > # swapon /dev/zvol/rpool/swap > > Often voiced fears for swapping on zfs are, that at the moment the server starts to swap, ZFS will start to compete for memory with the short-on-memory server itself (reason for swapping) and the system will lock up shortly after. > > Seems a lot of ZFS-on-root single disk setups make use of an extra swap partition on the root-disk. > > People booting of a mirrored-zpool often seem to have a swap partition on both disks forming the zpool and use those as devices for a gmirror. They then use the gmirror device as swap. > > Which approach is the recommended one? > Swapping to ZFS *or* a swap partition / gmirror on top of two partitions? > I can share my experience, which is not definitive but I hope can help. I have various machines with ZFS on root. some with swap on ZVOL and some with swap on separate partitions (none are mirroring the swap using gmirror though). There is a race condition between ZFS' ARC and the VM system when very low memory conditions arise and this could happen and the machine just starves, I've seen this happen on machines when running buildworld -j without enough ram and also on machines running ports tinderbox or poudriere. This is not happening when using a separate swap partition. In such a case the machine swaps happily and just slows down as naturally expected when swapping a lot. I have noticed that setting the following properties on the ZFS ZVOL can help some: checksum off compression off (it's the default usually) primarycache metadata (maybe none would be even better) secondarycache none sync disabled (in case of a system reset swap data is not valuable anyway) (I'm also not really sure if setting secondarycache has any purpose when primarycache is metadata or none) but this will not completely solve the problem anyway. Also I'm quite sure that tuning ARC not to take all available memory can mitigate the problem too. But the basic race condition remains anyway. Also, this could be just an idea I have no data to corroborate this, it looks to me that ZVOL swap is somewhat slower than a separate partition at recovering swapped data back to ram. But again, I don't know how to properly test this. I have never used mirrored swap because I don't think swap data is anyway valuable, but I understand that having a machine die just because it lost half it's swap can be a problem, so if you need the machine to run rock solid through a disk failure having swap mirrored could be a really good idea. My suggestion is: if you want stability and don't have specific disk layout problems create a separate swap. If you can afford the need to hard reset the machine sometimes when high load sends it in a lockup or really can't make separate partitions, just go for ZVOLS. Just my 2 cents! -- Guido Falsi From owner-freebsd-fs@FreeBSD.ORG Mon Apr 8 16:07:54 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 3F78C830 for ; Mon, 8 Apr 2013 16:07:54 +0000 (UTC) (envelope-from freebsd-fs@m.gmane.org) Received: from plane.gmane.org (plane.gmane.org [80.91.229.3]) by mx1.freebsd.org (Postfix) with ESMTP id 00ABF7FC for ; Mon, 8 Apr 2013 16:07:53 +0000 (UTC) Received: from list by plane.gmane.org with local (Exim 4.69) (envelope-from ) id 1UPEbL-0002D2-Cm for freebsd-fs@freebsd.org; Mon, 08 Apr 2013 18:07:23 +0200 Received: from jtotz2.cs.ucl.ac.uk ([128.16.6.56]) by main.gmane.org with esmtp (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Mon, 08 Apr 2013 18:07:23 +0200 Received: from johannes by jtotz2.cs.ucl.ac.uk with local (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Mon, 08 Apr 2013 18:07:23 +0200 X-Injected-Via-Gmane: http://gmane.org/ To: freebsd-fs@freebsd.org From: Johannes Totz Subject: Re: ZFS snapshots and daily security checks Date: Mon, 08 Apr 2013 17:06:59 +0100 Lines: 73 Message-ID: References: <20130408005438.GA66727@icarus.home.lan> <1504594172.20130408114200@serebryakov.spb.ru> Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Complaints-To: usenet@ger.gmane.org X-Gmane-NNTP-Posting-Host: jtotz2.cs.ucl.ac.uk User-Agent: Mozilla/5.0 (Windows NT 6.2; WOW64; rv:17.0) Gecko/20130307 Thunderbird/17.0.4 In-Reply-To: <1504594172.20130408114200@serebryakov.spb.ru> X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 08 Apr 2013 16:07:54 -0000 On 08/04/2013 08:42, Lev Serebryakov wrote: > Hello, Jeremy. > You wrote 8 апреля 2013 г., 4:54:38: > >>> Is it possible to exclude these snapshots from `mount -p' output or >>> don't mount them to hierarchy by default? > JC> Taken from my stable/9 r249160 system: > And here is my 9.1-STABLE r244958 (I'm filtering out all hourly > output, or this message will be infinite): > > % df -k > Filesystem 1024-blocks Used Avail Capacity Mounted on > /dev/mirror/root 2026028 675598 1188348 36% / > devfs 1 1 0 100% /dev > fdescfs 1 1 0 100% /dev/fd > procfs 4 4 0 100% /proc > /dev/mirror/var 16244332 6285320 8659466 42% /var > /dev/mirror/tmp 1012972 12290 919646 1% /tmp > /dev/mirror/usr 64995336 10259340 49536370 17% /usr > /dev/mirror/databases 101554148 174252 93255566 0% /var/databases > pool 487184219 21 487184198 0% /pool > pool/home 511417117 24232919 487184198 5% /usr/home > devfs 1 1 0 100% /var/named/dev > % mount > /dev/mirror/root on / (ufs, local) > devfs on /dev (devfs, local) > fdescfs on /dev/fd (fdescfs) > procfs on /proc (procfs, local) > /dev/mirror/var on /var (ufs, local, soft-updates) > /dev/mirror/tmp on /tmp (ufs, local, soft-updates) > /dev/mirror/usr on /usr (ufs, local, soft-updates) > /dev/mirror/databases on /var/databases (ufs, local, soft-updates) > pool on /pool (zfs, local, nfsv4acls) > pool/home on /usr/home (zfs, local, nfsv4acls) > devfs on /var/named/dev (devfs, local) > % zfs list -t snapshot | grep -v hourly > NAME USED AVAIL REFER MOUNTPOINT > pool/home@daily-2013-04-05_03.01.28--1m 544K - 23.1G - > pool/home@daily-2013-04-06_03.01.20--1m 688K - 23.1G - > pool/home@weekly-2013-04-06_04.15.34--1y 1.70M - 23.1G - > pool/home@daily-2013-04-07_03.04.44--1m 1.15M - 23.1G - > pool/home@daily-2013-04-08_03.01.31--1m 437K - 23.1G - > % mount -p | grep -v hourly > /dev/mirror/root / ufs rw 1 1 > devfs /dev devfs rw 0 0 > fdescfs /dev/fd fdescfs rw 0 0 > procfs /proc procfs rw 0 0 > /dev/mirror/var /var ufs rw 2 2 > /dev/mirror/tmp /tmp ufs rw 2 2 > /dev/mirror/usr /usr ufs rw 2 2 > /dev/mirror/databases /var/databases ufs rw 3 3 > pool /pool zfs rw,nfsv4acls 0 0 > pool/home /usr/home zfs rw,nfsv4acls 0 0 > devfs /var/named/dev devfs rw 0 0 > pool/home@daily-2013-04-05_03.01.28--1m /usr/home/.zfs/snapshot/daily-2013-04-05_03.01.28--1m zfs ro,nosuid,noatime,nfsv4acls 0 0 > pool/home@daily-2013-04-06_03.01.20--1m /usr/home/.zfs/snapshot/daily-2013-04-06_03.01.20--1m zfs ro,nosuid,noatime,nfsv4acls 0 0 > pool/home@weekly-2013-04-06_04.15.34--1y /usr/home/.zfs/snapshot/weekly-2013-04-06_04.15.34--1y zfs ro,nosuid,noatime,nfsv4acls 0 0 > pool/home@daily-2013-04-07_03.04.44--1m /usr/home/.zfs/snapshot/daily-2013-04-07_03.04.44--1m zfs ro,nosuid,noatime,nfsv4acls 0 0 > % > > JC> It seems to me mount and mount -p show the mounted snapshot. > I didn't mount snapshot specifically, and they are created by zfSnap > script from ports (sysutils/zfsnap). > As I can see in this script, snapshots are created with > > /sbin/zfs snapshot -r ${fs}@${snapshot} > Are your snapshots set to visible? zpool get listsnapshots pool If I remember correctly, daily security uses find to walk the file system tree... From owner-freebsd-fs@FreeBSD.ORG Mon Apr 8 19:11:10 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 83B58B07; Mon, 8 Apr 2013 19:11:10 +0000 (UTC) (envelope-from john@theusgroup.com) Received: from theusgroup.com (theusgroup.com [64.122.243.222]) by mx1.freebsd.org (Postfix) with ESMTP id 6CB9F24F; Mon, 8 Apr 2013 19:11:10 +0000 (UTC) From: John Theus To: lev@FreeBSD.org Subject: Re: ZFS snapshots and daily security checks In-reply-to: <1884594284.20130408125002@serebryakov.spb.ru> References: <20130408005438.GA66727@icarus.home.lan> <1504594172.20130408114200@serebryakov.spb.ru> <20130408080738.GA73905@icarus.home.lan> <1884594284.20130408125002@serebryakov.spb.ru> Comments: In-reply-to Lev Serebryakov message dated "Mon, 08 Apr 2013 12:50:02 +0400." Date: Mon, 08 Apr 2013 12:11:04 -0700 Message-Id: <20130408191104.98B90F1A@server.theusgroup.com> Cc: freebsd-fs@freebsd.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 08 Apr 2013 19:11:10 -0000 > >JC> I don't know what to tell you -- my output clearly shows that after >JC> creating a snapshot with "zfs snapshot -r filesystem@snapname" that >JC> mount nor mount -p show anything. > What's really wander me, why is here difference between `mount' and > `mount -p' output on my system. It looks like `-p' option should be > cosmetic one... > >JC> I wonder if you have either pool or filesystem-level attributes which >JC> are causing your issue. > >JC> Here are mine, for the pool and filesystem I used in my previous mail >JC> (pool "data" and filesystem "data/home"): > >JC> data/home snapdir hidden default >pool/home snapdir visible default > > It is only not size- and date-related difference. So, we know why > here is difference between my and your `mount -p' outputs! (BTW, why > both values are default?!) > > And here is some conflict of interests: it is god to allow useres >restore their files from snapshots without my help (and it is require >visible snapshots), but it is very annoying output in security >checks... > > And why output of mount depends on visual option? I need to read > mount sources. > It doesn't. Snapdir is hidden and listsnapshots if off on all my pools and filesystems, and I see snapshots listed on mount -p, but NOT all snapshots. Running 9.1-STABLE #1 r248540M: Wed Mar 20 00:48:58 PDT 2013, but I've seen this behavior since zfs version 15. All my snapshots use the same format as zfSnap, and show their creation time and time-to-live. On some filesystems, snapshots are made as frequently as 5 minutes, but only live a couple of hours. Other snapshots are made daily that live weeks. When I do a mount -p, the only snapshots that show up are the ones that were made on the once per day and once per week schedule. These snapshots were used for daily backups using zfs send. The snapshots that live for multiple days, but are not used for a backup do not show up. I have not looked any deeper, and took the easy route to clean up the security reports but setting daily_status_security_chkmounts_enable="NO" in periodic.conf. John Theus TheUsGroup.com From owner-freebsd-fs@FreeBSD.ORG Mon Apr 8 22:22:51 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 56E8EF06 for ; Mon, 8 Apr 2013 22:22:51 +0000 (UTC) (envelope-from lkchen@k-state.edu) Received: from ksu-out.merit.edu (ksu-out.merit.edu [207.75.117.132]) by mx1.freebsd.org (Postfix) with ESMTP id 22432F7E for ; Mon, 8 Apr 2013 22:22:50 +0000 (UTC) X-Merit-ExtLoop1: 1 X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AisFAPJBY1HPS3TT/2dsb2JhbABOAxaCcDaDKr8TFnSCHwEBBSNWDAINGgINGQIdPAYTGYd7DKsKiUuJEQSBH4w+gVqCHYETA6gCgyeBVzU X-IronPort-AV: E=Sophos;i="4.87,433,1363147200"; d="scan'208";a="40307106" X-MERIT-SOURCE: KSU Received: from ksu-sfpop-mailstore02.merit.edu ([207.75.116.211]) by sfpop-ironport03.merit.edu with ESMTP; 08 Apr 2013 18:22:44 -0400 Date: Mon, 8 Apr 2013 18:22:44 -0400 (EDT) From: "Lawrence K. Chen, P.Eng." To: Quartz Message-ID: <2092374421.4491514.1365459764269.JavaMail.root@k-state.edu> In-Reply-To: <516251B4.7050809@sneakertech.com> Subject: Re: ZFS: Failed pool causes system to hang MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Originating-IP: [129.130.0.181] X-Mailer: Zimbra 7.2.2_GA_2852 (ZimbraWebClient - GC25 ([unknown])/7.2.2_GA_2852) Cc: FreeBSD FS X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 08 Apr 2013 22:22:51 -0000 ----- Original Message ----- > > > So, this thread seems to just stop....and can't see if it was > > resolved or not. > > It wasn't. Jeremy Chadwick was the only one who really responded, but > besides confirming it wasn't specific to my hardware, there wasn't a > lot > he could do. He suggested I email some of the kernel folks directly > and/or open a PR about it. (I'm planning on doing both, but haven't > had > time over the weekend). > > > > Anyways, my input would be did you want long enough to see if the > > system will boot before declaring it hung? > > > I've had my system crash at bad times, which has resulted in the > > appearance that the boot is hung...but its busy churning away.... > > > It seemed hung at trying to mount root > > It might not have been clear from the back and forth, but my issue > isn't > a "boot hang" per se, but that "reboots also hang". The zfs subsystem > hangs so thoroughly it blocks all io on all disks and prevents the > reboot/halt/shutdown procedure from taking the machine down > gracefully. > Once I press the physical front-panel reboot button the machine comes > up > immediately (sans the offending pool). And yes I've waited over half > an > hour and it never recovers. My discussion with Jeremy indicated that > the > infinite wait is an "expected failure" in the sense that zfs would > not > be come back to life given the circumstances. > So, you're not really waiting a long time.... Granted the first time it happened to me...I would wait 10-30 minutes depending on the Internet searching I was doing turned up.... But, then I just left it and watched some TV...when out of the corner of my eye, I saw it come back up (about 2.5 hours.) Next time it happened....it seemed hung, but I left it....and it wasn't up the next morning...but I got notification later in the day that it had come back up. Both times it involved zpools that wasn't my root pool. Though both times it was an unexpected reboot while destroying a large dataset....first time was a 384G zvol with lots of snapshots (had been serving blocks up for iscsi). second time was just a 1TB filesystem. While I wasn't doing dedup on the zvol or filesystem, I was doing dedup in the pool....and I found that dedup does consider data in non-dedup enabled filesystems for dedup. Since I had copied the filesystem to another in the same zpool with dedup on to see if dedup would help....it seemed to, until I removed the original filesystem. So, not doing dedup in that pool anymore. -- Who: Lawrence K. Chen, P.Eng. - W0LKC - Senior Unix Systems Administrator For: Enterprise Server Technologies (EST) -- & SafeZone Ally Snail: Computing and Telecommunications Services (CTS) Kansas State University, 109 East Stadium, Manhattan, KS 66506-3102 Phone: (785) 532-4916 - Fax: (785) 532-3515 - Email: lkchen@ksu.edu Web: http://www-personal.ksu.edu/~lkchen - Where: 11 Hale Library From owner-freebsd-fs@FreeBSD.ORG Tue Apr 9 02:06:00 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 4333273F; Tue, 9 Apr 2013 02:06:00 +0000 (UTC) (envelope-from julian@freebsd.org) Received: from vps1.elischer.org (vps1.elischer.org [204.109.63.16]) by mx1.freebsd.org (Postfix) with ESMTP id 01408964; Tue, 9 Apr 2013 02:05:59 +0000 (UTC) Received: from Julian-MBP3.local (50-196-156-133-static.hfc.comcastbusiness.net [50.196.156.133]) (authenticated bits=0) by vps1.elischer.org (8.14.5/8.14.5) with ESMTP id r3925ljm085516 (version=TLSv1/SSLv3 cipher=DHE-RSA-CAMELLIA256-SHA bits=256 verify=NO); Mon, 8 Apr 2013 19:05:51 -0700 (PDT) (envelope-from julian@freebsd.org) Message-ID: <51637776.8010001@freebsd.org> Date: Tue, 09 Apr 2013 10:05:42 +0800 From: Julian Elischer User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.8; rv:17.0) Gecko/20130328 Thunderbird/17.0.5 MIME-Version: 1.0 To: Pawel Jakub Dawidek Subject: Re: When will we see TRIM support for GELI volumes ? References: <51479D54.1040509@gibfest.dk> <20130319082732.GB1367@garage.freebsd.pl> In-Reply-To: <20130319082732.GB1367@garage.freebsd.pl> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: freebsd-fs@freebsd.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 09 Apr 2013 02:06:00 -0000 On 3/19/13 4:27 PM, Pawel Jakub Dawidek wrote: > This is not what I see. On one of my SSDs in my laptop I've two > partitions, both running ZFS, but one of them on top of GELI. I > don't use ZFS TRIM yet, as I see no slowdown whatsoever. How can you > say this is lack of TRIM slowing your writes? The performance > degraded over time? For those readers wondering what all teh fuss about TRIM and SSDs is about: A quick description as to why TRIM is useful..... When doing a long sequence of writes on an SSD you eventually run out of free chunks of flash. When you do, an existing written chunk (or actual a bunch of them) is rewritten and the free space recovered using regular garbage collection techniques and defragmentation. If the chunks rewritten are 80 % full (20% is data that has been replaced and is not not of interest), then you need to rewrite 5 of them to get a complete new empty chunk of flash. This is called "Write Amplification" (WA). SO if you are writing,youare competing with the internal garbage collection process, except that it is producing 8 times as much data as you are (reading AND writing 80% of 5 chunks in the same time that you write one chunk. Thus your through put is going to be limited to 1/9th of the bandwidth of the flash. This is a huge drop in performance when you "hit the wall". SSDs have an internal path for this which is higher performance, so the drop may be less in practice, but it will still be there. If you have TRIM, and assume that an additional 20% of the drive is 'available' because it is not in use by the filesystem, then when we re-do the calculations we find that we only need to rewrite 2.5 chunks, and each one has only 60% data, so WA is 2 x 2.5 x 0.6, = 3.. in other words the internal process is using only 3 times as much of the flash bandwidth as you are. so your throughput can be 1/4 of the max instead of 1/9 of the max. this is a > 100% improvement. In addition, you are rewriting and erasing each chunk half as many times. so the SSD should last twice as long.. Julian From owner-freebsd-fs@FreeBSD.ORG Tue Apr 9 10:47:40 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 1CD10927 for ; Tue, 9 Apr 2013 10:47:40 +0000 (UTC) (envelope-from quartz@sneakertech.com) Received: from relay03.pair.com (relay03.pair.com [209.68.5.17]) by mx1.freebsd.org (Postfix) with SMTP id B4DD3D0 for ; Tue, 9 Apr 2013 10:47:39 +0000 (UTC) Received: (qmail 90764 invoked by uid 0); 9 Apr 2013 10:40:58 -0000 Received: from 173.48.104.62 (HELO ?10.2.2.1?) (173.48.104.62) by relay03.pair.com with SMTP; 9 Apr 2013 10:40:58 -0000 X-pair-Authenticated: 173.48.104.62 Message-ID: <5163F03B.9060700@sneakertech.com> Date: Tue, 09 Apr 2013 06:40:59 -0400 From: Quartz User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.6; rv:10.0.2) Gecko/20120216 Thunderbird/10.0.2 MIME-Version: 1.0 To: "Lawrence K. Chen, P.Eng." Subject: Re: ZFS: Failed pool causes system to hang References: <2092374421.4491514.1365459764269.JavaMail.root@k-state.edu> In-Reply-To: <2092374421.4491514.1365459764269.JavaMail.root@k-state.edu> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit Cc: FreeBSD FS X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 09 Apr 2013 10:47:40 -0000 > So, you're not really waiting a long time.... I still don't think you're 100% clear on what's happening in my case. I'm trying to explain that my problem is *prior* to the motherboard resetting, NOT after. If I hard-reset the machine with the front panel switch, it boots just fine every time. When my pool *FAILS* (ie; is unrecoverable because I lost too many drives) it hangs effectively all io on the entire machine. I can't cd or ls directories, I can't run any zfs commands, and I can't issue a reboot or halt. This is a hang. The machine is completely useless in this state. There is no disk or cpu activity churning. There's no pool (anymore) to be trying to resilver or whatever anyway. I'm not going to wait 3+ hours for "shutdown -r now" to bring the machine down. Especially not when I already know that zfs won't let it. ______________________________________ it has a certain smooth-brained appeal From owner-freebsd-fs@FreeBSD.ORG Tue Apr 9 11:52:03 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 38AF428E for ; Tue, 9 Apr 2013 11:52:03 +0000 (UTC) (envelope-from tevans.uk@googlemail.com) Received: from mail-la0-x229.google.com (mail-la0-x229.google.com [IPv6:2a00:1450:4010:c03::229]) by mx1.freebsd.org (Postfix) with ESMTP id BAC4F33D for ; Tue, 9 Apr 2013 11:52:02 +0000 (UTC) Received: by mail-la0-f41.google.com with SMTP id er20so1230154lab.28 for ; Tue, 09 Apr 2013 04:52:01 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=googlemail.com; s=20120113; h=mime-version:x-received:in-reply-to:references:date:message-id :subject:from:to:cc:content-type; bh=9ESf9FC4iRlKb3EfZkSGxosLNSlpWhg2McMQZpcNdCM=; b=KAskdekUEAaWQBa1pJL7CC1DHXlg+hIFMqXJCxFEAXdapmtptnfZxql4rK7XyvIPJP mJd0XbcI6fC0X1TdOD1ITROyB8X3Cz/hYyBWTeRyfHp4xoA0Q+xaJ+1LnvMccVMnFDnk JIV4/HVRzRoBfFy9SpD9HwJPVOpo6vSBiagTXZXSU9REf48D/0R3fZ/Iv2fib5m/N819 516MXQxzehTI7jdv/60EajavIMBfR3OuDLmLVPtWu2kdK8eDuyZtUBH0I9l236uJHhIb cZS754iOugqDjgDrFNamPItrP5TOOhXtu40kjp4uecJ/bYMi8ENCoO5LbF5Dufji1tqS aSNA== MIME-Version: 1.0 X-Received: by 10.112.128.231 with SMTP id nr7mr1643lbb.26.1365508321570; Tue, 09 Apr 2013 04:52:01 -0700 (PDT) Received: by 10.112.198.201 with HTTP; Tue, 9 Apr 2013 04:52:01 -0700 (PDT) In-Reply-To: <5163F03B.9060700@sneakertech.com> References: <2092374421.4491514.1365459764269.JavaMail.root@k-state.edu> <5163F03B.9060700@sneakertech.com> Date: Tue, 9 Apr 2013 12:52:01 +0100 Message-ID: Subject: Re: ZFS: Failed pool causes system to hang From: Tom Evans To: Quartz Content-Type: text/plain; charset=UTF-8 Cc: FreeBSD FS X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 09 Apr 2013 11:52:03 -0000 On Tue, Apr 9, 2013 at 11:40 AM, Quartz wrote: > >> So, you're not really waiting a long time.... > > > I still don't think you're 100% clear on what's happening in my case. I'm > trying to explain that my problem is *prior* to the motherboard resetting, > NOT after. If I hard-reset the machine with the front panel switch, it boots > just fine every time. > > When my pool *FAILS* (ie; is unrecoverable because I lost too many drives) > it hangs effectively all io on the entire machine. I can't cd or ls > directories, I can't run any zfs commands, and I can't issue a reboot or > halt. This is a hang. The machine is completely useless in this state. There > is no disk or cpu activity churning. There's no pool (anymore) to be trying > to resilver or whatever anyway. > > I'm not going to wait 3+ hours for "shutdown -r now" to bring the machine > down. Especially not when I already know that zfs won't let it. > I think what Lawrence is trying to explain is that a "hang" is not necessarily a deadlock. Leaving the system for an extended period may bring it back. What you are saying is also valid, that a hang that long is equivalent to a deadlock in your usage. Computers, even essential dedicated servers sometimes hang, which is why it is common to have some way of remotely power cycling. If your server is important, you need some sort of RAC for these scenarios. So, how to find out where the hang is. Your ZFS pools and your root disk probably - I've not seen a dmesg - share one thing in common, ATA/AHCI. If root does not also use this, does losing the pool still cause problems with root? Perhaps breaking into ddb at this point could tell us something. Cheers Tom From owner-freebsd-fs@FreeBSD.ORG Tue Apr 9 12:06:28 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id D623EA9C for ; Tue, 9 Apr 2013 12:06:28 +0000 (UTC) (envelope-from etnapierala@gmail.com) Received: from mail-ea0-x229.google.com (mail-ea0-x229.google.com [IPv6:2a00:1450:4013:c01::229]) by mx1.freebsd.org (Postfix) with ESMTP id 6DA79650 for ; Tue, 9 Apr 2013 12:06:28 +0000 (UTC) Received: by mail-ea0-f169.google.com with SMTP id n15so2874094ead.28 for ; Tue, 09 Apr 2013 05:06:27 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=x-received:sender:subject:mime-version:content-type:from :in-reply-to:date:cc:content-transfer-encoding:message-id:references :to:x-mailer; bh=8vPwiXDt4mEnTWuCNQKajFxjMBb6lPuH0nZEbdi1ohs=; b=kC+yrSjFMTnWXsHYcfz4u4ShMfZkEl4IJIxoxzh2qpkHkGMV3QEIc7HrJwF8ik6fPl 8s/WRvtHmFb7WQdWSZh0WP3EQY1qOVzONgYbiL0eH/kla0gydxIU1S/ZRLHX4Q37oolq Pp6KnU+No+ZECCP6g6OAjfEpDIvpHzY+PN1y4vWPklNvGI9tKT3jLIxt0uWavQBEM5gJ inFW2+hOdMi27KliccVIcd8x4T7EmyZ5KgmNNDG4gq/+1wqYrUvJuD5P+/8T1/GLL0nW NFAsCi2iTvYzqPX+Zo7MFeRZBhHF1zVJsejPc/bX9fdeFxtV/DbBKi8J1mVmoWpJILU+ 2rJA== X-Received: by 10.14.107.69 with SMTP id n45mr37587106eeg.23.1365509187473; Tue, 09 Apr 2013 05:06:27 -0700 (PDT) Received: from [192.168.1.104] (45.81.datacomsa.pl. [195.34.81.45]) by mx.google.com with ESMTPS id b5sm5894397eew.16.2013.04.09.05.06.26 (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Tue, 09 Apr 2013 05:06:26 -0700 (PDT) Sender: =?UTF-8?Q?Edward_Tomasz_Napiera=C5=82a?= Subject: Re: ZFS: Failed pool causes system to hang Mime-Version: 1.0 (Apple Message framework v1283) Content-Type: text/plain; charset=iso-8859-2 From: =?iso-8859-2?Q?Edward_Tomasz_Napiera=B3a?= In-Reply-To: <5163F03B.9060700@sneakertech.com> Date: Tue, 9 Apr 2013 14:06:24 +0200 Content-Transfer-Encoding: quoted-printable Message-Id: References: <2092374421.4491514.1365459764269.JavaMail.root@k-state.edu> <5163F03B.9060700@sneakertech.com> To: Quartz X-Mailer: Apple Mail (2.1283) Cc: FreeBSD FS X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 09 Apr 2013 12:06:28 -0000 Wiadomo=B6=E6 napisana przez Quartz w dniu 9 kwi 2013, o godz. 12:40: >=20 >> So, you're not really waiting a long time.... >=20 > I still don't think you're 100% clear on what's happening in my case. = I'm trying to explain that my problem is *prior* to the motherboard = resetting, NOT after. If I hard-reset the machine with the front panel = switch, it boots just fine every time. >=20 > When my pool *FAILS* (ie; is unrecoverable because I lost too many = drives) it hangs effectively all io on the entire machine. I can't cd or = ls directories, I can't run any zfs commands, and I can't issue a reboot = or halt. This is a hang. The machine is completely useless in this = state. There is no disk or cpu activity churning. There's no pool = (anymore) to be trying to resilver or whatever anyway. I hadn't followed the entire discussion, but do you have the "failmode" zpool property set to "wait" (the default)? If so, can you reproduce it with "failmode" set to "continue"? --=20 If you cut off my head, what would I say? Me and my head, or me and my = body? From owner-freebsd-fs@FreeBSD.ORG Tue Apr 9 12:15:15 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id D1EEBBB6 for ; Tue, 9 Apr 2013 12:15:15 +0000 (UTC) (envelope-from gallasch@free.de) Received: from smtp.free.de (smtp.free.de [91.204.6.103]) by mx1.freebsd.org (Postfix) with ESMTP id 478B26B4 for ; Tue, 9 Apr 2013 12:15:15 +0000 (UTC) Received: (qmail 73868 invoked from network); 9 Apr 2013 14:15:08 +0200 Received: from smtp.free.de (HELO orwell.free.de) (gallasch@free.de@[91.204.4.103]) (envelope-sender ) by smtp.free.de (qmail-ldap-1.03) with AES128-SHA encrypted SMTP for ; 9 Apr 2013 14:15:08 +0200 Subject: Re: FreeBSD 9.1 and swap on zfs Mime-Version: 1.0 (Apple Message framework v1085) Content-Type: text/plain; charset=us-ascii From: Kai Gallasch In-Reply-To: <5162CBE8.5050104@madpilot.net> Date: Tue, 9 Apr 2013 14:15:06 +0200 Content-Transfer-Encoding: quoted-printable Message-Id: References: <9407C6ED-3B4C-4BA2-8B88-F8A998E0A847@free.de> <5162CBE8.5050104@madpilot.net> To: Guido Falsi X-Mailer: Apple Mail (2.1085) Cc: freebsd-fs@freebsd.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 09 Apr 2013 12:15:15 -0000 Am 08.04.2013 um 15:53 schrieb Guido Falsi: > On 04/08/13 15:08, Kai Gallasch wrote: >> Hi. >>=20 >> When running a ZFS on root FreeBSD install.. >>=20 >> is it for FreeBSD 9.1 (ZFS v28) still not advisable to use a vdev as = swapspace? - like: > I can share my experience, which is not definitive but I hope can = help. thank you! > I have various machines with ZFS on root. some with swap on ZVOL and = some with swap on separate partitions (none are mirroring the swap using = gmirror though). >=20 > There is a race condition between ZFS' ARC and the VM system when very = low memory conditions arise and this could happen and the machine just = starves, I've seen this happen on machines when running buildworld -j = without enough ram and also on machines running ports tinderbox or = poudriere. This is not happening when using a separate swap partition. = In such a case the machine swaps happily and just slows down as = naturally expected when swapping a lot. Do you also have machines running on 9.1 that swap on ZVOL and which = show this behaviour? > My suggestion is: >=20 > if you want stability and don't have specific disk layout problems = create a separate swap. I think I'll repartition and swap to a gmirror device then. This also has the advantage of being able to write kernel dumps to the = swap (not possible with a ZVOL bases swap AFAIK) Kai.= From owner-freebsd-fs@FreeBSD.ORG Tue Apr 9 12:38:53 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 2274EF3E for ; Tue, 9 Apr 2013 12:38:53 +0000 (UTC) (envelope-from quartz@sneakertech.com) Received: from relay00.pair.com (relay00.pair.com [209.68.5.9]) by mx1.freebsd.org (Postfix) with SMTP id BB7E07ED for ; Tue, 9 Apr 2013 12:38:52 +0000 (UTC) Received: (qmail 33258 invoked by uid 0); 9 Apr 2013 12:38:50 -0000 Received: from 173.48.104.62 (HELO ?10.2.2.1?) (173.48.104.62) by relay00.pair.com with SMTP; 9 Apr 2013 12:38:50 -0000 X-pair-Authenticated: 173.48.104.62 Message-ID: <51640BDB.1020403@sneakertech.com> Date: Tue, 09 Apr 2013 08:38:51 -0400 From: Quartz User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.6; rv:10.0.2) Gecko/20120216 Thunderbird/10.0.2 MIME-Version: 1.0 To: Tom Evans Subject: Re: ZFS: Failed pool causes system to hang References: <2092374421.4491514.1365459764269.JavaMail.root@k-state.edu> <5163F03B.9060700@sneakertech.com> In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit Cc: FreeBSD FS X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 09 Apr 2013 12:38:53 -0000 > I think what Lawrence is trying to explain is that a "hang" is not > necessarily a deadlock. Leaving the system for an extended period may > bring it back. In other cases perhaps, but at least according to my experience and what Jeremy Chadwick has said, it won't in this instance. However either way.... >What you are saying is also valid, that a hang that > long is equivalent to a deadlock in your usage. .. yes. Hard-resetting a machine is bad, but having a machine offline for the better part of a day just isn't workable. > I've not seen a dmesg http://sneakertech.com/-/dmesg.txt >does losing the pool still > cause problems with root? Yes. At the moment, there's a single ufs disk that houses all the system/home/var/swap/etc stuff (no raid or dual boot or anything special). The zfs pool is a collection of six other disks in a raidz2 configuration. If three of those disks go out to lunch (ie; the pool is no longer solvent) most io across the board hangs, including io that should be confined to the boot drive. I've had hangs when trying to cd to my home folder. ______________________________________ it has a certain smooth-brained appeal From owner-freebsd-fs@FreeBSD.ORG Tue Apr 9 12:46:00 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id ADE21265 for ; Tue, 9 Apr 2013 12:46:00 +0000 (UTC) (envelope-from quartz@sneakertech.com) Received: from relay01.pair.com (relay01.pair.com [209.68.5.15]) by mx1.freebsd.org (Postfix) with SMTP id 5212982B for ; Tue, 9 Apr 2013 12:46:00 +0000 (UTC) Received: (qmail 9279 invoked by uid 0); 9 Apr 2013 12:45:58 -0000 Received: from 173.48.104.62 (HELO ?10.2.2.1?) (173.48.104.62) by relay01.pair.com with SMTP; 9 Apr 2013 12:45:58 -0000 X-pair-Authenticated: 173.48.104.62 Message-ID: <51640D87.7070307@sneakertech.com> Date: Tue, 09 Apr 2013 08:45:59 -0400 From: Quartz User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.6; rv:10.0.2) Gecko/20120216 Thunderbird/10.0.2 MIME-Version: 1.0 To: =?ISO-8859-2?Q?Edward_Tomasz_Napiera=B3a?= Subject: Re: ZFS: Failed pool causes system to hang References: <2092374421.4491514.1365459764269.JavaMail.root@k-state.edu> <5163F03B.9060700@sneakertech.com> In-Reply-To: Content-Type: text/plain; charset=ISO-8859-2; format=flowed Content-Transfer-Encoding: 7bit Cc: FreeBSD FS X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 09 Apr 2013 12:46:00 -0000 > I hadn't followed the entire discussion, but do you have the "failmode" > zpool property set to "wait" (the default)? If so, can you reproduce it > with "failmode" set to "continue"? Yes. Changing the failmode was the first thing research led me to try. "wait" hangs almost immediately while "continue" gives me some seconds to a minute and then hangs. -STABLE has some new commits that will eventually panic the machine if this type of zfs hang goes on for too long, but that doesn't help me at all. (If I was ok with a panic, I'd set failmode to "panic" and be done with it). ______________________________________ it has a certain smooth-brained appeal From owner-freebsd-fs@FreeBSD.ORG Tue Apr 9 13:03:12 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 9FB145BE for ; Tue, 9 Apr 2013 13:03:12 +0000 (UTC) (envelope-from tevans.uk@googlemail.com) Received: from mail-la0-x236.google.com (mail-la0-x236.google.com [IPv6:2a00:1450:4010:c03::236]) by mx1.freebsd.org (Postfix) with ESMTP id 2A523907 for ; Tue, 9 Apr 2013 13:03:11 +0000 (UTC) Received: by mail-la0-f54.google.com with SMTP id ec20so773665lab.27 for ; Tue, 09 Apr 2013 06:03:10 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=googlemail.com; s=20120113; h=mime-version:x-received:in-reply-to:references:date:message-id :subject:from:to:cc:content-type; bh=ffL/glaK8ER2mCkdjb77WR2/bTJYHzeeHdlnWQO0b8I=; b=uvOnZSVrboSJuZqxQNc3UaSyOhUSdT8DDp+ZcnGP4HkmzWrxYJX3NS04DM/ofPlo/e fVUvHR6gAJhYR6asgeialL3TiIxfeyyeubYiaUQn8QM1Vju5mTF0T/VxEgopjiVLx1fg svBJGCy33faWeFclxDIfuOw6MZduinzDVqGzHOcP6O0MGhNTpUAfflL5eY/T4IY8Qy3w yFI+es168Mc4DvZgYHxhJ5vhHynUR4hN4kJ6oaAIV9NbGDvrKeTZttkvEbZl98y0U+1U ShQ4OlnXYr+OmpzoC1VjYhxAS0XW9v10JB7ZNeci1+FhU8HAxyzk74hzvzRub+CvM3Yn 0KIA== MIME-Version: 1.0 X-Received: by 10.152.6.10 with SMTP id w10mr5925475law.30.1365512590902; Tue, 09 Apr 2013 06:03:10 -0700 (PDT) Received: by 10.112.198.201 with HTTP; Tue, 9 Apr 2013 06:03:10 -0700 (PDT) In-Reply-To: <51640BDB.1020403@sneakertech.com> References: <2092374421.4491514.1365459764269.JavaMail.root@k-state.edu> <5163F03B.9060700@sneakertech.com> <51640BDB.1020403@sneakertech.com> Date: Tue, 9 Apr 2013 14:03:10 +0100 Message-ID: Subject: Re: ZFS: Failed pool causes system to hang From: Tom Evans To: Quartz Content-Type: text/plain; charset=UTF-8 Cc: FreeBSD FS X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 09 Apr 2013 13:03:12 -0000 On Tue, Apr 9, 2013 at 1:38 PM, Quartz wrote: >> does losing the pool still >> cause problems with root? > > > Yes. At the moment, there's a single ufs disk that houses all the > system/home/var/swap/etc stuff (no raid or dual boot or anything special). Sorry, but you've not tested this. Your root is hanging off a different controller to the others, but it is still using the same ahci/cam stack. Is ahci/cam getting wedged, causing your root to get wedged - irrespective of running on a different controller - or is ZFS causing a deadlock. Cheers Tom From owner-freebsd-fs@FreeBSD.ORG Tue Apr 9 15:05:16 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 1E7135B7 for ; Tue, 9 Apr 2013 15:05:16 +0000 (UTC) (envelope-from c.kworr@gmail.com) Received: from mail-ee0-f52.google.com (mail-ee0-f52.google.com [74.125.83.52]) by mx1.freebsd.org (Postfix) with ESMTP id ABFDC74 for ; Tue, 9 Apr 2013 15:05:15 +0000 (UTC) Received: by mail-ee0-f52.google.com with SMTP id d17so2759906eek.25 for ; Tue, 09 Apr 2013 08:05:09 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=x-received:message-id:date:from:user-agent:mime-version:to:cc :subject:references:in-reply-to:content-type :content-transfer-encoding; bh=HgxSaGcNJTzZ1tRg7Nt+YkQqpB7iUCWIvmJYD6QjnMw=; b=Gy6AsRQiNiccx29C3NJivl9Y3RbbGndWtuipunguOuXke3vVKQVbd6bF/eySVyYDrh t1X+NpBMfQ9eo+oAcW3DYJBKiNhIOnmNIHHDSvgk4ym5nhqR9S7yQhLxxqNPwjP+3NT5 QZ9Tj8LKhHTfL0q4SxctAoz0XnhBdbSBzNuilPm6RyiAsUBbNXM9x+Ma8NIRkcCed1nZ nNjwoZVqVesIs0duh1zc8c1ywF1e50ME+RF1Y5TxhX24j//fuRXPVStqZFzbfXBGy4xA Hszm+Zl1JJNPiTyJRbE4pH4XzElIXgK/LPEq/Bm5Hh5LBE/ZCKIWmftui7ObDUuSot5C xD1Q== X-Received: by 10.15.34.199 with SMTP id e47mr50956771eev.35.1365519909393; Tue, 09 Apr 2013 08:05:09 -0700 (PDT) Received: from [192.168.1.128] ([91.196.229.122]) by mx.google.com with ESMTPS id t4sm38545714eel.0.2013.04.09.08.05.07 (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Tue, 09 Apr 2013 08:05:08 -0700 (PDT) Message-ID: <51642E22.404@gmail.com> Date: Tue, 09 Apr 2013 18:05:06 +0300 From: Volodymyr Kostyrko User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:20.0) Gecko/20100101 Firefox/20.0 SeaMonkey/2.17 MIME-Version: 1.0 To: Kai Gallasch , Guido Falsi Subject: Re: FreeBSD 9.1 and swap on zfs References: <9407C6ED-3B4C-4BA2-8B88-F8A998E0A847@free.de> <5162CBE8.5050104@madpilot.net> In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit Cc: freebsd-fs@freebsd.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 09 Apr 2013 15:05:16 -0000 09.04.2013 15:15, Kai Gallasch: >> My suggestion is: >> >> if you want stability and don't have specific disk layout problems create a separate swap. > > > I think I'll repartition and swap to a gmirror device then. > This also has the advantage of being able to write kernel dumps to the swap (not possible with a ZVOL bases swap AFAIK) Be sure to use -b prefer for that as when dump is written it goes to the first component. Man page is a bit outdated as we currently miss /etc/rc.early. -- Sphinx of black quartz, judge my vow. From owner-freebsd-fs@FreeBSD.ORG Tue Apr 9 15:15:57 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id D784E9CE; Tue, 9 Apr 2013 15:15:57 +0000 (UTC) (envelope-from gpalmer@freebsd.org) Received: from noop.in-addr.com (mail.in-addr.com [IPv6:2001:470:8:162::1]) by mx1.freebsd.org (Postfix) with ESMTP id AD73C12B; Tue, 9 Apr 2013 15:15:57 +0000 (UTC) Received: from gjp by noop.in-addr.com with local (Exim 4.80.1 (FreeBSD)) (envelope-from ) id 1UPaH7-0000RQ-3W; Tue, 09 Apr 2013 11:15:57 -0400 Date: Tue, 9 Apr 2013 11:15:56 -0400 From: Gary Palmer To: pjd@freebsd.org Subject: ZFS trim MFC? Message-ID: <20130409151556.GB96431@in-addr.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline X-SA-Exim-Connect-IP: X-SA-Exim-Mail-From: gpalmer@freebsd.org X-SA-Exim-Scanned: No (on noop.in-addr.com); SAEximRunCond expanded to false Cc: freebsd-fs@freebsd.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 09 Apr 2013 15:15:57 -0000 Hi Pawel, I notice in r240868 there is no MFC tag. Are there any plans to merge ZFS TRIM support back to stable/8 or stable/9? Has testing in -current thrown up any issues? Thanks, Gary From owner-freebsd-fs@FreeBSD.ORG Tue Apr 9 15:26:37 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 8FE20C91; Tue, 9 Apr 2013 15:26:37 +0000 (UTC) (envelope-from prvs=18115ede16=killing@multiplay.co.uk) Received: from mail1.multiplay.co.uk (mail1.multiplay.co.uk [85.236.96.23]) by mx1.freebsd.org (Postfix) with ESMTP id E449E1AF; Tue, 9 Apr 2013 15:26:36 +0000 (UTC) Received: from r2d2 ([46.65.172.4]) by mail1.multiplay.co.uk (mail1.multiplay.co.uk [85.236.96.23]) (MDaemon PRO v10.0.4) with ESMTP id md50003175261.msg; Tue, 09 Apr 2013 16:26:35 +0100 X-Spam-Processed: mail1.multiplay.co.uk, Tue, 09 Apr 2013 16:26:35 +0100 (not processed: message from valid local sender) X-MDDKIM-Result: neutral (mail1.multiplay.co.uk) X-MDRemoteIP: 46.65.172.4 X-Return-Path: prvs=18115ede16=killing@multiplay.co.uk X-Envelope-From: killing@multiplay.co.uk Message-ID: From: "Steven Hartland" To: "Gary Palmer" , References: <20130409151556.GB96431@in-addr.com> Subject: Re: ZFS trim MFC? Date: Tue, 9 Apr 2013 16:26:39 +0100 MIME-Version: 1.0 Content-Type: text/plain; format=flowed; charset="iso-8859-1"; reply-type=original Content-Transfer-Encoding: 7bit X-Priority: 3 X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook Express 6.00.2900.5931 X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2900.6157 Cc: freebsd-fs@freebsd.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 09 Apr 2013 15:26:37 -0000 We've not experienced any major issues, there's been a few incremental improvements which have been testing recently nothing major. I plan to MFC it relatively soon. ----- Original Message ----- From: "Gary Palmer" > Hi Pawel, > > I notice in r240868 there is no MFC tag. Are there any plans to merge > ZFS TRIM support back to stable/8 or stable/9? > > Has testing in -current thrown up any issues? Regards Steve ================================================ This e.mail is private and confidential between Multiplay (UK) Ltd. and the person or entity to whom it is addressed. In the event of misdirection, the recipient is prohibited from using, copying, printing or otherwise disseminating it or any information contained in it. In the event of misdirection, illegible or incomplete transmission please telephone +44 845 868 1337 or return the E.mail to postmaster@multiplay.co.uk. From owner-freebsd-fs@FreeBSD.ORG Tue Apr 9 16:19:31 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id C6436E35 for ; Tue, 9 Apr 2013 16:19:31 +0000 (UTC) (envelope-from quartz@sneakertech.com) Received: from relay02.pair.com (relay02.pair.com [209.68.5.16]) by mx1.freebsd.org (Postfix) with SMTP id 68BFE694 for ; Tue, 9 Apr 2013 16:19:31 +0000 (UTC) Received: (qmail 84655 invoked by uid 0); 9 Apr 2013 16:19:29 -0000 Received: from 173.48.104.62 (HELO ?10.2.2.1?) (173.48.104.62) by relay02.pair.com with SMTP; 9 Apr 2013 16:19:29 -0000 X-pair-Authenticated: 173.48.104.62 Message-ID: <51643F91.30704@sneakertech.com> Date: Tue, 09 Apr 2013 12:19:29 -0400 From: Quartz User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.6; rv:10.0.2) Gecko/20120216 Thunderbird/10.0.2 MIME-Version: 1.0 To: Tom Evans Subject: Re: ZFS: Failed pool causes system to hang References: <2092374421.4491514.1365459764269.JavaMail.root@k-state.edu> <5163F03B.9060700@sneakertech.com> <51640BDB.1020403@sneakertech.com> In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit Cc: FreeBSD FS X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 09 Apr 2013 16:19:31 -0000 > Sorry, but you've not tested this. Your root is hanging off a > different controller to the others, but it is still using the same > ahci/cam stack. Is ahci/cam getting wedged, causing your root to get > wedged - irrespective of running on a different controller - or is ZFS > causing a deadlock. If I simulate failures by yanking the sata cable to various drives in the pool, I can disconnect any two (raidz2) at random and everything hums along just fine. Status tells me the pool is degraded and if I reconnect them I can resilver and whatnot with no problems. However if I have three drives yanked simultaneously is when everything goes to shit. I don't know the ahci/cam stack from a hole in the wall, but it seems to me that if it can gracefully handle two drives dropping out and coming back randomly, it ought to be able to handle three. I suppose it's possible that zfs itself is not the root cause of the problem, but one way or another there's some kind of interaction here, as I only experience an issue when the pool is no longer solvent. ______________________________________ it has a certain smooth-brained appeal From owner-freebsd-fs@FreeBSD.ORG Tue Apr 9 16:22:12 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id C79D5F00 for ; Tue, 9 Apr 2013 16:22:12 +0000 (UTC) (envelope-from quartz@sneakertech.com) Received: from relay03.pair.com (relay03.pair.com [209.68.5.17]) by mx1.freebsd.org (Postfix) with SMTP id 6B6406BE for ; Tue, 9 Apr 2013 16:22:12 +0000 (UTC) Received: (qmail 1384 invoked by uid 0); 9 Apr 2013 16:22:10 -0000 Received: from 173.48.104.62 (HELO ?10.2.2.1?) (173.48.104.62) by relay03.pair.com with SMTP; 9 Apr 2013 16:22:10 -0000 X-pair-Authenticated: 173.48.104.62 Message-ID: <51644032.7070305@sneakertech.com> Date: Tue, 09 Apr 2013 12:22:10 -0400 From: Quartz User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.6; rv:10.0.2) Gecko/20120216 Thunderbird/10.0.2 MIME-Version: 1.0 To: Tom Evans Subject: Re: ZFS: Failed pool causes system to hang References: <2092374421.4491514.1365459764269.JavaMail.root@k-state.edu> <5163F03B.9060700@sneakertech.com> <51640BDB.1020403@sneakertech.com> <51643F91.30704@sneakertech.com> In-Reply-To: <51643F91.30704@sneakertech.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: FreeBSD FS X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 09 Apr 2013 16:22:12 -0000 > I don't know the ahci/cam stack from a hole in the wall, but it seems to > me that if it can gracefully handle two drives dropping out and coming > back randomly, it ought to be able to handle three. I suppose it's > possible that zfs itself is not the root cause of the problem, but one > way or another there's some kind of interaction here, as I only > experience an issue when the pool is no longer solvent. Although, if there's some way I can definitively test to see if it's an ahci/cam issue or not, I'll be glad to try it. ______________________________________ it has a certain smooth-brained appeal From owner-freebsd-fs@FreeBSD.ORG Tue Apr 9 19:08:46 2013 Return-Path: Delivered-To: fs@FreeBSD.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 1B6D47DD; Tue, 9 Apr 2013 19:08:46 +0000 (UTC) (envelope-from ken@kdm.org) Received: from nargothrond.kdm.org (nargothrond.kdm.org [70.56.43.81]) by mx1.freebsd.org (Postfix) with ESMTP id 4ECBE3EF; Tue, 9 Apr 2013 19:08:45 +0000 (UTC) Received: from nargothrond.kdm.org (localhost [127.0.0.1]) by nargothrond.kdm.org (8.14.2/8.14.2) with ESMTP id r39J8dN3004006; Tue, 9 Apr 2013 13:08:39 -0600 (MDT) (envelope-from ken@nargothrond.kdm.org) Received: (from ken@localhost) by nargothrond.kdm.org (8.14.2/8.14.2/Submit) id r39J8cJh004005; Tue, 9 Apr 2013 13:08:38 -0600 (MDT) (envelope-from ken) Date: Tue, 9 Apr 2013 13:08:38 -0600 From: "Kenneth D. Merry" To: Bruce Evans Subject: Re: patches to add new stat(2) file flags Message-ID: <20130409190838.GA60733@nargothrond.kdm.org> References: <20130307000533.GA38950@nargothrond.kdm.org> <20130307222553.P981@besplex.bde.org> <20130308232155.GA47062@nargothrond.kdm.org> <20130310181127.D2309@besplex.bde.org> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="rwEMma7ioTxnRzrJ" Content-Disposition: inline In-Reply-To: <20130310181127.D2309@besplex.bde.org> User-Agent: Mutt/1.4.2i Cc: arch@FreeBSD.org, fs@FreeBSD.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 09 Apr 2013 19:08:46 -0000 --rwEMma7ioTxnRzrJ Content-Type: text/plain; charset=us-ascii Content-Disposition: inline On Sun, Mar 10, 2013 at 19:21:57 +1100, Bruce Evans wrote: > On Fri, 8 Mar 2013, Kenneth D. Merry wrote: > > >On Fri, Mar 08, 2013 at 00:37:15 +1100, Bruce Evans wrote: > >>On Wed, 6 Mar 2013, Kenneth D. Merry wrote: > >> > >>>I have attached diffs against head for some additional stat(2) file > >>>flags. > >>> > >>>The primary purpose of these flags is to improve compatibility with CIFS, > >>>both from the client and the server side. > >>>... > >> > >>I missed looking at the diffs in my previous reply. > >> > >>% --- //depot/users/kenm/FreeBSD-test3/bin/chflags/chflags.1 2013-03-04 > >>17:51:12.000000000 -0700 > >>% +++ /usr/home/kenm/perforce4/kenm/FreeBSD-test3/bin/chflags/chflags.1 > >>2013-03-04 17:51:12.000000000 -0700 > >>% --- /tmp/tmp.49594.86 2013-03-06 16:42:43.000000000 -0700 > >>% +++ /usr/home/kenm/perforce4/kenm/FreeBSD-test3/bin/chflags/chflags.1 > >>2013-03-06 14:47:25.987128763 -0700 > >>% @@ -117,6 +117,16 @@ > >>% set the user immutable flag (owner or super-user only) > >>% .It Cm uunlnk , uunlink > >>% set the user undeletable flag (owner or super-user only) > >>% +.It Cm system , usystem > >>% +set the Windows system flag (owner or super-user only) > >> > >>This begins unsorting of the list. > > > >Fixed. > > > >>It's not just a Windows flag, since it also works in DOS. > > > >Fixed. > > Thanks. Hopefully all the simple bugs are fixed now. > > >>"Owner or" is too strict for msdosfs, since files can only have a > >>single owner so it is controlling access using groups is needed. I > >>use owner root and group msdosfs for msdosfs mounts. This works for > >>normal operations like open/read/write, but fails for most attributes > >>including file flags. msdosfs doesn't support many attributes but > >>this change is supposed to add support for 3 new file flags so it would > >>be good if it didn't restrict the support to root. > > > >I wasn't trying to change the existing security model for msdosfs, but if > >you've got a suggested patch to fix it I can add that in. > > I can't think of anything better than making group write permission enough > for attributes. > > msdosfs also has some style bugs in this area. It uses VOP_ACCESS() > with VADMIN for the non-VA_UTIMES_NULL case of utimes(), but for all > other attributes it hard-codes a direct uid check followed a > priv_check_cred() with PRIV_VFS_ADMIN. VADMIN requires even more than > user write permission for POSIX file systems and using it unchanged > for all the attributes would be even more restrictive unless we changed > it, but it would be easier to make it uniformly less restrictive for > msdosfs by using it consistently. > > Oops, that was in the old version of ffs. ffs now has related > complications and unnecessary style bugs (verboseness and misformatting) > to support ACLs. It now uses VOP_ACCESSX() with VWRITE_ATTRIBUTES for > utimes(), and VOP_ACCESSX() with other VFOO for all attributes except > flags. It still uses VOP_ACCESS() with VADMIN() for flags. > > >>... > >>% .It Dv SF_ARCHIVED > >>... > >>% +Filesystems in FreeBSD may or may not have special handling for this > >>flag. > >>% +For instance, ZFS tracks changes to files and will clear this bit when > >>a > >>% +file is updated. > >>% +UFS only stores the flag, and relies on the application to change it > >>when > >>% +needed. > >> > >>I think that is useless, since changing it is needed whenever the file > >>changes, and applications can do that (short of running as daemons and > >>watching for changes). > > > >Do you mean applications can't do that or can? > > Oops, can't. > > It is still hard for users to know how their file system supports. > Even programmers don't know that it is backwards :-). > > >>% --- //depot/users/kenm/FreeBSD-test3/sys/fs/msdosfs/msdosfs_vnops.c > >>2013-03-04 17:51:12.000000000 -0700 > >>% +++ > >>/usr/home/kenm/perforce4/kenm/FreeBSD-test3/sys/fs/msdosfs/msdosfs_vnops.c > >>2013-03-04 17:51:12.000000000 -0700 > >>% --- /tmp/tmp.49594.370 2013-03-06 16:42:43.000000000 -0700 > >>% +++ > >>/usr/home/kenm/perforce4/kenm/FreeBSD-test3/sys/fs/msdosfs/msdosfs_vnops.c > >>2013-03-06 14:49:47.179130318 -0700 > >>% @@ -345,8 +345,17 @@ > >>% vap->va_birthtime.tv_nsec = 0; > >>% } > >>% vap->va_flags = 0; > >>% + /* > >>% + * The DOS Archive attribute means that a file needs to be > >>% + * archived. The BSD SF_ARCHIVED attribute means that a file has > >>% + * been archived. Thus the inversion here. > >>% + */ > >> > >>No need to document it again. It goes without saying that ARCHIVE > >>!= ARCHIVED. > > > >I disagree. It wasn't immediately obvious to me that SF_ARCHIVED was > >generally used as the inverse of the DOS Archived bit until I started > >digging into this. If this helps anyone figure that out more quickly, it's > >useful. > > The surprising thing is that it is backwards in FreeBSD and not really > supported except in msdosfs. Now several file systems have the comment > about it being inverted, but man pages still don't. I made the change to UF_ARCHIVE, and updated the man pages. > >>% @@ -420,12 +429,21 @@ > >>% if (error) > >>% return (error); > >>% } > >> > >>The permissions check before this is delicate and was broken and is > >>more broken now. It is still short-circuit to handle setting the > >>single flag that used to be supported, and is slightly broken for that: > >>- unprivileged user asks to set ARCHIVE by passing !SF_ARCHIVED. We > >> allow that, although this may toggle the flag and normal semantics > >> for SF flags is to not allow toggling. > >>- unprivileged user asks to clear ARCHIVE by passing SF_ARCHIVED. We > >> don't allow that. But we should allow preserving ARCHIVE if it is > >> already clear. > >>The bug wasn't very important when only 1 flag was supported. Now it > >>prevents unprivileged users managing the new UF flags if ARCHIVE is > >>clear. Fortunately, this is the unusual case. Anyway, unprivileged > >>users can set ARCHIVE by doing some other operation. Even the chflags() > >>operation should set ARCHIVE and thus allow further chflags()'s that now > >>keep ARCHIVE set. Except it is very confusing if a chflags() asks for > >>ARCHIVE to be clear. This request might be just to try to preserve > >>the current setting and not want it if other things are changed, or > >>it might be to purposely clear it. Changing it from set to clear should > >>still be privileged. > > > >I changed it to allow setting or clearing SF_ARCHIVED. Now I can set or > >clear the flag as non-root: > > Actually, it seems OK, since there are no old or new SF_ immututable flags. > Some of the actions are broken in the old and new code for directories -- > see below. > > >>See the more complicated permissions check in ffs. It would be safest > >>to duplicate most of it, to get different permissions checking for the > >>SF and UF flags. Then decide if we want to keep allowing setting > >>ARCHIVE without privilege. > > > >I think we should allow getting and setting SF_ARCHIVED without special > >privileges. Given how it is generally used, I don't think it should be > >restricted to the super-user. > > I don't really like that since changing the flags is mainly needed for > the failry privileged operation of managing other OS's file systems. > However, since we're mapping the SYSTEM flag to a UF_ flag, the SYSTEM > flag will require less privilege than the ARCHIVE flag. This is backwards, > so we might as well require less privilege for ARCHIVE too. I think we, > that is, you should use a new UF_ARCHIVE flag with the correct sense. Okay, done. The patches are attached with UF_ARCHIVE used instead of SF_ARCHIVED, with the sense reversed. > >Can you provide some code demonstrating how the permissions code should > >be changed in msdosfs? I don't know that much about that sort of thing, > >so I'll probably spend an inordinate amount of time stumbling > >through it. > > Now I think only cleanups are needed. Okay. > >>% return EOPNOTSUPP; > >>% if (vap->va_flags & SF_ARCHIVED) > >>% dep->de_Attributes &= ~ATTR_ARCHIVE; > >>% else if (!(dep->de_Attributes & ATTR_DIRECTORY)) > >>% dep->de_Attributes |= ATTR_ARCHIVE; > >> > >>The comment before this says that we ignore attmps to set ATTR_ARCHIVED > >>for directories. However, it is out of date. WinXP allows setting it > >>and all the new flags for directories, and so do we. > > > >Do you mean we allow setting it in UFS, or where? Obviously the code above > >won't set it on a directory. > > I meant it here. Actually, the comment matches the code -- I somehow missed > the test in the code. However, the code is wrong. All directories except > the root directory have this and other attributes, but FreeBSD refuses to > set them. More below. > > >>The WinXP attrib command (at least on a FAT32 fs) doesn't allow setting > >>or clearing ARCHIVE (even if it is already set or clear) if any of > >>HIDDEN, READONLY or SYSTEM is already set and remains set after the > >>command. Thus the HRS attributes act a bit like immutable flags, but > >>subtly differently. (ffs has the quite different and worse behaviour > >>of allowing chflags() irrespective of immutable flags being set before > >>or after, provided there is enough privilege to change the immutable > >>flags.) Anyway, they should all give some aspects of immutability. > > > >We could do that for msdosfs, but why add more things for the user to trip > >over given how the filesystem is typically used? Most people probably > >use it for USB thumb drives these days. Or perahps on a dual boot system > >to access their Windows partition. > > The small data drives won't have many files with attributes (except > ARCHIVE). For multiple-boot, I think the permssions shouldn't be too > much different than the foreign OS's. I used not to worry about this > and liked deleting WinXP files without asking it, but recently I spent > a lot of time recovering a WinXP ntfs partition and changed a bit too > much using FreeBSD and Cygwin because I didn't understand the > permissions (especially ACLs). ntfs in FreeBSD was less than r/o so it > couldn't even back up the permissions (for file flags, it returned the > garbage in its internal inode flags without translation...). > > >*** src/bin/chflags/chflags.1.orig > >--- src/bin/chflags/chflags.1 > >*************** > >*** 101,120 **** > > .Bl -tag -offset indent -width ".Cm opaque" > > .It Cm arch , archived > > set the archived flag (super-user only) > > .It Cm opaque > > set the opaque flag (owner or super-user only) > >- .It Cm nodump > >- set the nodump flag (owner or super-user only) > > .It Cm sappnd , sappend > > The opaque flag is UF_ too. Yes, but all of the flag descriptions are sorted in alphabetical order. How would you suggest sorting them instead? (SF first and then UF, both in some version of alphabetical order?) > >+ .It Cm snapshot > >+ set the snapshot flag (most filesystems do not allow changing this flag) > > I think none do. It can only be displayed. Fixed. > chflags(1) doesn't display flags, so this shouldn't be here. The problem > is that this man page is the only place where the flag names are documented. > ls(1) and strtofflags(3) just point to here. strtofflags(3) says that the > flag names are documented here, but ls(1) just has an Xref to here. I fixed ls(1) at least. > >*** src/lib/libc/sys/chflags.2.orig > >--- src/lib/libc/sys/chflags.2 > >--- 71,127 ---- > > the following values > > .Pp > > .Bl -tag -width ".Dv SF_IMMUTABLE" -compact -offset indent > >! .It Dv SF_APPEND > > The file may only be appended to. > > .It Dv SF_ARCHIVED > >! The file has been archived. > >! This flag means the opposite of the Windows and CIFS > >FILE_ATTRIBUTE_ARCHIVE > > DOS, Windows and CIFS... Fixed. > >! attribute. > >! That attribute means that the file should be archived, whereas > >! .Dv SF_ARCHIVED > >! means that the file has been archived. > >! Filesystems in FreeBSD may or may not have special handling for this > >flag. > >! For instance, ZFS tracks changes to files and will clear this bit when a > >! file is updated. > > Does zfs clear it in other circumstances? WinXP doesn't for msdosfs (or > ntfs?), but FreeBSD clears it when changing some attributes, even for > null changes (these are: times except for atimes, and the HIDDEN attribute > when it is changed by chmod() -- even for null changes --, but not for > the HIDDEN attribute when it is changed (or preserved) by chflags() in > your new code). I want to to be cleared for metadata so that backup > utilities can trust the ARCHIVE flag for metadata changes. Well, it does look like changing a file or touching it causes the archive flag to get set with ZFS: # touch foo # ls -lao foo -rw-r--r-- 1 root wheel uarch 0 Apr 8 21:45 foo # chflags 0 foo # ls -lao foo -rw-r--r-- 1 root wheel - 0 Apr 8 21:45 foo # echo "hello" >> foo # ls -lao foo -rw-r--r-- 1 root wheel uarch 6 Apr 8 21:46 foo # chflags 0 foo # ls -lao foo -rw-r--r-- 1 root wheel - 6 Apr 8 21:46 foo # touch foo # ls -lao foo -rw-r--r-- 1 root wheel uarch 6 Apr 8 21:46 foo > >+ .It Dv UF_IMMUTABLE > >+ The file may not be changed. > >+ Filesystems may use this flag to maintain compatibility with the Windows > >and > >+ CIFS FILE_ATTRIBUTE_READONLY attribute. > > So READONLY is only mapped to UFS_IMMUTABLE if it gives immutability? No, it's mapped to whatever the CIFS server decides. In my changes to Likewise, I mapped it to UF_IMMUTABLE. I mapped UF_IMMUTABLE to the ZFS READONLY flag. As Pawel pointed out, there has been some talk on the Illumos developers list about just storing the READONLY bit and not enforcing it in ZFS: http://www.listbox.com/member/archive/182179/2013/03/sort/time_rev/page/2/?search_for=readonly That complicates things somewhat in the Illumos CIFS server, and so I think it's a reasonable thing to just record the bit and let the CIFS server enforce things where it needs to. UFS does honor the UF_IMMUTABLE flag, so it may be that we need to create a UF_READONLY flag that corresponds to the DOS readonly flag and is only stored, and the enforcement would happen in the CIFS server. > >*** src/sys/fs/msdosfs/msdosfs_vnops.c.orig > >--- src/sys/fs/msdosfs/msdosfs_vnops.c > >*************** > >*** 415,431 **** > > * set ATTR_ARCHIVE for directories `cp -pr' from a more > > * sensible filesystem attempts it a lot. > > */ > >! if (vap->va_flags & SF_SETTABLE) { > > error = priv_check_cred(cred, PRIV_VFS_SYSFLAGS, 0); > > if (error) > > return (error); > > } > >! if (vap->va_flags & ~SF_ARCHIVED) > > return EOPNOTSUPP; > > if (vap->va_flags & SF_ARCHIVED) > > dep->de_Attributes &= ~ATTR_ARCHIVE; > > else if (!(dep->de_Attributes & ATTR_DIRECTORY)) > > dep->de_Attributes |= ATTR_ARCHIVE; > > dep->de_flag |= DE_MODIFIED; > > } > > > >--- 424,448 ---- > > * set ATTR_ARCHIVE for directories `cp -pr' from a more > > * sensible filesystem attempts it a lot. > > */ > >! if (vap->va_flags & (SF_SETTABLE & ~(SF_ARCHIVED))) { > > Excessive parentheses. Fixed, by moving to UF_ARCHIVE. > > error = priv_check_cred(cred, PRIV_VFS_SYSFLAGS, 0); > > if (error) > > return (error); > > } > > VADMIN is still needed, and that is too strict. This is a general problem > and should be fixed separately. I took out the check, since I changed the code to use UF_ARCHIVE instead of SF_ARCHIVED. > >! if (vap->va_flags & ~(SF_ARCHIVED | UF_HIDDEN | UF_SYSTEM)) > > return EOPNOTSUPP; > > if (vap->va_flags & SF_ARCHIVED) > > dep->de_Attributes &= ~ATTR_ARCHIVE; > > else if (!(dep->de_Attributes & ATTR_DIRECTORY)) > > dep->de_Attributes |= ATTR_ARCHIVE; > >+ if (vap->va_flags & UF_HIDDEN) > >+ dep->de_Attributes |= ATTR_HIDDEN; > >+ else > >+ dep->de_Attributes &= ~ATTR_HIDDEN; > >+ if (vap->va_flags & UF_SYSTEM) > >+ dep->de_Attributes |= ATTR_SYSTEM; > >+ else > >+ dep->de_Attributes &= ~ATTR_SYSTEM; > > dep->de_flag |= DE_MODIFIED; > > } > > Technical old and new problems with msdosfs: > - all directories except the root directory support the 3 attributes > handled above, and READONLY > - the special case for the root directory is because before FAT32, the > root directory didn't have an entry for itself (and was otherwise > special). With FAT32, the root directory is not so special, but > still doesn't have an entry for itself. > - thus the old code in the above is wrong for all directories except > the root directory > - thus the new code in the above is wrong for the root directory. It > will make changes to the in-core denode. These can be seen by stat() > for a while, but go away when the vnode is recycled. > - other code is wrong for directories too. deupdat() refuses to > convert from the in-core denode to the disk directory entry for > directories. So even when the above changes values for directories, > the changes only get synced to the disk accidentally when there is > a large change (such as for extending the directory), to the directory > entry. > - being the root directory is best tested for using VV_ROOT. I use the > following to fix the corresponding bugs in utimes(): > > /* Was: silently ignore the non-error or error for all dirs. > */ > if (DETOV(dep)->v_vflag & VV_ROOT) > return (EINVAL); > /* Otherwise valid. */ > > deupdat() needs a similar change to not ignore all directories. Okay, I think these issues should now be fixed. We now refuse to change attributes only on the root directory. And I updatd deupdat() to do the same. When a directory is created or a file is added, the archive bit is not changed on the directory. Not sure if we need to do that or not. (Simply changing msdosfs_mkdir() to set ATTR_ARCHIVE was not enough to get the archive bit set on directory creation.) Ken -- Kenneth Merry ken@FreeBSD.ORG --rwEMma7ioTxnRzrJ Content-Type: text/plain; charset=us-ascii Content-Disposition: attachment; filename="file_flags_head.20130409.3.txt" *** src/bin/chflags/chflags.1.orig --- src/bin/chflags/chflags.1 *************** *** 32,38 **** .\" @(#)chflags.1 8.4 (Berkeley) 5/2/95 .\" $FreeBSD: head/bin/chflags/chflags.1 213573 2010-10-08 12:40:16Z uqs $ .\" ! .Dd March 3, 2006 .Dt CHFLAGS 1 .Os .Sh NAME --- 32,38 ---- .\" @(#)chflags.1 8.4 (Berkeley) 5/2/95 .\" $FreeBSD: head/bin/chflags/chflags.1 213573 2010-10-08 12:40:16Z uqs $ .\" ! .Dd April 8, 2013 .Dt CHFLAGS 1 .Os .Sh NAME *************** *** 101,120 **** .Bl -tag -offset indent -width ".Cm opaque" .It Cm arch , archived set the archived flag (super-user only) .It Cm opaque set the opaque flag (owner or super-user only) - .It Cm nodump - set the nodump flag (owner or super-user only) .It Cm sappnd , sappend set the system append-only flag (super-user only) .It Cm schg , schange , simmutable set the system immutable flag (super-user only) .It Cm sunlnk , sunlink set the system undeletable flag (super-user only) .It Cm uappnd , uappend set the user append-only flag (owner or super-user only) .It Cm uchg , uchange , uimmutable set the user immutable flag (owner or super-user only) .It Cm uunlnk , uunlink set the user undeletable flag (owner or super-user only) .El --- 101,134 ---- .Bl -tag -offset indent -width ".Cm opaque" .It Cm arch , archived set the archived flag (super-user only) + .It Cm nodump + set the nodump flag (owner or super-user only) .It Cm opaque set the opaque flag (owner or super-user only) .It Cm sappnd , sappend set the system append-only flag (super-user only) .It Cm schg , schange , simmutable set the system immutable flag (super-user only) + .It Cm snapshot + set the snapshot flag (filesystems do not allow changing this flag) .It Cm sunlnk , sunlink set the system undeletable flag (super-user only) .It Cm uappnd , uappend set the user append-only flag (owner or super-user only) + .It Cm uarch , uarchive + set the archive flag (owner or super-user only) .It Cm uchg , uchange , uimmutable set the user immutable flag (owner or super-user only) + .It Cm uhidden , hidden + set the hidden file attribute (owner or super-user only) + .It Cm uoffline , offline + set the offline file attribute (owner or super-user only) + .It Cm usparse , sparse + set the sparse file attribute (owner or super-user only) + .It Cm usystem , system + set the DOS and Windows system flag (owner or super-user only) + .It Cm ureparse , reparse + set the Windows reparse point file attribute (owner or super-user only) .It Cm uunlnk , uunlink set the user undeletable flag (owner or super-user only) .El *** src/bin/ls/ls.1.orig --- src/bin/ls/ls.1 *************** *** 232,237 **** --- 232,240 ---- Include the file flags in a long .Pq Fl l output. + See + .Xr chflags 1 + for a list of file flags and their meanings. .It Fl p Write a slash .Pq Ql / *** src/lib/libc/gen/strtofflags.c.orig --- src/lib/libc/gen/strtofflags.c *************** *** 62,74 **** #endif { "nouappnd", 0, UF_APPEND }, { "nouappend", 0, UF_APPEND }, { "nouchg", 0, UF_IMMUTABLE }, { "nouchange", 0, UF_IMMUTABLE }, { "nouimmutable", 0, UF_IMMUTABLE }, { "nodump", 1, UF_NODUMP }, { "noopaque", 0, UF_OPAQUE }, ! { "nouunlnk", 0, UF_NOUNLINK }, ! { "nouunlink", 0, UF_NOUNLINK } }; #define nmappings (sizeof(mapping) / sizeof(mapping[0])) --- 62,86 ---- #endif { "nouappnd", 0, UF_APPEND }, { "nouappend", 0, UF_APPEND }, + { "nouarch", 0, UF_ARCHIVE }, + { "nouarchive", 0, UF_ARCHIVE }, + { "nohidden", 0, UF_HIDDEN, }, + { "nouhidden", 0, UF_HIDDEN, }, { "nouchg", 0, UF_IMMUTABLE }, { "nouchange", 0, UF_IMMUTABLE }, { "nouimmutable", 0, UF_IMMUTABLE }, { "nodump", 1, UF_NODUMP }, + { "nouunlnk", 0, UF_NOUNLINK }, + { "nouunlink", 0, UF_NOUNLINK }, + { "nooffline", 0, UF_OFFLINE, }, + { "nouoffline", 0, UF_OFFLINE, }, { "noopaque", 0, UF_OPAQUE }, ! { "noreparse", 0, UF_REPARSE, }, ! { "noureparse", 0, UF_REPARSE, }, ! { "nosparse", 0, UF_SPARSE, }, ! { "nousparse", 0, UF_SPARSE, }, ! { "nosystem", 0, UF_SYSTEM, }, ! { "nousystem", 0, UF_SYSTEM, } }; #define nmappings (sizeof(mapping) / sizeof(mapping[0])) *** src/lib/libc/sys/chflags.2.orig --- src/lib/libc/sys/chflags.2 *************** *** 112,137 **** the following values .Pp .Bl -tag -width ".Dv SF_IMMUTABLE" -compact -offset indent ! .It Dv UF_NODUMP ! Do not dump the file. ! .It Dv UF_IMMUTABLE ! The file may not be changed. ! .It Dv UF_APPEND The file may only be appended to. - .It Dv UF_NOUNLINK - The file may not be renamed or deleted. - .It Dv UF_OPAQUE - The directory is opaque when viewed through a union stack. .It Dv SF_ARCHIVED ! The file may be archived. .It Dv SF_IMMUTABLE The file may not be changed. - .It Dv SF_APPEND - The file may only be appended to. .It Dv SF_NOUNLINK The file may not be renamed or deleted. .It Dv SF_SNAPSHOT The file is a snapshot file. .El .Pp If one of --- 112,170 ---- the following values .Pp .Bl -tag -width ".Dv SF_IMMUTABLE" -compact -offset indent ! .It Dv SF_APPEND The file may only be appended to. .It Dv SF_ARCHIVED ! The file has been archived. ! This flag means the opposite of the DOS, Windows and CIFS ! FILE_ATTRIBUTE_ARCHIVE attribute. ! This flag has been deprecated, and may be removed in a future release. .It Dv SF_IMMUTABLE The file may not be changed. .It Dv SF_NOUNLINK The file may not be renamed or deleted. .It Dv SF_SNAPSHOT The file is a snapshot file. + .It Dv UF_APPEND + The file may only be appended to. + .It Dv UF_ARCHIVE + The file needs to be archived. + This flag has the same meaning as the DOS, Windows and CIFS + FILE_ATTRIBUTE_ARCHIVE attribute. + Filesystems in FreeBSD may or may not have special handling for this flag. + For instance, ZFS tracks changes to files and will set this bit when a + file is updated. + UFS only stores the flag, and relies on the application to change it when + needed. + .It Dv UF_HIDDEN + The file may be hidden from directory listings at the application's + discretion. + The file has the DOS, Windows and CIFS FILE_ATTRIBUTE_HIDDEN attribute. + .It Dv UF_IMMUTABLE + The file may not be changed. + Filesystems may use this flag to maintain compatibility with the DOS, Windows + and CIFS FILE_ATTRIBUTE_READONLY attribute. + .It Dv UF_NODUMP + Do not dump the file. + .It Dv UF_NOUNLINK + The file may not be renamed or deleted. + .It Dv UF_OFFLINE + The file is offline, or has the Windows and CIFS FILE_ATTRIBUTE_OFFLINE + attribute. + Filesystems in FreeBSD store and display this flag, but do not provide any + special handling when it is set. + .It Dv UF_OPAQUE + The directory is opaque when viewed through a union stack. + .It Dv UF_REPARSE + The file contains a Windows reparse point and has the Windows and CIFS + FILE_ATTRIBUTE_REPARSE_POINT attribute. + .It Dv UF_SPARSE + The file has the Windows FILE_ATTRIBUTE_SPARSE_FILE attribute. + This may also be used by a filesystem to indicate a sparse file. + .It Dv UF_SYSTEM + The file has the DOS, Windows and CIFS FILE_ATTRIBUTE_SYSTEM attribute. + Filesystems in FreeBSD may store and display this flag, but do not provide + any special handling when it is set. .El .Pp If one of *************** *** 162,167 **** --- 195,207 ---- .Xr init 8 for details.) .Pp + The implementation of all flags is filesystem-dependent. + See the description of the + .Dv UF_ARCHIVE + flag above for one example of the differences in behavior. + Care should be exercised when writing applications to account for + support or lack of support of these flags in various filesystems. + .Pp The .Dv SF_SNAPSHOT flag is maintained by the system and cannot be toggled. *** src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c.orig --- src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c *************** *** 6067,6072 **** --- 6067,6080 ---- XVA_SET_REQ(&xvap, XAT_APPENDONLY); XVA_SET_REQ(&xvap, XAT_NOUNLINK); XVA_SET_REQ(&xvap, XAT_NODUMP); + XVA_SET_REQ(&xvap, XAT_READONLY); + XVA_SET_REQ(&xvap, XAT_ARCHIVE); + XVA_SET_REQ(&xvap, XAT_SYSTEM); + XVA_SET_REQ(&xvap, XAT_HIDDEN); + XVA_SET_REQ(&xvap, XAT_REPARSE); + XVA_SET_REQ(&xvap, XAT_OFFLINE); + XVA_SET_REQ(&xvap, XAT_SPARSE); + error = zfs_getattr(ap->a_vp, (vattr_t *)&xvap, 0, ap->a_cred, NULL); if (error != 0) return (error); *************** *** 6082,6089 **** --- 6090,6112 ---- xvap.xva_xoptattrs.xoa_appendonly); FLAG_CHECK(SF_NOUNLINK, XAT_NOUNLINK, xvap.xva_xoptattrs.xoa_nounlink); + FLAG_CHECK(UF_ARCHIVE, XAT_ARCHIVE, + xvap.xva_xoptattrs.xoa_archive); FLAG_CHECK(UF_NODUMP, XAT_NODUMP, xvap.xva_xoptattrs.xoa_nodump); + FLAG_CHECK(UF_IMMUTABLE, XAT_READONLY, + xvap.xva_xoptattrs.xoa_readonly); + FLAG_CHECK(UF_SYSTEM, XAT_SYSTEM, + xvap.xva_xoptattrs.xoa_system); + FLAG_CHECK(UF_HIDDEN, XAT_HIDDEN, + xvap.xva_xoptattrs.xoa_hidden); + FLAG_CHECK(UF_REPARSE, XAT_REPARSE, + xvap.xva_xoptattrs.xoa_reparse); + FLAG_CHECK(UF_OFFLINE, XAT_OFFLINE, + xvap.xva_xoptattrs.xoa_offline); + FLAG_CHECK(UF_SPARSE, XAT_SPARSE, + xvap.xva_xoptattrs.xoa_sparse); + #undef FLAG_CHECK *vap = xvap.xva_vattr; vap->va_flags = fflags; *************** *** 6121,6127 **** return (EOPNOTSUPP); fflags = vap->va_flags; ! if ((fflags & ~(SF_IMMUTABLE|SF_APPEND|SF_NOUNLINK|UF_NODUMP)) != 0) return (EOPNOTSUPP); /* * Unprivileged processes are not permitted to unset system --- 6144,6159 ---- return (EOPNOTSUPP); fflags = vap->va_flags; ! /* ! * XXX KDM ! * We need to figure out whether it makes sense to allow ! * UF_REPARSE through, since we don't really have other ! * facilities to handle reparse points and zfs_setattr() ! * doesn't currently allow setting that attribute anyway. ! */ ! if ((fflags & ~(SF_IMMUTABLE|SF_APPEND|SF_NOUNLINK|UF_ARCHIVE| ! UF_NODUMP|UF_IMMUTABLE|UF_SYSTEM|UF_HIDDEN|UF_REPARSE| ! UF_OFFLINE|UF_SPARSE)) != 0) return (EOPNOTSUPP); /* * Unprivileged processes are not permitted to unset system *************** *** 6173,6180 **** --- 6205,6226 ---- xvap.xva_xoptattrs.xoa_appendonly); FLAG_CHANGE(SF_NOUNLINK, ZFS_NOUNLINK, XAT_NOUNLINK, xvap.xva_xoptattrs.xoa_nounlink); + FLAG_CHANGE(UF_ARCHIVE, ZFS_ARCHIVE, XAT_ARCHIVE, + xvap.xva_xoptattrs.xoa_archive); FLAG_CHANGE(UF_NODUMP, ZFS_NODUMP, XAT_NODUMP, xvap.xva_xoptattrs.xoa_nodump); + FLAG_CHANGE(UF_IMMUTABLE, ZFS_READONLY, XAT_READONLY, + xvap.xva_xoptattrs.xoa_readonly); + FLAG_CHANGE(UF_SYSTEM, ZFS_SYSTEM, XAT_SYSTEM, + xvap.xva_xoptattrs.xoa_system); + FLAG_CHANGE(UF_HIDDEN, ZFS_HIDDEN, XAT_HIDDEN, + xvap.xva_xoptattrs.xoa_hidden); + FLAG_CHANGE(UF_REPARSE, ZFS_REPARSE, XAT_REPARSE, + xvap.xva_xoptattrs.xoa_hidden); + FLAG_CHANGE(UF_OFFLINE, ZFS_OFFLINE, XAT_OFFLINE, + xvap.xva_xoptattrs.xoa_offline); + FLAG_CHANGE(UF_SPARSE, ZFS_SPARSE, XAT_SPARSE, + xvap.xva_xoptattrs.xoa_sparse); #undef FLAG_CHANGE } return (zfs_setattr(vp, (vattr_t *)&xvap, 0, cred, NULL)); *** src/sys/fs/msdosfs/msdosfs_denode.c.orig --- src/sys/fs/msdosfs/msdosfs_denode.c *************** *** 300,307 **** if ((dep->de_flag & DE_MODIFIED) == 0) return (0); dep->de_flag &= ~DE_MODIFIED; ! if (dep->de_Attributes & ATTR_DIRECTORY) ! return (0); if (dep->de_refcnt <= 0) return (0); error = readde(dep, &bp, &dirp); --- 300,309 ---- if ((dep->de_flag & DE_MODIFIED) == 0) return (0); dep->de_flag &= ~DE_MODIFIED; ! /* Was: silently ignore attribute changes for all dirs. */ ! if (DETOV(dep)->v_vflag & VV_ROOT) ! return (EINVAL); ! /* Otherwise valid. */ if (dep->de_refcnt <= 0) return (0); error = readde(dep, &bp, &dirp); *** src/sys/fs/msdosfs/msdosfs_vnops.c.orig --- src/sys/fs/msdosfs/msdosfs_vnops.c *************** *** 345,352 **** vap->va_birthtime.tv_nsec = 0; } vap->va_flags = 0; ! if ((dep->de_Attributes & ATTR_ARCHIVE) == 0) ! vap->va_flags |= SF_ARCHIVED; vap->va_gen = 0; vap->va_blocksize = pmp->pm_bpcluster; vap->va_bytes = --- 345,356 ---- vap->va_birthtime.tv_nsec = 0; } vap->va_flags = 0; ! if (dep->de_Attributes & ATTR_ARCHIVE) ! vap->va_flags |= UF_ARCHIVE; ! if (dep->de_Attributes & ATTR_HIDDEN) ! vap->va_flags |= UF_HIDDEN; ! if (dep->de_Attributes & ATTR_SYSTEM) ! vap->va_flags |= UF_SYSTEM; vap->va_gen = 0; vap->va_blocksize = pmp->pm_bpcluster; vap->va_bytes = *************** *** 398,403 **** --- 402,418 ---- if (vap->va_flags != VNOVAL) { if (vp->v_mount->mnt_flag & MNT_RDONLY) return (EROFS); + /* + * We don't allow setting attributes on the root directory, + * because according to Bruce Evans: "The special case for + * the root directory is because before FAT32, the root + * directory didn't have an entry for itself (and was + * otherwise special). With FAT32, the root directory is + * not so special, but still doesn't have an entry for itself." + */ + if (vp->v_vflag & VV_ROOT) + return (EINVAL); + if (cred->cr_uid != pmp->pm_uid) { error = priv_check_cred(cred, PRIV_VFS_ADMIN, 0); if (error) *************** *** 408,431 **** * attributes. We ignored the access time and the * read and execute bits. We were strict for the other * attributes. - * - * Here we are strict, stricter than ufs in not allowing - * users to attempt to set SF_SETTABLE bits or anyone to - * set unsupported bits. However, we ignore attempts to - * set ATTR_ARCHIVE for directories `cp -pr' from a more - * sensible filesystem attempts it a lot. */ ! if (vap->va_flags & SF_SETTABLE) { ! error = priv_check_cred(cred, PRIV_VFS_SYSFLAGS, 0); ! if (error) ! return (error); ! } ! if (vap->va_flags & ~SF_ARCHIVED) return EOPNOTSUPP; ! if (vap->va_flags & SF_ARCHIVED) dep->de_Attributes &= ~ATTR_ARCHIVE; ! else if (!(dep->de_Attributes & ATTR_DIRECTORY)) ! dep->de_Attributes |= ATTR_ARCHIVE; dep->de_flag |= DE_MODIFIED; } --- 423,443 ---- * attributes. We ignored the access time and the * read and execute bits. We were strict for the other * attributes. */ ! if (vap->va_flags & ~(UF_ARCHIVE | UF_HIDDEN | UF_SYSTEM)) return EOPNOTSUPP; ! if (vap->va_flags & UF_ARCHIVE) ! dep->de_Attributes |= ATTR_ARCHIVE; ! else dep->de_Attributes &= ~ATTR_ARCHIVE; ! if (vap->va_flags & UF_HIDDEN) ! dep->de_Attributes |= ATTR_HIDDEN; ! else ! dep->de_Attributes &= ~ATTR_HIDDEN; ! if (vap->va_flags & UF_SYSTEM) ! dep->de_Attributes |= ATTR_SYSTEM; ! else ! dep->de_Attributes &= ~ATTR_SYSTEM; dep->de_flag |= DE_MODIFIED; } *** src/sys/fs/smbfs/smbfs_node.c.orig --- src/sys/fs/smbfs/smbfs_node.c *************** *** 370,379 **** if (diff > 2) /* XXX should be configurable */ return ENOENT; va->va_type = vp->v_type; /* vnode type (for create) */ if (vp->v_type == VREG) { va->va_mode = smp->sm_file_mode; /* files access mode and type */ ! if (np->n_dosattr & SMB_FA_RDONLY) va->va_mode &= ~(S_IWUSR|S_IWGRP|S_IWOTH); } else if (vp->v_type == VDIR) { va->va_mode = smp->sm_dir_mode; /* files access mode and type */ } else --- 370,382 ---- if (diff > 2) /* XXX should be configurable */ return ENOENT; va->va_type = vp->v_type; /* vnode type (for create) */ + va->va_flags = 0; /* flags defined for file */ if (vp->v_type == VREG) { va->va_mode = smp->sm_file_mode; /* files access mode and type */ ! if (np->n_dosattr & SMB_FA_RDONLY) { va->va_mode &= ~(S_IWUSR|S_IWGRP|S_IWOTH); + va->va_flags |= UF_IMMUTABLE; + } } else if (vp->v_type == VDIR) { va->va_mode = smp->sm_dir_mode; /* files access mode and type */ } else *************** *** 390,396 **** va->va_mtime = np->n_mtime; va->va_atime = va->va_ctime = va->va_mtime; /* time file changed */ va->va_gen = VNOVAL; /* generation number of file */ ! va->va_flags = 0; /* flags defined for file */ va->va_rdev = NODEV; /* device the special file represents */ va->va_bytes = va->va_size; /* bytes of disk space held by file */ va->va_filerev = 0; /* file modification number */ --- 393,407 ---- va->va_mtime = np->n_mtime; va->va_atime = va->va_ctime = va->va_mtime; /* time file changed */ va->va_gen = VNOVAL; /* generation number of file */ ! if (np->n_dosattr & SMB_FA_HIDDEN) ! va->va_flags |= UF_HIDDEN; ! if (np->n_dosattr & SMB_FA_SYSTEM) ! va->va_flags |= UF_SYSTEM; ! /* ! * We don't set the archive bit for directories. ! */ ! if ((vp->v_type != VDIR) && (np->n_dosattr & SMB_FA_ARCHIVE)) ! va->va_flags |= UF_ARCHIVE; va->va_rdev = NODEV; /* device the special file represents */ va->va_bytes = va->va_size; /* bytes of disk space held by file */ va->va_filerev = 0; /* file modification number */ *** src/sys/fs/smbfs/smbfs_vnops.c.orig --- src/sys/fs/smbfs/smbfs_vnops.c *************** *** 305,320 **** int old_n_dosattr; SMBVDEBUG("\n"); - if (vap->va_flags != VNOVAL) - return EOPNOTSUPP; isreadonly = (vp->v_mount->mnt_flag & MNT_RDONLY); /* * Disallow write attempts if the filesystem is mounted read-only. */ if ((vap->va_uid != (uid_t)VNOVAL || vap->va_gid != (gid_t)VNOVAL || vap->va_atime.tv_sec != VNOVAL || vap->va_mtime.tv_sec != VNOVAL || ! vap->va_mode != (mode_t)VNOVAL) && isreadonly) return EROFS; scred = smbfs_malloc_scred(); smb_makescred(scred, td, ap->a_cred); if (vap->va_size != VNOVAL) { --- 305,334 ---- int old_n_dosattr; SMBVDEBUG("\n"); isreadonly = (vp->v_mount->mnt_flag & MNT_RDONLY); /* * Disallow write attempts if the filesystem is mounted read-only. */ if ((vap->va_uid != (uid_t)VNOVAL || vap->va_gid != (gid_t)VNOVAL || vap->va_atime.tv_sec != VNOVAL || vap->va_mtime.tv_sec != VNOVAL || ! vap->va_mode != (mode_t)VNOVAL || vap->va_flags != VNOVAL) && ! isreadonly) return EROFS; + + /* + * We only support setting five flags. Don't allow setting others. + * + * We map both SF_IMMUTABLE and UF_IMMUTABLE to SMB_FA_RDONLY for + * setting attributes. This is compatible with the MacOS X version + * of this code. SMB_FA_RDONLY translates only to UF_IMMUTABLE + * when getting attributes. + */ + if (vap->va_flags != VNOVAL) { + if (vap->va_flags & ~(UF_IMMUTABLE|UF_HIDDEN|UF_SYSTEM| + UF_ARCHIVE|SF_IMMUTABLE)) + return EINVAL; + } + scred = smbfs_malloc_scred(); smb_makescred(scred, td, ap->a_cred); if (vap->va_size != VNOVAL) { *************** *** 353,364 **** goto out; } } ! if (vap->va_mode != (mode_t)VNOVAL) { old_n_dosattr = np->n_dosattr; ! if (vap->va_mode & S_IWUSR) ! np->n_dosattr &= ~SMB_FA_RDONLY; ! else ! np->n_dosattr |= SMB_FA_RDONLY; if (np->n_dosattr != old_n_dosattr) { error = smbfs_smb_setpattr(np, np->n_dosattr, NULL, scred); if (error) --- 367,413 ---- goto out; } } ! if ((vap->va_flags != VNOVAL) || (vap->va_mode != (mode_t)VNOVAL)) { old_n_dosattr = np->n_dosattr; ! ! if (vap->va_mode != (mode_t)VNOVAL) { ! if (vap->va_mode & S_IWUSR) ! np->n_dosattr &= ~SMB_FA_RDONLY; ! else ! np->n_dosattr |= SMB_FA_RDONLY; ! } ! ! if (vap->va_flags != VNOVAL) { ! if (vap->va_flags & UF_HIDDEN) ! np->n_dosattr |= SMB_FA_HIDDEN; ! else ! np->n_dosattr &= ~SMB_FA_HIDDEN; ! ! if (vap->va_flags & UF_SYSTEM) ! np->n_dosattr |= SMB_FA_SYSTEM; ! else ! np->n_dosattr &= ~SMB_FA_SYSTEM; ! ! if (vap->va_flags & UF_ARCHIVE) ! np->n_dosattr |= SMB_FA_ARCHIVE; ! else ! np->n_dosattr &= ~SMB_FA_ARCHIVE; ! ! /* ! * We only support setting the immutable / readonly ! * bit for regular files. According to comments in ! * the MacOS X version of this code, supporting the ! * readonly bit on directories doesn't do the same ! * thing in Windows as in Unix. ! */ ! if (vp->v_type == VREG) { ! if (vap->va_flags & (UF_IMMUTABLE|SF_IMMUTABLE)) ! np->n_dosattr |= SMB_FA_RDONLY; ! else ! np->n_dosattr &= ~SMB_FA_RDONLY; ! } ! } ! if (np->n_dosattr != old_n_dosattr) { error = smbfs_smb_setpattr(np, np->n_dosattr, NULL, scred); if (error) *** src/sys/sys/stat.h.orig --- src/sys/sys/stat.h *************** *** 265,272 **** #define UF_NODUMP 0x00000001 /* do not dump file */ #define UF_IMMUTABLE 0x00000002 /* file may not be changed */ #define UF_APPEND 0x00000004 /* writes to file may only append */ ! #define UF_OPAQUE 0x00000008 /* directory is opaque wrt. union */ ! #define UF_NOUNLINK 0x00000010 /* file may not be removed or renamed */ /* * Super-user changeable flags. */ --- 265,289 ---- #define UF_NODUMP 0x00000001 /* do not dump file */ #define UF_IMMUTABLE 0x00000002 /* file may not be changed */ #define UF_APPEND 0x00000004 /* writes to file may only append */ ! #define UF_OPAQUE 0x00000008 /* directory is opaque wrt. union */ ! #define UF_NOUNLINK 0x00000010 /* file may not be removed or renamed */ ! /* ! * These two bits are defined in MacOS X. They are not currently used in ! * FreeBSD. ! */ ! #if 0 ! #define UF_COMPRESSED 0x00000020 /* file is compressed */ ! #define UF_TRACKED 0x00000040 /* renames and deletes are tracked */ ! #endif ! ! #define UF_SYSTEM 0x00000080 /* Windows system file bit */ ! #define UF_SPARSE 0x00000100 /* sparse file */ ! #define UF_OFFLINE 0x00000200 /* file is offline */ ! #define UF_REPARSE 0x00000400 /* Windows reparse point file bit */ ! #define UF_ARCHIVE 0x00000800 /* file needs to be archived */ ! /* This is the same as the MacOS X definition of UF_HIDDEN. */ ! #define UF_HIDDEN 0x00008000 /* file is hidden */ ! /* * Super-user changeable flags. */ *** src/sys/ufs/ufs/ufs_vnops.c.orig --- src/sys/ufs/ufs/ufs_vnops.c *************** *** 528,536 **** return (EINVAL); } if (vap->va_flags != VNOVAL) { ! if ((vap->va_flags & ~(UF_NODUMP | UF_IMMUTABLE | UF_APPEND | ! UF_OPAQUE | UF_NOUNLINK | SF_ARCHIVED | SF_IMMUTABLE | ! SF_APPEND | SF_NOUNLINK | SF_SNAPSHOT)) != 0) return (EOPNOTSUPP); if (vp->v_mount->mnt_flag & MNT_RDONLY) return (EROFS); --- 528,538 ---- return (EINVAL); } if (vap->va_flags != VNOVAL) { ! if ((vap->va_flags & ~(SF_APPEND | SF_ARCHIVED | SF_IMMUTABLE | ! SF_NOUNLINK | SF_SNAPSHOT | UF_APPEND | UF_ARCHIVE | ! UF_HIDDEN | UF_IMMUTABLE | UF_NODUMP | UF_NOUNLINK | ! UF_OFFLINE | UF_OPAQUE | UF_REPARSE | UF_SPARSE | ! UF_SYSTEM)) != 0) return (EOPNOTSUPP); if (vp->v_mount->mnt_flag & MNT_RDONLY) return (EROFS); --rwEMma7ioTxnRzrJ-- From owner-freebsd-fs@FreeBSD.ORG Tue Apr 9 19:16:32 2013 Return-Path: Delivered-To: fs@FreeBSD.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id CAE679D9; Tue, 9 Apr 2013 19:16:32 +0000 (UTC) (envelope-from ken@kdm.org) Received: from nargothrond.kdm.org (nargothrond.kdm.org [70.56.43.81]) by mx1.freebsd.org (Postfix) with ESMTP id 9643E65A; Tue, 9 Apr 2013 19:16:32 +0000 (UTC) Received: from nargothrond.kdm.org (localhost [127.0.0.1]) by nargothrond.kdm.org (8.14.2/8.14.2) with ESMTP id r39JGW2j004514; Tue, 9 Apr 2013 13:16:32 -0600 (MDT) (envelope-from ken@nargothrond.kdm.org) Received: (from ken@localhost) by nargothrond.kdm.org (8.14.2/8.14.2/Submit) id r39JGW6n004513; Tue, 9 Apr 2013 13:16:32 -0600 (MDT) (envelope-from ken) Date: Tue, 9 Apr 2013 13:16:32 -0600 From: "Kenneth D. Merry" To: Pawel Jakub Dawidek Subject: Re: patches to add new stat(2) file flags Message-ID: <20130409191632.GA4480@nargothrond.kdm.org> References: <20130307000533.GA38950@nargothrond.kdm.org> <20130307214649.X981@besplex.bde.org> <20130314232449.GC1446@garage.freebsd.pl> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20130314232449.GC1446@garage.freebsd.pl> User-Agent: Mutt/1.4.2i Cc: arch@FreeBSD.org, fs@FreeBSD.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 09 Apr 2013 19:16:32 -0000 On Fri, Mar 15, 2013 at 00:24:50 +0100, Pawel Jakub Dawidek wrote: > On Thu, Mar 07, 2013 at 10:21:38PM +1100, Bruce Evans wrote: > > On Wed, 6 Mar 2013, Kenneth D. Merry wrote: > > > > > I have attached diffs against head for some additional stat(2) file flags. > > > > > > The primary purpose of these flags is to improve compatibility with CIFS, > > > both from the client and the server side. > > > ... > > > UF_IMMUTABLE: Command line name: "uchg", "uimmutable" > > > ZFS name: XAT_READONLY, ZFS_READONLY > > > Windows: FILE_ATTRIBUTE_READONLY > > > > > > This flag means that the file may not be modified. > > > This is not a new flag, but where applicable it is > > > mapped to the Windows readonly bit. ZFS and UFS > > > now both support the flag and enforce it. > > > > > > The behavior of this flag is compatible with MacOS X. > > > > This is incompatible with mapping the DOS read-only attribute to the > > non-writeable file permission in msdosfs. msdosfs does this mainly to > > get at least one useful file permission, but the semantics are subtly > > different from all of file permissions, UF_IMMUTABLE and SF_IMMUTABLE. > > I think it should be a new flag. > > I agree, especially that I saw some discussion recently on Illumos > mailing lists to not enforce this flag in ZFS, which would be confusing > to FreeBSD users if we forget to _not_ merge that change. Do we know whether the change to disable enforcement of the ZFS readonly attribute actually went into Illumos? I'm fine with creating a new flag, say UF_READONLY, and mapping it to a disabled ZFS readonly attribute. We can let the CIFS servers enforce it, as Gordon Ross proposed for the Illumos CIFS server. Ken -- Kenneth Merry ken@FreeBSD.ORG From owner-freebsd-fs@FreeBSD.ORG Tue Apr 9 19:36:11 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 2E15CDC6 for ; Tue, 9 Apr 2013 19:36:11 +0000 (UTC) (envelope-from marck@rinet.ru) Received: from woozle.rinet.ru (woozle.rinet.ru [195.54.192.68]) by mx1.freebsd.org (Postfix) with ESMTP id AB59B7D4 for ; Tue, 9 Apr 2013 19:36:10 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by woozle.rinet.ru (8.14.5/8.14.5) with ESMTP id r39Ja1r8068356; Tue, 9 Apr 2013 23:36:01 +0400 (MSK) (envelope-from marck@rinet.ru) Date: Tue, 9 Apr 2013 23:36:01 +0400 (MSK) From: Dmitry Morozovsky To: Kai Gallasch Subject: Re: FreeBSD 9.1 and swap on zfs In-Reply-To: Message-ID: References: <9407C6ED-3B4C-4BA2-8B88-F8A998E0A847@free.de> <5162CBE8.5050104@madpilot.net> User-Agent: Alpine 2.00 (BSF 1167 2008-08-23) X-NCC-RegID: ru.rinet X-OpenPGP-Key-ID: 6B691B03 MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Cc: freebsd-fs@freebsd.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 09 Apr 2013 19:36:11 -0000 On Tue, 9 Apr 2013, Kai Gallasch wrote: > I think I'll repartition and swap to a gmirror device then. > This also has the advantage of being able to write kernel dumps to the swap (not possible with a ZVOL bases swap AFAIK) I'm afraid this had been broken for stable/9, written dumps are inconsistent :( -- Sincerely, D.Marck [DM5020, MCK-RIPE, DM3-RIPN] [ FreeBSD committer: marck@FreeBSD.org ] ------------------------------------------------------------------------ *** Dmitry Morozovsky --- D.Marck --- Wild Woozle --- marck@rinet.ru *** ------------------------------------------------------------------------ From owner-freebsd-fs@FreeBSD.ORG Tue Apr 9 20:34:49 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 9D42E836 for ; Tue, 9 Apr 2013 20:34:49 +0000 (UTC) (envelope-from jamebus@gmail.com) Received: from mail-ve0-f169.google.com (mail-ve0-f169.google.com [209.85.128.169]) by mx1.freebsd.org (Postfix) with ESMTP id 62A43C3B for ; Tue, 9 Apr 2013 20:34:49 +0000 (UTC) Received: by mail-ve0-f169.google.com with SMTP id d10so6961851vea.28 for ; Tue, 09 Apr 2013 13:34:42 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:x-received:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:cc:content-type; bh=FtOTs/76xizFkmSlPCOyM1NkV1DAF3S21b0U3Ibd94c=; b=f04Mj8GQYF1YJSePnsIOoiipnFK50MnmMvluccdDHQl7LsqtezM9Mpay37ZdudjuSe a01q9uuO7xioMhPH/rIQzHGJ4KnDYt0Z83v+jQEpVqlUMPyFOeE6VcJcsLxG3qw7mePA ROm7j7m06mQIN+O47o5MFWYWkzcNDFidxDLlHhMQx2Dia/0w4Ah2lVVzAPTThfXRg7rH 7pMIbzvi0/SV0dOVMkuoHCPiXVporztyDhdKpXM1C2OgXfvDjxG0BLnMH4F9g8KBRj30 4HRpR3Ecb+f5d1o1FvCILjWHM1/FGvxZlDeksY6hJPTaHOCr+0XGmnwSA3PBHSbDhoMH eXqA== MIME-Version: 1.0 X-Received: by 10.52.66.229 with SMTP id i5mr13079586vdt.131.1365539681904; Tue, 09 Apr 2013 13:34:41 -0700 (PDT) Sender: jamebus@gmail.com Received: by 10.58.34.243 with HTTP; Tue, 9 Apr 2013 13:34:41 -0700 (PDT) In-Reply-To: <51642E22.404@gmail.com> References: <9407C6ED-3B4C-4BA2-8B88-F8A998E0A847@free.de> <5162CBE8.5050104@madpilot.net> <51642E22.404@gmail.com> Date: Tue, 9 Apr 2013 15:34:41 -0500 X-Google-Sender-Auth: Un56fCTGKzASobc6RpNDu8hmdsk Message-ID: Subject: Re: FreeBSD 9.1 and swap on zfs From: James To: Volodymyr Kostyrko Content-Type: text/plain; charset=ISO-8859-1 Cc: freebsd-fs@freebsd.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 09 Apr 2013 20:34:49 -0000 On Tue, Apr 9, 2013 at 10:05 AM, Volodymyr Kostyrko wrote: > Be sure to use -b prefer for that as when dump is written it goes to > the first component. Man page is a bit outdated as we currently miss > /etc/rc.early. Yes. rc.early hasn't worked in ages. But here's another way to do it. This assumes your swap mirror is called swap. /etc/rc.d/gmirror_savecore_pre: #!/bin/sh # # $Tilted$ # # PROVIDE: gmirror_savecore_pre # BEFORE: savecore # KEYWORD: nojail name='gmirror_savecore_pre' start_cmd='gmirror_savecore_pre_start' start_precmd='gmirror_savecore_pre_prestart' stop_cmd=':' gmirror_savecore_pre_prestart () { if ! gmirror status swap >/dev/null 2>&1; then debug 'No gmirror swap. Skipping.' return 1 fi return 0 } gmirror_savecore_pre_start () { gmirror configure -b prefer swap } load_rc_config $name run_rc_command "$1" /etc/rc.d/gmirror_savecore_post: #!/bin/sh # # $Tilted$ # # PROVIDE: gmirror_savecore_post # REQUIRE: savecore # KEYWORD: nojail name='gmirror_savecore_post' start_cmd='gmirror_savecore_post_start' start_precmd='gmirror_savecore_post_prestart' stop_cmd=':' gmirror_savecore_post_prestart () { if ! gmirror status swap >/dev/null 2>&1; then debug 'No gmirror swap. Skipping.' return 1 fi return 0 } gmirror_savecore_post_start () { gmirror configure -b load swap } load_rc_config $name run_rc_command "$1" I would prefer swap on a zvol, but it doesn't appear ready yet. At this time I allocate a swap partition in the gpt of every disk and gmirror the first two. If a disk failure happens, or looks like it might, I can simply add one of the spares. It works well. -- James. From owner-freebsd-fs@FreeBSD.ORG Wed Apr 10 07:38:44 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 70F42985 for ; Wed, 10 Apr 2013 07:38:44 +0000 (UTC) (envelope-from joar.jegleim@gmail.com) Received: from mail-wg0-f45.google.com (mail-wg0-f45.google.com [74.125.82.45]) by mx1.freebsd.org (Postfix) with ESMTP id 0FCA99B3 for ; Wed, 10 Apr 2013 07:38:43 +0000 (UTC) Received: by mail-wg0-f45.google.com with SMTP id l18so143134wgh.12 for ; Wed, 10 Apr 2013 00:38:37 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:x-received:in-reply-to:references:date:message-id :subject:from:to:cc:content-type; bh=g8QAZEFKaNQAClV/1J8O64T2CWgGgHwJ3j5OBTDHiYw=; b=huinxhB7yHNtF519s3dFh0pIfa4vRNkcAHT0tbupbl20fFgupKW00XOMzJ2/jdD4qC hHeQALHepqMLyBPZhlidXV6UcKwmtgX4BiSL9OKh5OiH2nim9xGny0PKChAN0qaZfloD dJVl8tOZ2iJ5zx8QNLyH1VpXftbIJz8PyhmYerkwHMvDUcISXXxeH96QcwqhV95+oOkk v8whbb3SUXYAeCjBptXFxyBib3onNUt2yYrsoC+ROx7dO11PMLPviUZoJS1kKs+umZ7X yYLPTg/CZshz3VTPsQFNrMRsRUJdu2/qpjBff5vJHogJeyiEssCEEnIiFnEF3fBk3Lz8 PISg== MIME-Version: 1.0 X-Received: by 10.180.97.233 with SMTP id ed9mr2939892wib.32.1365579516977; Wed, 10 Apr 2013 00:38:36 -0700 (PDT) Received: by 10.216.34.9 with HTTP; Wed, 10 Apr 2013 00:38:36 -0700 (PDT) In-Reply-To: <20130408084756.GD31958@server.rulingia.com> References: <20130405211249.GB31958@server.rulingia.com> <20130408084756.GD31958@server.rulingia.com> Date: Wed, 10 Apr 2013 09:38:36 +0200 Message-ID: Subject: Re: Regarding regular zfs From: Joar Jegleim To: Peter Jeremy Content-Type: text/plain; charset=UTF-8 X-Content-Filtered-By: Mailman/MimeDel 2.1.14 Cc: "freebsd-fs@freebsd.org" X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 10 Apr 2013 07:38:44 -0000 no compression or dedup . I've found some hardware to setup a test system where I can debug this thorough, but it's gonna take some time . Appreciate all help, I'll get back to you when I've got a test environment and have done some more tests. -- ---------------------- Joar Jegleim Homepage: http://cosmicb.no Linkedin: http://no.linkedin.com/in/joarjegleim fb: http://www.facebook.com/joar.jegleim AKA: CosmicB @Freenode ---------------------- On 8 April 2013 10:47, Peter Jeremy wrote: > On 2013-Apr-08 10:29:52 +0200, Joar Jegleim > wrote: > >I'll check how cache is doing, but as I wrote in my previous reply, the > >'slave' server is completely unresponsive, nothing works at all for 5-15 > >seconds, when the server is responsive again (can ssh in and so on) I > can't > >seem to find anything in dmesg or any log hinting about anything at all > >that went 'wrong' . > > If you have iostat/gstat/top/... running, does it hang (stop updating) > during this period? Is it pingable during the "hang"? > > How about iostat/gstat/top/... running on the console? > > Do you have compression or dedup enabled? > > >I could try DDB, I'm gonna have to get back to you on that, I haven't > >debug'ed FreeBSD kernel before and the system is in production, so I would > >have to be cautious. I might be able to try out that during this week . > > Do you have a test system that you can reproduce the problem on? The > reason I ask about DDB it that it would be useful to get a 'ps' whilst > the system is hung and it sounds like DDB is the only way to get that. > > -- > Peter Jeremy > From owner-freebsd-fs@FreeBSD.ORG Wed Apr 10 13:23:54 2013 Return-Path: Delivered-To: freebsd-fs@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 068EB44C for ; Wed, 10 Apr 2013 13:23:54 +0000 (UTC) (envelope-from marck@rinet.ru) Received: from woozle.rinet.ru (woozle.rinet.ru [195.54.192.68]) by mx1.freebsd.org (Postfix) with ESMTP id 877A0ED7 for ; Wed, 10 Apr 2013 13:23:53 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by woozle.rinet.ru (8.14.5/8.14.5) with ESMTP id r3ADNp9n041411 for ; Wed, 10 Apr 2013 17:23:51 +0400 (MSK) (envelope-from marck@rinet.ru) Date: Wed, 10 Apr 2013 17:23:51 +0400 (MSK) From: Dmitry Morozovsky To: freebsd-fs@FreeBSD.org Subject: ZFS-inly server and dedicated ZIL Message-ID: User-Agent: Alpine 2.00 (BSF 1167 2008-08-23) X-NCC-RegID: ru.rinet X-OpenPGP-Key-ID: 6B691B03 MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 10 Apr 2013 13:23:54 -0000 Dear colleagues, I'm planning to make new PostgreSQL server using zaid10-like ZFS with two SSDs splitted into mirrored ZIL and striped arc2. However, it seems current ZFS implementation does not support this: ./lib/libzfs/common/libzfs_pool.c- case EDOM: ./lib/libzfs/common/libzfs_pool.c- zfs_error_aux(hdl, dgettext(TEXT_DOMAIN, ./lib/libzfs/common/libzfs_pool.c: "root pool can not have multiple vdevs" ./lib/libzfs/common/libzfs_pool.c- " or separate logs")); ./lib/libzfs/common/libzfs_pool.c- (void) zfs_error(hdl, EZFS_POOL_NOTSUP, msg); Am I right, or did I missed something obvious? Ok, if so: In this situation, I see two possibilities: - make system boot from internal USB stick (only /bootdisk with /boot and /rescue) with the rest of ZFS-on-root - use dedicated pair of disks for ZFS pool without ZIL for system. what would you recommend? Thanks! -- Sincerely, D.Marck [DM5020, MCK-RIPE, DM3-RIPN] [ FreeBSD committer: marck@FreeBSD.org ] ------------------------------------------------------------------------ *** Dmitry Morozovsky --- D.Marck --- Wild Woozle --- marck@rinet.ru *** ------------------------------------------------------------------------ From owner-freebsd-fs@FreeBSD.ORG Wed Apr 10 15:42:36 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id DACC6F88 for ; Wed, 10 Apr 2013 15:42:36 +0000 (UTC) (envelope-from cryx-freebsd@h3q.com) Received: from mail.h3q.com (mail.h3q.com [213.73.89.199]) by mx1.freebsd.org (Postfix) with ESMTP id 34E4E6E6 for ; Wed, 10 Apr 2013 15:42:35 +0000 (UTC) Received: (qmail 11535 invoked from network); 10 Apr 2013 08:35:53 -0000 Received: from mail.h3q.com (HELO mail.h3q.com) (cryx) by mail.h3q.com with CAMELLIA256-SHA encrypted SMTP; 10 Apr 2013 08:35:53 -0000 Message-ID: <51652468.6010806@h3q.com> Date: Wed, 10 Apr 2013 10:35:52 +0200 From: Philipp Wuensche User-Agent: Postbox 3.0.7 (Macintosh/20130119) MIME-Version: 1.0 To: Volodymyr Kostyrko Subject: Re: FreeBSD 9.1 and swap on zfs References: <9407C6ED-3B4C-4BA2-8B88-F8A998E0A847@free.de> <5162CBE8.5050104@madpilot.net> <51642E22.404@gmail.com> In-Reply-To: <51642E22.404@gmail.com> Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Cc: freebsd-fs@freebsd.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 10 Apr 2013 15:42:36 -0000 Volodymyr Kostyrko wrote: > 09.04.2013 15:15, Kai Gallasch: >>> My suggestion is: >>> >>> if you want stability and don't have specific disk layout problems >>> create a separate swap. >> >> >> I think I'll repartition and swap to a gmirror device then. >> This also has the advantage of being able to write kernel dumps to the >> swap (not possible with a ZVOL bases swap AFAIK) > > Be sure to use -b prefer for that as when dump is written it goes to the > first component. Man page is a bit outdated as we currently miss > /etc/rc.early. And I guess you should make sure to use -F so gmirror doesn't try to resync your invalid swap data after a crash! greetings, Philipp From owner-freebsd-fs@FreeBSD.ORG Wed Apr 10 16:07:31 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id C1303809 for ; Wed, 10 Apr 2013 16:07:31 +0000 (UTC) (envelope-from wolfgang@riegler.homeip.net) Received: from slave2.cbt-l.de (slave2.cbt-l.de [91.205.173.116]) by mx1.freebsd.org (Postfix) with ESMTP id 8026E869 for ; Wed, 10 Apr 2013 16:07:31 +0000 (UTC) Received: from localhost (slave2.cbt-l.de [127.0.0.1]) by slave2.cbt-l.de (Postfix) with ESMTP id 9FE1D12016D1 for ; Wed, 10 Apr 2013 18:00:37 +0200 (CEST) X-Virus-Scanned: Debian amavisd-new at slave2.cbt-l.de Received: from slave2.cbt-l.de ([127.0.0.1]) by localhost (slave2.cbt-l.de [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id MaQDHMJYVQgK for ; Wed, 10 Apr 2013 18:00:37 +0200 (CEST) Received: from mail.cbt-l.de (mail.cbt-l.de [212.185.49.146]) by slave2.cbt-l.de (Postfix) with ESMTP id 6F3AD1201275 for ; Wed, 10 Apr 2013 18:00:37 +0200 (CEST) Received: (qmail 43224 invoked by uid 1009); 10 Apr 2013 16:00:37 -0000 Received: from 192.168.40.46 by mail.cbt-l.de (envelope-from , uid 1008) with qmail-scanner-1.25-st-qms (clamdscan: ClamAV 0.97.7/16876. spamassassin: 3.3.0. perlscan: 1.25-st-qms. Clear:RC:1(192.168.40.46):. Processed in 0.038787 secs); 10 Apr 2013 16:00:37 -0000 X-Antivirus-CBTL-Mail-From: wolfgang@riegler.homeip.net via mail.cbt-l.de X-Antivirus-CBTL: 1.25-st-qms (Clear:RC:1(192.168.40.46):. Processed in 0.038787 secs Process 43218) Received: from wolfgang.cbt-l.de (HELO wolfgang.localnet) (w.riegler@cbt-l.de@192.168.40.46) by mail.cbt-l.de with SMTP; 10 Apr 2013 16:00:37 -0000 From: wolfgang@riegler.homeip.net To: freebsd-fs@freebsd.org Subject: Is it possible to use a zvol as log or cache device? Date: Wed, 10 Apr 2013 18:00:37 +0200 Message-ID: <3766985.dKTvSZr3to@wolfgang> User-Agent: KMail/4.10.1 (Linux/3.8.2-pf-sepp1; KDE/4.10.1; x86_64; ; ) MIME-Version: 1.0 Content-Transfer-Encoding: 7Bit Content-Type: text/plain; charset="utf-8" X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 10 Apr 2013 16:07:31 -0000 Hi, is it possible to create a zvol on one zpool for using it as log or cache device on another zpool? On my testsystem I have a zpool called "system" with a FreeBSD 9.1 installed and a second zpool called "test". Trying to add the log device I get the following error: # zfs create -V 128M system/zil # zpool add test log /dev/zvol/system/zil cannot add to 'test': one or more devices is currently unavailable kind regards Wolfgang From owner-freebsd-fs@FreeBSD.ORG Wed Apr 10 19:37:57 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 5AF31F0E for ; Wed, 10 Apr 2013 19:37:57 +0000 (UTC) (envelope-from lkchen@k-state.edu) Received: from ksu-out.merit.edu (ksu-out.merit.edu [207.75.117.133]) by mx1.freebsd.org (Postfix) with ESMTP id 284173E5 for ; Wed, 10 Apr 2013 19:37:56 +0000 (UTC) X-Merit-ExtLoop1: 1 X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: Av8EAFy+ZVHPS3TT/2dsb2JhbABQhme/JRZ0gh8BAQUjVgwPGgINdAaIJ6xOiWyJEYEjjCgDF4R5A6gOgyeCDA X-IronPort-AV: E=Sophos;i="4.87,449,1363147200"; d="scan'208";a="916200004" X-MERIT-SOURCE: KSU Received: from ksu-sfpop-mailstore02.merit.edu ([207.75.116.211]) by sfpop-ironport05.merit.edu with ESMTP; 10 Apr 2013 15:17:50 -0400 Date: Wed, 10 Apr 2013 15:17:50 -0400 (EDT) From: "Lawrence K. Chen, P.Eng." To: Quartz Message-ID: <499967956.5577199.1365621470123.JavaMail.root@k-state.edu> In-Reply-To: <5163F03B.9060700@sneakertech.com> Subject: Re: ZFS: Failed pool causes system to hang MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Originating-IP: [129.130.0.181] X-Mailer: Zimbra 7.2.2_GA_2852 (ZimbraWebClient - GC25 ([unknown])/7.2.2_GA_2852) Cc: FreeBSD FS X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 10 Apr 2013 19:37:57 -0000 ----- Original Message ----- > > > So, you're not really waiting a long time.... > > I still don't think you're 100% clear on what's happening in my case. > I'm trying to explain that my problem is *prior* to the motherboard > resetting, NOT after. If I hard-reset the machine with the front > panel > switch, it boots just fine every time. > > When my pool *FAILS* (ie; is unrecoverable because I lost too many > drives) it hangs effectively all io on the entire machine. I can't cd > or > ls directories, I can't run any zfs commands, and I can't issue a > reboot > or halt. This is a hang. The machine is completely useless in this > state. There is no disk or cpu activity churning. There's no pool > (anymore) to be trying to resilver or whatever anyway. > > I'm not going to wait 3+ hours for "shutdown -r now" to bring the > machine down. Especially not when I already know that zfs won't let > it. > Well, that's a different kind of hang....that's the same kind of hang when an NFS fileserver goes away....and anything that accesses the non-responding mounts will block until the server responds again. And apparently by design, since zpool failmode=wait is default, which means all I/O on the system attempts to retry the devices. Other options are, failmode=continue is described that the system will continue on as if nothing has changed. And, failmode=panic...cause system to panic and dump core. Then it depends on what you have set for system to do after a panic. Not sure what it means the system will continue...suppose it just immediately errors out the I/O operations to the affected mounts. The other option is failmode=panic. Which might be what you want in this case, since you can't shutdown gracefully with all I/O hanging. Though the shutdown timer seems to still kick in when I've had this happen. Though watchdog seems to get turned off early on in the shutdown process. Though panic doesn't doesn't seem to reboot always....though maybe I should see about having it not reboot. I suppose I could try failmode=panic or failmode=continue....have a problem where if there's a power dropout, the transition to and from UPS battery will sometimes lockup one enclosure or another....and there's no way to redistribute disks such that I won't lose one zpool or another. Can't seem to get FreeBSD to redetect the enclosure, so rebooting gets it seeing the drives again. I've changed out enclosures....so it might be something about this particular UPS or that there's a power-conditioner on this circuit. Since another system at the other end of building has never had this kind of problem before...and it had more hanging off it. I'm sure there'll be a dropout or worse in the near future...especially with spring thunderstorms lurking already. Been thinking of getting a double conversion UPS for this server... Otherwise, the filesystems on these zpools aren't really critical to the operation of the server (though the contents are important to me) one is my backup pool (backuppc) and another has archival/replication data (plus data that I extracted from a corrupt drive that I need to go through and see what's usable....) From owner-freebsd-fs@FreeBSD.ORG Wed Apr 10 19:46:19 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 12D9D26E for ; Wed, 10 Apr 2013 19:46:19 +0000 (UTC) (envelope-from quartz@sneakertech.com) Received: from relay00.pair.com (relay00.pair.com [209.68.5.9]) by mx1.freebsd.org (Postfix) with SMTP id ABC93643 for ; Wed, 10 Apr 2013 19:46:18 +0000 (UTC) Received: (qmail 71529 invoked by uid 0); 10 Apr 2013 19:46:11 -0000 Received: from 173.48.104.62 (HELO ?10.2.2.1?) (173.48.104.62) by relay00.pair.com with SMTP; 10 Apr 2013 19:46:11 -0000 X-pair-Authenticated: 173.48.104.62 Message-ID: <5165C183.1020704@sneakertech.com> Date: Wed, 10 Apr 2013 15:46:11 -0400 From: Quartz User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.6; rv:10.0.2) Gecko/20120216 Thunderbird/10.0.2 MIME-Version: 1.0 To: "Lawrence K. Chen, P.Eng." Subject: Re: ZFS: Failed pool causes system to hang References: <499967956.5577199.1365621470123.JavaMail.root@k-state.edu> In-Reply-To: <499967956.5577199.1365621470123.JavaMail.root@k-state.edu> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit Cc: FreeBSD FS X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 10 Apr 2013 19:46:19 -0000 > Other options are, failmode=continue is described that the system > will continue on as if nothing has changed. >Not > sure what it means the system will continue I'm not sure ether. In my experience, there's effectively no difference between "wait" and "continue". > The other option is failmode=panic. Which might be what you want in > this case, since you can't shutdown gracefully with all I/O hanging. Well, what I *WANT* is a way to kick the pool or something so that the rest of the machine still functions. As it stands, I can't even reliably get the status of the pool, as I can't reliably run half the zfs commands. ______________________________________ it has a certain smooth-brained appeal From owner-freebsd-fs@FreeBSD.ORG Wed Apr 10 20:03:01 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id D2A4078E for ; Wed, 10 Apr 2013 20:03:01 +0000 (UTC) (envelope-from toasty@dragondata.com) Received: from mail-ia0-x22a.google.com (mail-ia0-x22a.google.com [IPv6:2607:f8b0:4001:c02::22a]) by mx1.freebsd.org (Postfix) with ESMTP id A41B3774 for ; Wed, 10 Apr 2013 20:03:01 +0000 (UTC) Received: by mail-ia0-f170.google.com with SMTP id j38so776236iad.29 for ; Wed, 10 Apr 2013 13:03:01 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=dragondata.com; s=google; h=x-received:from:content-type:content-transfer-encoding:subject :message-id:date:to:mime-version:x-mailer; bh=y2WhLqDBEZBbTbg+3DvexkctOVVJXTmdvBiFqKRZNTA=; b=MCU8pkQbcpjhcvQ4Zzr4tbWv8hGP+2tmHDH9gFl4livu2EeERJGcATCM2tokhkt+8v kcw6/MNd+4y788bDffjjxocGWggn69wtyeFsB5LDZM0fFLP90iRGVmRAIh3pKg0dy8iu cA7Ir6iNw8OeAgTzqPtfVuteuNeT+/3uGkP6g= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=x-received:from:content-type:content-transfer-encoding:subject :message-id:date:to:mime-version:x-mailer:x-gm-message-state; bh=y2WhLqDBEZBbTbg+3DvexkctOVVJXTmdvBiFqKRZNTA=; b=elkFXlRiRCBhDBPl1GvqX3BsO4N2tkiNKWV7X6wbMHOp9Rlvv+HWdxpSlUHAREwSWZ L77JN/ZaWQOc52ryuy2sjeMqQYjixPxjemp4y40Lm8I9K0XvDrLK1zt1m6SNo3JeFki3 2AwfRsDzhdgF5OB9VzspTaCwFwHlgY3ZnTF6xPI/lRZt7toncionTGsNeR23otbazPWD Nu5iGwvOky9DCX/1Mwiq42+VPnhCmJFZzjJusfEIZjSFYRGWHGDyf+k3T4HHEZf43kBA b6kXwY30DdodbPpG847oAKm9pkwAyNRzGsuZ6elNzFO+2zf+thJThXJuYpOxsFZBG7j1 NnkQ== X-Received: by 10.50.62.66 with SMTP id w2mr2393191igr.81.1365624180882; Wed, 10 Apr 2013 13:03:00 -0700 (PDT) Received: from vpn132.rw1.your.org (vpn132.rw1.your.org. [204.9.51.132]) by mx.google.com with ESMTPS id vb15sm1490041igb.9.2013.04.10.13.02.59 (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Wed, 10 Apr 2013 13:02:59 -0700 (PDT) From: Kevin Day Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: quoted-printable Subject: Does sync(8) really flush everything? Lost writes with journaled SU after sync+power cycle Message-Id: <87CC14D8-7DC6-481A-8F85-46629F6D2249@dragondata.com> Date: Wed, 10 Apr 2013 15:02:56 -0500 To: "freebsd-fs@FreeBSD.org Filesystems" Mime-Version: 1.0 (Mac OS X Mail 6.3 \(1503\)) X-Mailer: Apple Mail (2.1503) X-Gm-Message-State: ALoCoQlF3XKWKT29XUz7Phkn6qEmwN88x2hrhMBu31rqlhUIe8AHQ+tRJkLCAkzeJNue3A3aDqgI X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 10 Apr 2013 20:03:01 -0000 Working with an environment where a system (with journaled soft-updates) = is going to be notified that it's going to be losing power shortly, and = needs to shut down daemons and flush everything to disk. It doesn't = actually shutdown though, because the "power down now" command may get = cancelled and we need to bring things back up. My understanding was that = we could call sync(8), then just wait for the power to drop. The problem is that we were frequently losing the last 30-60 seconds = worth of filesystem changes prior to the shutdown. i.e. newly created = directories would disappear or fsck would reclaim them and throw them = into lost+found. I confirmed that there is no caching disk controller, and write caching = is disabled on the drives themselves, and the problem continued. On a whim, after running sync(8) once and waiting 10 seconds, I did = "mount -u -o ro -f /" to force the filesystem into read-only mode. It = took about 8 seconds to finish, gstat showed a lot of write activity, = and SIGINFO on the mount command showed: load: 0.01 cmd: mount 15775 [biowr] 3.62r 0.00u 0.55s 5% 1644k load: 0.03 cmd: mount 15775 [runnable] 4.41r 0.00u 0.65s 6% 1644k load: 0.03 cmd: mount 15775 [biowr] 5.00r 0.00u 0.72s 6% 1644k load: 0.03 cmd: mount 15775 [biowr] 5.70r 0.00u 0.80s 6% 1644k load: 0.03 cmd: mount 15775 [biowr] 6.03r 0.00u 0.84s 6% 1644k load: 0.03 cmd: mount 15775 [running] 6.27r 0.00u 0.87s 6% 1644k load: 0.03 cmd: mount 15775 [biowr] 6.51r 0.00u 0.90s 7% 1644k load: 0.03 cmd: mount 15775 [biowr] 6.69r 0.00u 0.92s 6% 1644k load: 0.03 cmd: mount 15775 [biowr] 6.90r 0.00u 0.94s 6% 1644k load: 0.03 cmd: mount 15775 [biowr] 7.04r 0.00u 0.96s 7% 1644k load: 0.03 cmd: mount 15775 [biowr] 7.20r 0.00u 0.98s 7% 1644k If sync's man page is true (force completion of pending disk writes = (flush cache)), and there is zero filesystem activity occurring, = shouldn't that be enough to ensure no corruption after a power cycle? If = sync really is flushing everything, what's all the write activity = happening in when degrading from rw to ro? Is there a better way to get things into a stable state on disk, yet not = fully shutdown so that we can recover from this if the shutdown order is = cancelled? For me, this is easily reproducible with: mkdir /root/test sync sleep 10 (hit reset button) The problem doesn't happen with: mkdir /root/test mount -u -o ro -f / (hit reset button) It's great that we're not ending up in an inconsistent state, but i was = expecting sync to prevent this. -- Kevin From owner-freebsd-fs@FreeBSD.ORG Wed Apr 10 21:39:27 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id DA4CAF19 for ; Wed, 10 Apr 2013 21:39:27 +0000 (UTC) (envelope-from lkchen@k-state.edu) Received: from ksu-out.merit.edu (ksu-out.merit.edu [207.75.117.133]) by mx1.freebsd.org (Postfix) with ESMTP id A776CC17 for ; Wed, 10 Apr 2013 21:39:27 +0000 (UTC) X-Merit-ExtLoop1: 1 X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AgkFADrbZVHPS3TT/2dsb2JhbABQgwaDYb8oFnSCHwEBBSNQBgwPDgwCDRkCWQYTiBSsPoluiRGBI5AogRMDqA6DJ4IM X-IronPort-AV: E=Sophos;i="4.87,450,1363147200"; d="scan'208";a="213340977" X-MERIT-SOURCE: KSU Received: from ksu-sfpop-mailstore02.merit.edu ([207.75.116.211]) by sfpop-ironport07.merit.edu with ESMTP; 10 Apr 2013 17:39:26 -0400 Date: Wed, 10 Apr 2013 17:39:26 -0400 (EDT) From: "Lawrence K. Chen, P.Eng." To: Dmitry Morozovsky Message-ID: <802657359.5644092.1365629966726.JavaMail.root@k-state.edu> In-Reply-To: Subject: Re: ZFS-inly server and dedicated ZIL MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Originating-IP: [129.130.0.181] X-Mailer: Zimbra 7.2.2_GA_2852 (ZimbraWebClient - GC25 ([unknown])/7.2.2_GA_2852) Cc: freebsd-fs@FreeBSD.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 10 Apr 2013 21:39:27 -0000 The root pool only having one root vdev is probably a restriction on what functionality could be put into the bootblock. But, can't really tell what it is you're trying to do... You say raid10-like ZFS with two SSDs....which to me means you have at least 4 physical drives, and 2 SSDs. zpool create tank mirror disk1 disk2 mirror disk3 disk4 .... log ... cache .... perhaps as: .... log mirror ssd1a ssd2a cache ssd1b ssd2b tank is load-balancing onto sets of mirrors, with a mirror of the first partition of the two ssds for ZIL and using the second partition of the two ssds concurrently for cache. So, in this case....tank can't be used as a root pool. an Internal USB stick as your root pool would be one way to go. or you could carve out another partition on the SSDs for a root pool. On my home system, I currently have a pair of SSDs....each with 7 partitions.... 5 tiny ones for mirror ZILs for the other zpools in my system, the mirrored root pool and partitions to be L2ARC for the two zpools in my system that I'm using dedup in. (there's currently only 4 zpools in my system, including the root....but I had originally envisioned the possibility of adding additional external jbod arrays....) I used to have 6 disk raid10-like with part of an external SSD for its zil...and no cache. It has since been redone as a raidz2 with mirrored zil (from the internal SSDs)....planning to add another SSD soon. ----- Original Message ----- > Dear colleagues, > > I'm planning to make new PostgreSQL server using zaid10-like ZFS with > two SSDs > splitted into mirrored ZIL and striped arc2. However, it seems > current > ZFS implementation does not support this: > > ./lib/libzfs/common/libzfs_pool.c- case EDOM: > ./lib/libzfs/common/libzfs_pool.c- zfs_error_aux(hdl, > dgettext(TEXT_DOMAIN, > ./lib/libzfs/common/libzfs_pool.c: "root pool can not have > multiple vdevs" > ./lib/libzfs/common/libzfs_pool.c- " or separate logs")); > ./lib/libzfs/common/libzfs_pool.c- (void) zfs_error(hdl, > EZFS_POOL_NOTSUP, msg); > > Am I right, or did I missed something obvious? > > Ok, if so: In this situation, I see two possibilities: > - make system boot from internal USB stick (only /bootdisk with /boot > and > /rescue) with the rest of ZFS-on-root > - use dedicated pair of disks for ZFS pool without ZIL for system. > > what would you recommend? > > Thanks! > > -- > Sincerely, > D.Marck [DM5020, MCK-RIPE, > DM3-RIPN] > [ FreeBSD committer: > marck@FreeBSD.org ] > ------------------------------------------------------------------------ > *** Dmitry Morozovsky --- D.Marck --- Wild Woozle --- marck@rinet.ru > *** From owner-freebsd-fs@FreeBSD.ORG Thu Apr 11 06:30:56 2013 Return-Path: Delivered-To: freebsd-fs@FreeBSD.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id B3830D4D for ; Thu, 11 Apr 2013 06:30:56 +0000 (UTC) (envelope-from brde@optusnet.com.au) Received: from mail110.syd.optusnet.com.au (mail110.syd.optusnet.com.au [211.29.132.97]) by mx1.freebsd.org (Postfix) with ESMTP id 7B6B2270 for ; Thu, 11 Apr 2013 06:30:55 +0000 (UTC) Received: from c211-30-173-106.carlnfd1.nsw.optusnet.com.au (c211-30-173-106.carlnfd1.nsw.optusnet.com.au [211.30.173.106]) by mail110.syd.optusnet.com.au (Postfix) with ESMTPS id 8D5C07812EB; Thu, 11 Apr 2013 16:30:53 +1000 (EST) Date: Thu, 11 Apr 2013 16:30:52 +1000 (EST) From: Bruce Evans X-X-Sender: bde@besplex.bde.org To: Kevin Day Subject: Re: Does sync(8) really flush everything? Lost writes with journaled SU after sync+power cycle In-Reply-To: <87CC14D8-7DC6-481A-8F85-46629F6D2249@dragondata.com> Message-ID: <20130411160253.V1041@besplex.bde.org> References: <87CC14D8-7DC6-481A-8F85-46629F6D2249@dragondata.com> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed X-Optus-CM-Score: 0 X-Optus-CM-Analysis: v=2.0 cv=HfxM1V48 c=1 sm=1 a=Cguo-lYZyhEA:10 a=kj9zAlcOel0A:10 a=PO7r1zJSAAAA:8 a=JzwRw_2MAAAA:8 a=5GGpcXspQ0YA:10 a=TNEYMwA1_HLbvNno7u4A:9 a=CjuIK1q_8ugA:10 a=OapHOw4wc7whhmw6:21 a=wi0mfrH21KDZYzXh:21 a=TEtd8y5WR3g2ypngnwZWYw==:117 Cc: "freebsd-fs@FreeBSD.org Filesystems" X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 11 Apr 2013 06:30:56 -0000 On Wed, 10 Apr 2013, Kevin Day wrote: > Working with an environment where a system (with journaled soft-updates) is going to be notified that it's going to be losing power shortly, and needs to shut down daemons and flush everything to disk. It doesn't actually shutdown though, because the "power down now" command may get cancelled and we need to bring things back up. My understanding was that we could call sync(8), then just wait for the power to drop. > > The problem is that we were frequently losing the last 30-60 seconds worth of filesystem changes prior to the shutdown. i.e. newly created directories would disappear or fsck would reclaim them and throw them into lost+found. > > I confirmed that there is no caching disk controller, and write caching is disabled on the drives themselves, and the problem continued. > > On a whim, after running sync(8) once and waiting 10 seconds, I did "mount -u -o ro -f /" to force the filesystem into read-only mode. It took about 8 seconds to finish, gstat showed a lot of write activity, and SIGINFO on the mount command showed: sync(2) only schedules all writing of all modified buffers to disk. Its man page even says this. It doesn't wait for any of the writes to complete. Its man page says that this is a BUG, but it is intentional and sync() has always done this. There is no way for sync() to guarantee that all modified buffers have been written to disk when it returns, since even if it waited, buffers might be modified while it is returning. Perhaps even ones that would take 8 seconds to complete can be written in the few nanoseconds that it takes to return. sync(8) is just a wrapper around sync(2). One that doesn't even check for errors. Not that it could handle sync() failure. Its man page bogusly first claims that it "forces completion". This is not completely wrong, since it doesn't claim that the completion occurs before sync(8) exits. But then it claims that sync(8) is suitable "to ensure that all disk writes have been completed in a way not suitably done by reboot(8) or halt(8). This wording is poor, unless it is intentionally weaselishly worded so that it doesn't actually claim full completion. It only claims more suitable completion than with reboot or halt. Actually, completion is not guaranteed, and what sync(8) provides is just less unsuitable than what reboot and halt provide. To ensure completion, you have to freeze the file systems of interest before rebooting. I don't know of any ways to do this from userland except mount -u -o ro or unmount. There should be a syscall to cause syncing with waiting. The kernel has a wait option for syncing, but doesn't use it for sync(2). But using this would only reduce the races. Bruce From owner-freebsd-fs@FreeBSD.ORG Thu Apr 11 09:25:15 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 862093BC for ; Thu, 11 Apr 2013 09:25:15 +0000 (UTC) (envelope-from marck@rinet.ru) Received: from woozle.rinet.ru (woozle.rinet.ru [195.54.192.68]) by mx1.freebsd.org (Postfix) with ESMTP id 1CF5CDA6 for ; Thu, 11 Apr 2013 09:25:14 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by woozle.rinet.ru (8.14.5/8.14.5) with ESMTP id r3B9PCQ9014258; Thu, 11 Apr 2013 13:25:12 +0400 (MSK) (envelope-from marck@rinet.ru) Date: Thu, 11 Apr 2013 13:25:12 +0400 (MSK) From: Dmitry Morozovsky To: "Lawrence K. Chen, P.Eng." Subject: Re: ZFS-inly server and dedicated ZIL In-Reply-To: <802657359.5644092.1365629966726.JavaMail.root@k-state.edu> Message-ID: References: <802657359.5644092.1365629966726.JavaMail.root@k-state.edu> User-Agent: Alpine 2.00 (BSF 1167 2008-08-23) X-NCC-RegID: ru.rinet X-OpenPGP-Key-ID: 6B691B03 MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Cc: freebsd-fs@freebsd.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 11 Apr 2013 09:25:15 -0000 On Wed, 10 Apr 2013, Lawrence K. Chen, P.Eng. wrote: > The root pool only having one root vdev is probably a restriction on what functionality could be put into the bootblock. > > But, can't really tell what it is you're trying to do... > > You say raid10-like ZFS with two SSDs....which to me means you have at least 4 physical drives, and 2 SSDs. > > zpool create tank mirror disk1 disk2 mirror disk3 disk4 .... log ... cache .... > > perhaps as: .... log mirror ssd1a ssd2a cache ssd1b ssd2b Yes, exactly this way (gpt partitions, but nevermind). And before adding log I booted from this pool perfectly. > tank is load-balancing onto sets of mirrors, with a mirror of the first > partition of the two ssds for ZIL and using the second partition of the two > ssds concurrently for cache. > > So, in this case....tank can't be used as a root pool. an Internal USB stick > as your root pool would be one way to go. or you could carve out another > partition on the SSDs for a root pool. I'm now thinking of putting /boot on the first pair of disks to boot from, avoiding SPOF in this area too. > On my home system, I currently have a pair of SSDs....each with 7 > partitions.... 5 tiny ones for mirror ZILs for the other zpools in my system, > the mirrored root pool and partitions to be L2ARC for the two zpools in my > system that I'm using dedup in. (there's currently only 4 zpools in my > system, including the root....but I had originally envisioned the possibility > of adding additional external jbod arrays....) What is not clear for me is why one could want more than one pool in one machine? Thanks for your comments, will test further. > > I used to have 6 disk raid10-like with part of an external SSD for its zil...and no cache. It has since been redone as a raidz2 with mirrored zil (from the internal SSDs)....planning to add another SSD soon. > > ----- Original Message ----- > > Dear colleagues, > > > > I'm planning to make new PostgreSQL server using zaid10-like ZFS with > > two SSDs > > splitted into mirrored ZIL and striped arc2. However, it seems > > current > > ZFS implementation does not support this: > > > > ./lib/libzfs/common/libzfs_pool.c- case EDOM: > > ./lib/libzfs/common/libzfs_pool.c- zfs_error_aux(hdl, > > dgettext(TEXT_DOMAIN, > > ./lib/libzfs/common/libzfs_pool.c: "root pool can not have > > multiple vdevs" > > ./lib/libzfs/common/libzfs_pool.c- " or separate logs")); > > ./lib/libzfs/common/libzfs_pool.c- (void) zfs_error(hdl, > > EZFS_POOL_NOTSUP, msg); > > > > Am I right, or did I missed something obvious? > > > > Ok, if so: In this situation, I see two possibilities: > > - make system boot from internal USB stick (only /bootdisk with /boot > > and > > /rescue) with the rest of ZFS-on-root > > - use dedicated pair of disks for ZFS pool without ZIL for system. > > > > what would you recommend? > > > > Thanks! > > > > -- > > Sincerely, > > D.Marck [DM5020, MCK-RIPE, > > DM3-RIPN] > > [ FreeBSD committer: > > marck@FreeBSD.org ] > > ------------------------------------------------------------------------ > > *** Dmitry Morozovsky --- D.Marck --- Wild Woozle --- marck@rinet.ru > > *** > _______________________________________________ > freebsd-fs@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-fs > To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org" > -- Sincerely, D.Marck [DM5020, MCK-RIPE, DM3-RIPN] [ FreeBSD committer: marck@FreeBSD.org ] ------------------------------------------------------------------------ *** Dmitry Morozovsky --- D.Marck --- Wild Woozle --- marck@rinet.ru *** ------------------------------------------------------------------------ From owner-freebsd-fs@FreeBSD.ORG Thu Apr 11 10:01:15 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 552498EE for ; Thu, 11 Apr 2013 10:01:15 +0000 (UTC) (envelope-from tevans.uk@googlemail.com) Received: from mail-la0-x22e.google.com (mail-la0-x22e.google.com [IPv6:2a00:1450:4010:c03::22e]) by mx1.freebsd.org (Postfix) with ESMTP id D61E0EFB for ; Thu, 11 Apr 2013 10:01:14 +0000 (UTC) Received: by mail-la0-f46.google.com with SMTP id ea20so1279313lab.5 for ; Thu, 11 Apr 2013 03:01:13 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=googlemail.com; s=20120113; h=mime-version:x-received:in-reply-to:references:date:message-id :subject:from:to:cc:content-type; bh=jPgvSk4IdDV8RQ+bSTI+7L/YRVnsMw+dzVatoKY/eUU=; b=jD38U6KVO8P/uFWP/dC9XtoQMdlB5LAbGSi2yLjIgBCQ2XPF38YLpbTBTkQfNYgI87 afzvUhsDF4X/rbaluRt2LWna3GYpOskrxdeAjnhypL48JvmxR0ilbN3T4t9KO+fqYNUY zvNiraqEI8fnzr0gztNrjwWtuig6nCkUV6lDTw+X3vP8xbGT0u6hIKM00Xbf/RvhFvPG qCOHJXFefSgR9dAY48UbYtlqfj3TadOO4TzobwgiZAdZVwqy10CQWwZx0jcgIUwAQvIi FNVfdIY7IFq7DnaJeI15D/sXp/7I9dJIb8fUmbtdge4F0JPQi2St4IgLG1imbfaIrza+ fzlw== MIME-Version: 1.0 X-Received: by 10.112.162.65 with SMTP id xy1mr2877226lbb.105.1365674473257; Thu, 11 Apr 2013 03:01:13 -0700 (PDT) Received: by 10.112.198.201 with HTTP; Thu, 11 Apr 2013 03:01:13 -0700 (PDT) In-Reply-To: References: <802657359.5644092.1365629966726.JavaMail.root@k-state.edu> Date: Thu, 11 Apr 2013 11:01:13 +0100 Message-ID: Subject: Re: ZFS-inly server and dedicated ZIL From: Tom Evans To: Dmitry Morozovsky Content-Type: text/plain; charset=UTF-8 Cc: freebsd-fs@freebsd.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 11 Apr 2013 10:01:15 -0000 On Thu, Apr 11, 2013 at 10:25 AM, Dmitry Morozovsky wrote: > What is not clear for me is why one could want more than one pool in one > machine? > For me, I have a mirrored pool on SSDs that serves as a root pool, and then a bunch of slow spinning rust in JBOD arrays for bulk storage. I don't need the secondary pools to boot up/use the system, which makes it easier if things go wrong. Cheers Tom From owner-freebsd-fs@FreeBSD.ORG Thu Apr 11 11:44:42 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 5F7E71F1 for ; Thu, 11 Apr 2013 11:44:42 +0000 (UTC) (envelope-from feld@feld.me) Received: from new1-smtp.messagingengine.com (new1-smtp.messagingengine.com [66.111.4.221]) by mx1.freebsd.org (Postfix) with ESMTP id 3546C5FD for ; Thu, 11 Apr 2013 11:44:40 +0000 (UTC) Received: from compute2.internal (compute2.nyi.mail.srv.osa [10.202.2.42]) by gateway1.nyi.mail.srv.osa (Postfix) with ESMTP id 3E5E2280 for ; Thu, 11 Apr 2013 07:44:34 -0400 (EDT) Received: from frontend1.nyi.mail.srv.osa ([10.202.2.160]) by compute2.internal (MEProxy); Thu, 11 Apr 2013 07:44:34 -0400 DKIM-Signature: v=1; a=rsa-sha1; c=relaxed/relaxed; d=feld.me; h= content-type:to:subject:references:date:mime-version :content-transfer-encoding:from:message-id:in-reply-to; s= mesmtp; bh=t5QxmgjulIU2x+pbuMtDNvqphfw=; b=l7F+zeZh42TkuJhpJChE7 JVZkbajM2o8lQXHxkaS6fac7gSmpIkXj1BSR1OjjwihIrwJwdwjNDAk3kCfBaPFs ZtaBd60+3k/EwBSZNjtvrVAqNBEU9QMWCtFMC+3+zFBT41h3MVVCCt/AVkJjJ3oU JpOvByLO0wZ8e49Q6+1njg= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed/relaxed; d= messagingengine.com; h=content-type:to:subject:references:date :mime-version:content-transfer-encoding:from:message-id :in-reply-to; s=smtpout; bh=t5QxmgjulIU2x+pbuMtDNvqphfw=; b=dH+q CZq9wnECNEnc9VB2VxXrRu0U9tNFxM/k9fDqjVRG3nm47gCEbZWWm97Q2A5sJ8sy M5XM0Pa59pc/erHcd+/TSBpGdaV//8KpGSZReXGafkAreKIh10/ZFd21P9xnsx4z 4/gVrI9CsyLPWOrpvEDGSUZKPLnunxgvFxw/+Eo= X-Sasl-enc: FcWW8AB1VlXRiwcHJqgccrRzCY3GKwzKMOiMgtbMB0op 1365680673 Received: from tech304.office.supranet.net (unknown [66.170.8.18]) by mail.messagingengine.com (Postfix) with ESMTPA id DD779C80004 for ; Thu, 11 Apr 2013 07:44:33 -0400 (EDT) Content-Type: text/plain; charset=us-ascii; format=flowed; delsp=yes To: freebsd-fs@freebsd.org Subject: Re: ZFS-inly server and dedicated ZIL References: <802657359.5644092.1365629966726.JavaMail.root@k-state.edu> Date: Thu, 11 Apr 2013 06:44:33 -0500 MIME-Version: 1.0 Content-Transfer-Encoding: 7bit From: "Mark Felder" Message-ID: In-Reply-To: <802657359.5644092.1365629966726.JavaMail.root@k-state.edu> User-Agent: Opera Mail/12.14 (FreeBSD) X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 11 Apr 2013 11:44:42 -0000 On Wed, 10 Apr 2013 16:39:26 -0500, Lawrence K. Chen, P.Eng. wrote: > The root pool only having one root vdev is probably a restriction on > what functionality could be put into the bootblock. You can have multiple vdevs in a root pool, but only if they're mirror. You also have to go out of your way to make it work. It's certainly possible, though. $ zpool status pool: tank0 state: ONLINE scan: resilvered 1.51G in 0h2m with 0 errors on Mon Apr 1 07:08:24 2013 config: NAME STATE READ WRITE CKSUM tank0 ONLINE 0 0 0 mirror-0 ONLINE 0 0 0 mfid0p2 ONLINE 0 0 0 mfid1p2 ONLINE 0 0 0 mirror-1 ONLINE 0 0 0 mfid2p2 ONLINE 0 0 0 mfid3p2 ONLINE 0 0 0 errors: No known data errors $ gpart show => 34 584843197 mfid0 GPT (278G) 34 128 1 freebsd-boot (64k) 162 6 - free - (3.0k) 168 584822784 2 freebsd-zfs (278G) 584822952 20279 - free - (9.9M) => 34 584843197 mfid1 GPT (278G) 34 128 1 freebsd-boot (64k) 162 6 - free - (3.0k) 168 584843056 2 freebsd-zfs (278G) 584843224 7 - free - (3.5k) => 34 584843197 mfid3 GPT (278G) 34 128 1 freebsd-boot (64k) 162 6 - free - (3.0k) 168 584822784 2 freebsd-zfs (278G) 584822952 20279 - free - (9.9M) => 34 584843197 mfid2 GPT (278G) 34 128 1 freebsd-boot (64k) 162 6 - free - (3.0k) 168 584822784 2 freebsd-zfs (278G) 584822952 20279 - free - (9.9M) From owner-freebsd-fs@FreeBSD.ORG Thu Apr 11 14:40:18 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 1725E35A for ; Thu, 11 Apr 2013 14:40:18 +0000 (UTC) (envelope-from josh@signalboxes.net) Received: from mail-oa0-f46.google.com (mail-oa0-f46.google.com [209.85.219.46]) by mx1.freebsd.org (Postfix) with ESMTP id D14EEEFD for ; Thu, 11 Apr 2013 14:40:17 +0000 (UTC) Received: by mail-oa0-f46.google.com with SMTP id h2so228365oag.5 for ; Thu, 11 Apr 2013 07:40:16 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=x-received:mime-version:x-received:in-reply-to:references:date :message-id:subject:from:to:content-type:x-gm-message-state; bh=4wTG567QSC1HtoRTpKL8srgBv1ESqARO8K4KCqE9gdI=; b=PgvIOc1Xu5VHyG4VDI58hgahcpDdPi0ynrD4xSHyKW1xdJhPD3fl7aLHDHP0MYAPnQ PITQ/sT+v3Xu98WBoOBYhOCGplfEWeXz46kzj53dZEXeMZnz16Uzk3t0oWeDZl0YpQ7q hzPcKtKqNUQsaPWmErTl/xJw3ImB0ad1GKOD8HUYipAX6EH6GGREm43ue4CI76qucj55 GySanCvj0wQTNb5uJn0fQjdN4gXBVFQldP5qkES5kCBqOJOKaOqkIYY5DusWGPCC390G 5i8oA+AlELdc1BWnhqcwxTuALU1uXeozXax5VygM7pYv1tbK01NjMGm8jvR8Xx/ee8UF eQtA== X-Received: by 10.60.135.103 with SMTP id pr7mr2336039oeb.142.1365691216835; Thu, 11 Apr 2013 07:40:16 -0700 (PDT) Received: from mail-ob0-x230.google.com (mail-ob0-x230.google.com [2607:f8b0:4003:c01::230]) by mx.google.com with ESMTPS id do4sm925053oeb.0.2013.04.11.07.40.16 (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Thu, 11 Apr 2013 07:40:16 -0700 (PDT) Received: by mail-ob0-f176.google.com with SMTP id er7so1445718obc.21 for ; Thu, 11 Apr 2013 07:40:15 -0700 (PDT) MIME-Version: 1.0 X-Received: by 10.60.155.212 with SMTP id vy20mr1036777oeb.33.1365691215672; Thu, 11 Apr 2013 07:40:15 -0700 (PDT) Received: by 10.60.140.130 with HTTP; Thu, 11 Apr 2013 07:40:15 -0700 (PDT) In-Reply-To: References: <12CCA57CCC7E4F16A1147F8422F5F151@multiplay.co.uk> Date: Thu, 11 Apr 2013 08:40:15 -0600 Message-ID: Subject: Re: ZFS + NFS poor performance after restarting from 100 day uptime From: Josh Beard To: freebsd-fs@freebsd.org X-Gm-Message-State: ALoCoQmqQMVk+pfhsN+fhXmc1sFAkeiMIk2WooDGClaFtQB0Edi229V9/+3Ui0r0cGTFMbeJyWil Content-Type: text/plain; charset=ISO-8859-1 X-Content-Filtered-By: Mailman/MimeDel 2.1.14 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 11 Apr 2013 14:40:18 -0000 I wanted to give a followup to this in case someone else stumbles upon this thread with search queries. I was wrong about the original (9.1-RC3) kernel performing better. It was exhibiting the same behavior under "real world" conditions. Real world for this server is 100-200 Mac clients connecting with network homes via NFS. I haven't completely confirmed anything, but disabling Spotlight Indexing (Mac client feature) helped *significantly*. It's still curious why spotlight indexing was never an issue prior to the reboot I mentioned. I'm also unsure why the RAID controller's verifications are intermittently slow since that reboot. In any event, I don't think it's a ZFS or FreeBSD issue, based off of various benchmarks, which show expected performance. Thanks. On Fri, Mar 22, 2013 at 2:24 PM, Josh Beard wrote: > > > On Fri, Mar 22, 2013 at 1:07 PM, Steven Hartland wrote: > >> >> ----- Original Message ----- From: Josh Beard >>> >>>> A snip of gstat: >>>> >>>> dT: 1.002s w: 1.000s >>>> L(q) ops/s r/s kBps ms/r w/s kBps ms/w %busy Name >>>> >>> ... >> >>> 4 160 126 1319 31.3 34 100 0.1 100.3| da1 >>>> 4 146 110 1289 33.6 36 98 0.1 97.8| da2 >>>> 4 142 107 1370 36.1 35 101 0.2 101.9| da3 >>>> 4 121 95 1360 35.6 26 19 0.1 95.9| da4 >>>> 4 151 117 1409 34.0 34 102 0.1 100.1| da5 >>>> 4 141 109 1366 35.9 32 101 0.1 97.9| da6 >>>> 4 136 118 1207 24.6 18 13 0.1 87.0| da7 >>>> 4 118 102 1278 32.2 16 12 0.1 89.8| da8 >>>> 4 138 116 1240 33.4 22 55 0.1 100.0| da9 >>>> 4 133 117 1269 27.8 16 13 0.1 86.5| da10 >>>> 4 121 102 1302 53.1 19 51 0.1 100.0| da11 >>>> 4 120 99 1242 40.7 21 51 0.1 99.7| da12 >>>> >>>> Your ops/s are be maxing your disks. You say "only" but the ~190 ops/s >>>> is what HD's will peak at, so whatever our machine is doing is causing >>>> it to max the available IO for your disks. >>>> >>>> If you boot back to your previous kernel does the problem go away? >>>> >>>> If so you could look at the changes between the two kernel revisions >>>> for possible causes and if needed to a binary chop with kernel builds >>>> to narrow down the cause. >>>> >>> >>> Thanks for your response. I booted with the old kernel (9.1-RC3) and the >>> problem disappeared! We're getting 3x the performance with the previous >>> kernel than we do with the 9.1-RELEASE-p1 kernel: >>> >>> Output from gstat: >>> >>> 1 362 0 0 0.0 345 20894 9.4 52.9| da1 >>> 1 365 0 0 0.0 348 20893 9.4 54.1| da2 >>> 1 367 0 0 0.0 350 20920 9.3 52.6| da3 >>> 1 362 0 0 0.0 345 21275 9.5 54.1| da4 >>> 1 363 0 0 0.0 346 21250 9.6 54.2| da5 >>> 1 359 0 0 0.0 342 21352 9.5 53.8| da6 >>> 1 347 0 0 0.0 330 20486 9.4 52.3| da7 >>> 1 353 0 0 0.0 336 20689 9.6 52.9| da8 >>> 1 355 0 0 0.0 338 20669 9.5 53.0| da9 >>> 1 357 0 0 0.0 340 20770 9.5 52.5| da10 >>> 1 351 0 0 0.0 334 20641 9.4 53.1| da11 >>> 1 362 0 0 0.0 345 21155 9.6 54.1| da12 >>> >>> >>> The kernels were compiled identically using GENERIC with no modification. >>> I'm no expert, but none of the stuff I've seen looking at svn commits >>> looks like it would have any impact on this. Any clues? >>> >> >> Your seeing a totally different profile there Josh as in all writes no >> reads where as before you where seeing mainly reads and some writes. >> >> So I would ask if your sure your seeing the same work load, or has >> something external changed too? >> >> Might be worth rebooting back to the new kernel and seeing if your >> still see the issue ;-) >> >> >> Regards >> Steve >> >> Regards >> Steve >> >> > Steve, > > You're absolutely right. I didn't catch that, but the total ops/s is > reaching quite a bit higher. Things are certainly more responsive than > they have been, for what it's worth, so it "feels right." I'm also not > seeing this thing consistently railed to 100% busy like I was before with > similar testing (that was 50 machines just pushing data with dd). I won't > be able to get a good comparison until Monday, when our students come back > (this is a file server for a public school district and used for network > homes). > > Josh > > From owner-freebsd-fs@FreeBSD.ORG Thu Apr 11 16:59:11 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id C28E7DFC for ; Thu, 11 Apr 2013 16:59:11 +0000 (UTC) (envelope-from nowakpl@platinum.linux.pl) Received: from platinum.linux.pl (platinum.edu.pl [81.161.192.4]) by mx1.freebsd.org (Postfix) with ESMTP id 6A738A0C for ; Thu, 11 Apr 2013 16:59:10 +0000 (UTC) Received: by platinum.linux.pl (Postfix, from userid 87) id 2CB5C47E29; Thu, 11 Apr 2013 18:52:37 +0200 (CEST) X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on platinum.linux.pl X-Spam-Level: X-Spam-Status: No, score=-1.3 required=3.0 tests=ALL_TRUSTED,AWL autolearn=disabled version=3.3.2 Received: from [10.255.1.2] (unknown [83.151.38.73]) by platinum.linux.pl (Postfix) with ESMTPA id 9D78F47E24 for ; Thu, 11 Apr 2013 18:52:37 +0200 (CEST) Message-ID: <5166EA43.7050700@platinum.linux.pl> Date: Thu, 11 Apr 2013 18:52:19 +0200 From: Adam Nowacki User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:17.0) Gecko/20130328 Thunderbird/17.0.5 MIME-Version: 1.0 To: freebsd-fs@freebsd.org Subject: ZFS slow reads for unallocated blocks Content-Type: text/plain; charset=ISO-8859-2; format=flowed Content-Transfer-Encoding: 7bit X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 11 Apr 2013 16:59:11 -0000 This one is quite weird - reads from files that were created and resized with ftruncate (so no actual data on disk) are considerably slower and use more CPU time than files with data. If compression is enabled this will also affect files with long runs of zeroes as ZFS won't write any data to disk in this case. There is no I/O on the pool during the read tests - all fits into 10GB ARC. FreeBSD storage 9.1-RELEASE FreeBSD 9.1-RELEASE #0: Sat Feb 23 15:51:26 UTC 2013 root@storage:/usr/obj/usr/src/sys/GENERIC amd64 Mem: 264M Active, 82M Inact, 12G Wired, 100M Cache, 13M Buf, 3279M Free Swap: 5120M Total, 5120M Free # zfs create -o atime=off -o recordsize=128k -o compression=off -o mountpoint=/home/testfs home/testfs --- truncated file: # time truncate -s 1G /home/testfs/trunc1g 0.000u 0.000s 0:00.00 0.0% 0+0k 0+0io 0pf+0w # time dd if=/home/testfs/trunc1g of=/dev/null bs=1024k 1024+0 records in 1024+0 records out 1073741824 bytes transferred in 0.434817 secs (2469410435 bytes/sec) 0.000u 0.435s 0:00.43 100.0% 26+2813k 0+0io 0pf+0w # time dd if=/home/testfs/trunc1g of=/dev/null bs=16k 65536+0 records in 65536+0 records out 1073741824 bytes transferred in 3.809560 secs (281854564 bytes/sec) 0.000u 3.779s 0:03.81 98.9% 25+2755k 0+0io 0pf+0w # time cat /home/testfs/trunc1g > /dev/null 0.070u 14.031s 0:14.19 99.3% 15+2755k 0+0io 0pf+0w ^^^^^^^ 14 seconds compared to 1 second for random data --- file filled with zeroes: # time dd if=/dev/zero of=/home/testfs/zero1g bs=1024k count=1024 1024+0 records in 1024+0 records out 1073741824 bytes transferred in 2.375426 secs (452020732 bytes/sec) 0.000u 0.525s 0:02.37 21.9% 23+2533k 1+1io 0pf+0w # time dd if=/home/testfs/zero1g of=/dev/null bs=1024k 1024+0 records in 1024+0 records out 1073741824 bytes transferred in 0.199078 secs (5393571244 bytes/sec) 0.000u 0.200s 0:00.20 100.0% 26+2808k 0+0io 0pf+0w # time dd if=/home/testfs/zero1g of=/dev/null bs=16k 65536+0 records in 65536+0 records out 1073741824 bytes transferred in 0.436472 secs (2460046434 bytes/sec) 0.015u 0.421s 0:00.43 100.0% 26+2813k 0+0io 0pf+0w # time cat /home/testfs/zero1g > /dev/null 0.023u 1.156s 0:01.18 99.1% 15+2779k 0+0io 0pf+0w --- file filled with random bytes: # time dd if=/dev/random of=/home/testfs/random1g bs=1024k count=1024 1024+0 records in 1024+0 records out 1073741824 bytes transferred in 16.116569 secs (66623474 bytes/sec) 0.000u 13.214s 0:16.11 81.9% 25+2750k 0+1io 0pf+0w # time dd if=/home/testfs/random1g of=/dev/null bs=1024k 1024+0 records in 1024+0 records out 1073741824 bytes transferred in 0.207115 secs (5184280044 bytes/sec) 0.000u 0.208s 0:00.20 100.0% 26+2808k 0+0io 0pf+0w # time dd if=/home/testfs/random1g of=/dev/null bs=16k 65536+0 records in 65536+0 records out 1073741824 bytes transferred in 0.432518 secs (2482536705 bytes/sec) 0.023u 0.409s 0:00.43 97.6% 26+2828k 0+0io 0pf+0w # time cat /home/testfs/random1g > /dev/null 0.031u 1.053s 0:01.08 100.0% 15+2770k 0+0io 0pf+0w --- compression on: # zfs create -o atime=off -o recordsize=128k -o compression=lzjb -o mountpoint=/home/testfs home/testfs --- file filled with zeroes: # time dd if=/dev/zero of=/home/testfs/zero1g bs=1024k count=1024 1024+0 records in 1024+0 records out 1073741824 bytes transferred in 1.007765 secs (1065468404 bytes/sec) 0.000u 0.458s 0:01.01 44.5% 26+2880k 1+1io 0pf+0w # time dd if=/home/testfs/zero1g of=/dev/null bs=1024k 1024+0 records in 1024+0 records out 1073741824 bytes transferred in 0.630737 secs (1702360431 bytes/sec) 0.000u 0.630s 0:00.63 100.0% 25+2742k 0+0io 0pf+0w # time dd if=/home/testfs/zero1g of=/dev/null bs=16k 65536+0 records in 65536+0 records out 1073741824 bytes transferred in 4.089175 secs (262581530 bytes/sec) 0.015u 4.036s 0:04.09 98.7% 25+2758k 0+0io 0pf+0w # time cat /home/testfs/zero1g > /dev/null 0.031u 15.863s 0:15.95 99.6% 15+2754k 0+0io 0pf+0w ^^^^^^^ --- it appears recordsize has a huge effect on this (recordsize=32k): # zfs create -o atime=off -o recordsize=32k -o compression=off -o mountpoint=/home/testfs home/testfs # time truncate -s 1G testfs/trunc1g 0.000u 0.000s 0:00.01 0.0% 0+0k 1+0io 0pf+0w # time cat /home/testfs/trunc1g > /dev/null 0.047u 5.842s 0:05.93 99.1% 15+2761k 0+0io 0pf+0w ^^^^^^ --- recordsize=4k: # zfs create -o atime=off -o recordsize=4k -o compression=off -o mountpoint=/home/testfs home/testfs # time truncate -s 1G testfs/trunc1g 0.000u 0.000s 0:00.00 0.0% 0+0k 0+0io 0pf+0w # time cat /home/testfs/trunc1g > /dev/null 0.047u 1.441s 0:01.52 97.3% 15+2768k 0+0io 0pf+0w ^^^^^^ From owner-freebsd-fs@FreeBSD.ORG Thu Apr 11 17:02:57 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id AEE99FB8 for ; Thu, 11 Apr 2013 17:02:57 +0000 (UTC) (envelope-from break19@gmail.com) Received: from mail-qa0-f50.google.com (mail-qa0-f50.google.com [209.85.216.50]) by mx1.freebsd.org (Postfix) with ESMTP id 7675AA7D for ; Thu, 11 Apr 2013 17:02:57 +0000 (UTC) Received: by mail-qa0-f50.google.com with SMTP id bv4so400267qab.2 for ; Thu, 11 Apr 2013 10:02:56 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=x-received:message-id:date:from:user-agent:mime-version:to:subject :references:in-reply-to:content-type:content-transfer-encoding; bh=DQ1iT9Qd4XzdRJLl2E1Jpg2BPFlxxV849t28cjZxhcE=; b=xR7JqPZiC4c7Yfqr9jCNCA7b5cFatWHGGfgiqENA1OKpBELrPF+BVzlhvcm4+ZCnpS ypm2b3LKkM4qUwN8tahoXZAIHQb1EhILkWBraS/sdB3dtzyhxVlZy/iIAItXCMl62xHd ZgOA/eggjMY0jCUlVXmc/5K3SOf7fMK0hpdo3zuvk5031AKTKQWrKJ4fYYImw3Kc0mhY 1GiQkiYKD//FNX7Lq5kAzPa4l2Zrcq/4SHX7smYSpymzXq4OZkgG81wnr5Akv4oFADQ0 zSqyi++QnnPO1xvOAmk2YxtYhwBA0MPyjg7lh7gD8L6B5CCHPUWldFVL+elN1w9EanEn xoLQ== X-Received: by 10.229.120.82 with SMTP id c18mr1901163qcr.10.1365699776758; Thu, 11 Apr 2013 10:02:56 -0700 (PDT) Received: from [192.168.0.198] (231.sub-70-196-129.myvzw.com. [70.196.129.231]) by mx.google.com with ESMTPS id ds5sm8400995qab.11.2013.04.11.10.02.54 (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Thu, 11 Apr 2013 10:02:55 -0700 (PDT) Message-ID: <5166ECBA.1090005@gmail.com> Date: Thu, 11 Apr 2013 12:02:50 -0500 From: Chuck Burns User-Agent: Mozilla/5.0 (Windows NT 6.1; rv:15.0) Gecko/20120907 Thunderbird/15.0.1 MIME-Version: 1.0 To: freebsd-fs@freebsd.org Subject: Re: ZFS slow reads for unallocated blocks References: <5166EA43.7050700@platinum.linux.pl> In-Reply-To: <5166EA43.7050700@platinum.linux.pl> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 11 Apr 2013 17:02:57 -0000 On 4/11/2013 11:52 AM, Adam Nowacki wrote: > This one is quite weird - reads from files that were created and resized > with ftruncate (so no actual data on disk) are considerably slower and > use more CPU time than files with data. If compression is enabled this > will also affect files with long runs of zeroes as ZFS won't write any > data to disk in this case. There is no I/O on the pool during the read > tests - all fits into 10GB ARC. > > FreeBSD storage 9.1-RELEASE FreeBSD 9.1-RELEASE #0: Sat Feb 23 15:51:26 > UTC 2013 root@storage:/usr/obj/usr/src/sys/GENERIC amd64 Sounds like it could be a CPU bottleneck? How about some cpu info? -- Chuck Burns From owner-freebsd-fs@FreeBSD.ORG Thu Apr 11 17:08:56 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 5D81EFC for ; Thu, 11 Apr 2013 17:08:56 +0000 (UTC) (envelope-from nowakpl@platinum.linux.pl) Received: from platinum.linux.pl (platinum.edu.pl [81.161.192.4]) by mx1.freebsd.org (Postfix) with ESMTP id 239A4AE3 for ; Thu, 11 Apr 2013 17:08:55 +0000 (UTC) Received: by platinum.linux.pl (Postfix, from userid 87) id CAB4647E29; Thu, 11 Apr 2013 19:08:54 +0200 (CEST) X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on platinum.linux.pl X-Spam-Level: X-Spam-Status: No, score=-1.3 required=3.0 tests=ALL_TRUSTED,AWL autolearn=disabled version=3.3.2 Received: from [10.255.1.2] (unknown [83.151.38.73]) by platinum.linux.pl (Postfix) with ESMTPA id 9DEB047E24 for ; Thu, 11 Apr 2013 19:08:54 +0200 (CEST) Message-ID: <5166EE14.1090806@platinum.linux.pl> Date: Thu, 11 Apr 2013 19:08:36 +0200 From: Adam Nowacki User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:17.0) Gecko/20130328 Thunderbird/17.0.5 MIME-Version: 1.0 To: freebsd-fs@freebsd.org Subject: Re: ZFS slow reads for unallocated blocks References: <5166EA43.7050700@platinum.linux.pl> <5166ECBA.1090005@gmail.com> In-Reply-To: <5166ECBA.1090005@gmail.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 11 Apr 2013 17:08:56 -0000 On 2013-04-11 19:02, Chuck Burns wrote: > On 4/11/2013 11:52 AM, Adam Nowacki wrote: >> This one is quite weird - reads from files that were created and resized >> with ftruncate (so no actual data on disk) are considerably slower and >> use more CPU time than files with data. If compression is enabled this >> will also affect files with long runs of zeroes as ZFS won't write any >> data to disk in this case. There is no I/O on the pool during the read >> tests - all fits into 10GB ARC. >> >> FreeBSD storage 9.1-RELEASE FreeBSD 9.1-RELEASE #0: Sat Feb 23 15:51:26 >> UTC 2013 root@storage:/usr/obj/usr/src/sys/GENERIC amd64 > > > Sounds like it could be a CPU bottleneck? How about some cpu info? > > hw.model: AMD FX(tm)-4100 Quad-Core Processor hw.ncpu: 4 But this doesn't matter much as CPU time should be compared relative to other tests - 1 second versus 15 seconds on the same CPU. From owner-freebsd-fs@FreeBSD.ORG Thu Apr 11 17:14:32 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 55B7C2DE for ; Thu, 11 Apr 2013 17:14:32 +0000 (UTC) (envelope-from jdc@koitsu.org) Received: from qmta11.emeryville.ca.mail.comcast.net (qmta11.emeryville.ca.mail.comcast.net [IPv6:2001:558:fe2d:44:76:96:27:211]) by mx1.freebsd.org (Postfix) with ESMTP id 3B2A7B9E for ; Thu, 11 Apr 2013 17:14:32 +0000 (UTC) Received: from omta08.emeryville.ca.mail.comcast.net ([76.96.30.12]) by qmta11.emeryville.ca.mail.comcast.net with comcast id NckQ1l0010FhH24ABhEWJe; Thu, 11 Apr 2013 17:14:30 +0000 Received: from koitsu.strangled.net ([67.180.84.87]) by omta08.emeryville.ca.mail.comcast.net with comcast id NhEU1l00b1t3BNj8UhEU1b; Thu, 11 Apr 2013 17:14:30 +0000 Received: by icarus.home.lan (Postfix, from userid 1000) id 51B6E73A33; Thu, 11 Apr 2013 10:14:28 -0700 (PDT) Date: Thu, 11 Apr 2013 10:14:28 -0700 From: Jeremy Chadwick To: Adam Nowacki Subject: Re: ZFS slow reads for unallocated blocks Message-ID: <20130411171428.GA56127@icarus.home.lan> References: <5166EA43.7050700@platinum.linux.pl> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <5166EA43.7050700@platinum.linux.pl> User-Agent: Mutt/1.5.21 (2010-09-15) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=comcast.net; s=q20121106; t=1365700470; bh=n139eCt+I4pSz9PE6DsND372GhSx/TOAQ9nnc+LzNJc=; h=Received:Received:Received:Date:From:To:Subject:Message-ID: MIME-Version:Content-Type; b=ZYxsQvm3tPhcEcjvBibP6s7ULTBz34ENeYnvZKqTk6Jt24somPUsVNiAlye++Fcdc Zkt8UHYU+C5VEbyGXy7jHRzamtK6qqG9xBMro99uFaVBjefqnoiKTneFKHWt+tXmGE HySAFUYbd3tnUy1YJZlsI35f1+kg2xXD7juyqV60g7pcWKJ/d9PhLo1ZKF9haNS8Wb L6YhkhUjEcECEBNXdHxBFa7vTBdy7fEHx6ph3ySYzU/L5uB1+mimbJ6auVwP6C4BG0 M9zBoo82eyF67c2Qq/RBkQIS7vF+rllq3vabgwdX+Cx/Qy9DFVrf7VgPL7PmRYb3lT Ni/fDTNOI5KiA== Cc: freebsd-fs@freebsd.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 11 Apr 2013 17:14:32 -0000 On Thu, Apr 11, 2013 at 06:52:19PM +0200, Adam Nowacki wrote: > This one is quite weird - reads from files that were created and > resized with ftruncate (so no actual data on disk) are considerably > slower and use more CPU time than files with data. If compression is > enabled this will also affect files with long runs of zeroes as ZFS > won't write any data to disk in this case. There is no I/O on the > pool during the read tests - all fits into 10GB ARC. > > FreeBSD storage 9.1-RELEASE FreeBSD 9.1-RELEASE #0: Sat Feb 23 > 15:51:26 UTC 2013 root@storage:/usr/obj/usr/src/sys/GENERIC > amd64 > > Mem: 264M Active, 82M Inact, 12G Wired, 100M Cache, 13M Buf, 3279M Free > Swap: 5120M Total, 5120M Free > > # zfs create -o atime=off -o recordsize=128k -o compression=off -o > mountpoint=/home/testfs home/testfs > > --- truncated file: > > # time truncate -s 1G /home/testfs/trunc1g > 0.000u 0.000s 0:00.00 0.0% 0+0k 0+0io 0pf+0w > > # time dd if=/home/testfs/trunc1g of=/dev/null bs=1024k > 1024+0 records in > 1024+0 records out > 1073741824 bytes transferred in 0.434817 secs (2469410435 bytes/sec) > 0.000u 0.435s 0:00.43 100.0% 26+2813k 0+0io 0pf+0w > > # time dd if=/home/testfs/trunc1g of=/dev/null bs=16k > 65536+0 records in > 65536+0 records out > 1073741824 bytes transferred in 3.809560 secs (281854564 bytes/sec) > 0.000u 3.779s 0:03.81 98.9% 25+2755k 0+0io 0pf+0w > > # time cat /home/testfs/trunc1g > /dev/null > 0.070u 14.031s 0:14.19 99.3% 15+2755k 0+0io 0pf+0w > ^^^^^^^ 14 seconds compared to 1 second for random data > > --- file filled with zeroes: > > # time dd if=/dev/zero of=/home/testfs/zero1g bs=1024k count=1024 > 1024+0 records in > 1024+0 records out > 1073741824 bytes transferred in 2.375426 secs (452020732 bytes/sec) > 0.000u 0.525s 0:02.37 21.9% 23+2533k 1+1io 0pf+0w > > # time dd if=/home/testfs/zero1g of=/dev/null bs=1024k > 1024+0 records in > 1024+0 records out > 1073741824 bytes transferred in 0.199078 secs (5393571244 bytes/sec) > 0.000u 0.200s 0:00.20 100.0% 26+2808k 0+0io 0pf+0w > > # time dd if=/home/testfs/zero1g of=/dev/null bs=16k > 65536+0 records in > 65536+0 records out > 1073741824 bytes transferred in 0.436472 secs (2460046434 bytes/sec) > 0.015u 0.421s 0:00.43 100.0% 26+2813k 0+0io 0pf+0w > > # time cat /home/testfs/zero1g > /dev/null > 0.023u 1.156s 0:01.18 99.1% 15+2779k 0+0io 0pf+0w > > --- file filled with random bytes: > > # time dd if=/dev/random of=/home/testfs/random1g bs=1024k count=1024 > 1024+0 records in > 1024+0 records out > 1073741824 bytes transferred in 16.116569 secs (66623474 bytes/sec) > 0.000u 13.214s 0:16.11 81.9% 25+2750k 0+1io 0pf+0w > > # time dd if=/home/testfs/random1g of=/dev/null bs=1024k > 1024+0 records in > 1024+0 records out > 1073741824 bytes transferred in 0.207115 secs (5184280044 bytes/sec) > 0.000u 0.208s 0:00.20 100.0% 26+2808k 0+0io 0pf+0w > > # time dd if=/home/testfs/random1g of=/dev/null bs=16k > 65536+0 records in > 65536+0 records out > 1073741824 bytes transferred in 0.432518 secs (2482536705 bytes/sec) > 0.023u 0.409s 0:00.43 97.6% 26+2828k 0+0io 0pf+0w > > # time cat /home/testfs/random1g > /dev/null > 0.031u 1.053s 0:01.08 100.0% 15+2770k 0+0io 0pf+0w > > --- compression on: > > # zfs create -o atime=off -o recordsize=128k -o compression=lzjb -o > mountpoint=/home/testfs home/testfs > > --- file filled with zeroes: > > # time dd if=/dev/zero of=/home/testfs/zero1g bs=1024k count=1024 > 1024+0 records in > 1024+0 records out > 1073741824 bytes transferred in 1.007765 secs (1065468404 bytes/sec) > 0.000u 0.458s 0:01.01 44.5% 26+2880k 1+1io 0pf+0w > > # time dd if=/home/testfs/zero1g of=/dev/null bs=1024k > 1024+0 records in > 1024+0 records out > 1073741824 bytes transferred in 0.630737 secs (1702360431 bytes/sec) > 0.000u 0.630s 0:00.63 100.0% 25+2742k 0+0io 0pf+0w > > # time dd if=/home/testfs/zero1g of=/dev/null bs=16k > 65536+0 records in > 65536+0 records out > 1073741824 bytes transferred in 4.089175 secs (262581530 bytes/sec) > 0.015u 4.036s 0:04.09 98.7% 25+2758k 0+0io 0pf+0w > > # time cat /home/testfs/zero1g > /dev/null > 0.031u 15.863s 0:15.95 99.6% 15+2754k 0+0io 0pf+0w > ^^^^^^^ > > --- it appears recordsize has a huge effect on this (recordsize=32k): > > # zfs create -o atime=off -o recordsize=32k -o compression=off -o > mountpoint=/home/testfs home/testfs > > # time truncate -s 1G testfs/trunc1g > 0.000u 0.000s 0:00.01 0.0% 0+0k 1+0io 0pf+0w > > # time cat /home/testfs/trunc1g > /dev/null > 0.047u 5.842s 0:05.93 99.1% 15+2761k 0+0io 0pf+0w > ^^^^^^ > > --- recordsize=4k: > > # zfs create -o atime=off -o recordsize=4k -o compression=off -o > mountpoint=/home/testfs home/testfs > > # time truncate -s 1G testfs/trunc1g > 0.000u 0.000s 0:00.00 0.0% 0+0k 0+0io 0pf+0w > > # time cat /home/testfs/trunc1g > /dev/null > 0.047u 1.441s 0:01.52 97.3% 15+2768k 0+0io 0pf+0w > ^^^^^^ Take a look at src/bin/cat/cat.c, function raw_cat(). Therein lies the answer. TL;DR -- cat != dd. -- | Jeremy Chadwick jdc@koitsu.org | | UNIX Systems Administrator http://jdc.koitsu.org/ | | Mountain View, CA, US | | Making life hard for others since 1977. PGP 4BD6C0CB | From owner-freebsd-fs@FreeBSD.ORG Thu Apr 11 18:23:23 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 8D997B44 for ; Thu, 11 Apr 2013 18:23:23 +0000 (UTC) (envelope-from jdc@koitsu.org) Received: from qmta01.emeryville.ca.mail.comcast.net (qmta01.emeryville.ca.mail.comcast.net [IPv6:2001:558:fe2d:43:76:96:30:16]) by mx1.freebsd.org (Postfix) with ESMTP id 73BE91088 for ; Thu, 11 Apr 2013 18:23:23 +0000 (UTC) Received: from omta15.emeryville.ca.mail.comcast.net ([76.96.30.71]) by qmta01.emeryville.ca.mail.comcast.net with comcast id Ndsd1l02D1Y3wxoA1iPNjD; Thu, 11 Apr 2013 18:23:22 +0000 Received: from koitsu.strangled.net ([67.180.84.87]) by omta15.emeryville.ca.mail.comcast.net with comcast id NiPN1l0061t3BNj8biPNx4; Thu, 11 Apr 2013 18:23:22 +0000 Received: by icarus.home.lan (Postfix, from userid 1000) id 02F0973A33; Thu, 11 Apr 2013 11:23:22 -0700 (PDT) Date: Thu, 11 Apr 2013 11:23:21 -0700 From: Jeremy Chadwick To: Adam Nowacki Subject: Re: ZFS slow reads for unallocated blocks Message-ID: <20130411182321.GA57336@icarus.home.lan> References: <5166EA43.7050700@platinum.linux.pl> <20130411171428.GA56127@icarus.home.lan> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20130411171428.GA56127@icarus.home.lan> User-Agent: Mutt/1.5.21 (2010-09-15) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=comcast.net; s=q20121106; t=1365704602; bh=7fdnkFVpop8s5wcLbmBVmR8cBBHIKwLSiOjDj0i1q8w=; h=Received:Received:Received:Date:From:To:Subject:Message-ID: MIME-Version:Content-Type; b=ZI6SEAaEzrPJsblSlw+6h9nfYa/Nk8eTOaLFDdb4LmE5ZQLRvwgZge06PnlVdOU6M 0977VnGHwrhbamgxtRhGi4aHxTp7i2IGA3poSUjifBLdo6T3u0mh39yuje6JSMLOv1 4VOBGzk1eSbAzsMD4S3n7xeOnLSiVqU2PQoSjgutrqzVyNeRR6eDGrDWf7plJFjxs7 2W79ep2cyZ+zf3eCsEi5eMWKDv9FyzAbEoaE1c6Yh7alQxOijxKGTfrAjOOLIRigC/ Hestiizr8JbrxUu278gxaVGjUOJhjZsvaSiuUofCd7TJwEQOSnYXsDVeoq75/AJGBt KbRhW21vchHMQ== Cc: freebsd-fs@freebsd.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 11 Apr 2013 18:23:23 -0000 On Thu, Apr 11, 2013 at 10:14:28AM -0700, Jeremy Chadwick wrote: > On Thu, Apr 11, 2013 at 06:52:19PM +0200, Adam Nowacki wrote: > > This one is quite weird - reads from files that were created and > > resized with ftruncate (so no actual data on disk) are considerably > > slower and use more CPU time than files with data. If compression is > > enabled this will also affect files with long runs of zeroes as ZFS > > won't write any data to disk in this case. There is no I/O on the > > pool during the read tests - all fits into 10GB ARC. > > > > FreeBSD storage 9.1-RELEASE FreeBSD 9.1-RELEASE #0: Sat Feb 23 > > 15:51:26 UTC 2013 root@storage:/usr/obj/usr/src/sys/GENERIC > > amd64 > > > > Mem: 264M Active, 82M Inact, 12G Wired, 100M Cache, 13M Buf, 3279M Free > > Swap: 5120M Total, 5120M Free > > > > # zfs create -o atime=off -o recordsize=128k -o compression=off -o > > mountpoint=/home/testfs home/testfs > > > > --- truncated file: > > > > # time truncate -s 1G /home/testfs/trunc1g > > 0.000u 0.000s 0:00.00 0.0% 0+0k 0+0io 0pf+0w > > > > # time dd if=/home/testfs/trunc1g of=/dev/null bs=1024k > > 1024+0 records in > > 1024+0 records out > > 1073741824 bytes transferred in 0.434817 secs (2469410435 bytes/sec) > > 0.000u 0.435s 0:00.43 100.0% 26+2813k 0+0io 0pf+0w > > > > # time dd if=/home/testfs/trunc1g of=/dev/null bs=16k > > 65536+0 records in > > 65536+0 records out > > 1073741824 bytes transferred in 3.809560 secs (281854564 bytes/sec) > > 0.000u 3.779s 0:03.81 98.9% 25+2755k 0+0io 0pf+0w > > > > # time cat /home/testfs/trunc1g > /dev/null > > 0.070u 14.031s 0:14.19 99.3% 15+2755k 0+0io 0pf+0w > > ^^^^^^^ 14 seconds compared to 1 second for random data > > > > --- file filled with zeroes: > > > > # time dd if=/dev/zero of=/home/testfs/zero1g bs=1024k count=1024 > > 1024+0 records in > > 1024+0 records out > > 1073741824 bytes transferred in 2.375426 secs (452020732 bytes/sec) > > 0.000u 0.525s 0:02.37 21.9% 23+2533k 1+1io 0pf+0w > > > > # time dd if=/home/testfs/zero1g of=/dev/null bs=1024k > > 1024+0 records in > > 1024+0 records out > > 1073741824 bytes transferred in 0.199078 secs (5393571244 bytes/sec) > > 0.000u 0.200s 0:00.20 100.0% 26+2808k 0+0io 0pf+0w > > > > # time dd if=/home/testfs/zero1g of=/dev/null bs=16k > > 65536+0 records in > > 65536+0 records out > > 1073741824 bytes transferred in 0.436472 secs (2460046434 bytes/sec) > > 0.015u 0.421s 0:00.43 100.0% 26+2813k 0+0io 0pf+0w > > > > # time cat /home/testfs/zero1g > /dev/null > > 0.023u 1.156s 0:01.18 99.1% 15+2779k 0+0io 0pf+0w > > > > --- file filled with random bytes: > > > > # time dd if=/dev/random of=/home/testfs/random1g bs=1024k count=1024 > > 1024+0 records in > > 1024+0 records out > > 1073741824 bytes transferred in 16.116569 secs (66623474 bytes/sec) > > 0.000u 13.214s 0:16.11 81.9% 25+2750k 0+1io 0pf+0w > > > > # time dd if=/home/testfs/random1g of=/dev/null bs=1024k > > 1024+0 records in > > 1024+0 records out > > 1073741824 bytes transferred in 0.207115 secs (5184280044 bytes/sec) > > 0.000u 0.208s 0:00.20 100.0% 26+2808k 0+0io 0pf+0w > > > > # time dd if=/home/testfs/random1g of=/dev/null bs=16k > > 65536+0 records in > > 65536+0 records out > > 1073741824 bytes transferred in 0.432518 secs (2482536705 bytes/sec) > > 0.023u 0.409s 0:00.43 97.6% 26+2828k 0+0io 0pf+0w > > > > # time cat /home/testfs/random1g > /dev/null > > 0.031u 1.053s 0:01.08 100.0% 15+2770k 0+0io 0pf+0w > > > > --- compression on: > > > > # zfs create -o atime=off -o recordsize=128k -o compression=lzjb -o > > mountpoint=/home/testfs home/testfs > > > > --- file filled with zeroes: > > > > # time dd if=/dev/zero of=/home/testfs/zero1g bs=1024k count=1024 > > 1024+0 records in > > 1024+0 records out > > 1073741824 bytes transferred in 1.007765 secs (1065468404 bytes/sec) > > 0.000u 0.458s 0:01.01 44.5% 26+2880k 1+1io 0pf+0w > > > > # time dd if=/home/testfs/zero1g of=/dev/null bs=1024k > > 1024+0 records in > > 1024+0 records out > > 1073741824 bytes transferred in 0.630737 secs (1702360431 bytes/sec) > > 0.000u 0.630s 0:00.63 100.0% 25+2742k 0+0io 0pf+0w > > > > # time dd if=/home/testfs/zero1g of=/dev/null bs=16k > > 65536+0 records in > > 65536+0 records out > > 1073741824 bytes transferred in 4.089175 secs (262581530 bytes/sec) > > 0.015u 4.036s 0:04.09 98.7% 25+2758k 0+0io 0pf+0w > > > > # time cat /home/testfs/zero1g > /dev/null > > 0.031u 15.863s 0:15.95 99.6% 15+2754k 0+0io 0pf+0w > > ^^^^^^^ > > > > --- it appears recordsize has a huge effect on this (recordsize=32k): > > > > # zfs create -o atime=off -o recordsize=32k -o compression=off -o > > mountpoint=/home/testfs home/testfs > > > > # time truncate -s 1G testfs/trunc1g > > 0.000u 0.000s 0:00.01 0.0% 0+0k 1+0io 0pf+0w > > > > # time cat /home/testfs/trunc1g > /dev/null > > 0.047u 5.842s 0:05.93 99.1% 15+2761k 0+0io 0pf+0w > > ^^^^^^ > > > > --- recordsize=4k: > > > > # zfs create -o atime=off -o recordsize=4k -o compression=off -o > > mountpoint=/home/testfs home/testfs > > > > # time truncate -s 1G testfs/trunc1g > > 0.000u 0.000s 0:00.00 0.0% 0+0k 0+0io 0pf+0w > > > > # time cat /home/testfs/trunc1g > /dev/null > > 0.047u 1.441s 0:01.52 97.3% 15+2768k 0+0io 0pf+0w > > ^^^^^^ > > Take a look at src/bin/cat/cat.c, function raw_cat(). > > Therein lies the answer. > > TL;DR -- cat != dd. I wanted to follow up on this, because I received an off-list private flame basically telling me to shut the fuck up. Compression has nothing to do with this. recordsize plays a role for what should become obvious reasons -- keep reading. Again: cat is not dd. cat has its own set of logic for how it calculates the "optimal block size" to use when calling read(2). dd, operates differently, and lets you set the blocksize using bs. A modified cat binary (work/src/bin/cat/cat) which prints out the relevant "calculation" bits within raw_cat(). And yes, $cwd is a ZFS filesystem, with a default recordsize (128KB): $ truncate -s 1g testfile $ time work/src/bin/cat/cat testfile > /dev/null raw_cat() PHYSPAGES_THRESHOLD: 32768 BUFSIZE_MAX: 2097152 BUFSIZE_SMALL: 131072 _SC_PHYS_PAGES: 2088077 bsize = 4096 real 0m13.067s user 0m0.000s sys 0m13.070s cat, in my case, issues read(fd, 4096). That's a 4KByte block size. Now let's use dd with the same size (bs=4096): $ time dd if=testfile of=/dev/null bs=4k 262144+0 records in 262144+0 records out 1073741824 bytes transferred in 13.031543 secs (82395601 bytes/sec) real 0m13.033s user 0m0.023s sys 0m13.000s Increase the block size to dd (from 4k up to 8k) and the speed will double: $ time dd if=testfile of=/dev/null bs=8k 131072+0 records in 131072+0 records out 1073741824 bytes transferred in 6.545117 secs (164052352 bytes/sec) real 0m6.546s user 0m0.026s sys 0m6.519s Now let's move to a UFS2+SU filesystem, where the behaviour is different: $ truncate -s 1g /tmp/testfile $ time work/src/bin/cat/cat /tmp/testfile > /dev/null raw_cat() PHYSPAGES_THRESHOLD: 32768 BUFSIZE_MAX: 2097152 BUFSIZE_SMALL: 131072 _SC_PHYS_PAGES: 2088077 bsize = 4096 real 0m1.191s user 0m0.031s sys 0m1.159s $ time dd if=/tmp/testfile of=/dev/null bs=4k 262144+0 records in 262144+0 records out 1073741824 bytes transferred in 0.898440 secs (1195117846 bytes/sec) real 0m0.900s user 0m0.039s sys 0m0.860s $ time dd if=/tmp/testfile of=/dev/null bs=8k 131072+0 records in 131072+0 records out 1073741824 bytes transferred in 0.768045 secs (1398019512 bytes/sec) real 0m0.769s user 0m0.024s sys 0m0.745s My conclusion is that ZFS handles sparse/truncated files very differently than UFS. Those familiar with the internals of ZFS can probably explain this dilemma. -- | Jeremy Chadwick jdc@koitsu.org | | UNIX Systems Administrator http://jdc.koitsu.org/ | | Mountain View, CA, US | | Making life hard for others since 1977. PGP 4BD6C0CB | From owner-freebsd-fs@FreeBSD.ORG Thu Apr 11 19:59:38 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 7C16C135 for ; Thu, 11 Apr 2013 19:59:38 +0000 (UTC) (envelope-from lkchen@k-state.edu) Received: from ksu-out.merit.edu (ksu-out.merit.edu [207.75.117.132]) by mx1.freebsd.org (Postfix) with ESMTP id 48BE316D6 for ; Thu, 11 Apr 2013 19:59:37 +0000 (UTC) X-Merit-ExtLoop1: 1 X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AgEFACMVZ1HPS3TT/2dsb2JhbABQgwaDZL8iFnSCHwEBBSNWDA8ODAINGQJLAQ0GiCeqYolmiRGBI4wmFQwHg1qBEwOoEYMngU4BAR4e X-IronPort-AV: E=Sophos;i="4.87,456,1363147200"; d="scan'208";a="213530502" X-MERIT-SOURCE: KSU Received: from ksu-sfpop-mailstore02.merit.edu ([207.75.116.211]) by sfpop-ironport07.merit.edu with ESMTP; 11 Apr 2013 15:59:30 -0400 Date: Thu, 11 Apr 2013 15:59:30 -0400 (EDT) From: "Lawrence K. Chen, P.Eng." To: Dmitry Morozovsky Message-ID: <1499926740.6048241.1365710370032.JavaMail.root@k-state.edu> In-Reply-To: Subject: Re: ZFS-inly server and dedicated ZIL MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Originating-IP: [129.130.0.181] X-Mailer: Zimbra 7.2.2_GA_2852 (ZimbraWebClient - GC25 ([unknown])/7.2.2_GA_2852) Cc: freebsd-fs@freebsd.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 11 Apr 2013 19:59:38 -0000 ----- Original Message ----- > > On my home system, I currently have a pair of SSDs....each with 7 > > partitions.... 5 tiny ones for mirror ZILs for the other zpools in > > my system, > > the mirrored root pool and partitions to be L2ARC for the two > > zpools in my > > system that I'm using dedup in. (there's currently only 4 zpools > > in my > > system, including the root....but I had originally envisioned the > > possibility > > of adding additional external jbod arrays....) > > What is not clear for me is why one could want more than one pool in > one > machine? > > Thanks for your comments, will test further. > Well, before ZFS it was common (best) practice everywhere I've been that the OS is separate from the rest of the system. Either when it was just a separate disk, to a separate VG, to now where its a separate zpool. typically at work, we have root pool, local disk pool (typically 2 drives mirrored, though we have boxes with more than 4 internal drives.), plus a zpool of SAN storage. Alternatively we have a system where the other local drives are in some kind of raidz pool.) There's also the part where root pool is limited to a disk or mirrored, Solaris can't boot from a raidz root pool. Made things wasteful in the Thumper, until they added a compact flash option. It has only been recently that the practice is losing favor....now that systems are shipping with 300GB drives or larger. But my home system started with a single 120GB SSD for root pool (cut up for swap, root pool and cache, for a mirrored pool) and a pair of internal 1.5TB drives mirrored, and an external set of 4 - 1.5TB drives in a raidz pool, it later got 6 - 2TB drives in a raidz2 pool. I later went to a pair of 120GB SSDs, redivvy'd the swap as a bunch of small partitions for mirrored ZILs and added cache to my raidz pool. The mix of drive sizes is another reason for having more than one zpool....if you mix with mirror or raidz#, the smallest drive limits what the total capacity can be. Also originally this was a Windows machine with just the 2 x 1.5TB disks mirrored.... as single C drive (I had debated repartitioning it like I had done with my previous Windows machine....wish I had.) Feb 14, 2012....it had autopatched itself overnight and now the drive was corrupt the only only option in the OEM disk was revert to factory (bought a full copy, but upgrade install requires the system boot first) I could reinstall the OS, but I want all my data back first. The 4 - 1.5TB drives had been in RAID5 under windows, but FreeBSD would only see the drives individually...not as the single RAID5 logical. A couple months ago the same thing happened to another Windows machine (at work.) Now the only Windows I run are in VirtualBox. This system is a pair 2TB drives, but I have it divided into two zpools. I've corrupted zpools on this system twice, until I upgraded the BIOS and it detected a bad DIMM. From owner-freebsd-fs@FreeBSD.ORG Thu Apr 11 20:14:18 2013 Return-Path: Delivered-To: freebsd-fs@FreeBSD.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 533175E5 for ; Thu, 11 Apr 2013 20:14:18 +0000 (UTC) (envelope-from spork@bway.net) Received: from smtp1.bway.net (smtp1.bway.net [216.220.96.27]) by mx1.freebsd.org (Postfix) with ESMTP id 335C91798 for ; Thu, 11 Apr 2013 20:14:17 +0000 (UTC) Received: from hotlap.sporklab.com (foon.sporktines.com [96.57.144.66]) (using TLSv1 with cipher AES128-SHA (128/128 bits)) (No client certificate requested) (Authenticated sender: spork@bway.net) by smtp1.bway.net (Postfix) with ESMTPSA id 5EBB59586D; Thu, 11 Apr 2013 16:04:16 -0400 (EDT) References: In-Reply-To: Mime-Version: 1.0 (Apple Message framework v1085) Content-Type: text/plain; charset=us-ascii Message-Id: Content-Transfer-Encoding: quoted-printable From: Charles Sprickman Subject: Re: ZFS-inly server and dedicated ZIL Date: Thu, 11 Apr 2013 16:04:15 -0400 To: Dmitry Morozovsky X-Mailer: Apple Mail (2.1085) Cc: freebsd-fs@FreeBSD.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 11 Apr 2013 20:14:18 -0000 On Apr 10, 2013, at 9:23 AM, Dmitry Morozovsky wrote: > Dear colleagues, >=20 > I'm planning to make new PostgreSQL server using zaid10-like ZFS with = two SSDs=20 > splitted into mirrored ZIL and striped arc2. This might seem like an odd suggestion, but if you're putting the pool = on SSDs (is that correct?), I'd totally skip the separate ZIL device and = ARC. I think you'll find the SSDs will need zero help from another log = device and L2ARC is probably just not that helpful for DB loads. > However, it seems current=20 > ZFS implementation does not support this: >=20 > ./lib/libzfs/common/libzfs_pool.c- case EDOM: > ./lib/libzfs/common/libzfs_pool.c- = zfs_error_aux(hdl, dgettext(TEXT_DOMAIN, > ./lib/libzfs/common/libzfs_pool.c: "root pool = can not have multiple vdevs" > ./lib/libzfs/common/libzfs_pool.c- " or = separate logs")); > ./lib/libzfs/common/libzfs_pool.c- (void) = zfs_error(hdl, EZFS_POOL_NOTSUP, msg); >=20 > Am I right, or did I missed something obvious? I've asked about this on this very list some time ago, but no one really = had any answers: http://lists.freebsd.org/pipermail/freebsd-fs/2012-September/015142.html The last post in that thread brings up an interesting point that was not = answered, which is can our zfs boot loader handle the ZIL playback on = boot? I would assume so (regardless of where the ZIL device lives), but = who knows? Yet another ZFS mystery. :) To summarize, yes, you can work around the root pool restriction, which = is supposedly a Solaris thing that got carried over. You do this by = unsetting the "bootfs" property on the pool (ie: "zpool set bootfs=3D'' = poolname"), adding your log devices to the pool, and then setting the = bootfs property again. Works for me. Someone noted this only works for mirrors, but I've done it on raidz = pools as well. It would be great to have someone weigh-in on whether this is valid or = not. The blog post most people reference regarding this issue is here: http://astralblue.livejournal.com/371755.html Charles >=20 > Ok, if so: In this situation, I see two possibilities: > - make system boot from internal USB stick (only /bootdisk with /boot = and=20 > /rescue) with the rest of ZFS-on-root > - use dedicated pair of disks for ZFS pool without ZIL for system. >=20 > what would you recommend? >=20 > Thanks! >=20 > --=20 > Sincerely, > D.Marck [DM5020, MCK-RIPE, = DM3-RIPN] > [ FreeBSD committer: marck@FreeBSD.org = ] > = ------------------------------------------------------------------------ > *** Dmitry Morozovsky --- D.Marck --- Wild Woozle --- marck@rinet.ru = *** > = ------------------------------------------------------------------------ > _______________________________________________ > freebsd-fs@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-fs > To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org" From owner-freebsd-fs@FreeBSD.ORG Thu Apr 11 20:47:52 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 7146BDFB for ; Thu, 11 Apr 2013 20:47:52 +0000 (UTC) (envelope-from radiomlodychbandytow@o2.pl) Received: from moh2-ve2.go2.pl (moh2-ve2.go2.pl [193.17.41.200]) by mx1.freebsd.org (Postfix) with ESMTP id 33ACA1A64 for ; Thu, 11 Apr 2013 20:47:51 +0000 (UTC) Received: from moh2-ve2.go2.pl (unknown [10.0.0.200]) by moh2-ve2.go2.pl (Postfix) with ESMTP id 79367B0156B for ; Thu, 11 Apr 2013 22:47:44 +0200 (CEST) Received: from unknown (unknown [10.0.0.108]) by moh2-ve2.go2.pl (Postfix) with SMTP for ; Thu, 11 Apr 2013 22:47:43 +0200 (CEST) Received: from unknown [93.175.66.185] by poczta.o2.pl with ESMTP id rQjzzC; Thu, 11 Apr 2013 22:47:43 +0200 Message-ID: <51672164.1090908@o2.pl> Date: Thu, 11 Apr 2013 22:47:32 +0200 From: =?UTF-8?B?UmFkaW8gbcWCb2R5Y2ggYmFuZHl0w7N3?= User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:17.0) Gecko/20130324 Thunderbird/17.0.4 MIME-Version: 1.0 CC: freebsd-fs@freebsd.org Subject: A failed drive causes system to hang References: In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-O2-Trust: 1, 37 X-O2-SPF: neutral X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 11 Apr 2013 20:47:52 -0000 Seeing a ZFS thread, I decided to write about a similar problem that I experience. I have a failing drive in my array. I need to RMA it, but don't have time and it fails rarely enough to be a yet another annoyance. The failure is simple: it fails to respond. When it happens, the only thing I found I can do is switch consoles. Any command fails, login fails, apps hang. On the 1st console I see a series of messages like: (ada0:ahcich0:0:0:0): CAM status: Command timeout (ada0:ahcich0:0:0:0): Error 5, Periph was invalidated (ada0:ahcich0:0:0:0): WRITE_FPDMA_QUEUED I use RAIDZ1 and I'd expect that none single failure would cause the system to fail... -- Twoje radio From owner-freebsd-fs@FreeBSD.ORG Thu Apr 11 21:03:06 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 5103AFFC for ; Thu, 11 Apr 2013 21:03:06 +0000 (UTC) (envelope-from prvs=1813db02ab=killing@multiplay.co.uk) Received: from mail1.multiplay.co.uk (mail1.multiplay.co.uk [85.236.96.23]) by mx1.freebsd.org (Postfix) with ESMTP id EB0441B35 for ; Thu, 11 Apr 2013 21:03:04 +0000 (UTC) Received: from r2d2 ([46.65.172.4]) by mail1.multiplay.co.uk (mail1.multiplay.co.uk [85.236.96.23]) (MDaemon PRO v10.0.4) with ESMTP id md50003224210.msg for ; Thu, 11 Apr 2013 22:02:55 +0100 X-Spam-Processed: mail1.multiplay.co.uk, Thu, 11 Apr 2013 22:02:55 +0100 (not processed: message from valid local sender) X-MDDKIM-Result: neutral (mail1.multiplay.co.uk) X-MDRemoteIP: 46.65.172.4 X-Return-Path: prvs=1813db02ab=killing@multiplay.co.uk X-Envelope-From: killing@multiplay.co.uk X-MDaemon-Deliver-To: freebsd-fs@freebsd.org Message-ID: <41A207817BC94167B0C94133EC0DFD68@multiplay.co.uk> From: "Steven Hartland" To: =?iso-8859-1?Q?Radio_mlodych_bandyt=F3w?= References: <51672164.1090908@o2.pl> Subject: Re: A failed drive causes system to hang Date: Thu, 11 Apr 2013 22:03:04 +0100 MIME-Version: 1.0 Content-Type: text/plain; format=flowed; charset="iso-8859-1"; reply-type=response Content-Transfer-Encoding: 8bit X-Priority: 3 X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook Express 6.00.2900.5931 X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2900.6157 Cc: freebsd-fs@freebsd.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 11 Apr 2013 21:03:06 -0000 ----- Original Message ----- From: "Radio mlodych bandytw" > Seeing a ZFS thread, I decided to write about a similar problem that I experience. > I have a failing drive in my array. I need to RMA it, but don't have time and it fails rarely enough to be a yet another > annoyance. > The failure is simple: it fails to respond. > When it happens, the only thing I found I can do is switch consoles. Any command fails, login fails, apps hang. > > On the 1st console I see a series of messages like: > > (ada0:ahcich0:0:0:0): CAM status: Command timeout > (ada0:ahcich0:0:0:0): Error 5, Periph was invalidated > (ada0:ahcich0:0:0:0): WRITE_FPDMA_QUEUED > > I use RAIDZ1 and I'd expect that none single failure would cause the system to fail... OS version? ================================================ This e.mail is private and confidential between Multiplay (UK) Ltd. and the person or entity to whom it is addressed. In the event of misdirection, the recipient is prohibited from using, copying, printing or otherwise disseminating it or any information contained in it. In the event of misdirection, illegible or incomplete transmission please telephone +44 845 868 1337 or return the E.mail to postmaster@multiplay.co.uk. From owner-freebsd-fs@FreeBSD.ORG Thu Apr 11 21:24:10 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id DE755825 for ; Thu, 11 Apr 2013 21:24:10 +0000 (UTC) (envelope-from jdc@koitsu.org) Received: from qmta14.emeryville.ca.mail.comcast.net (qmta14.emeryville.ca.mail.comcast.net [IPv6:2001:558:fe2d:44:76:96:27:212]) by mx1.freebsd.org (Postfix) with ESMTP id C472F1CA1 for ; Thu, 11 Apr 2013 21:24:10 +0000 (UTC) Received: from omta09.emeryville.ca.mail.comcast.net ([76.96.30.20]) by qmta14.emeryville.ca.mail.comcast.net with comcast id Nkgs1l0040S2fkCAElQ9Sb; Thu, 11 Apr 2013 21:24:09 +0000 Received: from koitsu.strangled.net ([67.180.84.87]) by omta09.emeryville.ca.mail.comcast.net with comcast id NlQ81l00Z1t3BNj8VlQ8Zc; Thu, 11 Apr 2013 21:24:08 +0000 Received: by icarus.home.lan (Postfix, from userid 1000) id 3C40773A33; Thu, 11 Apr 2013 14:24:08 -0700 (PDT) Date: Thu, 11 Apr 2013 14:24:08 -0700 From: Jeremy Chadwick To: Radio =?unknown-8bit?B?bcU/b2R5Y2ggYmFuZHl0w7N3?= Subject: Re: A failed drive causes system to hang Message-ID: <20130411212408.GA60159@icarus.home.lan> References: <51672164.1090908@o2.pl> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <51672164.1090908@o2.pl> User-Agent: Mutt/1.5.21 (2010-09-15) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=comcast.net; s=q20121106; t=1365715449; bh=lgeG55F7PQugm9NysM2njoyOOna4cglwykmfgG1Balw=; h=Received:Received:Received:Date:From:To:Subject:Message-ID: MIME-Version:Content-Type; b=ehSizw0gWrdfgEm4dWfiL0qxjBc0SeUCqm76EktGQlLchtYlvYd65U3NBTGaWysDR 02ImKX7WxoZc79v58dawcPhbZwRzc6PEcDngDFB9OqPq8/UKGkowE9lpb+vukTEviK 6nrYEKA0nNPooz3q8oYQ5ZeBAupb8RoSOrKmTNO8VxxnKa3sRnZHoSxHxI2nYBJZMy S8uDwV8K1hc4OjTbhkgkbk9L7yTVKa+TSeslT/gd2HnVn50nZgO/hWRCH1l9QGuUr0 Nb338kItKokDQe4SEiYuD2PdzLmEat3I5k9hSFiOpcc/lMb5T5xo9C18g72XRZmEZ+ Y+odVzzhjMGPQ== Cc: freebsd-fs@freebsd.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 11 Apr 2013 21:24:10 -0000 On Thu, Apr 11, 2013 at 10:47:32PM +0200, Radio m?odych bandytw wrote: > Seeing a ZFS thread, I decided to write about a similar problem that > I experience. > I have a failing drive in my array. I need to RMA it, but don't have > time and it fails rarely enough to be a yet another annoyance. > The failure is simple: it fails to respond. > When it happens, the only thing I found I can do is switch consoles. > Any command fails, login fails, apps hang. > > On the 1st console I see a series of messages like: > > (ada0:ahcich0:0:0:0): CAM status: Command timeout > (ada0:ahcich0:0:0:0): Error 5, Periph was invalidated > (ada0:ahcich0:0:0:0): WRITE_FPDMA_QUEUED > > I use RAIDZ1 and I'd expect that none single failure would cause the > system to fail... You need to provide full output from "dmesg", and you need to define what the word "fails" means (re: "any command fails", "login fails"). I've already demonstrated that loss of a disk in raidz1 (or even 2 disks in raidz2) does not cause ""the system to fail"" on stable/9. However, if you lose enough members or vdevs to cause catastrophic failure, there may be anomalies depending on how your system is set up: http://lists.freebsd.org/pipermail/freebsd-fs/2013-March/016814.html If the pool has failmode=wait, any I/O to that pool will block (wait) indefinitely. This is the default. If the pool has failmode=continue, existing write I/O operations will fail with EIO (I/O error) (and hopefully applications/daemons will handle that gracefully -- if not, that's their fault) but any subsequent I/O (read or write) to that pool will block (wait) indefinitely. If the pool has failmode=panic, the kernel will immediately panic. If the CAM layer is what's wedged, that may be a different issue (and not related to ZFS). I would suggest running stable/9 as many improvements in this regard have been committed recently (some related to CAM, others related to ZFS and its new "deadman" watcher). Bottom line: terse output of the problem does not help. Be verbose, provide all output (commands you type, everything!), as well as any physical actions you take. -- | Jeremy Chadwick jdc@koitsu.org | | UNIX Systems Administrator http://jdc.koitsu.org/ | | Mountain View, CA, US | | Making life hard for others since 1977. PGP 4BD6C0CB | From owner-freebsd-fs@FreeBSD.ORG Thu Apr 11 22:08:17 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 5D530210 for ; Thu, 11 Apr 2013 22:08:17 +0000 (UTC) (envelope-from quartz@sneakertech.com) Received: from relay03.pair.com (relay03.pair.com [209.68.5.17]) by mx1.freebsd.org (Postfix) with SMTP id 022201E5B for ; Thu, 11 Apr 2013 22:08:16 +0000 (UTC) Received: (qmail 8346 invoked by uid 0); 11 Apr 2013 22:08:15 -0000 Received: from 173.48.104.62 (HELO ?10.2.2.1?) (173.48.104.62) by relay03.pair.com with SMTP; 11 Apr 2013 22:08:15 -0000 X-pair-Authenticated: 173.48.104.62 Message-ID: <5167344E.8020301@sneakertech.com> Date: Thu, 11 Apr 2013 18:08:14 -0400 From: Quartz User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.6; rv:10.0.2) Gecko/20120216 Thunderbird/10.0.2 MIME-Version: 1.0 To: =?UTF-8?B?UmFkaW8gbcWCb2R5Y2ggYmFuZHl0w7N3?= Subject: Re: A failed drive causes system to hang References: <51672164.1090908@o2.pl> In-Reply-To: <51672164.1090908@o2.pl> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit Cc: freebsd-fs@freebsd.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 11 Apr 2013 22:08:17 -0000 > Seeing a ZFS thread, I decided to write about a similar problem that I > experience. I'm assuming you're referring to my "Failed pool causes system to hang" thread. I wonder if there's some common issue with zfs where it locks up if it can't write to disks how it wants to. I'm not sure how similar your problem is to mine. What's your pool setup look like? Redundancy options? Are you booting from a pool? I'd be interested to know if you can just yank the cable to the drive and see if the system recovers. You seem to be worse off than me- I can still login and run at least a couple commands. I'm booting from a straight ufs drive though. ______________________________________ it has a certain smooth-brained appeal From owner-freebsd-fs@FreeBSD.ORG Thu Apr 11 22:12:48 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 40262598 for ; Thu, 11 Apr 2013 22:12:48 +0000 (UTC) (envelope-from nowakpl@platinum.linux.pl) Received: from platinum.linux.pl (platinum.edu.pl [81.161.192.4]) by mx1.freebsd.org (Postfix) with ESMTP id DA1321E9F for ; Thu, 11 Apr 2013 22:12:47 +0000 (UTC) Received: by platinum.linux.pl (Postfix, from userid 87) id C1EF747E21; Fri, 12 Apr 2013 00:12:45 +0200 (CEST) X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on platinum.linux.pl X-Spam-Level: X-Spam-Status: No, score=-1.3 required=3.0 tests=ALL_TRUSTED,AWL autolearn=disabled version=3.3.2 Received: from [10.255.1.2] (c38-073.client.duna.pl [83.151.38.73]) by platinum.linux.pl (Postfix) with ESMTPA id EEC3247E16; Fri, 12 Apr 2013 00:12:44 +0200 (CEST) Message-ID: <5167354A.60406@platinum.linux.pl> Date: Fri, 12 Apr 2013 00:12:26 +0200 From: Adam Nowacki User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:17.0) Gecko/20130328 Thunderbird/17.0.5 MIME-Version: 1.0 To: Jeremy Chadwick Subject: Re: ZFS slow reads for unallocated blocks References: <5166EA43.7050700@platinum.linux.pl> <20130411171428.GA56127@icarus.home.lan> <20130411182321.GA57336@icarus.home.lan> In-Reply-To: <20130411182321.GA57336@icarus.home.lan> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: freebsd-fs@freebsd.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 11 Apr 2013 22:12:48 -0000 On 2013-04-11 20:23, Jeremy Chadwick wrote: > On Thu, Apr 11, 2013 at 10:14:28AM -0700, Jeremy Chadwick wrote: >> On Thu, Apr 11, 2013 at 06:52:19PM +0200, Adam Nowacki wrote: >>> This one is quite weird - reads from files that were created and >>> resized with ftruncate (so no actual data on disk) are considerably >>> slower and use more CPU time than files with data. If compression is >>> enabled this will also affect files with long runs of zeroes as ZFS >>> won't write any data to disk in this case. There is no I/O on the >>> pool during the read tests - all fits into 10GB ARC. >>> >>> FreeBSD storage 9.1-RELEASE FreeBSD 9.1-RELEASE #0: Sat Feb 23 >>> 15:51:26 UTC 2013 root@storage:/usr/obj/usr/src/sys/GENERIC >>> amd64 >>> >>> Mem: 264M Active, 82M Inact, 12G Wired, 100M Cache, 13M Buf, 3279M Free >>> Swap: 5120M Total, 5120M Free >>> >>> # zfs create -o atime=off -o recordsize=128k -o compression=off -o >>> mountpoint=/home/testfs home/testfs >>> >>> --- truncated file: >>> >>> # time truncate -s 1G /home/testfs/trunc1g >>> 0.000u 0.000s 0:00.00 0.0% 0+0k 0+0io 0pf+0w >>> >>> # time dd if=/home/testfs/trunc1g of=/dev/null bs=1024k >>> 1024+0 records in >>> 1024+0 records out >>> 1073741824 bytes transferred in 0.434817 secs (2469410435 bytes/sec) >>> 0.000u 0.435s 0:00.43 100.0% 26+2813k 0+0io 0pf+0w >>> >>> # time dd if=/home/testfs/trunc1g of=/dev/null bs=16k >>> 65536+0 records in >>> 65536+0 records out >>> 1073741824 bytes transferred in 3.809560 secs (281854564 bytes/sec) >>> 0.000u 3.779s 0:03.81 98.9% 25+2755k 0+0io 0pf+0w >>> >>> # time cat /home/testfs/trunc1g > /dev/null >>> 0.070u 14.031s 0:14.19 99.3% 15+2755k 0+0io 0pf+0w >>> ^^^^^^^ 14 seconds compared to 1 second for random data >>> >>> --- file filled with zeroes: >>> >>> # time dd if=/dev/zero of=/home/testfs/zero1g bs=1024k count=1024 >>> 1024+0 records in >>> 1024+0 records out >>> 1073741824 bytes transferred in 2.375426 secs (452020732 bytes/sec) >>> 0.000u 0.525s 0:02.37 21.9% 23+2533k 1+1io 0pf+0w >>> >>> # time dd if=/home/testfs/zero1g of=/dev/null bs=1024k >>> 1024+0 records in >>> 1024+0 records out >>> 1073741824 bytes transferred in 0.199078 secs (5393571244 bytes/sec) >>> 0.000u 0.200s 0:00.20 100.0% 26+2808k 0+0io 0pf+0w >>> >>> # time dd if=/home/testfs/zero1g of=/dev/null bs=16k >>> 65536+0 records in >>> 65536+0 records out >>> 1073741824 bytes transferred in 0.436472 secs (2460046434 bytes/sec) >>> 0.015u 0.421s 0:00.43 100.0% 26+2813k 0+0io 0pf+0w >>> >>> # time cat /home/testfs/zero1g > /dev/null >>> 0.023u 1.156s 0:01.18 99.1% 15+2779k 0+0io 0pf+0w >>> >>> --- file filled with random bytes: >>> >>> # time dd if=/dev/random of=/home/testfs/random1g bs=1024k count=1024 >>> 1024+0 records in >>> 1024+0 records out >>> 1073741824 bytes transferred in 16.116569 secs (66623474 bytes/sec) >>> 0.000u 13.214s 0:16.11 81.9% 25+2750k 0+1io 0pf+0w >>> >>> # time dd if=/home/testfs/random1g of=/dev/null bs=1024k >>> 1024+0 records in >>> 1024+0 records out >>> 1073741824 bytes transferred in 0.207115 secs (5184280044 bytes/sec) >>> 0.000u 0.208s 0:00.20 100.0% 26+2808k 0+0io 0pf+0w >>> >>> # time dd if=/home/testfs/random1g of=/dev/null bs=16k >>> 65536+0 records in >>> 65536+0 records out >>> 1073741824 bytes transferred in 0.432518 secs (2482536705 bytes/sec) >>> 0.023u 0.409s 0:00.43 97.6% 26+2828k 0+0io 0pf+0w >>> >>> # time cat /home/testfs/random1g > /dev/null >>> 0.031u 1.053s 0:01.08 100.0% 15+2770k 0+0io 0pf+0w >>> >>> --- compression on: >>> >>> # zfs create -o atime=off -o recordsize=128k -o compression=lzjb -o >>> mountpoint=/home/testfs home/testfs >>> >>> --- file filled with zeroes: >>> >>> # time dd if=/dev/zero of=/home/testfs/zero1g bs=1024k count=1024 >>> 1024+0 records in >>> 1024+0 records out >>> 1073741824 bytes transferred in 1.007765 secs (1065468404 bytes/sec) >>> 0.000u 0.458s 0:01.01 44.5% 26+2880k 1+1io 0pf+0w >>> >>> # time dd if=/home/testfs/zero1g of=/dev/null bs=1024k >>> 1024+0 records in >>> 1024+0 records out >>> 1073741824 bytes transferred in 0.630737 secs (1702360431 bytes/sec) >>> 0.000u 0.630s 0:00.63 100.0% 25+2742k 0+0io 0pf+0w >>> >>> # time dd if=/home/testfs/zero1g of=/dev/null bs=16k >>> 65536+0 records in >>> 65536+0 records out >>> 1073741824 bytes transferred in 4.089175 secs (262581530 bytes/sec) >>> 0.015u 4.036s 0:04.09 98.7% 25+2758k 0+0io 0pf+0w >>> >>> # time cat /home/testfs/zero1g > /dev/null >>> 0.031u 15.863s 0:15.95 99.6% 15+2754k 0+0io 0pf+0w >>> ^^^^^^^ >>> >>> --- it appears recordsize has a huge effect on this (recordsize=32k): >>> >>> # zfs create -o atime=off -o recordsize=32k -o compression=off -o >>> mountpoint=/home/testfs home/testfs >>> >>> # time truncate -s 1G testfs/trunc1g >>> 0.000u 0.000s 0:00.01 0.0% 0+0k 1+0io 0pf+0w >>> >>> # time cat /home/testfs/trunc1g > /dev/null >>> 0.047u 5.842s 0:05.93 99.1% 15+2761k 0+0io 0pf+0w >>> ^^^^^^ >>> >>> --- recordsize=4k: >>> >>> # zfs create -o atime=off -o recordsize=4k -o compression=off -o >>> mountpoint=/home/testfs home/testfs >>> >>> # time truncate -s 1G testfs/trunc1g >>> 0.000u 0.000s 0:00.00 0.0% 0+0k 0+0io 0pf+0w >>> >>> # time cat /home/testfs/trunc1g > /dev/null >>> 0.047u 1.441s 0:01.52 97.3% 15+2768k 0+0io 0pf+0w >>> ^^^^^^ >> >> Take a look at src/bin/cat/cat.c, function raw_cat(). >> >> Therein lies the answer. >> >> TL;DR -- cat != dd. > > I wanted to follow up on this, because I received an off-list private > flame basically telling me to shut the fuck up. Indeed. I wasn't expecting an answer like that (TL;DR) from freebsd-fs. And it angered me quite a lot since answers like that will just scare away those with actual knowledge of the internals. > Compression has nothing to do with this. recordsize plays a role for > what should become obvious reasons -- keep reading. It has to do quite a lot if you understand how ZFS handles compression and all zero records. Without compression file data is stored as is - if the file was written with all zeroes then those zeroes end written to the disks. With compression there is additional step (separate from compression itself) where if recordsized block of all zeroes is written this block remains unallocated with no data stored on the disk - there is no L0 record for that block. This produces the same behavior as if file was extended with ftruncate. > Again: cat is not dd. cat has its own set of logic for how it > calculates the "optimal block size" to use when calling read(2). dd, > operates differently, and lets you set the blocksize using bs. I've already answered you off the list - cat was used to show "real life" example of really how bad the slowdown is. Most software in the wild won't be using large buffer sizes. I've given you another example (again sent in private but worth repeating on the list below) where this issue is clearly visible: # time md5 testfs/zero1g MD5 (testfs/zero1g) = cd573cfaace07e7949bc0c46028904ff 2.495u 3.384s 0:05.88 99.8% 15+167k 0+0io 0pf+0w # time md5 testfs/trunc1g MD5 (testfs/trunc1g) = cd573cfaace07e7949bc0c46028904ff 2.711u 65.452s 1:08.55 99.4% 15+167k 0+0io 0pf+0w The difference is 3 seconds versus 65 seconds. Clearly there is room for improvement - and most of the extra time is spent in kernel. > My conclusion is that ZFS handles sparse/truncated files very > differently than UFS. Those familiar with the internals of ZFS can > probably explain this dilemma. And finally we are back to the root of the problem. Is this FreeBSD only issue or should this be brought upstream to illumos - all my systems are FreeBSD. From owner-freebsd-fs@FreeBSD.ORG Fri Apr 12 07:03:43 2013 Return-Path: Delivered-To: freebsd-fs@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id AE3E7123 for ; Fri, 12 Apr 2013 07:03:43 +0000 (UTC) (envelope-from avg@FreeBSD.org) Received: from citadel.icyb.net.ua (citadel.icyb.net.ua [212.40.38.140]) by mx1.freebsd.org (Postfix) with ESMTP id 056801333 for ; Fri, 12 Apr 2013 07:03:42 +0000 (UTC) Received: from porto.starpoint.kiev.ua (porto-e.starpoint.kiev.ua [212.40.38.100]) by citadel.icyb.net.ua (8.8.8p3/ICyb-2.3exp) with ESMTP id KAA23251; Fri, 12 Apr 2013 10:03:37 +0300 (EEST) (envelope-from avg@FreeBSD.org) Received: from localhost ([127.0.0.1]) by porto.starpoint.kiev.ua with esmtp (Exim 4.34 (FreeBSD)) id 1UQY1J-0009cx-5d; Fri, 12 Apr 2013 10:03:37 +0300 Message-ID: <5167B1C5.8020402@FreeBSD.org> Date: Fri, 12 Apr 2013 10:03:33 +0300 From: Andriy Gapon User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:17.0) Gecko/20130405 Thunderbird/17.0.5 MIME-Version: 1.0 To: Adam Nowacki Subject: Re: ZFS slow reads for unallocated blocks References: <5166EA43.7050700@platinum.linux.pl> In-Reply-To: <5166EA43.7050700@platinum.linux.pl> X-Enigmail-Version: 1.5.1 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Cc: freebsd-fs@FreeBSD.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 12 Apr 2013 07:03:43 -0000 ENOTIME to really investigate, but here is a basic profile result for those interested: kernel`bzero+0xa kernel`dmu_buf_hold_array_by_dnode+0x1cf kernel`dmu_read_uio+0x66 kernel`zfs_freebsd_read+0x3c0 kernel`VOP_READ_APV+0x92 kernel`vn_read+0x1a3 kernel`vn_io_fault+0x23a kernel`dofileread+0x7b kernel`sys_read+0x9e kernel`amd64_syscall+0x238 kernel`0xffffffff80747e4b That's where > 99% of time is spent. -- Andriy Gapon From owner-freebsd-fs@FreeBSD.ORG Fri Apr 12 09:44:52 2013 Return-Path: Delivered-To: freebsd-fs@FreeBSD.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 412841B5 for ; Fri, 12 Apr 2013 09:44:52 +0000 (UTC) (envelope-from marck@rinet.ru) Received: from woozle.rinet.ru (woozle.rinet.ru [195.54.192.68]) by mx1.freebsd.org (Postfix) with ESMTP id C46BB1C0F for ; Fri, 12 Apr 2013 09:44:51 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by woozle.rinet.ru (8.14.5/8.14.5) with ESMTP id r3C9ilG1007558; Fri, 12 Apr 2013 13:44:47 +0400 (MSK) (envelope-from marck@rinet.ru) Date: Fri, 12 Apr 2013 13:44:47 +0400 (MSK) From: Dmitry Morozovsky To: Charles Sprickman Subject: Re: ZFS-inly server and dedicated ZIL In-Reply-To: Message-ID: References: User-Agent: Alpine 2.00 (BSF 1167 2008-08-23) X-NCC-RegID: ru.rinet X-OpenPGP-Key-ID: 6B691B03 MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Cc: freebsd-fs@FreeBSD.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 12 Apr 2013 09:44:52 -0000 On Thu, 11 Apr 2013, Charles Sprickman wrote: > On Apr 10, 2013, at 9:23 AM, Dmitry Morozovsky wrote: > > > Dear colleagues, > > > > I'm planning to make new PostgreSQL server using zaid10-like ZFS with two SSDs > > splitted into mirrored ZIL and striped arc2. > > This might seem like an odd suggestion, but if you're putting the pool on > SSDs (is that correct?), I'd totally skip the separate ZIL device and ARC. > I think you'll find the SSDs will need zero help from another log device and > L2ARC is probably just not that helpful for DB loads. No, this will be 8*SAS in 4 mirrored pairs + 2*SSD for mirrored ZIL and striped l2arc. Like the following (this is from other machine, but similar in setup): NAME STATE READ WRITE CKSUM pn ONLINE 0 0 0 mirror-0 ONLINE 0 0 0 da0 ONLINE 0 0 0 da6 ONLINE 0 0 0 mirror-2 ONLINE 0 0 0 da1 ONLINE 0 0 0 da7 ONLINE 0 0 0 logs mirror-1 ONLINE 0 0 0 da2d ONLINE 0 0 0 da3d ONLINE 0 0 0 cache da2e ONLINE 0 0 0 da3e ONLINE 0 0 0 -- Sincerely, D.Marck [DM5020, MCK-RIPE, DM3-RIPN] [ FreeBSD committer: marck@FreeBSD.org ] ------------------------------------------------------------------------ *** Dmitry Morozovsky --- D.Marck --- Wild Woozle --- marck@rinet.ru *** ------------------------------------------------------------------------ From owner-freebsd-fs@FreeBSD.ORG Fri Apr 12 14:30:09 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id D05F3193 for ; Fri, 12 Apr 2013 14:30:09 +0000 (UTC) (envelope-from paulz@vanderzwan.org) Received: from cpsmtpb-ews10.kpnxchange.com (cpsmtpb-ews10.kpnxchange.com [213.75.39.15]) by mx1.freebsd.org (Postfix) with ESMTP id 52646AB0 for ; Fri, 12 Apr 2013 14:30:09 +0000 (UTC) Received: from cpsps-ews15.kpnxchange.com ([10.94.84.182]) by cpsmtpb-ews10.kpnxchange.com with Microsoft SMTPSVC(7.5.7601.17514); Fri, 12 Apr 2013 16:28:58 +0200 Received: from CPSMTPM-TLF101.kpnxchange.com ([195.121.3.4]) by cpsps-ews15.kpnxchange.com with Microsoft SMTPSVC(7.5.7601.17514); Fri, 12 Apr 2013 16:28:58 +0200 Received: from mailvm.vanderzwan.org ([77.172.189.82]) by CPSMTPM-TLF101.kpnxchange.com with Microsoft SMTPSVC(7.5.7601.17514); Fri, 12 Apr 2013 16:28:57 +0200 Received: from [IPv6:2001:1af8:fefb::12dd:b1ff:feb3:1119] ([IPv6:2001:1af8:fefb:0:12dd:b1ff:feb3:1119]) (authenticated bits=0) by mailvm.vanderzwan.org (8.14.6/8.14.6) with ESMTP id r3CESqUK004519 (version=TLSv1/SSLv3 cipher=AES128-SHA bits=128 verify=NO) for ; Fri, 12 Apr 2013 16:28:57 +0200 (CEST) (envelope-from paulz@vanderzwan.org) From: Paul van der Zwan Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: quoted-printable Subject: FreeBSD 9.1 NFSv4 client attribute cache not caching ? Message-Id: <15B91473-99F4-4B48-BC18-D47B3037E8DF@vanderzwan.org> Date: Fri, 12 Apr 2013 16:28:52 +0200 To: freebsd-fs@freebsd.org Mime-Version: 1.0 (Mac OS X Mail 6.3 \(1503\)) X-Mailer: Apple Mail (2.1503) X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.3.9 (mailvm.vanderzwan.org [IPv6:2001:1af8:fefb::25]); Fri, 12 Apr 2013 16:28:57 +0200 (CEST) X-OriginalArrivalTime: 12 Apr 2013 14:28:57.0974 (UTC) FILETIME=[118C3960:01CE378A] X-RcptDomain: freebsd.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 12 Apr 2013 14:30:09 -0000 I am running a few VirtualBox VMs with 9.1 on my OpenIndiana server and = I noticed that make buildworld seem to take much longer=20 when the clients mount /usr/src and /usr/obj over NFS V4 than when they = use V3. Unfortunately I have to use V4 as a buildworld on V3 hangs the server = completely... I noticed the number of PUTFH/GETATTR/GETFH calls in in the order of a = few thousand per second and if I snoop the traffic I see the same filenames appear over and over = again. It looks like the client is not caching anything at all and using a = server request everytime. I use the default mount options: 192.168.178.24:/data/ports on /usr/ports (nfs, nfsv4acls) 192.168.178.24:/data/src on /usr/src (nfs, nfsv4acls) 192.168.178.24:/data/obj on /usr/obj (nfs, nfsv4acls) Any ideas, Paul From owner-freebsd-fs@FreeBSD.ORG Fri Apr 12 15:59:47 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id DE859B46 for ; Fri, 12 Apr 2013 15:59:47 +0000 (UTC) (envelope-from toasty@dragondata.com) Received: from mail-ia0-x22e.google.com (mail-ia0-x22e.google.com [IPv6:2607:f8b0:4001:c02::22e]) by mx1.freebsd.org (Postfix) with ESMTP id AB690FD7 for ; Fri, 12 Apr 2013 15:59:47 +0000 (UTC) Received: by mail-ia0-f174.google.com with SMTP id r13so2502192iar.5 for ; Fri, 12 Apr 2013 08:59:47 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=dragondata.com; s=google; h=x-received:content-type:mime-version:subject:from:in-reply-to:date :cc:content-transfer-encoding:message-id:references:to:x-mailer; bh=+tAF/14YlcVF1kxnDsVk6tFbGRdWqxL8l6nkNoDhqss=; b=Mk7zpjHGEAoOslt0atC1a4j5RFBo63S069cBMSBXFuImUkDnB0heZcrRxrMr2WZnn0 BQ9VWoAvVmtAvliAvt2kmVjJfnC8lMxB9S05deGASzZ/2dwFtSelBqPrGaG2TrOl7PqC Ru0rErdmHdMcEbjXf3bqDxm0pPOgvIjvXhhEY= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=x-received:content-type:mime-version:subject:from:in-reply-to:date :cc:content-transfer-encoding:message-id:references:to:x-mailer :x-gm-message-state; bh=+tAF/14YlcVF1kxnDsVk6tFbGRdWqxL8l6nkNoDhqss=; b=hjGWgL1/cR3QjmGAB4slBVO+JL6p6c/+dhdBPv24MJZQ8Ehj0oG6RksHNy4Hs50fkp pJCyOwd+895Us/3EE14WdkskDPTWefehyjUqSOziu4btPr9L4ZhA8nE4K4wnWZl8GI2d RhYmhmo8MbZAjHJ+RyInA5tsIpzl2ROwpizNG7fIu8HktlXIRSarYrRnCMHUvHmhMGB3 LSbgIdRZoohbVW44lZ5MUFOM/l9OFqGHK63zYe0TMK8jQdlJoJA/F0l7cuFwlqn/aZx3 b0J4qOLU9k89Y4wkb8ulQM/6Zu0IvJqHG+3yiw403dPIKSvjsFHFwG7M6Zt6Vei16Dt7 lnXg== X-Received: by 10.50.154.72 with SMTP id vm8mr1563072igb.1.1365782387248; Fri, 12 Apr 2013 08:59:47 -0700 (PDT) Received: from vpn132.rw1.your.org (vpn132.rw1.your.org. [204.9.51.132]) by mx.google.com with ESMTPS id in10sm3494076igc.1.2013.04.12.08.59.44 (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Fri, 12 Apr 2013 08:59:45 -0700 (PDT) Content-Type: text/plain; charset=us-ascii Mime-Version: 1.0 (Mac OS X Mail 6.3 \(1503\)) Subject: Re: Does sync(8) really flush everything? Lost writes with journaled SU after sync+power cycle From: Kevin Day In-Reply-To: <20130411160253.V1041@besplex.bde.org> Date: Fri, 12 Apr 2013 10:59:42 -0500 Content-Transfer-Encoding: quoted-printable Message-Id: References: <87CC14D8-7DC6-481A-8F85-46629F6D2249@dragondata.com> <20130411160253.V1041@besplex.bde.org> To: Bruce Evans X-Mailer: Apple Mail (2.1503) X-Gm-Message-State: ALoCoQnx2wu0umS5Kd+hZHA2TU4rVAEZslfdRYTx5RDtUFql0CKyfIWxqN3KiNqrO+bhn2c0AF0o Cc: "freebsd-fs@FreeBSD.org Filesystems" X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 12 Apr 2013 15:59:47 -0000 On Apr 11, 2013, at 1:30 AM, Bruce Evans wrote: >=20 > sync(2) only schedules all writing of all modified buffers to disk. = Its > man page even says this. It doesn't wait for any of the writes to = complete. > Its man page says that this is a BUG, but it is intentional and sync() = has > always done this. There is no way for sync() to guarantee that all = modified > buffers have been written to disk when it returns, since even if it = waited, > buffers might be modified while it is returning. Perhaps even ones = that > would take 8 seconds to complete can be written in the few nanoseconds = that > it takes to return. >=20 > sync(8) is just a wrapper around sync(2). One that doesn't even check > for errors. Not that it could handle sync() failure. Its man page > bogusly first claims that it "forces completion". This is not > completely wrong, since it doesn't claim that the completion occurs > before sync(8) exits. But then it claims that sync(8) is suitable "to > ensure that all disk writes have been completed in a way not suitably > done by reboot(8) or halt(8). This wording is poor, unless it is > intentionally weaselishly worded so that it doesn't actually claim > full completion. And on the flip side, the man page for syncer says: It is possible on some systems that a sync(2) occurring simultaneously = with a crash may cause file system damage. See fsck(8). > It only claims more suitable completion than with > reboot or halt. Actually, completion is not guaranteed, and what > sync(8) provides is just less unsuitable than what reboot and halt > provide. >=20 > To ensure completion, you have to freeze the file systems of interest > before rebooting. I don't know of any ways to do this from userland > except mount -u -o ro or unmount. >=20 > There should be a syscall to cause syncing with waiting. The kernel > has a wait option for syncing, but doesn't use it for sync(2). But > using this would only reduce the races. >=20 > Bruce I understand that sync(8) returns immediately, I guess my confusion is = that calling sync(8) doesn't seem to cause *any* writes to happen. I can have the system completely idle (absolutely no processes running = that could cause any filesystem activity), call sync(8), and watching = gstat(8) can see no write activity happen at all, even waiting 10+ = seconds afterwards, where as "mount -u -o ro -f /" causes an instant = flurry of writes to happen. My understanding was that even though sync = returned immediately, flushing would also start immediately, and leave = the system in a safe point, at least until another write happens. If sync(8) isn't starting the write flush immediately, but does return = immediately, I'm having trouble figuring out a situation where calling = sync would ever accomplish anything useful, other than possibly creating = a window where the filesystem could be damaged if a crash happens. -- Kevin From owner-freebsd-fs@FreeBSD.ORG Fri Apr 12 21:11:31 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 395F323C for ; Fri, 12 Apr 2013 21:11:31 +0000 (UTC) (envelope-from radiomlodychbandytow@o2.pl) Received: from moh1-ve3.go2.pl (moh1-ve3.go2.pl [193.17.41.134]) by mx1.freebsd.org (Postfix) with ESMTP id F18C31DA3 for ; Fri, 12 Apr 2013 21:11:30 +0000 (UTC) Received: from moh1-ve3.go2.pl (unknown [10.0.0.134]) by moh1-ve3.go2.pl (Postfix) with ESMTP id 3CA55665C44 for ; Fri, 12 Apr 2013 23:11:30 +0200 (CEST) Received: from unknown (unknown [10.0.0.42]) by moh1-ve3.go2.pl (Postfix) with SMTP for ; Fri, 12 Apr 2013 23:11:30 +0200 (CEST) Received: from unknown [93.175.66.185] by poczta.o2.pl with ESMTP id rntvYA; Fri, 12 Apr 2013 23:11:29 +0200 Message-ID: <51687881.4080005@o2.pl> Date: Fri, 12 Apr 2013 23:11:29 +0200 From: =?UTF-8?B?UmFkaW8gbcWCb2R5Y2ggYmFuZHl0w7N3?= User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:17.0) Gecko/20130324 Thunderbird/17.0.4 MIME-Version: 1.0 To: Steven Hartland Subject: Re: A failed drive causes system to hang References: <51672164.1090908@o2.pl> <41A207817BC94167B0C94133EC0DFD68@multiplay.co.uk> In-Reply-To: <41A207817BC94167B0C94133EC0DFD68@multiplay.co.uk> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-O2-Trust: 1, 35 X-O2-SPF: neutral Cc: freebsd-fs@freebsd.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 12 Apr 2013 21:11:31 -0000 On 11/04/2013 23:03, Steven Hartland wrote: > > OS version? > PC-BSD 9.1 -- Twoje radio From owner-freebsd-fs@FreeBSD.ORG Fri Apr 12 21:22:08 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id DFA994B8 for ; Fri, 12 Apr 2013 21:22:08 +0000 (UTC) (envelope-from jdc@koitsu.org) Received: from qmta12.emeryville.ca.mail.comcast.net (qmta12.emeryville.ca.mail.comcast.net [IPv6:2001:558:fe2d:44:76:96:27:227]) by mx1.freebsd.org (Postfix) with ESMTP id C55B81E14 for ; Fri, 12 Apr 2013 21:22:08 +0000 (UTC) Received: from omta03.emeryville.ca.mail.comcast.net ([76.96.30.27]) by qmta12.emeryville.ca.mail.comcast.net with comcast id P0Bi1l0020b6N64AC9N8Mr; Fri, 12 Apr 2013 21:22:08 +0000 Received: from koitsu.strangled.net ([67.180.84.87]) by omta03.emeryville.ca.mail.comcast.net with comcast id P9N71l00P1t3BNj8P9N7AX; Fri, 12 Apr 2013 21:22:08 +0000 Received: by icarus.home.lan (Postfix, from userid 1000) id 5B0D573A33; Fri, 12 Apr 2013 14:22:07 -0700 (PDT) Date: Fri, 12 Apr 2013 14:22:07 -0700 From: Jeremy Chadwick To: Radio =?unknown-8bit?B?bcU/b2R5Y2ggYmFuZHl0w7N3?= Subject: Re: A failed drive causes system to hang Message-ID: <20130412212207.GA81897@icarus.home.lan> References: <51672164.1090908@o2.pl> <41A207817BC94167B0C94133EC0DFD68@multiplay.co.uk> <51687881.4080005@o2.pl> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <51687881.4080005@o2.pl> User-Agent: Mutt/1.5.21 (2010-09-15) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=comcast.net; s=q20121106; t=1365801728; bh=cIXEyR76rbMz1qDM/s05j/C5U9BGhAepK/1PBUvyWsg=; h=Received:Received:Received:Date:From:To:Subject:Message-ID: MIME-Version:Content-Type; b=p9r2xl1QwvdrBWz/fEtrYQG8SMrYj5Tv+bn7/XC1Oljp2RUGc5ErXF9g+/Z9ugomC dvhNN0r+KmkoS3lWjFUlJ+NLxRhCJp8kd8zvA3W8dYzAJFLau4ihkVhVVg1lPFvVin xYsyU3rI0V6ls84PmQfUI5kUObNExbQ6krrrkRgAGR3NNdO9I44bK8aodAQLpegWrB irIk+XTLBUdblex7KCnpSfeGD73ctSbiipYdS7EdOS6xwHYv8wiPKxw5wlCPm0j1CW Ice+pD1C1QIzr7fYS8NsMOuOn55SWpb4kGPv8Z4eYitKciUD3Nm/9KFZshZftdawFb wDoDG02zii4Sg== Cc: freebsd-fs@freebsd.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 12 Apr 2013 21:22:08 -0000 On Fri, Apr 12, 2013 at 11:11:29PM +0200, Radio m?odych bandytw wrote: > > On 11/04/2013 23:03, Steven Hartland wrote: > > > >OS version? > > > > PC-BSD 9.1 While PC-BSD is based on FreeBSD, is there a reason you didn't go the support route with PC-BSD folks first? Or did they refer you to this mailing list? http://www.pcbsd.org/en/support/ -- | Jeremy Chadwick jdc@koitsu.org | | UNIX Systems Administrator http://jdc.koitsu.org/ | | Mountain View, CA, US | | Making life hard for others since 1977. PGP 4BD6C0CB | From owner-freebsd-fs@FreeBSD.ORG Fri Apr 12 21:52:52 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 24AACA91 for ; Fri, 12 Apr 2013 21:52:52 +0000 (UTC) (envelope-from radiomlodychbandytow@o2.pl) Received: from moh2-ve1.go2.pl (moh2-ve1.go2.pl [193.17.41.186]) by mx1.freebsd.org (Postfix) with ESMTP id 95DE31F01 for ; Fri, 12 Apr 2013 21:52:51 +0000 (UTC) Received: from moh2-ve1.go2.pl (unknown [10.0.0.186]) by moh2-ve1.go2.pl (Postfix) with ESMTP id 103D044C9A8 for ; Fri, 12 Apr 2013 23:52:44 +0200 (CEST) Received: from unknown (unknown [10.0.0.108]) by moh2-ve1.go2.pl (Postfix) with SMTP for ; Fri, 12 Apr 2013 23:52:44 +0200 (CEST) Received: from unknown [93.175.66.185] by poczta.o2.pl with ESMTP id rGzSIl; Fri, 12 Apr 2013 23:52:42 +0200 Message-ID: <5168821F.5020502@o2.pl> Date: Fri, 12 Apr 2013 23:52:31 +0200 From: =?UTF-8?B?UmFkaW8gbcWCb2R5Y2ggYmFuZHl0w7N3?= User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:17.0) Gecko/20130324 Thunderbird/17.0.4 MIME-Version: 1.0 To: Jeremy Chadwick , Quartz Subject: Re: A failed drive causes system to hang References: <51672164.1090908@o2.pl> <20130411212408.GA60159@icarus.home.lan> In-Reply-To: <20130411212408.GA60159@icarus.home.lan> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-O2-Trust: 1, 35 X-O2-SPF: neutral Cc: freebsd-fs@freebsd.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 12 Apr 2013 21:52:52 -0000 On 11/04/2013 23:24, Jeremy Chadwick wrote: > On Thu, Apr 11, 2013 at 10:47:32PM +0200, Radio m?odych bandytw wrote: >> Seeing a ZFS thread, I decided to write about a similar problem that >> I experience. >> I have a failing drive in my array. I need to RMA it, but don't have >> time and it fails rarely enough to be a yet another annoyance. >> The failure is simple: it fails to respond. >> When it happens, the only thing I found I can do is switch consoles. >> Any command fails, login fails, apps hang. >> >> On the 1st console I see a series of messages like: >> >> (ada0:ahcich0:0:0:0): CAM status: Command timeout >> (ada0:ahcich0:0:0:0): Error 5, Periph was invalidated >> (ada0:ahcich0:0:0:0): WRITE_FPDMA_QUEUED >> >> I use RAIDZ1 and I'd expect that none single failure would cause the >> system to fail... > > You need to provide full output from "dmesg", and you need to define > what the word "fails" means (re: "any command fails", "login fails"). Fails = hangs. When trying to log it, I can type my user name, but after I press enter the prompt for password never appear. As to dmesg, tough luck. I have 2 photos on my phone and their transcripts are all I can give until the problem reappears (which should take up to 2 weeks). Photos are blurry and in many cases I'm not sure what exactly is there. Screen1: (ada0:ahcich0:0:0:0): FLUSHCACHE40. ACB: (ea?) 00 00 00 00 (cut?) (ada0:ahcich0:0:0:0): CAM status: Unconditionally Re-qu (cut) (ada0:ahcich0:0:0:0): Error 5, Periph was invalidated (ada0:ahcich0:0:0:0): WRITE_FPDMA_QUEUED. ACB: 61 05 d3(cut) 00 (ada0:ahcich0:0:0:0): CAM status: Command timeout (ada0:ahcich0:0:0:0): Error 5, Periph was invalidated (ada0:ahcich0:0:0:0): WRITE_FPDMA_QUEUED. ACB: 61 03 7b(cut) 00 (ada0:ahcich0:0:0:0): CAM status: Unconditionally Re-qu (cut) (ada0:ahcich0:0:0:0): Error 5, Periph was invalidated (ada0:ahcich0:0:0:0): WRITE_FPDMA_QUEUED. ACB: 61 03 d0(cut) 00 (ada0:ahcich0:0:0:0): CAM status: Command timeout (ada0:ahcich0:0:0:0): Error 5, Periph was invalidated Screen 2: ahcich0: Timeout on slot 29 port 0 ahcich0: (unreadable, lots of numbers, some text) (aprobe0:ahcich0:0:0:0): ATA_IDENTIFY. ACB: (cc?) 00 (cut) (aprobe0:ahcich0:0:0:0): CAM status: Command timeout (aprobe0:ahcich0:0:0:0): Error (5?), Retry was blocked ahcich0: Timeout on slot 29 port 0 ahcich0: (unreadable, lots of numbers, some text) (aprobe0:ahcich0:0:0:0): ATA_IDENTIFY. ACB: (cc?) 00 (cut) (aprobe0:ahcich0:0:0:0): CAM status: Command timeout (aprobe0:ahcich0:0:0:0): Error (5?), Retry was blocked ahcich0: Timeout on slot 30 port 0 ahcich0: (unreadable, lots of numbers, some text) (ada0:ahcich0:0:0:0): WRITE_FPDMA_QUEUED. ACB: 01 (cut) (ada0:ahcich0:0:0:0): CAM status: Command timeout (ada0:ahcich0:0:0:0): Error 5, Periph was invalidated (ada0:ahcich0:0:0:0): WRITE_FPDMA_QUEUED. ACB: 01 (cut) Both are from the same event. In general, messages: (ada0:ahcich0:0:0:0): CAM status: Command timeout (ada0:ahcich0:0:0:0): Error 5, Periph was invalidated (ada0:ahcich0:0:0:0): WRITE_FPDMA_QUEUED. are the most common. I've waited for more than 1/2 hour once and the system didn't return to a working state, the messages kept flowing and pretty much nothing was working. What's interesting, I remember that it happened to me even when I was using an installer (PC-BSD one), before the actual installation began, so the disk stored no program data. And I *think* there was no ZFS yet anyway. > > I've already demonstrated that loss of a disk in raidz1 (or even 2 disks > in raidz2) does not cause ""the system to fail"" on stable/9. However, > if you lose enough members or vdevs to cause catastrophic failure, there > may be anomalies depending on how your system is set up: > > http://lists.freebsd.org/pipermail/freebsd-fs/2013-March/016814.html > > If the pool has failmode=wait, any I/O to that pool will block (wait) > indefinitely. This is the default. > > If the pool has failmode=continue, existing write I/O operations will > fail with EIO (I/O error) (and hopefully applications/daemons will > handle that gracefully -- if not, that's their fault) but any subsequent > I/O (read or write) to that pool will block (wait) indefinitely. > > If the pool has failmode=panic, the kernel will immediately panic. > > If the CAM layer is what's wedged, that may be a different issue (and > not related to ZFS). I would suggest running stable/9 as many > improvements in this regard have been committed recently (some related > to CAM, others related to ZFS and its new "deadman" watcher). Yeah, because of the installer failure, I don't think it's related to ZFS. Even if it is, for now I won't set any ZFS properties in hope it repeats and I can get better data. > > Bottom line: terse output of the problem does not help. Be verbose, > provide all output (commands you type, everything!), as well as any > physical actions you take. > Yep. In fact having little data was what made me hesitate to write about it; since I did already, I'll do my best to get more info, though for now I can only wait for a repetition. On 12/04/2013 00:08, Quartz wrote:> >> Seeing a ZFS thread, I decided to write about a similar problem that I >> experience. > > I'm assuming you're referring to my "Failed pool causes system to hang" > thread. I wonder if there's some common issue with zfs where it locks up > if it can't write to disks how it wants to. > > I'm not sure how similar your problem is to mine. What's your pool setup > look like? Redundancy options? Are you booting from a pool? I'd be > interested to know if you can just yank the cable to the drive and see > if the system recovers. > > You seem to be worse off than me- I can still login and run at least a > couple commands. I'm booting from a straight ufs drive though. > > ______________________________________ > it has a certain smooth-brained appeal > Like I said, I don't think it's ZFS-specific, but just in case...: RAIDZ1, root on ZFS. I should reduce severity of a pool loss before pulling cables, so no tests for now. -- Twoje radio From owner-freebsd-fs@FreeBSD.ORG Fri Apr 12 22:03:51 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 99171D2A for ; Fri, 12 Apr 2013 22:03:51 +0000 (UTC) (envelope-from jdc@koitsu.org) Received: from qmta01.emeryville.ca.mail.comcast.net (qmta01.emeryville.ca.mail.comcast.net [IPv6:2001:558:fe2d:43:76:96:30:16]) by mx1.freebsd.org (Postfix) with ESMTP id 7D9F61F83 for ; Fri, 12 Apr 2013 22:03:51 +0000 (UTC) Received: from omta04.emeryville.ca.mail.comcast.net ([76.96.30.35]) by qmta01.emeryville.ca.mail.comcast.net with comcast id NzsR1l00C0lTkoCA1A3rT0; Fri, 12 Apr 2013 22:03:51 +0000 Received: from koitsu.strangled.net ([67.180.84.87]) by omta04.emeryville.ca.mail.comcast.net with comcast id PA3q1l00L1t3BNj8QA3qvQ; Fri, 12 Apr 2013 22:03:50 +0000 Received: by icarus.home.lan (Postfix, from userid 1000) id 5A40C73A33; Fri, 12 Apr 2013 15:03:50 -0700 (PDT) Date: Fri, 12 Apr 2013 15:03:50 -0700 From: Jeremy Chadwick To: Radio =?unknown-8bit?B?bcU/b2R5Y2ggYmFuZHl0w7N3?= Subject: Re: A failed drive causes system to hang Message-ID: <20130412220350.GA82467@icarus.home.lan> References: <51672164.1090908@o2.pl> <20130411212408.GA60159@icarus.home.lan> <5168821F.5020502@o2.pl> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <5168821F.5020502@o2.pl> User-Agent: Mutt/1.5.21 (2010-09-15) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=comcast.net; s=q20121106; t=1365804231; bh=TL2F0//uUgilTub7gstIjuKD49GAHhekt+OfD5On0I4=; h=Received:Received:Received:Date:From:To:Subject:Message-ID: MIME-Version:Content-Type; b=NGpTahXGwnOpVjIQ+mkwFlgccFttQq/c56By/GK18mvQG7KA/5dgpfsufgZ4PYTu6 PLYCnLVYNIfqIWyK2I4Egsm+ayKCAiAGaQPbr2bIbGuQkmM89JeGuGq1z0ezn/rvnT ZDw4ydb1c/m4h5b/MeNS2uEotGlBfaf6fwVTAqT2zJvpVWktHZisJWK4m3t7g1Xxn0 mUZRuzuaal+c7cUm8gSiHhG/ZaDJzvkjFkJBDHXhTsRZUJ1w0fGzGNbIrpgWRvErjS vdOli56ExUQ86dnJh128vWmj7V0WZvBXo6dqiftu8o4Gvcdv1exbG6bmdbiWlJiGMx vYtM8eCY22FLA== Cc: freebsd-fs@freebsd.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 12 Apr 2013 22:03:51 -0000 On Fri, Apr 12, 2013 at 11:52:31PM +0200, Radio m?odych bandytw wrote: > On 11/04/2013 23:24, Jeremy Chadwick wrote: > >On Thu, Apr 11, 2013 at 10:47:32PM +0200, Radio m?odych bandytw wrote: > >>Seeing a ZFS thread, I decided to write about a similar problem that > >>I experience. > >>I have a failing drive in my array. I need to RMA it, but don't have > >>time and it fails rarely enough to be a yet another annoyance. > >>The failure is simple: it fails to respond. > >>When it happens, the only thing I found I can do is switch consoles. > >>Any command fails, login fails, apps hang. > >> > >>On the 1st console I see a series of messages like: > >> > >>(ada0:ahcich0:0:0:0): CAM status: Command timeout > >>(ada0:ahcich0:0:0:0): Error 5, Periph was invalidated > >>(ada0:ahcich0:0:0:0): WRITE_FPDMA_QUEUED > >> > >>I use RAIDZ1 and I'd expect that none single failure would cause the > >>system to fail... > > > >You need to provide full output from "dmesg", and you need to define > >what the word "fails" means (re: "any command fails", "login fails"). > Fails = hangs. When trying to log it, I can type my user name, but > after I press enter the prompt for password never appear. > As to dmesg, tough luck. I have 2 photos on my phone and their > transcripts are all I can give until the problem reappears (which > should take up to 2 weeks). Photos are blurry and in many cases I'm > not sure what exactly is there. > > Screen1: > (ada0:ahcich0:0:0:0): FLUSHCACHE40. ACB: (ea?) 00 00 00 00 (cut?) > (ada0:ahcich0:0:0:0): CAM status: Unconditionally Re-qu (cut) > (ada0:ahcich0:0:0:0): Error 5, Periph was invalidated > (ada0:ahcich0:0:0:0): WRITE_FPDMA_QUEUED. ACB: 61 05 d3(cut) > 00 > (ada0:ahcich0:0:0:0): CAM status: Command timeout > (ada0:ahcich0:0:0:0): Error 5, Periph was invalidated > (ada0:ahcich0:0:0:0): WRITE_FPDMA_QUEUED. ACB: 61 03 7b(cut) > 00 > (ada0:ahcich0:0:0:0): CAM status: Unconditionally Re-qu (cut) > (ada0:ahcich0:0:0:0): Error 5, Periph was invalidated > (ada0:ahcich0:0:0:0): WRITE_FPDMA_QUEUED. ACB: 61 03 d0(cut) > 00 > (ada0:ahcich0:0:0:0): CAM status: Command timeout > (ada0:ahcich0:0:0:0): Error 5, Periph was invalidated > > > Screen 2: > ahcich0: Timeout on slot 29 port 0 > ahcich0: (unreadable, lots of numbers, some text) > (aprobe0:ahcich0:0:0:0): ATA_IDENTIFY. ACB: (cc?) 00 (cut) > (aprobe0:ahcich0:0:0:0): CAM status: Command timeout > (aprobe0:ahcich0:0:0:0): Error (5?), Retry was blocked > ahcich0: Timeout on slot 29 port 0 > ahcich0: (unreadable, lots of numbers, some text) > (aprobe0:ahcich0:0:0:0): ATA_IDENTIFY. ACB: (cc?) 00 (cut) > (aprobe0:ahcich0:0:0:0): CAM status: Command timeout > (aprobe0:ahcich0:0:0:0): Error (5?), Retry was blocked > ahcich0: Timeout on slot 30 port 0 > ahcich0: (unreadable, lots of numbers, some text) > (ada0:ahcich0:0:0:0): WRITE_FPDMA_QUEUED. ACB: 01 (cut) > (ada0:ahcich0:0:0:0): CAM status: Command timeout > (ada0:ahcich0:0:0:0): Error 5, Periph was invalidated > (ada0:ahcich0:0:0:0): WRITE_FPDMA_QUEUED. ACB: 01 (cut) > > Both are from the same event. In general, messages: > > (ada0:ahcich0:0:0:0): CAM status: Command timeout > (ada0:ahcich0:0:0:0): Error 5, Periph was invalidated > (ada0:ahcich0:0:0:0): WRITE_FPDMA_QUEUED. > > are the most common. > > I've waited for more than 1/2 hour once and the system didn't return > to a working state, the messages kept flowing and pretty much > nothing was working. What's interesting, I remember that it happened > to me even when I was using an installer (PC-BSD one), before the > actual installation began, so the disk stored no program data. And I > *think* there was no ZFS yet anyway. > > > > >I've already demonstrated that loss of a disk in raidz1 (or even 2 disks > >in raidz2) does not cause ""the system to fail"" on stable/9. However, > >if you lose enough members or vdevs to cause catastrophic failure, there > >may be anomalies depending on how your system is set up: > > > >http://lists.freebsd.org/pipermail/freebsd-fs/2013-March/016814.html > > > >If the pool has failmode=wait, any I/O to that pool will block (wait) > >indefinitely. This is the default. > > > >If the pool has failmode=continue, existing write I/O operations will > >fail with EIO (I/O error) (and hopefully applications/daemons will > >handle that gracefully -- if not, that's their fault) but any subsequent > >I/O (read or write) to that pool will block (wait) indefinitely. > > > >If the pool has failmode=panic, the kernel will immediately panic. > > > >If the CAM layer is what's wedged, that may be a different issue (and > >not related to ZFS). I would suggest running stable/9 as many > >improvements in this regard have been committed recently (some related > >to CAM, others related to ZFS and its new "deadman" watcher). > > Yeah, because of the installer failure, I don't think it's related to ZFS. > Even if it is, for now I won't set any ZFS properties in hope it > repeats and I can get better data. > > > >Bottom line: terse output of the problem does not help. Be verbose, > >provide all output (commands you type, everything!), as well as any > >physical actions you take. > > > Yep. In fact having little data was what made me hesitate to write > about it; since I did already, I'll do my best to get more info, > though for now I can only wait for a repetition. > > > On 12/04/2013 00:08, Quartz wrote:> > >> Seeing a ZFS thread, I decided to write about a similar problem that I > >> experience. > > > > I'm assuming you're referring to my "Failed pool causes system to hang" > > thread. I wonder if there's some common issue with zfs where it locks up > > if it can't write to disks how it wants to. > > > > I'm not sure how similar your problem is to mine. What's your pool setup > > look like? Redundancy options? Are you booting from a pool? I'd be > > interested to know if you can just yank the cable to the drive and see > > if the system recovers. > > > > You seem to be worse off than me- I can still login and run at least a > > couple commands. I'm booting from a straight ufs drive though. > > > > ______________________________________ > > it has a certain smooth-brained appeal > > > Like I said, I don't think it's ZFS-specific, but just in case...: > RAIDZ1, root on ZFS. I should reduce severity of a pool loss before > pulling cables, so no tests for now. Key points: 1. We now know why "commands hang" and anything I/O-related blocks (waits) for you: because your root filesystem is ZFS. If the ZFS layer is waiting on CAM, and CAM is waiting on your hardware, then those I/O requests are going to block indefinitely. So now you know the answer to why that happens. 2. I agree that the problem is not likely in ZFS, but rather either with CAM, the AHCI implementation used, or hardware (either disk or storage controller). 3. Your lack of "dmesg" is going to make this virtually impossible to solve. We really, ***really*** need that. I cannot stress this enough. This will tell us a lot of information about your system. We're also going to need to see "zpool status" output, as well as "zpool get all" and "zfs get all". "pciconf -lvbc" would also be useful. There are some known "gotchas" with certain models of hard disks or AHCI controllers (which is responsible is unknown at this time), but I don't want to start jumping to conclusions until full details can be provided first. I would recommend formatting a USB flash drive as FAT/FAT32, booting into single-user mode, then mounting the USB flash drive and issuing the above commands + writing the output to files on the flash drive, then provide those here. We really need this information. 4. Please involve the PC-BSD folks in this discussion. They need to be made aware of issues like this so they (and iXSystems, potentially) can investigate from their side. -- | Jeremy Chadwick jdc@koitsu.org | | UNIX Systems Administrator http://jdc.koitsu.org/ | | Mountain View, CA, US | | Making life hard for others since 1977. PGP 4BD6C0CB | From owner-freebsd-fs@FreeBSD.ORG Fri Apr 12 22:10:21 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 4BB8718F for ; Fri, 12 Apr 2013 22:10:21 +0000 (UTC) (envelope-from quartz@sneakertech.com) Received: from relay03.pair.com (relay03.pair.com [209.68.5.17]) by mx1.freebsd.org (Postfix) with SMTP id E38841FCF for ; Fri, 12 Apr 2013 22:10:20 +0000 (UTC) Received: (qmail 85085 invoked by uid 0); 12 Apr 2013 22:10:18 -0000 Received: from 173.48.104.62 (HELO ?10.2.2.1?) (173.48.104.62) by relay03.pair.com with SMTP; 12 Apr 2013 22:10:18 -0000 X-pair-Authenticated: 173.48.104.62 Message-ID: <5168864A.2090602@sneakertech.com> Date: Fri, 12 Apr 2013 18:10:18 -0400 From: Quartz User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.6; rv:10.0.2) Gecko/20120216 Thunderbird/10.0.2 MIME-Version: 1.0 To: freebsd-fs@freebsd.org Subject: Re: A failed drive causes system to hang References: <51672164.1090908@o2.pl> <41A207817BC94167B0C94133EC0DFD68@multiplay.co.uk> <51687881.4080005@o2.pl> <20130412212207.GA81897@icarus.home.lan> In-Reply-To: <20130412212207.GA81897@icarus.home.lan> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 12 Apr 2013 22:10:21 -0000 > While PC-BSD is based on FreeBSD, Wait, isn't pcbsd exactly the same as vanilla freebsd except that kde/bash/trueos/etc are pre-installed? Do they have their own version of zfs or low level library differences or something? ______________________________________ it has a certain smooth-brained appeal From owner-freebsd-fs@FreeBSD.ORG Fri Apr 12 22:20:06 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 7258B3E2 for ; Fri, 12 Apr 2013 22:20:06 +0000 (UTC) (envelope-from jdc@koitsu.org) Received: from qmta13.emeryville.ca.mail.comcast.net (qmta13.emeryville.ca.mail.comcast.net [IPv6:2001:558:fe2d:44:76:96:27:243]) by mx1.freebsd.org (Postfix) with ESMTP id 56D1973 for ; Fri, 12 Apr 2013 22:20:06 +0000 (UTC) Received: from omta02.emeryville.ca.mail.comcast.net ([76.96.30.19]) by qmta13.emeryville.ca.mail.comcast.net with comcast id P3Vb1l0040QkzPwADAL6BF; Fri, 12 Apr 2013 22:20:06 +0000 Received: from koitsu.strangled.net ([67.180.84.87]) by omta02.emeryville.ca.mail.comcast.net with comcast id PAL51l00J1t3BNj8NAL5cr; Fri, 12 Apr 2013 22:20:05 +0000 Received: by icarus.home.lan (Postfix, from userid 1000) id 437A173A33; Fri, 12 Apr 2013 15:20:05 -0700 (PDT) Date: Fri, 12 Apr 2013 15:20:05 -0700 From: Jeremy Chadwick To: freebsd-fs@freebsd.org Subject: Re: A failed drive causes system to hang Message-ID: <20130412222005.GA82884@icarus.home.lan> References: <51672164.1090908@o2.pl> <41A207817BC94167B0C94133EC0DFD68@multiplay.co.uk> <51687881.4080005@o2.pl> <20130412212207.GA81897@icarus.home.lan> <5168864A.2090602@sneakertech.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <5168864A.2090602@sneakertech.com> User-Agent: Mutt/1.5.21 (2010-09-15) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=comcast.net; s=q20121106; t=1365805206; bh=o5qa+F35UcOm/TqF24SpdK6oW/P9DaSsqWLd5K9syeE=; h=Received:Received:Received:Date:From:To:Subject:Message-ID: MIME-Version:Content-Type; b=XUvkZnn9Oevc5jlgGFFD+XQclnSIMcbpoanVStjC2Q8+JduEurjO3aSiHSqIboWyD aIoJH+bQrxY3ZTtsM5tv6w8ZMYHCm6A2EkNjdbNahTW7xhdgPaNm6r/5/RMvO/w6RJ 4foSPV4z0gXNUQw+Ibwa2lRyrFHdtsIGW38gRM1BHFR2wSGGZPywVAs5h3IT177yqU QvfZjvekEXaOiq6CZzTklWZ8zC5Vbf6SK6TVSBQk1eVR3A1n/l5YnfD/kQi9o3vZnZ t2/Ad0M18Vy7/lDI7mX7QO+PT9LdBir7h2+8OGTjaVwt/lxewYFY5sCgl1Iukq1wA+ wwByZ390DAoaQ== X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 12 Apr 2013 22:20:06 -0000 On Fri, Apr 12, 2013 at 06:10:18PM -0400, Quartz wrote: > > >While PC-BSD is based on FreeBSD, > > Wait, isn't pcbsd exactly the same as vanilla freebsd except that > kde/bash/trueos/etc are pre-installed? Do they have their own > version of zfs or low level library differences or something? Off-topic, so I will not be responding past this point: Let's say I have a router maintained/made by XYZ company that runs Linux. I have problems with this device, but am unsure what the reason is or what the cause is. Do I mail the LKML about the issue or do I go to the vendor? The correct answer is: you start by going to the vendor. Alternately one can involve both (ex. vendor and CC the LKML). The fact that it's a "purchased piece of hardware" has no bearing -- the methodology of support inquiries is the same. My point is that the PC-BSD folks need to be made aware of this issue, so their iXSystems folks can help out if necessary and so on. If they're left out of the loop, then that's bad for everyone. -- | Jeremy Chadwick jdc@koitsu.org | | UNIX Systems Administrator http://jdc.koitsu.org/ | | Mountain View, CA, US | | Making life hard for others since 1977. PGP 4BD6C0CB | From owner-freebsd-fs@FreeBSD.ORG Fri Apr 12 22:33:29 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id D1F1A5AF for ; Fri, 12 Apr 2013 22:33:29 +0000 (UTC) (envelope-from radiomlodychbandytow@o2.pl) Received: from moh2-ve2.go2.pl (moh2-ve2.go2.pl [193.17.41.200]) by mx1.freebsd.org (Postfix) with ESMTP id 1F49CED for ; Fri, 12 Apr 2013 22:33:27 +0000 (UTC) Received: from moh2-ve2.go2.pl (unknown [10.0.0.200]) by moh2-ve2.go2.pl (Postfix) with ESMTP id 06AE3B00BE4 for ; Sat, 13 Apr 2013 00:33:22 +0200 (CEST) Received: from unknown (unknown [10.0.0.74]) by moh2-ve2.go2.pl (Postfix) with SMTP for ; Sat, 13 Apr 2013 00:33:22 +0200 (CEST) Received: from unknown [93.175.66.185] by poczta.o2.pl with ESMTP id hrKWAz; Sat, 13 Apr 2013 00:33:21 +0200 Message-ID: <51688BA6.1000507@o2.pl> Date: Sat, 13 Apr 2013 00:33:10 +0200 From: =?UTF-8?B?UmFkaW8gbcWCb2R5Y2ggYmFuZHl0w7N3?= User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:17.0) Gecko/20130324 Thunderbird/17.0.4 MIME-Version: 1.0 To: Jeremy Chadwick Subject: Re: A failed drive causes system to hang References: <51672164.1090908@o2.pl> <20130411212408.GA60159@icarus.home.lan> <5168821F.5020502@o2.pl> <20130412220350.GA82467@icarus.home.lan> In-Reply-To: <20130412220350.GA82467@icarus.home.lan> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-O2-Trust: 1, 36 X-O2-SPF: neutral Cc: freebsd-fs@freebsd.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 12 Apr 2013 22:33:29 -0000 On 13/04/2013 00:03, Jeremy Chadwick wrote: > On Fri, Apr 12, 2013 at 11:52:31PM +0200, Radio m?odych bandytw wrote: >> On 11/04/2013 23:24, Jeremy Chadwick wrote: >>> On Thu, Apr 11, 2013 at 10:47:32PM +0200, Radio m?odych bandytw wrote: >>>> Seeing a ZFS thread, I decided to write about a similar problem that >>>> I experience. >>>> I have a failing drive in my array. I need to RMA it, but don't have >>>> time and it fails rarely enough to be a yet another annoyance. >>>> The failure is simple: it fails to respond. >>>> When it happens, the only thing I found I can do is switch consoles. >>>> Any command fails, login fails, apps hang. >>>> >>>> On the 1st console I see a series of messages like: >>>> >>>> (ada0:ahcich0:0:0:0): CAM status: Command timeout >>>> (ada0:ahcich0:0:0:0): Error 5, Periph was invalidated >>>> (ada0:ahcich0:0:0:0): WRITE_FPDMA_QUEUED >>>> >>>> I use RAIDZ1 and I'd expect that none single failure would cause the >>>> system to fail... >>> >>> You need to provide full output from "dmesg", and you need to define >>> what the word "fails" means (re: "any command fails", "login fails"). >> Fails = hangs. When trying to log it, I can type my user name, but >> after I press enter the prompt for password never appear. >> As to dmesg, tough luck. I have 2 photos on my phone and their >> transcripts are all I can give until the problem reappears (which >> should take up to 2 weeks). Photos are blurry and in many cases I'm >> not sure what exactly is there. >> >> Screen1: >> (ada0:ahcich0:0:0:0): FLUSHCACHE40. ACB: (ea?) 00 00 00 00 (cut?) >> (ada0:ahcich0:0:0:0): CAM status: Unconditionally Re-qu (cut) >> (ada0:ahcich0:0:0:0): Error 5, Periph was invalidated >> (ada0:ahcich0:0:0:0): WRITE_FPDMA_QUEUED. ACB: 61 05 d3(cut) >> 00 >> (ada0:ahcich0:0:0:0): CAM status: Command timeout >> (ada0:ahcich0:0:0:0): Error 5, Periph was invalidated >> (ada0:ahcich0:0:0:0): WRITE_FPDMA_QUEUED. ACB: 61 03 7b(cut) >> 00 >> (ada0:ahcich0:0:0:0): CAM status: Unconditionally Re-qu (cut) >> (ada0:ahcich0:0:0:0): Error 5, Periph was invalidated >> (ada0:ahcich0:0:0:0): WRITE_FPDMA_QUEUED. ACB: 61 03 d0(cut) >> 00 >> (ada0:ahcich0:0:0:0): CAM status: Command timeout >> (ada0:ahcich0:0:0:0): Error 5, Periph was invalidated >> >> >> Screen 2: >> ahcich0: Timeout on slot 29 port 0 >> ahcich0: (unreadable, lots of numbers, some text) >> (aprobe0:ahcich0:0:0:0): ATA_IDENTIFY. ACB: (cc?) 00 (cut) >> (aprobe0:ahcich0:0:0:0): CAM status: Command timeout >> (aprobe0:ahcich0:0:0:0): Error (5?), Retry was blocked >> ahcich0: Timeout on slot 29 port 0 >> ahcich0: (unreadable, lots of numbers, some text) >> (aprobe0:ahcich0:0:0:0): ATA_IDENTIFY. ACB: (cc?) 00 (cut) >> (aprobe0:ahcich0:0:0:0): CAM status: Command timeout >> (aprobe0:ahcich0:0:0:0): Error (5?), Retry was blocked >> ahcich0: Timeout on slot 30 port 0 >> ahcich0: (unreadable, lots of numbers, some text) >> (ada0:ahcich0:0:0:0): WRITE_FPDMA_QUEUED. ACB: 01 (cut) >> (ada0:ahcich0:0:0:0): CAM status: Command timeout >> (ada0:ahcich0:0:0:0): Error 5, Periph was invalidated >> (ada0:ahcich0:0:0:0): WRITE_FPDMA_QUEUED. ACB: 01 (cut) >> >> Both are from the same event. In general, messages: >> >> (ada0:ahcich0:0:0:0): CAM status: Command timeout >> (ada0:ahcich0:0:0:0): Error 5, Periph was invalidated >> (ada0:ahcich0:0:0:0): WRITE_FPDMA_QUEUED. >> >> are the most common. >> >> I've waited for more than 1/2 hour once and the system didn't return >> to a working state, the messages kept flowing and pretty much >> nothing was working. What's interesting, I remember that it happened >> to me even when I was using an installer (PC-BSD one), before the >> actual installation began, so the disk stored no program data. And I >> *think* there was no ZFS yet anyway. >> >>> >>> I've already demonstrated that loss of a disk in raidz1 (or even 2 disks >>> in raidz2) does not cause ""the system to fail"" on stable/9. However, >>> if you lose enough members or vdevs to cause catastrophic failure, there >>> may be anomalies depending on how your system is set up: >>> >>> http://lists.freebsd.org/pipermail/freebsd-fs/2013-March/016814.html >>> >>> If the pool has failmode=wait, any I/O to that pool will block (wait) >>> indefinitely. This is the default. >>> >>> If the pool has failmode=continue, existing write I/O operations will >>> fail with EIO (I/O error) (and hopefully applications/daemons will >>> handle that gracefully -- if not, that's their fault) but any subsequent >>> I/O (read or write) to that pool will block (wait) indefinitely. >>> >>> If the pool has failmode=panic, the kernel will immediately panic. >>> >>> If the CAM layer is what's wedged, that may be a different issue (and >>> not related to ZFS). I would suggest running stable/9 as many >>> improvements in this regard have been committed recently (some related >>> to CAM, others related to ZFS and its new "deadman" watcher). >> >> Yeah, because of the installer failure, I don't think it's related to ZFS. >> Even if it is, for now I won't set any ZFS properties in hope it >> repeats and I can get better data. >>> >>> Bottom line: terse output of the problem does not help. Be verbose, >>> provide all output (commands you type, everything!), as well as any >>> physical actions you take. >>> >> Yep. In fact having little data was what made me hesitate to write >> about it; since I did already, I'll do my best to get more info, >> though for now I can only wait for a repetition. >> >> >> On 12/04/2013 00:08, Quartz wrote:> >>>> Seeing a ZFS thread, I decided to write about a similar problem that I >>>> experience. >>> >>> I'm assuming you're referring to my "Failed pool causes system to hang" >>> thread. I wonder if there's some common issue with zfs where it locks up >>> if it can't write to disks how it wants to. >>> >>> I'm not sure how similar your problem is to mine. What's your pool setup >>> look like? Redundancy options? Are you booting from a pool? I'd be >>> interested to know if you can just yank the cable to the drive and see >>> if the system recovers. >>> >>> You seem to be worse off than me- I can still login and run at least a >>> couple commands. I'm booting from a straight ufs drive though. >>> >>> ______________________________________ >>> it has a certain smooth-brained appeal >>> >> Like I said, I don't think it's ZFS-specific, but just in case...: >> RAIDZ1, root on ZFS. I should reduce severity of a pool loss before >> pulling cables, so no tests for now. > > Key points: > > 1. We now know why "commands hang" and anything I/O-related blocks > (waits) for you: because your root filesystem is ZFS. If the ZFS layer > is waiting on CAM, and CAM is waiting on your hardware, then those I/O > requests are going to block indefinitely. So now you know the answer to > why that happens. > > 2. I agree that the problem is not likely in ZFS, but rather either with > CAM, the AHCI implementation used, or hardware (either disk or storage > controller). > > 3. Your lack of "dmesg" is going to make this virtually impossible to > solve. We really, ***really*** need that. I cannot stress this enough. > This will tell us a lot of information about your system. We're also > going to need to see "zpool status" output, as well as "zpool get all" > and "zfs get all". "pciconf -lvbc" would also be useful. > > There are some known "gotchas" with certain models of hard disks or AHCI > controllers (which is responsible is unknown at this time), but I don't > want to start jumping to conclusions until full details can be provided > first. > > I would recommend formatting a USB flash drive as FAT/FAT32, booting > into single-user mode, then mounting the USB flash drive and issuing > the above commands + writing the output to files on the flash drive, > then provide those here. > > We really need this information. > > 4. Please involve the PC-BSD folks in this discussion. They need to be > made aware of issues like this so they (and iXSystems, potentially) can > investigate from their side. > OK, thanks for the info. Since dmesg is so important, I'd say the best thing is to wait for the problem to happen again. When it does, I'll restart the thread with every information that you requested here and with a PC-BSD cross-post. However, I just got a different hang just a while ago. This time it was temporary, I don't know, I switched to console0 after ~10 seconds, there were 2 errors. Nothing appeared for ~1 minute, so I switched back and the system was OK. Different drive, I haven't seen problems with this one. And I think they used to be ahci, here's ata. dmesg: fuse4bsd: version 0.3.9-pre1, FUSE ABI 7.19 (ada1:ata0:0:0:0): READ_DMA48. ACB: 25 00 82 46 b8 40 25 00 00 00 01 00 (ada1:ata0:0:0:0): CAM status: Command timeout (ada1:ata0:0:0:0): Retrying command vboxdrv: fAsync=0 offMin=0x53d offMax=0x52b9 linux: pid 17170 (npviewer.bin): syscall pipe2 not implemented (ada1:ata0:0:0:0): READ_DMA48. ACB: 25 00 87 1a c7 40 1a 00 00 00 01 00 (ada1:ata0:0:0:0): CAM status: Command timeout (ada1:ata0:0:0:0): Retrying command pcbsd-8973% zpool status pool: tank1 state: DEGRADED status: One or more devices has been removed by the administrator. Sufficient replicas exist for the pool to continue functioning in a degraded state. action: Online the device using 'zpool online' or replace the device with 'zpool replace'. scan: resilvered 855M in 19h28m with 0 errors on Fri Nov 2 18:58:41 2012 config: NAME STATE READ WRITE CKSUM tank1 DEGRADED 0 0 0 raidz1-0 DEGRADED 0 0 0 ada0p2 ONLINE 0 0 0 8834471525873181994 REMOVED 0 0 0 was /dev/ada1p2 ada1p2 ONLINE 0 0 0 errors: No known data errors pcbsd-8973% sudo pciconf -lvbc Password: hostb0@pci0:0:0:0: class=0x060000 card=0x843e1043 chip=0x96011022 rev=0x00 hdr=0x00 vendor = 'Advanced Micro Devices [AMD]' device = 'RS880 Host Bridge' class = bridge subclass = HOST-PCI cap 08[c4] = HT slave cap 08[54] = HT unit ID clumping cap 08[40] = HT retry mode cap 08[9c] = HT unknown d07c cap 08[f8] = HT unknown e000 pcib1@pci0:0:2:0: class=0x060400 card=0x843e1043 chip=0x96031022 rev=0x00 hdr=0x01 vendor = 'Advanced Micro Devices [AMD]' device = 'RS780 PCI to PCI bridge (ext gfx port 0)' class = bridge subclass = PCI-PCI cap 01[50] = powerspec 3 supports D0 D3 current D0 cap 10[58] = PCI-Express 2 root port max data 128(128) link x16(x16) cap 05[a0] = MSI supports 1 message cap 0d[b0] = PCI Bridge card=0x843e1043 cap 08[b8] = HT MSI fixed address window enabled at 0xfee00000 ecap 000b[100] = unknown 1 ecap 0002[110] = VC 1 max VC0 pcib2@pci0:0:9:0: class=0x060400 card=0x843e1043 chip=0x96081022 rev=0x00 hdr=0x01 vendor = 'Advanced Micro Devices [AMD]' device = 'RS780/RS880 PCI to PCI bridge (PCIE port 4)' class = bridge subclass = PCI-PCI cap 01[50] = powerspec 3 supports D0 D3 current D0 cap 10[58] = PCI-Express 2 root port max data 128(128) link x1(x1) cap 05[a0] = MSI supports 1 message cap 0d[b0] = PCI Bridge card=0x843e1043 cap 08[b8] = HT MSI fixed address window enabled at 0xfee00000 ecap 000b[100] = unknown 1 ecap 0002[110] = VC 1 max VC0 pcib3@pci0:0:10:0: class=0x060400 card=0x843e1043 chip=0x96091022 rev=0x00 hdr=0x01 vendor = 'Advanced Micro Devices [AMD]' device = 'RS780/RS880 PCI to PCI bridge (PCIE port 5)' class = bridge subclass = PCI-PCI cap 01[50] = powerspec 3 supports D0 D3 current D0 cap 10[58] = PCI-Express 2 root port max data 128(128) link x1(x1) cap 05[a0] = MSI supports 1 message cap 0d[b0] = PCI Bridge card=0x843e1043 cap 08[b8] = HT MSI fixed address window enabled at 0xfee00000 ecap 000b[100] = unknown 1 ecap 0002[110] = VC 1 max VC0 ahci0@pci0:0:17:0: class=0x01018f card=0x84431043 chip=0x43901002 rev=0x40 hdr=0x00 vendor = 'ATI Technologies Inc' device = 'SB7x0/SB8x0/SB9x0 SATA Controller [IDE mode]' class = mass storage subclass = ATA bar [10] = type I/O Port, range 32, base 0xb000, size 8, enabled bar [14] = type I/O Port, range 32, base 0xa000, size 4, enabled bar [18] = type I/O Port, range 32, base 0x9000, size 8, enabled bar [1c] = type I/O Port, range 32, base 0x8000, size 4, enabled bar [20] = type I/O Port, range 32, base 0x7000, size 16, enabled bar [24] = type Memory, range 32, base 0xf9fffc00, size 1024, enabled cap 05[50] = MSI supports 4 messages, 64 bit enabled with 1 message cap 12[70] = SATA Index-Data Pair cap 13[a4] = PCI Advanced Features: FLR TP ohci0@pci0:0:18:0: class=0x0c0310 card=0x84431043 chip=0x43971002 rev=0x00 hdr=0x00 vendor = 'ATI Technologies Inc' device = 'SB7x0/SB8x0/SB9x0 USB OHCI0 Controller' class = serial bus subclass = USB bar [10] = type Memory, range 32, base 0xf9ffe000, size 4096, enabled ehci0@pci0:0:18:2: class=0x0c0320 card=0x84431043 chip=0x43961002 rev=0x00 hdr=0x00 vendor = 'ATI Technologies Inc' device = 'SB7x0/SB8x0/SB9x0 USB EHCI Controller' class = serial bus subclass = USB bar [10] = type Memory, range 32, base 0xf9fff800, size 256, enabled cap 01[c0] = powerspec 2 supports D0 D1 D2 D3 current D0 cap 0a[e4] = EHCI Debug Port at offset 0xe0 in map 0x14 ohci1@pci0:0:19:0: class=0x0c0310 card=0x84431043 chip=0x43971002 rev=0x00 hdr=0x00 vendor = 'ATI Technologies Inc' device = 'SB7x0/SB8x0/SB9x0 USB OHCI0 Controller' class = serial bus subclass = USB bar [10] = type Memory, range 32, base 0xf9ffd000, size 4096, enabled ehci1@pci0:0:19:2: class=0x0c0320 card=0x84431043 chip=0x43961002 rev=0x00 hdr=0x00 vendor = 'ATI Technologies Inc' device = 'SB7x0/SB8x0/SB9x0 USB EHCI Controller' class = serial bus subclass = USB bar [10] = type Memory, range 32, base 0xf9fff400, size 256, enabled cap 01[c0] = powerspec 2 supports D0 D1 D2 D3 current D0 cap 0a[e4] = EHCI Debug Port at offset 0xe0 in map 0x14 none0@pci0:0:20:0: class=0x0c0500 card=0x00000000 chip=0x43851002 rev=0x42 hdr=0x00 vendor = 'ATI Technologies Inc' device = 'SBx00 SMBus Controller' class = serial bus subclass = SMBus atapci1@pci0:0:20:1: class=0x01018a card=0x84431043 chip=0x439c1002 rev=0x40 hdr=0x00 vendor = 'ATI Technologies Inc' device = 'SB7x0/SB8x0/SB9x0 IDE Controller' class = mass storage subclass = ATA bar [20] = type I/O Port, range 32, base 0xff00, size 16, enabled cap 05[70] = MSI supports 1 message hdac0@pci0:0:20:2: class=0x040300 card=0x841b1043 chip=0x43831002 rev=0x40 hdr=0x00 vendor = 'ATI Technologies Inc' device = 'SBx00 Azalia (Intel HDA)' class = multimedia subclass = HDA bar [10] = type Memory, range 64, base 0xf9ff8000, size 16384, enabled cap 01[50] = powerspec 2 supports D0 D3 current D0 isab0@pci0:0:20:3: class=0x060100 card=0x84431043 chip=0x439d1002 rev=0x40 hdr=0x00 vendor = 'ATI Technologies Inc' device = 'SB7x0/SB8x0/SB9x0 LPC host controller' class = bridge subclass = PCI-ISA pcib4@pci0:0:20:4: class=0x060401 card=0x00000000 chip=0x43841002 rev=0x40 hdr=0x01 vendor = 'ATI Technologies Inc' device = 'SBx00 PCI to PCI Bridge' class = bridge subclass = PCI-PCI ohci2@pci0:0:20:5: class=0x0c0310 card=0x84431043 chip=0x43991002 rev=0x00 hdr=0x00 vendor = 'ATI Technologies Inc' device = 'SB7x0/SB8x0/SB9x0 USB OHCI2 Controller' class = serial bus subclass = USB bar [10] = type Memory, range 32, base 0xf9ffc000, size 4096, enabled pcib5@pci0:0:21:0: class=0x060400 card=0x00001002 chip=0x43a01002 rev=0x00 hdr=0x01 vendor = 'ATI Technologies Inc' device = 'SB700/SB800 PCI to PCI bridge (PCIE port 0)' class = bridge subclass = PCI-PCI cap 01[50] = powerspec 3 supports D0 D1 D2 D3 current D0 cap 10[58] = PCI-Express 2 root port max data 128(128) link x16(x1) cap 05[a0] = MSI supports 1 message, 64 bit cap 0d[b0] = PCI Bridge card=0x00001002 cap 08[b8] = HT MSI fixed address window enabled at 0xfee00000 ecap 000b[100] = unknown 1 pcib6@pci0:0:21:1: class=0x060400 card=0x00001002 chip=0x43a11002 rev=0x00 hdr=0x01 vendor = 'ATI Technologies Inc' device = 'SB700/SB800 PCI to PCI bridge (PCIE port 1)' class = bridge subclass = PCI-PCI cap 01[50] = powerspec 3 supports D0 D1 D2 D3 current D0 cap 10[58] = PCI-Express 2 root port max data 128(128) link x1(x1) cap 05[a0] = MSI supports 1 message, 64 bit cap 0d[b0] = PCI Bridge card=0x00001002 cap 08[b8] = HT MSI fixed address window enabled at 0xfee00000 ecap 000b[100] = unknown 1 ohci3@pci0:0:22:0: class=0x0c0310 card=0x84431043 chip=0x43971002 rev=0x00 hdr=0x00 vendor = 'ATI Technologies Inc' device = 'SB7x0/SB8x0/SB9x0 USB OHCI0 Controller' class = serial bus subclass = USB bar [10] = type Memory, range 32, base 0xf9ff7000, size 4096, enabled ehci2@pci0:0:22:2: class=0x0c0320 card=0x84431043 chip=0x43961002 rev=0x00 hdr=0x00 vendor = 'ATI Technologies Inc' device = 'SB7x0/SB8x0/SB9x0 USB EHCI Controller' class = serial bus subclass = USB bar [10] = type Memory, range 32, base 0xf9fff000, size 256, enabled cap 01[c0] = powerspec 2 supports D0 D1 D2 D3 current D0 cap 0a[e4] = EHCI Debug Port at offset 0xe0 in map 0x14 hostb1@pci0:0:24:0: class=0x060000 card=0x00000000 chip=0x12001022 rev=0x00 hdr=0x00 vendor = 'Advanced Micro Devices [AMD]' device = 'Family 10h Processor HyperTransport Configuration' class = bridge subclass = HOST-PCI cap 08[80] = HT host hostb2@pci0:0:24:1: class=0x060000 card=0x00000000 chip=0x12011022 rev=0x00 hdr=0x00 vendor = 'Advanced Micro Devices [AMD]' device = 'Family 10h Processor Address Map' class = bridge subclass = HOST-PCI hostb3@pci0:0:24:2: class=0x060000 card=0x00000000 chip=0x12021022 rev=0x00 hdr=0x00 vendor = 'Advanced Micro Devices [AMD]' device = 'Family 10h Processor DRAM Controller' class = bridge subclass = HOST-PCI hostb4@pci0:0:24:3: class=0x060000 card=0x00000000 chip=0x12031022 rev=0x00 hdr=0x00 vendor = 'Advanced Micro Devices [AMD]' device = 'Family 10h Processor Miscellaneous Control' class = bridge subclass = HOST-PCI cap 0f[f0] = unknown hostb5@pci0:0:24:4: class=0x060000 card=0x00000000 chip=0x12041022 rev=0x00 hdr=0x00 vendor = 'Advanced Micro Devices [AMD]' device = 'Family 10h Processor Link Control' class = bridge subclass = HOST-PCI vgapci0@pci0:1:0:0: class=0x030000 card=0x040110b0 chip=0x061410de rev=0xa2 hdr=0x00 vendor = 'nVidia Corporation' device = 'G92 [GeForce 9800 GT]' class = display subclass = VGA bar [10] = type Memory, range 32, base 0xfd000000, size 16777216, enabled bar [14] = type Prefetchable Memory, range 64, base 0xd0000000, size 268435456, enabled bar [1c] = type Memory, range 64, base 0xfa000000, size 33554432, enabled bar [24] = type I/O Port, range 32, base 0xcc00, size 128, enabled cap 01[60] = powerspec 3 supports D0 D3 current D0 cap 05[68] = MSI supports 1 message, 64 bit cap 10[78] = PCI-Express 2 endpoint max data 128(128) link x16(x16) ecap 0002[100] = VC 1 max VC0 ecap 0004[128] = unknown 1 ecap 000b[600] = unknown 1 fwohci0@pci0:2:0:0: class=0x0c0010 card=0x83741043 chip=0x34031106 rev=0x00 hdr=0x00 vendor = 'VIA Technologies, Inc.' device = 'VT6315 Series Firewire Controller' class = serial bus subclass = FireWire bar [10] = type Memory, range 64, base 0xfeaff800, size 2048, enabled bar [18] = type I/O Port, range 32, base 0xd800, size 256, enabled cap 01[50] = powerspec 3 supports D0 D2 D3 current D0 cap 05[80] = MSI supports 1 message, 64 bit, vector masks cap 10[98] = PCI-Express 1 endpoint max data 128(128) link x1(x1) ecap 0001[100] = AER 1 0 fatal 0 non-fatal 1 corrected ecap 0003[130] = Serial 1 001e8cffff34d695 atapci0@pci0:2:0:1: class=0x010185 card=0x838f1043 chip=0x04151106 rev=0xa0 hdr=0x00 vendor = 'VIA Technologies, Inc.' device = 'VT6415 PATA IDE Host Controller' class = mass storage subclass = ATA bar [10] = type I/O Port, range 32, base 0xdc00, size 8, enabled bar [14] = type I/O Port, range 32, base 0xd480, size 4, enabled bar [18] = type I/O Port, range 32, base 0xd400, size 8, enabled bar [1c] = type I/O Port, range 32, base 0xd080, size 4, enabled bar [20] = type I/O Port, range 32, base 0xd000, size 16, enabled cap 01[50] = powerspec 3 supports D0 D2 D3 current D0 cap 05[70] = MSI supports 1 message, 64 bit, vector masks cap 10[90] = PCI-Express 1 legacy endpoint max data 128(128) link x1(x1) ecap 0001[100] = AER 1 0 fatal 0 non-fatal 1 corrected ecap 0003[130] = Serial 1 001e8cffff34d695 xhci0@pci0:3:0:0: class=0x0c0330 card=0x84131043 chip=0x01941033 rev=0x03 hdr=0x00 vendor = 'NEC Corporation' device = 'uPD720200 USB 3.0 Host Controller' class = serial bus subclass = USB bar [10] = type Memory, range 64, base 0xfebfa000, size 8192, enabled cap 01[50] = powerspec 3 supports D0 D3 current D0 cap 05[70] = MSI supports 8 messages, 64 bit cap 11[90] = MSI-X supports 8 messages in map 0x10 cap 10[a0] = PCI-Express 2 endpoint max data 128(128) link x1(x1) ecap 0001[100] = AER 1 0 fatal 0 non-fatal 1 corrected ecap 0003[140] = Serial 1 ffffffffffffffff ecap 0018[150] = unknown 1 re0@pci0:6:0:0: class=0x020000 card=0x84321043 chip=0x816810ec rev=0x06 hdr=0x00 vendor = 'Realtek Semiconductor Co., Ltd.' device = 'RTL8111/8168B PCI Express Gigabit Ethernet controller' class = network subclass = ethernet bar [10] = type I/O Port, range 32, base 0xe800, size 256, enabled bar [18] = type Prefetchable Memory, range 64, base 0xf8fff000, size 4096, enabled bar [20] = type Prefetchable Memory, range 64, base 0xf8ff8000, size 16384, enabled cap 01[40] = powerspec 3 supports D0 D1 D2 D3 current D0 cap 05[50] = MSI supports 1 message, 64 bit cap 10[70] = PCI-Express 2 endpoint IRQ 2 max data 128(256) link x1(x1) cap 11[b0] = MSI-X supports 4 messages in map 0x20 enabled cap 03[d0] = VPD ecap 0001[100] = AER 1 0 fatal 1 non-fatal 2 corrected ecap 0002[140] = VC 1 max VC0 ecap 0003[160] = Serial 1 01000000684ce000 zpool get all tank1: NAME PROPERTY VALUE SOURCE tank1 size 4.06T - tank1 capacity 19% - tank1 altroot - default tank1 health DEGRADED - tank1 guid 8205817271403379195 default tank1 version 28 default tank1 bootfs tank1/ROOT/default local tank1 delegation on default tank1 autoreplace off default tank1 cachefile - default tank1 failmode wait default tank1 listsnapshots off default tank1 autoexpand off default tank1 dedupditto 0 default tank1 dedupratio 1.00x - tank1 free 3.28T - tank1 allocated 805G - tank1 readonly off - tank1 comment - default tank1 expandsize 0 - zfs get all: NAME PROPERTY VALUE SOURCE tank1 type filesystem - tank1 creation Wed Oct 31 19:03 2012 - tank1 used 566G - tank1 available 2.11T - tank1 referenced 40.0K - tank1 compressratio 1.03x - tank1 mounted no - tank1 quota none default tank1 reservation none default tank1 recordsize 128K default tank1 mountpoint legacy local tank1 sharenfs off default tank1 checksum on default tank1 compression off default tank1 atime off local tank1 devices on default tank1 exec on default tank1 setuid on default tank1 readonly off default tank1 jailed off default tank1 snapdir hidden default tank1 aclmode discard default tank1 aclinherit restricted default tank1 canmount on default tank1 xattr on default tank1 copies 1 default tank1 version 5 - tank1 utf8only off - tank1 normalization none - tank1 casesensitivity sensitive - tank1 vscan off default tank1 nbmand off default tank1 sharesmb off default tank1 refquota none default tank1 refreservation none default tank1 primarycache all default tank1 secondarycache all default tank1 usedbysnapshots 0 - tank1 usedbydataset 40.0K - tank1 usedbychildren 566G - tank1 usedbyrefreservation 0 - tank1 logbias latency default tank1 dedup off default tank1 mlslabel - tank1 sync standard default tank1 refcompressratio 1.00x - tank1 written 40.0K - tank1/ROOT type filesystem - tank1/ROOT creation Wed Oct 31 19:03 2012 - tank1/ROOT used 5.37G - tank1/ROOT available 2.11T - tank1/ROOT referenced 40.0K - tank1/ROOT compressratio 1.00x - tank1/ROOT mounted no - tank1/ROOT quota none default tank1/ROOT reservation none default tank1/ROOT recordsize 128K default tank1/ROOT mountpoint legacy inherited from tank1 tank1/ROOT sharenfs off default tank1/ROOT checksum on default tank1/ROOT compression off default tank1/ROOT atime off inherited from tank1 tank1/ROOT devices on default tank1/ROOT exec on default tank1/ROOT setuid on default tank1/ROOT readonly off default tank1/ROOT jailed off default tank1/ROOT snapdir hidden default tank1/ROOT aclmode discard default tank1/ROOT aclinherit restricted default tank1/ROOT canmount on default tank1/ROOT xattr on default tank1/ROOT copies 1 default tank1/ROOT version 5 - tank1/ROOT utf8only off - tank1/ROOT normalization none - tank1/ROOT casesensitivity sensitive - tank1/ROOT vscan off default tank1/ROOT nbmand off default tank1/ROOT sharesmb off default tank1/ROOT refquota none default tank1/ROOT refreservation none default tank1/ROOT primarycache all default tank1/ROOT secondarycache all default tank1/ROOT usedbysnapshots 0 - tank1/ROOT usedbydataset 40.0K - tank1/ROOT usedbychildren 5.37G - tank1/ROOT usedbyrefreservation 0 - tank1/ROOT logbias latency default tank1/ROOT dedup off default tank1/ROOT mlslabel - tank1/ROOT sync standard default tank1/ROOT refcompressratio 1.00x - tank1/ROOT written 40.0K - tank1/ROOT/default type filesystem - tank1/ROOT/default creation Wed Oct 31 19:03 2012 - tank1/ROOT/default used 5.37G - tank1/ROOT/default available 2.11T - tank1/ROOT/default referenced 5.37G - tank1/ROOT/default compressratio 1.00x - tank1/ROOT/default mounted yes - tank1/ROOT/default quota none default tank1/ROOT/default reservation none default tank1/ROOT/default recordsize 128K default tank1/ROOT/default mountpoint /mnt local tank1/ROOT/default sharenfs off default tank1/ROOT/default checksum on default tank1/ROOT/default compression off default tank1/ROOT/default atime off inherited from tank1 tank1/ROOT/default devices on default tank1/ROOT/default exec on default tank1/ROOT/default setuid on default tank1/ROOT/default readonly off default tank1/ROOT/default jailed off default tank1/ROOT/default snapdir hidden default tank1/ROOT/default aclmode discard default tank1/ROOT/default aclinherit restricted default tank1/ROOT/default canmount on default tank1/ROOT/default xattr off temporary tank1/ROOT/default copies 1 default tank1/ROOT/default version 5 - tank1/ROOT/default utf8only off - tank1/ROOT/default normalization none - tank1/ROOT/default casesensitivity sensitive - tank1/ROOT/default vscan off default tank1/ROOT/default nbmand off default tank1/ROOT/default sharesmb off default tank1/ROOT/default refquota none default tank1/ROOT/default refreservation none default tank1/ROOT/default primarycache all default tank1/ROOT/default secondarycache all default tank1/ROOT/default usedbysnapshots 0 - tank1/ROOT/default usedbydataset 5.37G - tank1/ROOT/default usedbychildren 0 - tank1/ROOT/default usedbyrefreservation 0 - tank1/ROOT/default logbias latency default tank1/ROOT/default dedup off default tank1/ROOT/default mlslabel - tank1/ROOT/default sync standard default tank1/ROOT/default refcompressratio 1.00x - tank1/ROOT/default written 5.37G - tank1/games type volume - tank1/games creation Sun Apr 7 11:07 2013 - tank1/games used 33.0G - tank1/games available 2.14T - tank1/games referenced 3.87G - tank1/games compressratio 1.00x - tank1/games reservation none default tank1/games volsize 32G local tank1/games volblocksize 8K - tank1/games checksum on default tank1/games compression off default tank1/games readonly off default tank1/games copies 1 default tank1/games refreservation 33.0G local tank1/games primarycache all default tank1/games secondarycache all default tank1/games usedbysnapshots 0 - tank1/games usedbydataset 3.87G - tank1/games usedbychildren 0 - tank1/games usedbyrefreservation 29.1G - tank1/games logbias latency default tank1/games dedup off default tank1/games mlslabel - tank1/games sync standard default tank1/games refcompressratio 1.00x - tank1/games written 3.87G - tank1/root type filesystem - tank1/root creation Wed Oct 31 19:03 2012 - tank1/root used 5.00M - tank1/root available 2.11T - tank1/root referenced 5.00M - tank1/root compressratio 1.00x - tank1/root mounted yes - tank1/root quota none default tank1/root reservation none default tank1/root recordsize 128K default tank1/root mountpoint /root local tank1/root sharenfs off default tank1/root checksum on default tank1/root compression off default tank1/root atime off inherited from tank1 tank1/root devices on default tank1/root exec on default tank1/root setuid on default tank1/root readonly off default tank1/root jailed off default tank1/root snapdir hidden default tank1/root aclmode discard default tank1/root aclinherit restricted default tank1/root canmount on default tank1/root xattr off temporary tank1/root copies 1 default tank1/root version 5 - tank1/root utf8only off - tank1/root normalization none - tank1/root casesensitivity sensitive - tank1/root vscan off default tank1/root nbmand off default tank1/root sharesmb off default tank1/root refquota none default tank1/root refreservation none default tank1/root primarycache all default tank1/root secondarycache all default tank1/root usedbysnapshots 0 - tank1/root usedbydataset 5.00M - tank1/root usedbychildren 0 - tank1/root usedbyrefreservation 0 - tank1/root logbias latency default tank1/root dedup off default tank1/root mlslabel - tank1/root sync standard default tank1/root refcompressratio 1.00x - tank1/root written 5.00M - tank1/swap type volume - tank1/swap creation Wed Oct 31 19:03 2012 - tank1/swap used 2.06G - tank1/swap available 2.11T - tank1/swap referenced 1.70G - tank1/swap compressratio 1.00x - tank1/swap reservation none default tank1/swap volsize 2G local tank1/swap volblocksize 8K - tank1/swap checksum off local tank1/swap compression off default tank1/swap readonly off default tank1/swap copies 1 default tank1/swap refreservation 2.06G local tank1/swap primarycache all default tank1/swap secondarycache all default tank1/swap usedbysnapshots 0 - tank1/swap usedbydataset 1.70G - tank1/swap usedbychildren 0 - tank1/swap usedbyrefreservation 377M - tank1/swap logbias latency default tank1/swap dedup off default tank1/swap mlslabel - tank1/swap sync standard default tank1/swap refcompressratio 1.00x - tank1/swap written 1.70G - tank1/swap org.freebsd:swap on local tank1/tmp type filesystem - tank1/tmp creation Wed Oct 31 19:03 2012 - tank1/tmp used 137K - tank1/tmp available 2.11T - tank1/tmp referenced 137K - tank1/tmp compressratio 4.44x - tank1/tmp mounted yes - tank1/tmp quota none default tank1/tmp reservation none default tank1/tmp recordsize 128K default tank1/tmp mountpoint /tmp local tank1/tmp sharenfs off default tank1/tmp checksum on default tank1/tmp compression lzjb local tank1/tmp atime off inherited from tank1 tank1/tmp devices on default tank1/tmp exec on default tank1/tmp setuid on default tank1/tmp readonly off default tank1/tmp jailed off default tank1/tmp snapdir hidden default tank1/tmp aclmode discard default tank1/tmp aclinherit restricted default tank1/tmp canmount on default tank1/tmp xattr off temporary tank1/tmp copies 1 default tank1/tmp version 5 - tank1/tmp utf8only off - tank1/tmp normalization none - tank1/tmp casesensitivity sensitive - tank1/tmp vscan off default tank1/tmp nbmand off default tank1/tmp sharesmb off default tank1/tmp refquota none default tank1/tmp refreservation none default tank1/tmp primarycache all default tank1/tmp secondarycache all default tank1/tmp usedbysnapshots 0 - tank1/tmp usedbydataset 137K - tank1/tmp usedbychildren 0 - tank1/tmp usedbyrefreservation 0 - tank1/tmp logbias latency default tank1/tmp dedup off default tank1/tmp mlslabel - tank1/tmp sync standard default tank1/tmp refcompressratio 4.44x - tank1/tmp written 137K - tank1/usr type filesystem - tank1/usr creation Wed Oct 31 19:03 2012 - tank1/usr used 525G - tank1/usr available 2.11T - tank1/usr referenced 40.0K - tank1/usr compressratio 1.00x - tank1/usr mounted no - tank1/usr quota none default tank1/usr reservation none default tank1/usr recordsize 128K default tank1/usr mountpoint /mnt/usr local tank1/usr sharenfs off default tank1/usr checksum on default tank1/usr compression off default tank1/usr atime off inherited from tank1 tank1/usr devices on default tank1/usr exec on default tank1/usr setuid on default tank1/usr readonly off default tank1/usr jailed off default tank1/usr snapdir hidden default tank1/usr aclmode discard default tank1/usr aclinherit restricted default tank1/usr canmount off local tank1/usr xattr on default tank1/usr copies 1 default tank1/usr version 5 - tank1/usr utf8only off - tank1/usr normalization none - tank1/usr casesensitivity sensitive - tank1/usr vscan off default tank1/usr nbmand off default tank1/usr sharesmb off default tank1/usr refquota none default tank1/usr refreservation none default tank1/usr primarycache all default tank1/usr secondarycache all default tank1/usr usedbysnapshots 0 - tank1/usr usedbydataset 40.0K - tank1/usr usedbychildren 525G - tank1/usr usedbyrefreservation 0 - tank1/usr logbias latency default tank1/usr dedup off default tank1/usr mlslabel - tank1/usr sync standard default tank1/usr refcompressratio 1.00x - tank1/usr written 40.0K - tank1/usr/home type filesystem - tank1/usr/home creation Wed Oct 31 19:03 2012 - tank1/usr/home used 497G - tank1/usr/home available 2.11T - tank1/usr/home referenced 497G - tank1/usr/home compressratio 1.00x - tank1/usr/home mounted yes - tank1/usr/home quota none default tank1/usr/home reservation none default tank1/usr/home recordsize 128K default tank1/usr/home mountpoint /usr/home local tank1/usr/home sharenfs off default tank1/usr/home checksum on default tank1/usr/home compression off default tank1/usr/home atime off inherited from tank1 tank1/usr/home devices on default tank1/usr/home exec on default tank1/usr/home setuid on default tank1/usr/home readonly off default tank1/usr/home jailed off default tank1/usr/home snapdir hidden default tank1/usr/home aclmode discard default tank1/usr/home aclinherit restricted default tank1/usr/home canmount on default tank1/usr/home xattr off temporary tank1/usr/home copies 1 default tank1/usr/home version 5 - tank1/usr/home utf8only off - tank1/usr/home normalization none - tank1/usr/home casesensitivity sensitive - tank1/usr/home vscan off default tank1/usr/home nbmand off default tank1/usr/home sharesmb off default tank1/usr/home refquota none default tank1/usr/home refreservation none default tank1/usr/home primarycache all default tank1/usr/home secondarycache all default tank1/usr/home usedbysnapshots 0 - tank1/usr/home usedbydataset 497G - tank1/usr/home usedbychildren 18.7M - tank1/usr/home usedbyrefreservation 0 - tank1/usr/home logbias latency default tank1/usr/home dedup off default tank1/usr/home mlslabel - tank1/usr/home sync standard default tank1/usr/home refcompressratio 1.00x - tank1/usr/home written 497G - tank1/usr/home/nonet type filesystem - tank1/usr/home/nonet creation Sun Dec 23 18:12 2012 - tank1/usr/home/nonet used 18.7M - tank1/usr/home/nonet available 2.11T - tank1/usr/home/nonet referenced 18.7M - tank1/usr/home/nonet compressratio 1.00x - tank1/usr/home/nonet mounted yes - tank1/usr/home/nonet quota none default tank1/usr/home/nonet reservation none default tank1/usr/home/nonet recordsize 128K default tank1/usr/home/nonet mountpoint /usr/home/nonet local tank1/usr/home/nonet sharenfs off default tank1/usr/home/nonet checksum on default tank1/usr/home/nonet compression off default tank1/usr/home/nonet atime off inherited from tank1 tank1/usr/home/nonet devices on default tank1/usr/home/nonet exec on default tank1/usr/home/nonet setuid on default tank1/usr/home/nonet readonly off default tank1/usr/home/nonet jailed off default tank1/usr/home/nonet snapdir hidden default tank1/usr/home/nonet aclmode discard default tank1/usr/home/nonet aclinherit restricted default tank1/usr/home/nonet canmount on default tank1/usr/home/nonet xattr off temporary tank1/usr/home/nonet copies 1 default tank1/usr/home/nonet version 5 - tank1/usr/home/nonet utf8only off - tank1/usr/home/nonet normalization none - tank1/usr/home/nonet casesensitivity sensitive - tank1/usr/home/nonet vscan off default tank1/usr/home/nonet nbmand off default tank1/usr/home/nonet sharesmb off default tank1/usr/home/nonet refquota none default tank1/usr/home/nonet refreservation none default tank1/usr/home/nonet primarycache all default tank1/usr/home/nonet secondarycache all default tank1/usr/home/nonet usedbysnapshots 0 - tank1/usr/home/nonet usedbydataset 18.7M - tank1/usr/home/nonet usedbychildren 0 - tank1/usr/home/nonet usedbyrefreservation 0 - tank1/usr/home/nonet logbias latency default tank1/usr/home/nonet dedup off default tank1/usr/home/nonet mlslabel - tank1/usr/home/nonet sync standard default tank1/usr/home/nonet refcompressratio 1.00x - tank1/usr/home/nonet written 18.7M - tank1/usr/jails type filesystem - tank1/usr/jails creation Wed Oct 31 19:03 2012 - tank1/usr/jails used 1.86G - tank1/usr/jails available 2.11T - tank1/usr/jails referenced 83.8M - tank1/usr/jails compressratio 1.00x - tank1/usr/jails mounted yes - tank1/usr/jails quota none default tank1/usr/jails reservation none default tank1/usr/jails recordsize 128K default tank1/usr/jails mountpoint /usr/jails local tank1/usr/jails sharenfs off default tank1/usr/jails checksum on default tank1/usr/jails compression off default tank1/usr/jails atime off inherited from tank1 tank1/usr/jails devices on default tank1/usr/jails exec on default tank1/usr/jails setuid on default tank1/usr/jails readonly off default tank1/usr/jails jailed off default tank1/usr/jails snapdir hidden default tank1/usr/jails aclmode discard default tank1/usr/jails aclinherit restricted default tank1/usr/jails canmount on default tank1/usr/jails xattr off temporary tank1/usr/jails copies 1 default tank1/usr/jails version 5 - tank1/usr/jails utf8only off - tank1/usr/jails normalization none - tank1/usr/jails casesensitivity sensitive - tank1/usr/jails vscan off default tank1/usr/jails nbmand off default tank1/usr/jails sharesmb off default tank1/usr/jails refquota none default tank1/usr/jails refreservation none default tank1/usr/jails primarycache all default tank1/usr/jails secondarycache all default tank1/usr/jails usedbysnapshots 0 - tank1/usr/jails usedbydataset 83.8M - tank1/usr/jails usedbychildren 1.78G - tank1/usr/jails usedbyrefreservation 0 - tank1/usr/jails logbias latency default tank1/usr/jails dedup off default tank1/usr/jails mlslabel - tank1/usr/jails sync standard default tank1/usr/jails refcompressratio 1.00x - tank1/usr/jails written 83.8M - tank1/usr/jails/.warden-chroot-amd64 type filesystem - tank1/usr/jails/.warden-chroot-amd64 creation Sat Dec 15 12:08 2012 - tank1/usr/jails/.warden-chroot-amd64 used 414M - tank1/usr/jails/.warden-chroot-amd64 available 2.11T - tank1/usr/jails/.warden-chroot-amd64 referenced 414M - tank1/usr/jails/.warden-chroot-amd64 compressratio 1.00x - tank1/usr/jails/.warden-chroot-amd64 mounted yes - tank1/usr/jails/.warden-chroot-amd64 quota none default tank1/usr/jails/.warden-chroot-amd64 reservation none default tank1/usr/jails/.warden-chroot-amd64 recordsize 128K default tank1/usr/jails/.warden-chroot-amd64 mountpoint /usr/jails/.warden-chroot-amd64 local tank1/usr/jails/.warden-chroot-amd64 sharenfs off default tank1/usr/jails/.warden-chroot-amd64 checksum on default tank1/usr/jails/.warden-chroot-amd64 compression off default tank1/usr/jails/.warden-chroot-amd64 atime off inherited from tank1 tank1/usr/jails/.warden-chroot-amd64 devices on default tank1/usr/jails/.warden-chroot-amd64 exec on default tank1/usr/jails/.warden-chroot-amd64 setuid on default tank1/usr/jails/.warden-chroot-amd64 readonly off default tank1/usr/jails/.warden-chroot-amd64 jailed off default tank1/usr/jails/.warden-chroot-amd64 snapdir hidden default tank1/usr/jails/.warden-chroot-amd64 aclmode discard default tank1/usr/jails/.warden-chroot-amd64 aclinherit restricted default tank1/usr/jails/.warden-chroot-amd64 canmount on default tank1/usr/jails/.warden-chroot-amd64 xattr off temporary tank1/usr/jails/.warden-chroot-amd64 copies 1 default tank1/usr/jails/.warden-chroot-amd64 version 5 - tank1/usr/jails/.warden-chroot-amd64 utf8only off - tank1/usr/jails/.warden-chroot-amd64 normalization none - tank1/usr/jails/.warden-chroot-amd64 casesensitivity sensitive - tank1/usr/jails/.warden-chroot-amd64 vscan off default tank1/usr/jails/.warden-chroot-amd64 nbmand off default tank1/usr/jails/.warden-chroot-amd64 sharesmb off default tank1/usr/jails/.warden-chroot-amd64 refquota none default tank1/usr/jails/.warden-chroot-amd64 refreservation none default tank1/usr/jails/.warden-chroot-amd64 primarycache all default tank1/usr/jails/.warden-chroot-amd64 secondarycache all default tank1/usr/jails/.warden-chroot-amd64 usedbysnapshots 1.33K - tank1/usr/jails/.warden-chroot-amd64 usedbydataset 414M - tank1/usr/jails/.warden-chroot-amd64 usedbychildren 0 - tank1/usr/jails/.warden-chroot-amd64 usedbyrefreservation 0 - tank1/usr/jails/.warden-chroot-amd64 logbias latency default tank1/usr/jails/.warden-chroot-amd64 dedup off default tank1/usr/jails/.warden-chroot-amd64 mlslabel - tank1/usr/jails/.warden-chroot-amd64 sync standard default tank1/usr/jails/.warden-chroot-amd64 refcompressratio 1.00x - tank1/usr/jails/.warden-chroot-amd64 written 1.33K - tank1/usr/jails/.warden-chroot-amd64@clean type snapshot - tank1/usr/jails/.warden-chroot-amd64@clean creation Sat Dec 15 12:08 2012 - tank1/usr/jails/.warden-chroot-amd64@clean used 1.33K - tank1/usr/jails/.warden-chroot-amd64@clean referenced 414M - tank1/usr/jails/.warden-chroot-amd64@clean compressratio 1.00x - tank1/usr/jails/.warden-chroot-amd64@clean devices on default tank1/usr/jails/.warden-chroot-amd64@clean exec on default tank1/usr/jails/.warden-chroot-amd64@clean setuid on default tank1/usr/jails/.warden-chroot-amd64@clean xattr on default tank1/usr/jails/.warden-chroot-amd64@clean version 5 - tank1/usr/jails/.warden-chroot-amd64@clean utf8only off - tank1/usr/jails/.warden-chroot-amd64@clean normalization none - tank1/usr/jails/.warden-chroot-amd64@clean casesensitivity sensitive - tank1/usr/jails/.warden-chroot-amd64@clean nbmand off default tank1/usr/jails/.warden-chroot-amd64@clean primarycache all default tank1/usr/jails/.warden-chroot-amd64@clean secondarycache all default tank1/usr/jails/.warden-chroot-amd64@clean defer_destroy off - tank1/usr/jails/.warden-chroot-amd64@clean userrefs 0 - tank1/usr/jails/.warden-chroot-amd64@clean mlslabel - tank1/usr/jails/.warden-chroot-amd64@clean refcompressratio 1.00x - tank1/usr/jails/.warden-chroot-amd64@clean written 414M - tank1/usr/jails/.warden-chroot-amd64@clean clones tank1/usr/jails/192.168.242.52 - tank1/usr/jails/1.2.3.6 type filesystem - tank1/usr/jails/1.2.3.6 creation Thu Nov 1 14:56 2012 - tank1/usr/jails/1.2.3.6 used 910M - tank1/usr/jails/1.2.3.6 available 2.11T - tank1/usr/jails/1.2.3.6 referenced 910M - tank1/usr/jails/1.2.3.6 compressratio 1.00x - tank1/usr/jails/1.2.3.6 mounted yes - tank1/usr/jails/1.2.3.6 quota none default tank1/usr/jails/1.2.3.6 reservation none default tank1/usr/jails/1.2.3.6 recordsize 128K default tank1/usr/jails/1.2.3.6 mountpoint /usr/jails/1.2.3.6 local tank1/usr/jails/1.2.3.6 sharenfs off default tank1/usr/jails/1.2.3.6 checksum on default tank1/usr/jails/1.2.3.6 compression off default tank1/usr/jails/1.2.3.6 atime off inherited from tank1 tank1/usr/jails/1.2.3.6 devices on default tank1/usr/jails/1.2.3.6 exec on default tank1/usr/jails/1.2.3.6 setuid on default tank1/usr/jails/1.2.3.6 readonly off default tank1/usr/jails/1.2.3.6 jailed off default tank1/usr/jails/1.2.3.6 snapdir hidden default tank1/usr/jails/1.2.3.6 aclmode discard default tank1/usr/jails/1.2.3.6 aclinherit restricted default tank1/usr/jails/1.2.3.6 canmount on default tank1/usr/jails/1.2.3.6 xattr off temporary tank1/usr/jails/1.2.3.6 copies 1 default tank1/usr/jails/1.2.3.6 version 5 - tank1/usr/jails/1.2.3.6 utf8only off - tank1/usr/jails/1.2.3.6 normalization none - tank1/usr/jails/1.2.3.6 casesensitivity sensitive - tank1/usr/jails/1.2.3.6 vscan off default tank1/usr/jails/1.2.3.6 nbmand off default tank1/usr/jails/1.2.3.6 sharesmb off default tank1/usr/jails/1.2.3.6 refquota none default tank1/usr/jails/1.2.3.6 refreservation none default tank1/usr/jails/1.2.3.6 primarycache all default tank1/usr/jails/1.2.3.6 secondarycache all default tank1/usr/jails/1.2.3.6 usedbysnapshots 0 - tank1/usr/jails/1.2.3.6 usedbydataset 910M - tank1/usr/jails/1.2.3.6 usedbychildren 0 - tank1/usr/jails/1.2.3.6 usedbyrefreservation 0 - tank1/usr/jails/1.2.3.6 logbias latency default tank1/usr/jails/1.2.3.6 dedup off default tank1/usr/jails/1.2.3.6 mlslabel - tank1/usr/jails/1.2.3.6 sync standard default tank1/usr/jails/1.2.3.6 refcompressratio 1.00x - tank1/usr/jails/1.2.3.6 written 910M - tank1/usr/jails/192.168.242.52 type filesystem - tank1/usr/jails/192.168.242.52 creation Sat Dec 15 12:08 2012 - tank1/usr/jails/192.168.242.52 used 496M - tank1/usr/jails/192.168.242.52 available 2.11T - tank1/usr/jails/192.168.242.52 referenced 910M - tank1/usr/jails/192.168.242.52 compressratio 1.00x - tank1/usr/jails/192.168.242.52 mounted yes - tank1/usr/jails/192.168.242.52 origin tank1/usr/jails/.warden-chroot-amd64@clean - tank1/usr/jails/192.168.242.52 quota none default tank1/usr/jails/192.168.242.52 reservation none default tank1/usr/jails/192.168.242.52 recordsize 128K default tank1/usr/jails/192.168.242.52 mountpoint /usr/jails/192.168.242.52 inherited from tank1/usr/jails tank1/usr/jails/192.168.242.52 sharenfs off default tank1/usr/jails/192.168.242.52 checksum on default tank1/usr/jails/192.168.242.52 compression off default tank1/usr/jails/192.168.242.52 atime off inherited from tank1 tank1/usr/jails/192.168.242.52 devices on default tank1/usr/jails/192.168.242.52 exec on default tank1/usr/jails/192.168.242.52 setuid on default tank1/usr/jails/192.168.242.52 readonly off default tank1/usr/jails/192.168.242.52 jailed off default tank1/usr/jails/192.168.242.52 snapdir hidden default tank1/usr/jails/192.168.242.52 aclmode discard default tank1/usr/jails/192.168.242.52 aclinherit restricted default tank1/usr/jails/192.168.242.52 canmount on default tank1/usr/jails/192.168.242.52 xattr off temporary tank1/usr/jails/192.168.242.52 copies 1 default tank1/usr/jails/192.168.242.52 version 5 - tank1/usr/jails/192.168.242.52 utf8only off - tank1/usr/jails/192.168.242.52 normalization none - tank1/usr/jails/192.168.242.52 casesensitivity sensitive - tank1/usr/jails/192.168.242.52 vscan off default tank1/usr/jails/192.168.242.52 nbmand off default tank1/usr/jails/192.168.242.52 sharesmb off default tank1/usr/jails/192.168.242.52 refquota none default tank1/usr/jails/192.168.242.52 refreservation none default tank1/usr/jails/192.168.242.52 primarycache all default tank1/usr/jails/192.168.242.52 secondarycache all default tank1/usr/jails/192.168.242.52 usedbysnapshots 0 - tank1/usr/jails/192.168.242.52 usedbydataset 496M - tank1/usr/jails/192.168.242.52 usedbychildren 0 - tank1/usr/jails/192.168.242.52 usedbyrefreservation 0 - tank1/usr/jails/192.168.242.52 logbias latency default tank1/usr/jails/192.168.242.52 dedup off default tank1/usr/jails/192.168.242.52 mlslabel - tank1/usr/jails/192.168.242.52 sync standard default tank1/usr/jails/192.168.242.52 refcompressratio 1.00x - tank1/usr/jails/192.168.242.52 written 496M - tank1/usr/obj type filesystem - tank1/usr/obj creation Wed Oct 31 19:03 2012 - tank1/usr/obj used 40.0K - tank1/usr/obj available 2.11T - tank1/usr/obj referenced 40.0K - tank1/usr/obj compressratio 1.00x - tank1/usr/obj mounted yes - tank1/usr/obj quota none default tank1/usr/obj reservation none default tank1/usr/obj recordsize 128K default tank1/usr/obj mountpoint /usr/obj local tank1/usr/obj sharenfs off default tank1/usr/obj checksum on default tank1/usr/obj compression lzjb local tank1/usr/obj atime off inherited from tank1 tank1/usr/obj devices on default tank1/usr/obj exec on default tank1/usr/obj setuid on default tank1/usr/obj readonly off default tank1/usr/obj jailed off default tank1/usr/obj snapdir hidden default tank1/usr/obj aclmode discard default tank1/usr/obj aclinherit restricted default tank1/usr/obj canmount on default tank1/usr/obj xattr off temporary tank1/usr/obj copies 1 default tank1/usr/obj version 5 - tank1/usr/obj utf8only off - tank1/usr/obj normalization none - tank1/usr/obj casesensitivity sensitive - tank1/usr/obj vscan off default tank1/usr/obj nbmand off default tank1/usr/obj sharesmb off default tank1/usr/obj refquota none default tank1/usr/obj refreservation none default tank1/usr/obj primarycache all default tank1/usr/obj secondarycache all default tank1/usr/obj usedbysnapshots 0 - tank1/usr/obj usedbydataset 40.0K - tank1/usr/obj usedbychildren 0 - tank1/usr/obj usedbyrefreservation 0 - tank1/usr/obj logbias latency default tank1/usr/obj dedup off default tank1/usr/obj mlslabel - tank1/usr/obj sync standard default tank1/usr/obj refcompressratio 1.00x - tank1/usr/obj written 40.0K - tank1/usr/pbi type filesystem - tank1/usr/pbi creation Wed Oct 31 19:03 2012 - tank1/usr/pbi used 24.4G - tank1/usr/pbi available 2.11T - tank1/usr/pbi referenced 22.7G - tank1/usr/pbi compressratio 1.00x - tank1/usr/pbi mounted yes - tank1/usr/pbi quota none default tank1/usr/pbi reservation none default tank1/usr/pbi recordsize 128K default tank1/usr/pbi mountpoint /usr/pbi local tank1/usr/pbi sharenfs off default tank1/usr/pbi checksum on default tank1/usr/pbi compression off default tank1/usr/pbi atime off inherited from tank1 tank1/usr/pbi devices on default tank1/usr/pbi exec on default tank1/usr/pbi setuid on default tank1/usr/pbi readonly off default tank1/usr/pbi jailed off default tank1/usr/pbi snapdir hidden default tank1/usr/pbi aclmode discard default tank1/usr/pbi aclinherit restricted default tank1/usr/pbi canmount on default tank1/usr/pbi xattr off temporary tank1/usr/pbi copies 1 default tank1/usr/pbi version 5 - tank1/usr/pbi utf8only off - tank1/usr/pbi normalization none - tank1/usr/pbi casesensitivity sensitive - tank1/usr/pbi vscan off default tank1/usr/pbi nbmand off default tank1/usr/pbi sharesmb off default tank1/usr/pbi refquota none default tank1/usr/pbi refreservation none default tank1/usr/pbi primarycache all default tank1/usr/pbi secondarycache all default tank1/usr/pbi usedbysnapshots 0 - tank1/usr/pbi usedbydataset 22.7G - tank1/usr/pbi usedbychildren 1.65G - tank1/usr/pbi usedbyrefreservation 0 - tank1/usr/pbi logbias latency default tank1/usr/pbi dedup off default tank1/usr/pbi mlslabel - tank1/usr/pbi sync standard default tank1/usr/pbi refcompressratio 1.00x - tank1/usr/pbi written 22.7G - tank1/usr/pbi/.pbi-world-amd64 type filesystem - tank1/usr/pbi/.pbi-world-amd64 creation Mon Nov 26 7:15 2012 - tank1/usr/pbi/.pbi-world-amd64 used 1.12G - tank1/usr/pbi/.pbi-world-amd64 available 2.11T - tank1/usr/pbi/.pbi-world-amd64 referenced 1.12G - tank1/usr/pbi/.pbi-world-amd64 compressratio 1.00x - tank1/usr/pbi/.pbi-world-amd64 mounted yes - tank1/usr/pbi/.pbi-world-amd64 quota none default tank1/usr/pbi/.pbi-world-amd64 reservation none default tank1/usr/pbi/.pbi-world-amd64 recordsize 128K default tank1/usr/pbi/.pbi-world-amd64 mountpoint /usr/pbi/.pbi-world-amd64 local tank1/usr/pbi/.pbi-world-amd64 sharenfs off default tank1/usr/pbi/.pbi-world-amd64 checksum on default tank1/usr/pbi/.pbi-world-amd64 compression off default tank1/usr/pbi/.pbi-world-amd64 atime off inherited from tank1 tank1/usr/pbi/.pbi-world-amd64 devices on default tank1/usr/pbi/.pbi-world-amd64 exec on default tank1/usr/pbi/.pbi-world-amd64 setuid on default tank1/usr/pbi/.pbi-world-amd64 readonly off default tank1/usr/pbi/.pbi-world-amd64 jailed off default tank1/usr/pbi/.pbi-world-amd64 snapdir hidden default tank1/usr/pbi/.pbi-world-amd64 aclmode discard default tank1/usr/pbi/.pbi-world-amd64 aclinherit restricted default tank1/usr/pbi/.pbi-world-amd64 canmount on default tank1/usr/pbi/.pbi-world-amd64 xattr off temporary tank1/usr/pbi/.pbi-world-amd64 copies 1 default tank1/usr/pbi/.pbi-world-amd64 version 5 - tank1/usr/pbi/.pbi-world-amd64 utf8only off - tank1/usr/pbi/.pbi-world-amd64 normalization none - tank1/usr/pbi/.pbi-world-amd64 casesensitivity sensitive - tank1/usr/pbi/.pbi-world-amd64 vscan off default tank1/usr/pbi/.pbi-world-amd64 nbmand off default tank1/usr/pbi/.pbi-world-amd64 sharesmb off default tank1/usr/pbi/.pbi-world-amd64 refquota none default tank1/usr/pbi/.pbi-world-amd64 refreservation none default tank1/usr/pbi/.pbi-world-amd64 primarycache all default tank1/usr/pbi/.pbi-world-amd64 secondarycache all default tank1/usr/pbi/.pbi-world-amd64 usedbysnapshots 1.33K - tank1/usr/pbi/.pbi-world-amd64 usedbydataset 1.12G - tank1/usr/pbi/.pbi-world-amd64 usedbychildren 0 - tank1/usr/pbi/.pbi-world-amd64 usedbyrefreservation 0 - tank1/usr/pbi/.pbi-world-amd64 logbias latency default tank1/usr/pbi/.pbi-world-amd64 dedup off default tank1/usr/pbi/.pbi-world-amd64 mlslabel - tank1/usr/pbi/.pbi-world-amd64 sync standard default tank1/usr/pbi/.pbi-world-amd64 refcompressratio 1.00x - tank1/usr/pbi/.pbi-world-amd64 written 1.33K - tank1/usr/pbi/.pbi-world-amd64@clean type snapshot - tank1/usr/pbi/.pbi-world-amd64@clean creation Mon Nov 26 7:16 2012 - tank1/usr/pbi/.pbi-world-amd64@clean used 1.33K - tank1/usr/pbi/.pbi-world-amd64@clean referenced 1.12G - tank1/usr/pbi/.pbi-world-amd64@clean compressratio 1.00x - tank1/usr/pbi/.pbi-world-amd64@clean devices on default tank1/usr/pbi/.pbi-world-amd64@clean exec on default tank1/usr/pbi/.pbi-world-amd64@clean setuid on default tank1/usr/pbi/.pbi-world-amd64@clean xattr on default tank1/usr/pbi/.pbi-world-amd64@clean version 5 - tank1/usr/pbi/.pbi-world-amd64@clean utf8only off - tank1/usr/pbi/.pbi-world-amd64@clean normalization none - tank1/usr/pbi/.pbi-world-amd64@clean casesensitivity sensitive - tank1/usr/pbi/.pbi-world-amd64@clean nbmand off default tank1/usr/pbi/.pbi-world-amd64@clean primarycache all default tank1/usr/pbi/.pbi-world-amd64@clean secondarycache all default tank1/usr/pbi/.pbi-world-amd64@clean defer_destroy off - tank1/usr/pbi/.pbi-world-amd64@clean userrefs 0 - tank1/usr/pbi/.pbi-world-amd64@clean mlslabel - tank1/usr/pbi/.pbi-world-amd64@clean refcompressratio 1.00x - tank1/usr/pbi/.pbi-world-amd64@clean written 1.12G - tank1/usr/pbi/.pbi-world-amd64@clean clones tank1/usr/pbi/pypy-amd64.chroot,tank1/usr/pbi/pure-ftpd-amd64.chroot,tank1/usr/pbi/i2p-amd64.chroot - tank1/usr/pbi/i2p-amd64.chroot type filesystem - tank1/usr/pbi/i2p-amd64.chroot creation Sun Feb 10 20:41 2013 - tank1/usr/pbi/i2p-amd64.chroot used 222M - tank1/usr/pbi/i2p-amd64.chroot available 2.11T - tank1/usr/pbi/i2p-amd64.chroot referenced 1.33G - tank1/usr/pbi/i2p-amd64.chroot compressratio 1.00x - tank1/usr/pbi/i2p-amd64.chroot mounted yes - tank1/usr/pbi/i2p-amd64.chroot origin tank1/usr/pbi/.pbi-world-amd64@clean - tank1/usr/pbi/i2p-amd64.chroot quota none default tank1/usr/pbi/i2p-amd64.chroot reservation none default tank1/usr/pbi/i2p-amd64.chroot recordsize 128K default tank1/usr/pbi/i2p-amd64.chroot mountpoint /usr/pbi/i2p-amd64.chroot inherited from tank1/usr/pbi tank1/usr/pbi/i2p-amd64.chroot sharenfs off default tank1/usr/pbi/i2p-amd64.chroot checksum on default tank1/usr/pbi/i2p-amd64.chroot compression off default tank1/usr/pbi/i2p-amd64.chroot atime off inherited from tank1 tank1/usr/pbi/i2p-amd64.chroot devices on default tank1/usr/pbi/i2p-amd64.chroot exec on default tank1/usr/pbi/i2p-amd64.chroot setuid on default tank1/usr/pbi/i2p-amd64.chroot readonly off default tank1/usr/pbi/i2p-amd64.chroot jailed off default tank1/usr/pbi/i2p-amd64.chroot snapdir hidden default tank1/usr/pbi/i2p-amd64.chroot aclmode discard default tank1/usr/pbi/i2p-amd64.chroot aclinherit restricted default tank1/usr/pbi/i2p-amd64.chroot canmount on default tank1/usr/pbi/i2p-amd64.chroot xattr off temporary tank1/usr/pbi/i2p-amd64.chroot copies 1 default tank1/usr/pbi/i2p-amd64.chroot version 5 - tank1/usr/pbi/i2p-amd64.chroot utf8only off - tank1/usr/pbi/i2p-amd64.chroot normalization none - tank1/usr/pbi/i2p-amd64.chroot casesensitivity sensitive - tank1/usr/pbi/i2p-amd64.chroot vscan off default tank1/usr/pbi/i2p-amd64.chroot nbmand off default tank1/usr/pbi/i2p-amd64.chroot sharesmb off default tank1/usr/pbi/i2p-amd64.chroot refquota none default tank1/usr/pbi/i2p-amd64.chroot refreservation none default tank1/usr/pbi/i2p-amd64.chroot primarycache all default tank1/usr/pbi/i2p-amd64.chroot secondarycache all default tank1/usr/pbi/i2p-amd64.chroot usedbysnapshots 0 - tank1/usr/pbi/i2p-amd64.chroot usedbydataset 222M - tank1/usr/pbi/i2p-amd64.chroot usedbychildren 0 - tank1/usr/pbi/i2p-amd64.chroot usedbyrefreservation 0 - tank1/usr/pbi/i2p-amd64.chroot logbias latency default tank1/usr/pbi/i2p-amd64.chroot dedup off default tank1/usr/pbi/i2p-amd64.chroot mlslabel - tank1/usr/pbi/i2p-amd64.chroot sync standard default tank1/usr/pbi/i2p-amd64.chroot refcompressratio 1.00x - tank1/usr/pbi/i2p-amd64.chroot written 222M - tank1/usr/pbi/pure-ftpd-amd64.chroot type filesystem - tank1/usr/pbi/pure-ftpd-amd64.chroot creation Mon Nov 26 20:34 2012 - tank1/usr/pbi/pure-ftpd-amd64.chroot used 147M - tank1/usr/pbi/pure-ftpd-amd64.chroot available 2.11T - tank1/usr/pbi/pure-ftpd-amd64.chroot referenced 1.26G - tank1/usr/pbi/pure-ftpd-amd64.chroot compressratio 1.00x - tank1/usr/pbi/pure-ftpd-amd64.chroot mounted yes - tank1/usr/pbi/pure-ftpd-amd64.chroot origin tank1/usr/pbi/.pbi-world-amd64@clean - tank1/usr/pbi/pure-ftpd-amd64.chroot quota none default tank1/usr/pbi/pure-ftpd-amd64.chroot reservation none default tank1/usr/pbi/pure-ftpd-amd64.chroot recordsize 128K default tank1/usr/pbi/pure-ftpd-amd64.chroot mountpoint /usr/pbi/pure-ftpd-amd64.chroot inherited from tank1/usr/pbi tank1/usr/pbi/pure-ftpd-amd64.chroot sharenfs off default tank1/usr/pbi/pure-ftpd-amd64.chroot checksum on default tank1/usr/pbi/pure-ftpd-amd64.chroot compression off default tank1/usr/pbi/pure-ftpd-amd64.chroot atime off inherited from tank1 tank1/usr/pbi/pure-ftpd-amd64.chroot devices on default tank1/usr/pbi/pure-ftpd-amd64.chroot exec on default tank1/usr/pbi/pure-ftpd-amd64.chroot setuid on default tank1/usr/pbi/pure-ftpd-amd64.chroot readonly off default tank1/usr/pbi/pure-ftpd-amd64.chroot jailed off default tank1/usr/pbi/pure-ftpd-amd64.chroot snapdir hidden default tank1/usr/pbi/pure-ftpd-amd64.chroot aclmode discard default tank1/usr/pbi/pure-ftpd-amd64.chroot aclinherit restricted default tank1/usr/pbi/pure-ftpd-amd64.chroot canmount on default tank1/usr/pbi/pure-ftpd-amd64.chroot xattr off temporary tank1/usr/pbi/pure-ftpd-amd64.chroot copies 1 default tank1/usr/pbi/pure-ftpd-amd64.chroot version 5 - tank1/usr/pbi/pure-ftpd-amd64.chroot utf8only off - tank1/usr/pbi/pure-ftpd-amd64.chroot normalization none - tank1/usr/pbi/pure-ftpd-amd64.chroot casesensitivity sensitive - tank1/usr/pbi/pure-ftpd-amd64.chroot vscan off default tank1/usr/pbi/pure-ftpd-amd64.chroot nbmand off default tank1/usr/pbi/pure-ftpd-amd64.chroot sharesmb off default tank1/usr/pbi/pure-ftpd-amd64.chroot refquota none default tank1/usr/pbi/pure-ftpd-amd64.chroot refreservation none default tank1/usr/pbi/pure-ftpd-amd64.chroot primarycache all default tank1/usr/pbi/pure-ftpd-amd64.chroot secondarycache all default tank1/usr/pbi/pure-ftpd-amd64.chroot usedbysnapshots 0 - tank1/usr/pbi/pure-ftpd-amd64.chroot usedbydataset 147M - tank1/usr/pbi/pure-ftpd-amd64.chroot usedbychildren 0 - tank1/usr/pbi/pure-ftpd-amd64.chroot usedbyrefreservation 0 - tank1/usr/pbi/pure-ftpd-amd64.chroot logbias latency default tank1/usr/pbi/pure-ftpd-amd64.chroot dedup off default tank1/usr/pbi/pure-ftpd-amd64.chroot mlslabel - tank1/usr/pbi/pure-ftpd-amd64.chroot sync standard default tank1/usr/pbi/pure-ftpd-amd64.chroot refcompressratio 1.00x - tank1/usr/pbi/pure-ftpd-amd64.chroot written 147M - tank1/usr/pbi/pypy-amd64.chroot type filesystem - tank1/usr/pbi/pypy-amd64.chroot creation Sun Dec 9 17:51 2012 - tank1/usr/pbi/pypy-amd64.chroot used 184M - tank1/usr/pbi/pypy-amd64.chroot available 2.11T - tank1/usr/pbi/pypy-amd64.chroot referenced 1.29G - tank1/usr/pbi/pypy-amd64.chroot compressratio 1.00x - tank1/usr/pbi/pypy-amd64.chroot mounted yes - tank1/usr/pbi/pypy-amd64.chroot origin tank1/usr/pbi/.pbi-world-amd64@clean - tank1/usr/pbi/pypy-amd64.chroot quota none default tank1/usr/pbi/pypy-amd64.chroot reservation none default tank1/usr/pbi/pypy-amd64.chroot recordsize 128K default tank1/usr/pbi/pypy-amd64.chroot mountpoint /usr/pbi/pypy-amd64.chroot inherited from tank1/usr/pbi tank1/usr/pbi/pypy-amd64.chroot sharenfs off default tank1/usr/pbi/pypy-amd64.chroot checksum on default tank1/usr/pbi/pypy-amd64.chroot compression off default tank1/usr/pbi/pypy-amd64.chroot atime off inherited from tank1 tank1/usr/pbi/pypy-amd64.chroot devices on default tank1/usr/pbi/pypy-amd64.chroot exec on default tank1/usr/pbi/pypy-amd64.chroot setuid on default tank1/usr/pbi/pypy-amd64.chroot readonly off default tank1/usr/pbi/pypy-amd64.chroot jailed off default tank1/usr/pbi/pypy-amd64.chroot snapdir hidden default tank1/usr/pbi/pypy-amd64.chroot aclmode discard default tank1/usr/pbi/pypy-amd64.chroot aclinherit restricted default tank1/usr/pbi/pypy-amd64.chroot canmount on default tank1/usr/pbi/pypy-amd64.chroot xattr off temporary tank1/usr/pbi/pypy-amd64.chroot copies 1 default tank1/usr/pbi/pypy-amd64.chroot version 5 - tank1/usr/pbi/pypy-amd64.chroot utf8only off - tank1/usr/pbi/pypy-amd64.chroot normalization none - tank1/usr/pbi/pypy-amd64.chroot casesensitivity sensitive - tank1/usr/pbi/pypy-amd64.chroot vscan off default tank1/usr/pbi/pypy-amd64.chroot nbmand off default tank1/usr/pbi/pypy-amd64.chroot sharesmb off default tank1/usr/pbi/pypy-amd64.chroot refquota none default tank1/usr/pbi/pypy-amd64.chroot refreservation none default tank1/usr/pbi/pypy-amd64.chroot primarycache all default tank1/usr/pbi/pypy-amd64.chroot secondarycache all default tank1/usr/pbi/pypy-amd64.chroot usedbysnapshots 0 - tank1/usr/pbi/pypy-amd64.chroot usedbydataset 184M - tank1/usr/pbi/pypy-amd64.chroot usedbychildren 0 - tank1/usr/pbi/pypy-amd64.chroot usedbyrefreservation 0 - tank1/usr/pbi/pypy-amd64.chroot logbias latency default tank1/usr/pbi/pypy-amd64.chroot dedup off default tank1/usr/pbi/pypy-amd64.chroot mlslabel - tank1/usr/pbi/pypy-amd64.chroot sync standard default tank1/usr/pbi/pypy-amd64.chroot refcompressratio 1.00x - tank1/usr/pbi/pypy-amd64.chroot written 184M - tank1/usr/ports type filesystem - tank1/usr/ports creation Wed Oct 31 19:03 2012 - tank1/usr/ports used 1.53G - tank1/usr/ports available 2.11T - tank1/usr/ports referenced 1.14G - tank1/usr/ports compressratio 1.56x - tank1/usr/ports mounted yes - tank1/usr/ports quota none default tank1/usr/ports reservation none default tank1/usr/ports recordsize 128K default tank1/usr/ports mountpoint /usr/ports local tank1/usr/ports sharenfs off default tank1/usr/ports checksum on default tank1/usr/ports compression gzip local tank1/usr/ports atime off inherited from tank1 tank1/usr/ports devices on default tank1/usr/ports exec on default tank1/usr/ports setuid on default tank1/usr/ports readonly off default tank1/usr/ports jailed off default tank1/usr/ports snapdir hidden default tank1/usr/ports aclmode discard default tank1/usr/ports aclinherit restricted default tank1/usr/ports canmount on default tank1/usr/ports xattr off temporary tank1/usr/ports copies 1 default tank1/usr/ports version 5 - tank1/usr/ports utf8only off - tank1/usr/ports normalization none - tank1/usr/ports casesensitivity sensitive - tank1/usr/ports vscan off default tank1/usr/ports nbmand off default tank1/usr/ports sharesmb off default tank1/usr/ports refquota none default tank1/usr/ports refreservation none default tank1/usr/ports primarycache all default tank1/usr/ports secondarycache all default tank1/usr/ports usedbysnapshots 0 - tank1/usr/ports usedbydataset 1.14G - tank1/usr/ports usedbychildren 394M - tank1/usr/ports usedbyrefreservation 0 - tank1/usr/ports logbias latency default tank1/usr/ports dedup off default tank1/usr/ports mlslabel - tank1/usr/ports sync standard default tank1/usr/ports refcompressratio 1.76x - tank1/usr/ports written 1.14G - tank1/usr/ports/distfiles type filesystem - tank1/usr/ports/distfiles creation Wed Oct 31 19:03 2012 - tank1/usr/ports/distfiles used 394M - tank1/usr/ports/distfiles available 2.11T - tank1/usr/ports/distfiles referenced 394M - tank1/usr/ports/distfiles compressratio 1.00x - tank1/usr/ports/distfiles mounted yes - tank1/usr/ports/distfiles quota none default tank1/usr/ports/distfiles reservation none default tank1/usr/ports/distfiles recordsize 128K default tank1/usr/ports/distfiles mountpoint /usr/ports/distfiles local tank1/usr/ports/distfiles sharenfs off default tank1/usr/ports/distfiles checksum on default tank1/usr/ports/distfiles compression off local tank1/usr/ports/distfiles atime off inherited from tank1 tank1/usr/ports/distfiles devices on default tank1/usr/ports/distfiles exec on default tank1/usr/ports/distfiles setuid on default tank1/usr/ports/distfiles readonly off default tank1/usr/ports/distfiles jailed off default tank1/usr/ports/distfiles snapdir hidden default tank1/usr/ports/distfiles aclmode discard default tank1/usr/ports/distfiles aclinherit restricted default tank1/usr/ports/distfiles canmount on default tank1/usr/ports/distfiles xattr off temporary tank1/usr/ports/distfiles copies 1 default tank1/usr/ports/distfiles version 5 - tank1/usr/ports/distfiles utf8only off - tank1/usr/ports/distfiles normalization none - tank1/usr/ports/distfiles casesensitivity sensitive - tank1/usr/ports/distfiles vscan off default tank1/usr/ports/distfiles nbmand off default tank1/usr/ports/distfiles sharesmb off default tank1/usr/ports/distfiles refquota none default tank1/usr/ports/distfiles refreservation none default tank1/usr/ports/distfiles primarycache all default tank1/usr/ports/distfiles secondarycache all default tank1/usr/ports/distfiles usedbysnapshots 0 - tank1/usr/ports/distfiles usedbydataset 394M - tank1/usr/ports/distfiles usedbychildren 0 - tank1/usr/ports/distfiles usedbyrefreservation 0 - tank1/usr/ports/distfiles logbias latency default tank1/usr/ports/distfiles dedup off default tank1/usr/ports/distfiles mlslabel - tank1/usr/ports/distfiles sync standard default tank1/usr/ports/distfiles refcompressratio 1.00x - tank1/usr/ports/distfiles written 394M - tank1/usr/src type filesystem - tank1/usr/src creation Wed Oct 31 19:03 2012 - tank1/usr/src used 471M - tank1/usr/src available 2.11T - tank1/usr/src referenced 471M - tank1/usr/src compressratio 3.49x - tank1/usr/src mounted yes - tank1/usr/src quota none default tank1/usr/src reservation none default tank1/usr/src recordsize 128K default tank1/usr/src mountpoint /usr/src local tank1/usr/src sharenfs off default tank1/usr/src checksum on default tank1/usr/src compression gzip local tank1/usr/src atime off inherited from tank1 tank1/usr/src devices on default tank1/usr/src exec on default tank1/usr/src setuid on default tank1/usr/src readonly off default tank1/usr/src jailed off default tank1/usr/src snapdir hidden default tank1/usr/src aclmode discard default tank1/usr/src aclinherit restricted default tank1/usr/src canmount on default tank1/usr/src xattr off temporary tank1/usr/src copies 1 default tank1/usr/src version 5 - tank1/usr/src utf8only off - tank1/usr/src normalization none - tank1/usr/src casesensitivity sensitive - tank1/usr/src vscan off default tank1/usr/src nbmand off default tank1/usr/src sharesmb off default tank1/usr/src refquota none default tank1/usr/src refreservation none default tank1/usr/src primarycache all default tank1/usr/src secondarycache all default tank1/usr/src usedbysnapshots 0 - tank1/usr/src usedbydataset 471M - tank1/usr/src usedbychildren 0 - tank1/usr/src usedbyrefreservation 0 - tank1/usr/src logbias latency default tank1/usr/src dedup off default tank1/usr/src mlslabel - tank1/usr/src sync standard default tank1/usr/src refcompressratio 3.49x - tank1/usr/src written 471M - tank1/var type filesystem - tank1/var creation Wed Oct 31 19:03 2012 - tank1/var used 189M - tank1/var available 2.11T - tank1/var referenced 40.0K - tank1/var compressratio 119.03x - tank1/var mounted no - tank1/var quota none default tank1/var reservation none default tank1/var recordsize 128K default tank1/var mountpoint /mnt/var local tank1/var sharenfs off default tank1/var checksum on default tank1/var compression off default tank1/var atime off inherited from tank1 tank1/var devices on default tank1/var exec on default tank1/var setuid on default tank1/var readonly off default tank1/var jailed off default tank1/var snapdir hidden default tank1/var aclmode discard default tank1/var aclinherit restricted default tank1/var canmount off local tank1/var xattr on default tank1/var copies 1 default tank1/var version 5 - tank1/var utf8only off - tank1/var normalization none - tank1/var casesensitivity sensitive - tank1/var vscan off default tank1/var nbmand off default tank1/var sharesmb off default tank1/var refquota none default tank1/var refreservation none default tank1/var primarycache all default tank1/var secondarycache all default tank1/var usedbysnapshots 0 - tank1/var usedbydataset 40.0K - tank1/var usedbychildren 189M - tank1/var usedbyrefreservation 0 - tank1/var logbias latency default tank1/var dedup off default tank1/var mlslabel - tank1/var sync standard default tank1/var refcompressratio 1.00x - tank1/var written 40.0K - tank1/var/audit type filesystem - tank1/var/audit creation Wed Oct 31 19:03 2012 - tank1/var/audit used 40.0K - tank1/var/audit available 2.11T - tank1/var/audit referenced 40.0K - tank1/var/audit compressratio 1.00x - tank1/var/audit mounted yes - tank1/var/audit quota none default tank1/var/audit reservation none default tank1/var/audit recordsize 128K default tank1/var/audit mountpoint /var/audit local tank1/var/audit sharenfs off default tank1/var/audit checksum on default tank1/var/audit compression lzjb local tank1/var/audit atime off inherited from tank1 tank1/var/audit devices on default tank1/var/audit exec on default tank1/var/audit setuid on default tank1/var/audit readonly off default tank1/var/audit jailed off default tank1/var/audit snapdir hidden default tank1/var/audit aclmode discard default tank1/var/audit aclinherit restricted default tank1/var/audit canmount on default tank1/var/audit xattr off temporary tank1/var/audit copies 1 default tank1/var/audit version 5 - tank1/var/audit utf8only off - tank1/var/audit normalization none - tank1/var/audit casesensitivity sensitive - tank1/var/audit vscan off default tank1/var/audit nbmand off default tank1/var/audit sharesmb off default tank1/var/audit refquota none default tank1/var/audit refreservation none default tank1/var/audit primarycache all default tank1/var/audit secondarycache all default tank1/var/audit usedbysnapshots 0 - tank1/var/audit usedbydataset 40.0K - tank1/var/audit usedbychildren 0 - tank1/var/audit usedbyrefreservation 0 - tank1/var/audit logbias latency default tank1/var/audit dedup off default tank1/var/audit mlslabel - tank1/var/audit sync standard default tank1/var/audit refcompressratio 1.00x - tank1/var/audit written 40.0K - tank1/var/log type filesystem - tank1/var/log creation Wed Oct 31 19:04 2012 - tank1/var/log used 185M - tank1/var/log available 2.11T - tank1/var/log referenced 185M - tank1/var/log compressratio 121.95x - tank1/var/log mounted yes - tank1/var/log quota none default tank1/var/log reservation none default tank1/var/log recordsize 128K default tank1/var/log mountpoint /var/log local tank1/var/log sharenfs off default tank1/var/log checksum on default tank1/var/log compression gzip local tank1/var/log atime off inherited from tank1 tank1/var/log devices on default tank1/var/log exec on default tank1/var/log setuid on default tank1/var/log readonly off default tank1/var/log jailed off default tank1/var/log snapdir hidden default tank1/var/log aclmode discard default tank1/var/log aclinherit restricted default tank1/var/log canmount on default tank1/var/log xattr off temporary tank1/var/log copies 1 default tank1/var/log version 5 - tank1/var/log utf8only off - tank1/var/log normalization none - tank1/var/log casesensitivity sensitive - tank1/var/log vscan off default tank1/var/log nbmand off default tank1/var/log sharesmb off default tank1/var/log refquota none default tank1/var/log refreservation none default tank1/var/log primarycache all default tank1/var/log secondarycache all default tank1/var/log usedbysnapshots 0 - tank1/var/log usedbydataset 185M - tank1/var/log usedbychildren 0 - tank1/var/log usedbyrefreservation 0 - tank1/var/log logbias latency default tank1/var/log dedup off default tank1/var/log mlslabel - tank1/var/log sync standard default tank1/var/log refcompressratio 121.95x - tank1/var/log written 185M - tank1/var/tmp type filesystem - tank1/var/tmp creation Wed Oct 31 19:04 2012 - tank1/var/tmp used 3.69M - tank1/var/tmp available 2.11T - tank1/var/tmp referenced 3.69M - tank1/var/tmp compressratio 1.95x - tank1/var/tmp mounted yes - tank1/var/tmp quota none default tank1/var/tmp reservation none default tank1/var/tmp recordsize 128K default tank1/var/tmp mountpoint /var/tmp local tank1/var/tmp sharenfs off default tank1/var/tmp checksum on default tank1/var/tmp compression lzjb local tank1/var/tmp atime off inherited from tank1 tank1/var/tmp devices on default tank1/var/tmp exec on default tank1/var/tmp setuid on default tank1/var/tmp readonly off default tank1/var/tmp jailed off default tank1/var/tmp snapdir hidden default tank1/var/tmp aclmode discard default tank1/var/tmp aclinherit restricted default tank1/var/tmp canmount on default tank1/var/tmp xattr off temporary tank1/var/tmp copies 1 default tank1/var/tmp version 5 - tank1/var/tmp utf8only off - tank1/var/tmp normalization none - tank1/var/tmp casesensitivity sensitive - tank1/var/tmp vscan off default tank1/var/tmp nbmand off default tank1/var/tmp sharesmb off default tank1/var/tmp refquota none default tank1/var/tmp refreservation none default tank1/var/tmp primarycache all default tank1/var/tmp secondarycache all default tank1/var/tmp usedbysnapshots 0 - tank1/var/tmp usedbydataset 3.69M - tank1/var/tmp usedbychildren 0 - tank1/var/tmp usedbyrefreservation 0 - tank1/var/tmp logbias latency default tank1/var/tmp dedup off default tank1/var/tmp mlslabel - tank1/var/tmp sync standard default tank1/var/tmp refcompressratio 1.95x - tank1/var/tmp written 3.69M - OK, so I forgot an important info that my pool doesn't have redundancy.... Still, I don't think it's *the* problem because it happened in the installer and it used to happen before I RMA'd another disk (That was hanging the same, but more often. They are the same batch. Not sure about ada1, but I do have a 3rd disk from the batch. I can check it tomorrow). Actually I have the replacement disk laying around...should I connect it or better leave the system as it is and wait for the problem to reappear? -- Twoje radio From owner-freebsd-fs@FreeBSD.ORG Fri Apr 12 23:35:39 2013 Return-Path: Delivered-To: freebsd-fs@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 43FE018E; Fri, 12 Apr 2013 23:35:39 +0000 (UTC) (envelope-from nowakpl@platinum.linux.pl) Received: from platinum.linux.pl (platinum.edu.pl [81.161.192.4]) by mx1.freebsd.org (Postfix) with ESMTP id 07E432FC; Fri, 12 Apr 2013 23:35:37 +0000 (UTC) Received: by platinum.linux.pl (Postfix, from userid 87) id C92D247E21; Sat, 13 Apr 2013 01:35:29 +0200 (CEST) X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on platinum.linux.pl X-Spam-Level: X-Spam-Status: No, score=-1.3 required=3.0 tests=ALL_TRUSTED,AWL autolearn=disabled version=3.3.2 Received: from [10.255.1.2] (unknown [83.151.38.73]) by platinum.linux.pl (Postfix) with ESMTPA id 9848847E15; Sat, 13 Apr 2013 01:35:29 +0200 (CEST) Message-ID: <51689A2C.4080402@platinum.linux.pl> Date: Sat, 13 Apr 2013 01:35:08 +0200 From: Adam Nowacki User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:17.0) Gecko/20130328 Thunderbird/17.0.5 MIME-Version: 1.0 To: Andriy Gapon Subject: Re: ZFS slow reads for unallocated blocks References: <5166EA43.7050700@platinum.linux.pl> <5167B1C5.8020402@FreeBSD.org> In-Reply-To: <5167B1C5.8020402@FreeBSD.org> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: freebsd-fs@FreeBSD.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 12 Apr 2013 23:35:39 -0000 http://tepeserwery.pl/nowak/freebsd/zfs_sparse_optimization.patch.txt Does it look sane? On 2013-04-12 09:03, Andriy Gapon wrote: > > ENOTIME to really investigate, but here is a basic profile result for those > interested: > kernel`bzero+0xa > kernel`dmu_buf_hold_array_by_dnode+0x1cf > kernel`dmu_read_uio+0x66 > kernel`zfs_freebsd_read+0x3c0 > kernel`VOP_READ_APV+0x92 > kernel`vn_read+0x1a3 > kernel`vn_io_fault+0x23a > kernel`dofileread+0x7b > kernel`sys_read+0x9e > kernel`amd64_syscall+0x238 > kernel`0xffffffff80747e4b > > That's where > 99% of time is spent. > From owner-freebsd-fs@FreeBSD.ORG Sat Apr 13 00:07:33 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 89216693 for ; Sat, 13 Apr 2013 00:07:33 +0000 (UTC) (envelope-from jdc@koitsu.org) Received: from qmta01.emeryville.ca.mail.comcast.net (qmta01.emeryville.ca.mail.comcast.net [IPv6:2001:558:fe2d:43:76:96:30:16]) by mx1.freebsd.org (Postfix) with ESMTP id 6BBEE606 for ; Sat, 13 Apr 2013 00:07:33 +0000 (UTC) Received: from omta20.emeryville.ca.mail.comcast.net ([76.96.30.87]) by qmta01.emeryville.ca.mail.comcast.net with comcast id P04P1l00D1smiN4A1C7Y38; Sat, 13 Apr 2013 00:07:32 +0000 Received: from koitsu.strangled.net ([67.180.84.87]) by omta20.emeryville.ca.mail.comcast.net with comcast id PC7X1l00b1t3BNj8gC7X6Q; Sat, 13 Apr 2013 00:07:32 +0000 Received: by icarus.home.lan (Postfix, from userid 1000) id 6C8D173A33; Fri, 12 Apr 2013 17:07:31 -0700 (PDT) Date: Fri, 12 Apr 2013 17:07:31 -0700 From: Jeremy Chadwick To: Radio =?unknown-8bit?B?bcU/b2R5Y2ggYmFuZHl0w7N3?= Subject: Re: A failed drive causes system to hang Message-ID: <20130413000731.GA84309@icarus.home.lan> References: <51672164.1090908@o2.pl> <20130411212408.GA60159@icarus.home.lan> <5168821F.5020502@o2.pl> <20130412220350.GA82467@icarus.home.lan> <51688BA6.1000507@o2.pl> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <51688BA6.1000507@o2.pl> User-Agent: Mutt/1.5.21 (2010-09-15) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=comcast.net; s=q20121106; t=1365811652; bh=PT0bLxF4D8dG1PKVb/GQCxVHs5VocZUZPgBR+7VBvak=; h=Received:Received:Received:Date:From:To:Subject:Message-ID: MIME-Version:Content-Type; b=TECqcvQC/X8xihFUzKvBwDXBiny5+fxQ3HJpjfrwIT3RiPaKWFiw0gAfa+1xAKzkB kFFk0gfjoXOP+W6SOo/6ryH3spj/dun/4w2FTukQE/wiaVDpM1baYeh6/beB2WiJc1 KKNWznV8Lz1Ze8LK2lcAtoLGq9vcUR5/XvAA8QhxoU/9GR9Nxoy2SGg0975G3zhCE2 NFhJmNvpQmAoF3nokwjHj6WU/zP0kNgV3E39yYsQaujjz1X2iclyCbko5rl3DRK97R 8Tw3pKlpuQNIOgYk1S+PRMxIUN4YfKhVAQE13WxryLBLDl7yqLU8u+syuJJ+lGQf87 RW4qeCVVkO0Hg== Cc: freebsd-fs@freebsd.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 13 Apr 2013 00:07:33 -0000 On Sat, Apr 13, 2013 at 12:33:10AM +0200, Radio m?odych bandytw wrote: > On 13/04/2013 00:03, Jeremy Chadwick wrote: > >On Fri, Apr 12, 2013 at 11:52:31PM +0200, Radio m?odych bandytw wrote: > >>On 11/04/2013 23:24, Jeremy Chadwick wrote: > >>>On Thu, Apr 11, 2013 at 10:47:32PM +0200, Radio m?odych bandytw wrote: > >>>>Seeing a ZFS thread, I decided to write about a similar problem that > >>>>I experience. > >>>>I have a failing drive in my array. I need to RMA it, but don't have > >>>>time and it fails rarely enough to be a yet another annoyance. > >>>>The failure is simple: it fails to respond. > >>>>When it happens, the only thing I found I can do is switch consoles. > >>>>Any command fails, login fails, apps hang. > >>>> > >>>>On the 1st console I see a series of messages like: > >>>> > >>>>(ada0:ahcich0:0:0:0): CAM status: Command timeout > >>>>(ada0:ahcich0:0:0:0): Error 5, Periph was invalidated > >>>>(ada0:ahcich0:0:0:0): WRITE_FPDMA_QUEUED > >>>> > >>>>I use RAIDZ1 and I'd expect that none single failure would cause the > >>>>system to fail... > >>> > >>>You need to provide full output from "dmesg", and you need to define > >>>what the word "fails" means (re: "any command fails", "login fails"). > >>Fails = hangs. When trying to log it, I can type my user name, but > >>after I press enter the prompt for password never appear. > >>As to dmesg, tough luck. I have 2 photos on my phone and their > >>transcripts are all I can give until the problem reappears (which > >>should take up to 2 weeks). Photos are blurry and in many cases I'm > >>not sure what exactly is there. > >> > >>Screen1: > >>(ada0:ahcich0:0:0:0): FLUSHCACHE40. ACB: (ea?) 00 00 00 00 (cut?) > >>(ada0:ahcich0:0:0:0): CAM status: Unconditionally Re-qu (cut) > >>(ada0:ahcich0:0:0:0): Error 5, Periph was invalidated > >>(ada0:ahcich0:0:0:0): WRITE_FPDMA_QUEUED. ACB: 61 05 d3(cut) > >>00 > >>(ada0:ahcich0:0:0:0): CAM status: Command timeout > >>(ada0:ahcich0:0:0:0): Error 5, Periph was invalidated > >>(ada0:ahcich0:0:0:0): WRITE_FPDMA_QUEUED. ACB: 61 03 7b(cut) > >>00 > >>(ada0:ahcich0:0:0:0): CAM status: Unconditionally Re-qu (cut) > >>(ada0:ahcich0:0:0:0): Error 5, Periph was invalidated > >>(ada0:ahcich0:0:0:0): WRITE_FPDMA_QUEUED. ACB: 61 03 d0(cut) > >>00 > >>(ada0:ahcich0:0:0:0): CAM status: Command timeout > >>(ada0:ahcich0:0:0:0): Error 5, Periph was invalidated > >> > >> > >>Screen 2: > >>ahcich0: Timeout on slot 29 port 0 > >>ahcich0: (unreadable, lots of numbers, some text) > >>(aprobe0:ahcich0:0:0:0): ATA_IDENTIFY. ACB: (cc?) 00 (cut) > >>(aprobe0:ahcich0:0:0:0): CAM status: Command timeout > >>(aprobe0:ahcich0:0:0:0): Error (5?), Retry was blocked > >>ahcich0: Timeout on slot 29 port 0 > >>ahcich0: (unreadable, lots of numbers, some text) > >>(aprobe0:ahcich0:0:0:0): ATA_IDENTIFY. ACB: (cc?) 00 (cut) > >>(aprobe0:ahcich0:0:0:0): CAM status: Command timeout > >>(aprobe0:ahcich0:0:0:0): Error (5?), Retry was blocked > >>ahcich0: Timeout on slot 30 port 0 > >>ahcich0: (unreadable, lots of numbers, some text) > >>(ada0:ahcich0:0:0:0): WRITE_FPDMA_QUEUED. ACB: 01 (cut) > >>(ada0:ahcich0:0:0:0): CAM status: Command timeout > >>(ada0:ahcich0:0:0:0): Error 5, Periph was invalidated > >>(ada0:ahcich0:0:0:0): WRITE_FPDMA_QUEUED. ACB: 01 (cut) > >> > >>Both are from the same event. In general, messages: > >> > >>(ada0:ahcich0:0:0:0): CAM status: Command timeout > >>(ada0:ahcich0:0:0:0): Error 5, Periph was invalidated > >>(ada0:ahcich0:0:0:0): WRITE_FPDMA_QUEUED. > >> > >>are the most common. > >> > >>I've waited for more than 1/2 hour once and the system didn't return > >>to a working state, the messages kept flowing and pretty much > >>nothing was working. What's interesting, I remember that it happened > >>to me even when I was using an installer (PC-BSD one), before the > >>actual installation began, so the disk stored no program data. And I > >>*think* there was no ZFS yet anyway. > >> > >>> > >>>I've already demonstrated that loss of a disk in raidz1 (or even 2 disks > >>>in raidz2) does not cause ""the system to fail"" on stable/9. However, > >>>if you lose enough members or vdevs to cause catastrophic failure, there > >>>may be anomalies depending on how your system is set up: > >>> > >>>http://lists.freebsd.org/pipermail/freebsd-fs/2013-March/016814.html > >>> > >>>If the pool has failmode=wait, any I/O to that pool will block (wait) > >>>indefinitely. This is the default. > >>> > >>>If the pool has failmode=continue, existing write I/O operations will > >>>fail with EIO (I/O error) (and hopefully applications/daemons will > >>>handle that gracefully -- if not, that's their fault) but any subsequent > >>>I/O (read or write) to that pool will block (wait) indefinitely. > >>> > >>>If the pool has failmode=panic, the kernel will immediately panic. > >>> > >>>If the CAM layer is what's wedged, that may be a different issue (and > >>>not related to ZFS). I would suggest running stable/9 as many > >>>improvements in this regard have been committed recently (some related > >>>to CAM, others related to ZFS and its new "deadman" watcher). > >> > >>Yeah, because of the installer failure, I don't think it's related to ZFS. > >>Even if it is, for now I won't set any ZFS properties in hope it > >>repeats and I can get better data. > >>> > >>>Bottom line: terse output of the problem does not help. Be verbose, > >>>provide all output (commands you type, everything!), as well as any > >>>physical actions you take. > >>> > >>Yep. In fact having little data was what made me hesitate to write > >>about it; since I did already, I'll do my best to get more info, > >>though for now I can only wait for a repetition. > >> > >> > >>On 12/04/2013 00:08, Quartz wrote:> > >>>>Seeing a ZFS thread, I decided to write about a similar problem that I > >>>>experience. > >>> > >>>I'm assuming you're referring to my "Failed pool causes system to hang" > >>>thread. I wonder if there's some common issue with zfs where it locks up > >>>if it can't write to disks how it wants to. > >>> > >>>I'm not sure how similar your problem is to mine. What's your pool setup > >>>look like? Redundancy options? Are you booting from a pool? I'd be > >>>interested to know if you can just yank the cable to the drive and see > >>>if the system recovers. > >>> > >>>You seem to be worse off than me- I can still login and run at least a > >>>couple commands. I'm booting from a straight ufs drive though. > >>> > >>>______________________________________ > >>>it has a certain smooth-brained appeal > >>> > >>Like I said, I don't think it's ZFS-specific, but just in case...: > >>RAIDZ1, root on ZFS. I should reduce severity of a pool loss before > >>pulling cables, so no tests for now. > > > >Key points: > > > >1. We now know why "commands hang" and anything I/O-related blocks > >(waits) for you: because your root filesystem is ZFS. If the ZFS layer > >is waiting on CAM, and CAM is waiting on your hardware, then those I/O > >requests are going to block indefinitely. So now you know the answer to > >why that happens. > > > >2. I agree that the problem is not likely in ZFS, but rather either with > >CAM, the AHCI implementation used, or hardware (either disk or storage > >controller). > > > >3. Your lack of "dmesg" is going to make this virtually impossible to > >solve. We really, ***really*** need that. I cannot stress this enough. > >This will tell us a lot of information about your system. We're also > >going to need to see "zpool status" output, as well as "zpool get all" > >and "zfs get all". "pciconf -lvbc" would also be useful. > > > >There are some known "gotchas" with certain models of hard disks or AHCI > >controllers (which is responsible is unknown at this time), but I don't > >want to start jumping to conclusions until full details can be provided > >first. > > > >I would recommend formatting a USB flash drive as FAT/FAT32, booting > >into single-user mode, then mounting the USB flash drive and issuing > >the above commands + writing the output to files on the flash drive, > >then provide those here. > > > >We really need this information. > > > >4. Please involve the PC-BSD folks in this discussion. They need to be > >made aware of issues like this so they (and iXSystems, potentially) can > >investigate from their side. > > > OK, thanks for the info. > Since dmesg is so important, I'd say the best thing is to wait for > the problem to happen again. When it does, I'll restart the thread > with every information that you requested here and with a PC-BSD > cross-post. > > However, I just got a different hang just a while ago. This time it > was temporary, I don't know, I switched to console0 after ~10 > seconds, there were 2 errors. Nothing appeared for ~1 minute, so I > switched back and the system was OK. Different drive, I haven't seen > problems with this one. And I think they used to be ahci, here's > ata. > > dmesg: > > fuse4bsd: version 0.3.9-pre1, FUSE ABI 7.19 > (ada1:ata0:0:0:0): READ_DMA48. ACB: 25 00 82 46 b8 40 25 00 00 00 01 00 > (ada1:ata0:0:0:0): CAM status: Command timeout > (ada1:ata0:0:0:0): Retrying command > vboxdrv: fAsync=0 offMin=0x53d offMax=0x52b9 > linux: pid 17170 (npviewer.bin): syscall pipe2 not implemented > (ada1:ata0:0:0:0): READ_DMA48. ACB: 25 00 87 1a c7 40 1a 00 00 00 01 00 > (ada1:ata0:0:0:0): CAM status: Command timeout > (ada1:ata0:0:0:0): Retrying command > > {another 150KBytes of data snipped} The above output indicates that there was a timeout when trying to issue a 48-bit DMA request to the disk. The disk did not respond to the request within 30 seconds. If you were using AHCI, we'd be able to see if the AHCI layer was reporting signalling problems or other anomalies that could explain the behaviour. With ATA, such is significantly limited. It's worse if you're hiding/not showing us the entire information. The classic FreeBSD ATA driver does not provide command queueing (NCQ), while AHCI via CAM does. The difference is that command queueing causes xxx_FPDMA_QUEUED CDBs to be issued to the disk. I'm going to repeat myself -- for the last time: CAN YOU PLEASE JUST PROVIDE "DMESG" FROM THE SYSTEM? Like after a fresh reboot? If you're able to provide all of the above, I don't know why you can't provide dmesg. It is the most important information that there is. I am sick and tired of stressing this point. Furthermore, please stop changing ATA vs. AHCI interface drivers. The more you change/screw around with, the less likely people are going to help. CHANGE NOTHING ON THE SYSTEM. Leave it how it is. Do not fiddle with things or start flipping switches/changing settings/etc. to "try and relieve the problem". You're asking other people for help, which means you need to be patient and follow what we ask. Thank you for the rest of the output, however. It looks like this is another system with an ATI-based controller (which is usually the kind involved in my aforementioned "gotchas"), but there still isn't enough information that can help. I have a gut feeling of what's about to come, but I need to see dmesg output before I can determine that. Furthermore, can you please provide this information with its formatting intact? Your Email client is screwing up "long lines" and causing unnecesary wrapping. The mailing list will nuke attachments, so please use pastebin or some similar service + provide URLs. > OK, so I forgot an important info that my pool doesn't have redundancy.... > Still, I don't think it's *the* problem because it happened in the > installer and it used to happen before I RMA'd another disk (That > was hanging the same, but more often. They are the same batch. Not > sure about ada1, but I do have a 3rd disk from the batch. I can > check it tomorrow). Actually I have the replacement disk laying > around...should I connect it or better leave the system as it is and > wait for the problem to reappear? You're still not giving the information needed. Your reluctance and inability to provide what's asked for is really pissing me off. -- | Jeremy Chadwick jdc@koitsu.org | | UNIX Systems Administrator http://jdc.koitsu.org/ | | Mountain View, CA, US | | Making life hard for others since 1977. PGP 4BD6C0CB | From owner-freebsd-fs@FreeBSD.ORG Sat Apr 13 04:36:51 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id E31182C9 for ; Sat, 13 Apr 2013 04:36:51 +0000 (UTC) (envelope-from will@firepipe.net) Received: from mail-ie0-x22e.google.com (mail-ie0-x22e.google.com [IPv6:2607:f8b0:4001:c03::22e]) by mx1.freebsd.org (Postfix) with ESMTP id B645FE9B for ; Sat, 13 Apr 2013 04:36:51 +0000 (UTC) Received: by mail-ie0-f174.google.com with SMTP id aq17so4056525iec.19 for ; Fri, 12 Apr 2013 21:36:51 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=mime-version:x-received:in-reply-to:references:date:message-id :subject:from:to:cc:content-type:x-gm-message-state; bh=00y47FmR+2iCG8KFzTY4sIkoZO/DGR6rc5Au8hcW8H0=; b=K23XpZj6bFrGi7rsZ9/mW6+B+RktH5nA2SL5FXRPY5Lf8VlRVQCsowNXDqxwLlbbcd BUUCs4QBAPcx/Ms5ikUTlZjZmtui3z3w9gU6HYtd/PMnkWFW/hK0h2L2t5jX6IIHBfOj gPCT9a/kISkSoXzQLosT+fEkpyijeSxUIbTLJgp1ywbvX3fIxVCJyS5yM0n+MZqCpWQK /RXfXZKKiysMHkG+BH4Xd+GbmEIG1Y1DFUcOAQTtpbfLy7451QQM0jPFiMpQejWKOh3F +JKA8EwCMGdfkvgVQSglhUPc2x56y6A3KUQJjKmPBJMLcOPrFW4s0LXh2qqq5iv/7CxT zkpw== MIME-Version: 1.0 X-Received: by 10.50.7.42 with SMTP id g10mr715559iga.97.1365827811238; Fri, 12 Apr 2013 21:36:51 -0700 (PDT) Received: by 10.231.211.133 with HTTP; Fri, 12 Apr 2013 21:36:50 -0700 (PDT) In-Reply-To: <20130411160253.V1041@besplex.bde.org> References: <87CC14D8-7DC6-481A-8F85-46629F6D2249@dragondata.com> <20130411160253.V1041@besplex.bde.org> Date: Fri, 12 Apr 2013 22:36:50 -0600 Message-ID: Subject: Re: Does sync(8) really flush everything? Lost writes with journaled SU after sync+power cycle From: Will Andrews To: Bruce Evans X-Gm-Message-State: ALoCoQkIh1VPeFG0FfoO786qcvnbIqFKOcGSlGqycySeSmPxWvoDfOSzo1Uh+56r8YSx43VZc0Cg Content-Type: text/plain; charset=UTF-8 X-Content-Filtered-By: Mailman/MimeDel 2.1.14 Cc: "freebsd-fs@FreeBSD.org Filesystems" X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 13 Apr 2013 04:36:51 -0000 On Thu, Apr 11, 2013 at 12:30 AM, Bruce Evans wrote: > On Wed, 10 Apr 2013, Kevin Day wrote: > > Working with an environment where a system (with journaled soft-updates) >> is going to be notified that it's going to be losing power shortly, and >> needs to shut down daemons and flush everything to disk. It doesn't >> actually shutdown though, because the "power down now" command may get >> cancelled and we need to bring things back up. My understanding was that we >> could call sync(8), then just wait for the power to drop. >> >> The problem is that we were frequently losing the last 30-60 seconds >> worth of filesystem changes prior to the shutdown. i.e. newly created >> directories would disappear or fsck would reclaim them and throw them into >> lost+found. >> >> I confirmed that there is no caching disk controller, and write caching >> is disabled on the drives themselves, and the problem continued. >> >> On a whim, after running sync(8) once and waiting 10 seconds, I did >> "mount -u -o ro -f /" to force the filesystem into read-only mode. It took >> about 8 seconds to finish, gstat showed a lot of write activity, and >> SIGINFO on the mount command showed: >> > > sync(2) only schedules all writing of all modified buffers to disk. Its > man page even says this. It doesn't wait for any of the writes to > complete. > Its man page says that this is a BUG, but it is intentional and sync() has > always done this. There is no way for sync() to guarantee that all > modified > buffers have been written to disk when it returns, since even if it waited, > buffers might be modified while it is returning. Perhaps even ones that > would take 8 seconds to complete can be written in the few nanoseconds that > it takes to return. > The behavior of sync(2) is actually filesystem-specific. sync(8) calls sync(2), which calls sys_sync, which calls VFS_SYNC, which means the filesystem determines the exact behavior. In the case of ZFS, its vfs_sync performs a ZIL commit, which means that all writes up to that point will be committed to disk prior to returning. sync(8) is just a wrapper around sync(2). One that doesn't even check > for errors. Not that it could handle sync() failure. Its man page > bogusly first claims that it "forces completion". This is not > completely wrong, since it doesn't claim that the completion occurs > before sync(8) exits. But then it claims that sync(8) is suitable "to > ensure that all disk writes have been completed in a way not suitably > done by reboot(8) or halt(8). This wording is poor, unless it is > intentionally weaselishly worded so that it doesn't actually claim > full completion. It only claims more suitable completion than with > reboot or halt. Actually, completion is not guaranteed, and what > sync(8) provides is just less unsuitable than what reboot and halt > provide. > I think sync(2) should implemented to mean, where possible, the filesystem equivalent of a CPU memory barrier. In short, you should be guaranteed that every write you know you made prior to calling sync, has been committed to disk. Writes performed in other contexts do not receive any such guarantee. To ensure completion, you have to freeze the file systems of interest > before rebooting. I don't know of any ways to do this from userland > except mount -u -o ro or unmount. > This is certainly true, if you want to guarantee that all writes in all contexts were committed. But sync(2) could never be useful for that purpose, for the reasons you mention. --Will. From owner-freebsd-fs@FreeBSD.ORG Sat Apr 13 08:10:16 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 0AD7629E for ; Sat, 13 Apr 2013 08:10:16 +0000 (UTC) (envelope-from quartz@sneakertech.com) Received: from relay02.pair.com (relay02.pair.com [209.68.5.16]) by mx1.freebsd.org (Postfix) with SMTP id A0A5C781 for ; Sat, 13 Apr 2013 08:10:15 +0000 (UTC) Received: (qmail 71105 invoked by uid 0); 13 Apr 2013 08:10:08 -0000 Received: from 173.48.104.62 (HELO ?10.2.2.1?) (173.48.104.62) by relay02.pair.com with SMTP; 13 Apr 2013 08:10:08 -0000 X-pair-Authenticated: 173.48.104.62 Message-ID: <516912E0.8080708@sneakertech.com> Date: Sat, 13 Apr 2013 04:10:08 -0400 From: Quartz User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.6; rv:10.0.2) Gecko/20120216 Thunderbird/10.0.2 MIME-Version: 1.0 To: freebsd-fs@freebsd.org Subject: Re: A failed drive causes system to hang References: <51672164.1090908@o2.pl> <41A207817BC94167B0C94133EC0DFD68@multiplay.co.uk> <51687881.4080005@o2.pl> <20130412212207.GA81897@icarus.home.lan> <5168864A.2090602@sneakertech.com> <20130412222005.GA82884@icarus.home.lan> In-Reply-To: <20130412222005.GA82884@icarus.home.lan> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 13 Apr 2013 08:10:16 -0000 > My point is that the PC-BSD folks need to be made aware of this issue, > so their iXSystems folks can help out if necessary and so on. If > they're left out of the loop, then that's bad for everyone. I was under the impression that pcbsd was entirely an "apps and UI" thing, and that low level stuff like cam or zfs wouldn't be their area of expertise, or even really their responsibility since they're not messing with any of that. I don't think your embedded linux analogy is necessarily relevant, since in those cases the vendor definitely WILL be making low level changes to the base system. Unless you're saying pcbsd does something besides just adding packages to a base install and changing a few config files. ______________________________________ it has a certain smooth-brained appeal From owner-freebsd-fs@FreeBSD.ORG Sat Apr 13 08:19:56 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 78715471 for ; Sat, 13 Apr 2013 08:19:56 +0000 (UTC) (envelope-from quartz@sneakertech.com) Received: from relay01.pair.com (relay01.pair.com [209.68.5.15]) by mx1.freebsd.org (Postfix) with SMTP id 1CEE07D0 for ; Sat, 13 Apr 2013 08:19:55 +0000 (UTC) Received: (qmail 68086 invoked by uid 0); 13 Apr 2013 08:19:49 -0000 Received: from 173.48.104.62 (HELO ?10.2.2.1?) (173.48.104.62) by relay01.pair.com with SMTP; 13 Apr 2013 08:19:49 -0000 X-pair-Authenticated: 173.48.104.62 Message-ID: <51691524.4050009@sneakertech.com> Date: Sat, 13 Apr 2013 04:19:48 -0400 From: Quartz User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.6; rv:10.0.2) Gecko/20120216 Thunderbird/10.0.2 MIME-Version: 1.0 To: =?UTF-8?B?UmFkaW8gbcWCb2R5Y2ggYmFuZHl0w7N3?= Subject: Re: A failed drive causes system to hang References: <51672164.1090908@o2.pl> <20130411212408.GA60159@icarus.home.lan> <5168821F.5020502@o2.pl> In-Reply-To: <5168821F.5020502@o2.pl> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit Cc: freebsd-fs@freebsd.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 13 Apr 2013 08:19:56 -0000 > As to dmesg, tough luck. I have 2 photos on my phone and their > transcripts are all I can give until the problem reappears I think there's a communication gap here. While a messages and logs from the time the incident happens are ideal, Jeremy *also* just needs to see the generic info about your hardware, which can be found in any dmesg taken at any time. ______________________________________ it has a certain smooth-brained appeal From owner-freebsd-fs@FreeBSD.ORG Sat Apr 13 08:31:08 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id DFEE582F for ; Sat, 13 Apr 2013 08:31:08 +0000 (UTC) (envelope-from quartz@sneakertech.com) Received: from relay03.pair.com (relay03.pair.com [209.68.5.17]) by mx1.freebsd.org (Postfix) with SMTP id 84323853 for ; Sat, 13 Apr 2013 08:31:08 +0000 (UTC) Received: (qmail 84369 invoked by uid 0); 13 Apr 2013 08:31:06 -0000 Received: from 173.48.104.62 (HELO ?10.2.2.1?) (173.48.104.62) by relay03.pair.com with SMTP; 13 Apr 2013 08:31:06 -0000 X-pair-Authenticated: 173.48.104.62 Message-ID: <516917CA.5040607@sneakertech.com> Date: Sat, 13 Apr 2013 04:31:06 -0400 From: Quartz User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.6; rv:10.0.2) Gecko/20120216 Thunderbird/10.0.2 MIME-Version: 1.0 To: Jeremy Chadwick Subject: Re: A failed drive causes system to hang References: <51672164.1090908@o2.pl> <20130411212408.GA60159@icarus.home.lan> <5168821F.5020502@o2.pl> <20130412220350.GA82467@icarus.home.lan> In-Reply-To: <20130412220350.GA82467@icarus.home.lan> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: freebsd-fs@freebsd.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 13 Apr 2013 08:31:08 -0000 >If the ZFS layer > is waiting on CAM, and CAM is waiting on your hardware, then those I/O > requests are going to block indefinitely. > 2. I agree that the problem is not likely in ZFS, but rather either with > CAM, the AHCI implementation used, or hardware (either disk or storage > controller). Question: How (or does) this relate to the hang that I'm seeing with my system? You mentioned cam issues when talking to me earlier, but less decisively than your comment here. What's the difference? > We're also > going to need to see "zpool status" output, as well as "zpool get all" > and "zfs get all". "pciconf -lvbc" would also be useful. You never asked for these when talking to me, but I can provide any of it if you want to look at it. ______________________________________ it has a certain smooth-brained appeal From owner-freebsd-fs@FreeBSD.ORG Sat Apr 13 08:45:55 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id BDECE949 for ; Sat, 13 Apr 2013 08:45:55 +0000 (UTC) (envelope-from njm@njm.me.uk) Received: from smtp003.apm-internet.net (smtp003.apm-internet.net [85.119.248.52]) by mx1.freebsd.org (Postfix) with ESMTP id 3C3068AA for ; Sat, 13 Apr 2013 08:45:54 +0000 (UTC) Received: (qmail 29208 invoked from network); 13 Apr 2013 08:39:12 -0000 Received: from unknown (HELO meld.njm.me.uk) (86.159.26.144) by smtp003.apm-internet.net with SMTP; 13 Apr 2013 08:39:12 -0000 Received: from titania.njm.me.uk (titania.njm.me.uk [192.168.144.130]) by meld.njm.me.uk (8.14.6/8.14.6) with ESMTP id r3D8dBbL016840; Sat, 13 Apr 2013 09:39:11 +0100 (BST) (envelope-from njm@njm.me.uk) Received: from titania.njm.me.uk (localhost [127.0.0.1]) by titania.njm.me.uk (8.14.6/8.14.6) with ESMTP id r3D8dBsC075480; Sat, 13 Apr 2013 09:39:11 +0100 (BST) (envelope-from njm@njm.me.uk) Received: (from njm@localhost) by titania.njm.me.uk (8.14.6/8.14.6/Submit) id r3D8dALq075479; Sat, 13 Apr 2013 09:39:10 +0100 (BST) (envelope-from njm@njm.me.uk) Date: Sat, 13 Apr 2013 09:39:10 +0100 From: "N.J. Mann" To: Quartz Subject: Re: A failed drive causes system to hang Message-ID: <20130413083910.GA73903@titania.njm.me.uk> Mail-Followup-To: Quartz , freebsd-fs@freebsd.org References: <51672164.1090908@o2.pl> <41A207817BC94167B0C94133EC0DFD68@multiplay.co.uk> <51687881.4080005@o2.pl> <20130412212207.GA81897@icarus.home.lan> <5168864A.2090602@sneakertech.com> <20130412222005.GA82884@icarus.home.lan> <516912E0.8080708@sneakertech.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <516912E0.8080708@sneakertech.com> X-Operating-System: FreeBSD 8.3-STABLE User-Agent: mutt-NJM (2010-10-31) Cc: freebsd-fs@freebsd.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 13 Apr 2013 08:45:55 -0000 In message <516912E0.8080708@sneakertech.com>, Quartz (quartz@sneakertech.com) wrote: > > > My point is that the PC-BSD folks need to be made aware of this issue, > > so their iXSystems folks can help out if necessary and so on. If > > they're left out of the loop, then that's bad for everyone. > > I was under the impression that pcbsd was entirely an "apps and UI" > thing, and that low level stuff like cam or zfs wouldn't be their area > of expertise, or even really their responsibility since they're not > messing with any of that. They have made changes to the way the system boots, e.g. files in /etc/rc.d have be changed and some of the loader's Forth files. While this may have no impact in this particular case it may be relevant in other cases where users of PC-BSD are experiencing problems. Cheers, Nick, -- From owner-freebsd-fs@FreeBSD.ORG Sat Apr 13 10:24:19 2013 Return-Path: Delivered-To: freebsd-fs@FreeBSD.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 50FB96F2 for ; Sat, 13 Apr 2013 10:24:19 +0000 (UTC) (envelope-from avg@FreeBSD.org) Received: from citadel.icyb.net.ua (citadel.icyb.net.ua [212.40.38.140]) by mx1.freebsd.org (Postfix) with ESMTP id 9B2F4B58 for ; Sat, 13 Apr 2013 10:24:18 +0000 (UTC) Received: from porto.starpoint.kiev.ua (porto-e.starpoint.kiev.ua [212.40.38.100]) by citadel.icyb.net.ua (8.8.8p3/ICyb-2.3exp) with ESMTP id NAA07491; Sat, 13 Apr 2013 13:24:14 +0300 (EEST) (envelope-from avg@FreeBSD.org) Received: from localhost ([127.0.0.1]) by porto.starpoint.kiev.ua with esmtp (Exim 4.34 (FreeBSD)) id 1UQxcz-000FqH-PX; Sat, 13 Apr 2013 13:24:13 +0300 Message-ID: <5169324A.3080309@FreeBSD.org> Date: Sat, 13 Apr 2013 13:24:10 +0300 From: Andriy Gapon User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:17.0) Gecko/20130405 Thunderbird/17.0.5 MIME-Version: 1.0 To: Adam Nowacki Subject: Re: ZFS slow reads for unallocated blocks References: <5166EA43.7050700@platinum.linux.pl> <5167B1C5.8020402@FreeBSD.org> <51689A2C.4080402@platinum.linux.pl> In-Reply-To: <51689A2C.4080402@platinum.linux.pl> X-Enigmail-Version: 1.5.1 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Cc: freebsd-fs@FreeBSD.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 13 Apr 2013 10:24:19 -0000 on 13/04/2013 02:35 Adam Nowacki said the following: > http://tepeserwery.pl/nowak/freebsd/zfs_sparse_optimization.patch.txt > > Does it look sane? It's hard to tell from a quick look since they change is not small. What is your idea of the problem and the fix? > On 2013-04-12 09:03, Andriy Gapon wrote: >> >> ENOTIME to really investigate, but here is a basic profile result for those >> interested: >> kernel`bzero+0xa >> kernel`dmu_buf_hold_array_by_dnode+0x1cf >> kernel`dmu_read_uio+0x66 >> kernel`zfs_freebsd_read+0x3c0 >> kernel`VOP_READ_APV+0x92 >> kernel`vn_read+0x1a3 >> kernel`vn_io_fault+0x23a >> kernel`dofileread+0x7b >> kernel`sys_read+0x9e >> kernel`amd64_syscall+0x238 >> kernel`0xffffffff80747e4b >> >> That's where > 99% of time is spent. >> > -- Andriy Gapon From owner-freebsd-fs@FreeBSD.ORG Sat Apr 13 10:27:47 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id BC4F37F9 for ; Sat, 13 Apr 2013 10:27:47 +0000 (UTC) (envelope-from brde@optusnet.com.au) Received: from mail09.syd.optusnet.com.au (mail09.syd.optusnet.com.au [211.29.132.190]) by mx1.freebsd.org (Postfix) with ESMTP id 5A64FBE0 for ; Sat, 13 Apr 2013 10:27:46 +0000 (UTC) Received: from c211-30-173-106.carlnfd1.nsw.optusnet.com.au (c211-30-173-106.carlnfd1.nsw.optusnet.com.au [211.30.173.106]) by mail09.syd.optusnet.com.au (8.13.1/8.13.1) with ESMTP id r3DARYAX006865 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Sat, 13 Apr 2013 20:27:35 +1000 Date: Sat, 13 Apr 2013 20:27:34 +1000 (EST) From: Bruce Evans X-X-Sender: bde@besplex.bde.org To: Kevin Day Subject: Re: Does sync(8) really flush everything? Lost writes with journaled SU after sync+power cycle In-Reply-To: Message-ID: <20130413200708.J1165@besplex.bde.org> References: <87CC14D8-7DC6-481A-8F85-46629F6D2249@dragondata.com> <20130411160253.V1041@besplex.bde.org> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed X-Optus-CM-Score: 0 X-Optus-CM-Analysis: v=2.0 cv=Ov0XUFDt c=1 sm=1 a=Cguo-lYZyhEA:10 a=kj9zAlcOel0A:10 a=PO7r1zJSAAAA:8 a=JzwRw_2MAAAA:8 a=5GGpcXspQ0YA:10 a=gIcydfCnhbxikQVEtaIA:9 a=CjuIK1q_8ugA:10 a=TEtd8y5WR3g2ypngnwZWYw==:117 Cc: "freebsd-fs@FreeBSD.org Filesystems" X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 13 Apr 2013 10:27:47 -0000 On Fri, 12 Apr 2013, Kevin Day wrote: > On Apr 11, 2013, at 1:30 AM, Bruce Evans wrote: >> >> sync(2) only schedules all writing of all modified buffers to disk. Its >> ... >> >> sync(8) is just a wrapper around sync(2). One that doesn't even check >> ... > > And on the flip side, the man page for syncer says: > > It is possible on some systems that a sync(2) occurring simultaneously with a crash may cause file system damage. See fsck(8). That is not useful. It was copied from update(8) to update(4misplaced) and then to syncer(4misplaced). It should go without saying that a crash may cause file system damage irrrespective of whether it occurs simultaneously with sync(2). Apart from that: back in 1996 when these words in update(4) were written, all syncs were done by update(8) calling sync(2). With the syncer daemon, crashes may also occur concurrently with syncer daemon activity, and that and not sync(2) became ths most common source of crashes with syncs. I thing these words are just a hint about the bug that panic() calls sync(). Back in 1996, there was little locking for sync against itself (perhaps it could deadlock). Now there is lots of locking for normal sync activity (e.g., sync(2) vs the syncer daemon). I don't know how this acts in panic() but guess that it doesn't really work. So crashes may cause file system damage, but this has little to do with sync(2) or even the syncer daemon (the sync() in panic() either has to blow away all locks so that it can complete, or possibly deadlock. If it blows away all locks then it may cause damage directly, and if it deadlocks then you cause the damage use reset or power cycling). Of course, nothing about this can be seen in fsck(8). Now there are many foofs_fsck(8)'s where the things to be seen would have been moved to if there were any. > ... > I understand that sync(8) returns immediately, I guess my confusion is that calling sync(8) doesn't seem to cause *any* writes to happen. > > I can have the system completely idle (absolutely no processes running that could cause any filesystem activity), call sync(8), and watching gstat(8) can see no write activity happen at all, even waiting 10+ seconds afterwards, where as "mount -u -o ro -f /" causes an instant flurry of writes to happen. My understanding was that even though sync returned immediately, flushing would also start immediately, and leave the system in a safe point, at least until another write happens. I thought that you got the flurry of writes from sync() too, but they took 10+ seconds. Perhaps an fs bug. Bruce From owner-freebsd-fs@FreeBSD.ORG Sat Apr 13 12:04:47 2013 Return-Path: Delivered-To: freebsd-fs@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 6B3C1C31; Sat, 13 Apr 2013 12:04:47 +0000 (UTC) (envelope-from nowakpl@platinum.linux.pl) Received: from platinum.linux.pl (platinum.edu.pl [81.161.192.4]) by mx1.freebsd.org (Postfix) with ESMTP id 312B3D3; Sat, 13 Apr 2013 12:04:47 +0000 (UTC) Received: by platinum.linux.pl (Postfix, from userid 87) id 2721A47E24; Sat, 13 Apr 2013 14:04:44 +0200 (CEST) X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on platinum.linux.pl X-Spam-Level: X-Spam-Status: No, score=-1.3 required=3.0 tests=ALL_TRUSTED,AWL autolearn=disabled version=3.3.2 Received: from [10.255.1.2] (unknown [83.151.38.73]) by platinum.linux.pl (Postfix) with ESMTPA id CBC6347E21; Sat, 13 Apr 2013 14:04:44 +0200 (CEST) Message-ID: <516949C7.4030305@platinum.linux.pl> Date: Sat, 13 Apr 2013 14:04:23 +0200 From: Adam Nowacki User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:17.0) Gecko/20130328 Thunderbird/17.0.5 MIME-Version: 1.0 To: Andriy Gapon Subject: Re: ZFS slow reads for unallocated blocks References: <5166EA43.7050700@platinum.linux.pl> <5167B1C5.8020402@FreeBSD.org> <51689A2C.4080402@platinum.linux.pl> <5169324A.3080309@FreeBSD.org> In-Reply-To: <5169324A.3080309@FreeBSD.org> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: freebsd-fs@FreeBSD.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 13 Apr 2013 12:04:47 -0000 Temporary dbufs are created for each missing (unallocated on disk) record, including indirects if the hole is large enough. Those dbufs never find way to ARC and are freed at the end of dmu_read_uio. A small read (from a hole) would in the best case bzero 128KiB (recordsize, more if missing indirects) ... and I'm running modified ZFS with record sizes up to 8MiB. # zfs create -o atime=off -o recordsize=8M -o compression=off -o mountpoint=/home/testfs home/testfs # truncate -s 8m /home/testfs/trunc8m # dd if=/dev/zero of=/home/testfs/zero8m bs=8m count=1 1+0 records in 1+0 records out 8388608 bytes transferred in 0.010193 secs (822987745 bytes/sec) # time cat /home/testfs/trunc8m > /dev/null 0.000u 6.111s 0:06.11 100.0% 15+2753k 0+0io 0pf+0w # time cat /home/testfs/zero8m > /dev/null 0.000u 0.010s 0:00.01 100.0% 12+2168k 0+0io 0pf+0w 600x increase in system time and close to 1MB/s - insanity. The fix - a lot of the code to efficiently handle this was already there. dbuf_hold_impl has int fail_sparse argument to return ENOENT for holes. Just had to get there and somehow back to dmu_read_uio where zeroing can happen at byte granularity. ... didn't have time to actually test it yet. On 2013-04-13 12:24, Andriy Gapon wrote: > on 13/04/2013 02:35 Adam Nowacki said the following: >> http://tepeserwery.pl/nowak/freebsd/zfs_sparse_optimization.patch.txt >> >> Does it look sane? > > It's hard to tell from a quick look since they change is not small. > What is your idea of the problem and the fix? > >> On 2013-04-12 09:03, Andriy Gapon wrote: >>> >>> ENOTIME to really investigate, but here is a basic profile result for those >>> interested: >>> kernel`bzero+0xa >>> kernel`dmu_buf_hold_array_by_dnode+0x1cf >>> kernel`dmu_read_uio+0x66 >>> kernel`zfs_freebsd_read+0x3c0 >>> kernel`VOP_READ_APV+0x92 >>> kernel`vn_read+0x1a3 >>> kernel`vn_io_fault+0x23a >>> kernel`dofileread+0x7b >>> kernel`sys_read+0x9e >>> kernel`amd64_syscall+0x238 >>> kernel`0xffffffff80747e4b >>> >>> That's where > 99% of time is spent. >>> >> > > From owner-freebsd-fs@FreeBSD.ORG Sat Apr 13 12:42:01 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 5116E59A for ; Sat, 13 Apr 2013 12:42:01 +0000 (UTC) (envelope-from paulz@vanderzwan.org) Received: from cpsmtpb-ews10.kpnxchange.com (cpsmtpb-ews10.kpnxchange.com [213.75.39.15]) by mx1.freebsd.org (Postfix) with ESMTP id B96D223F for ; Sat, 13 Apr 2013 12:42:00 +0000 (UTC) Received: from cpsps-ews03.kpnxchange.com ([10.94.84.170]) by cpsmtpb-ews10.kpnxchange.com with Microsoft SMTPSVC(7.5.7601.17514); Sat, 13 Apr 2013 14:41:57 +0200 Received: from CPSMTPM-TLF104.kpnxchange.com ([195.121.3.7]) by cpsps-ews03.kpnxchange.com with Microsoft SMTPSVC(7.5.7601.17514); Sat, 13 Apr 2013 14:41:56 +0200 Received: from mailvm.vanderzwan.org ([77.172.189.82]) by CPSMTPM-TLF104.kpnxchange.com with Microsoft SMTPSVC(7.5.7601.17514); Sat, 13 Apr 2013 14:41:56 +0200 Received: from [IPv6:2001:1af8:fefb::12dd:b1ff:feb3:1119] ([IPv6:2001:1af8:fefb:0:12dd:b1ff:feb3:1119]) (authenticated bits=0) by mailvm.vanderzwan.org (8.14.6/8.14.6) with ESMTP id r3DCfooj008436 (version=TLSv1/SSLv3 cipher=AES128-SHA bits=128 verify=NO) for ; Sat, 13 Apr 2013 14:41:55 +0200 (CEST) (envelope-from paulz@vanderzwan.org) Content-Type: text/plain; charset=us-ascii Mime-Version: 1.0 (Mac OS X Mail 6.3 \(1503\)) Subject: Re: FreeBSD 9.1 NFSv4 client attribute cache not caching ? From: Paul van der Zwan In-Reply-To: <15B91473-99F4-4B48-BC18-D47B3037E8DF@vanderzwan.org> Date: Sat, 13 Apr 2013 14:41:50 +0200 Content-Transfer-Encoding: quoted-printable Message-Id: <495AEA10-9B8F-4A03-B706-79BF43539482@vanderzwan.org> References: <15B91473-99F4-4B48-BC18-D47B3037E8DF@vanderzwan.org> To: freebsd-fs@freebsd.org X-Mailer: Apple Mail (2.1503) X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.3.9 (mailvm.vanderzwan.org [IPv6:2001:1af8:fefb::25]); Sat, 13 Apr 2013 14:41:55 +0200 (CEST) X-OriginalArrivalTime: 13 Apr 2013 12:41:56.0526 (UTC) FILETIME=[487AC4E0:01CE3844] X-RcptDomain: freebsd.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 13 Apr 2013 12:42:01 -0000 On 12 Apr 2013, at 16:28 , Paul van der Zwan = wrote: >=20 > I am running a few VirtualBox VMs with 9.1 on my OpenIndiana server = and I noticed that make buildworld seem to take much longer=20 > when the clients mount /usr/src and /usr/obj over NFS V4 than when = they use V3. > Unfortunately I have to use V4 as a buildworld on V3 hangs the server = completely... > I noticed the number of PUTFH/GETATTR/GETFH calls in in the order of a = few thousand per second > and if I snoop the traffic I see the same filenames appear over and = over again. > It looks like the client is not caching anything at all and using a = server request everytime. > I use the default mount options: > 192.168.178.24:/data/ports on /usr/ports (nfs, nfsv4acls) > 192.168.178.24:/data/src on /usr/src (nfs, nfsv4acls) > 192.168.178.24:/data/obj on /usr/obj (nfs, nfsv4acls) >=20 >=20 I had a look with dtrace=20 $ sudo dtrace -n '::getattr:start { @[stack()]=3Dcount();}' and it seems the vast majority of the calls to getattr are from open() = and close() system calls.: kernel`newnfs_request+0x631 kernel`nfscl_request+0x75 kernel`nfsrpc_getattr+0xbe kernel`nfs_getattr+0x280 kernel`VOP_GETATTR_APV+0x74 kernel`nfs_lookup+0x3cc kernel`VOP_LOOKUP_APV+0x74 kernel`lookup+0x69e kernel`namei+0x6df kernel`kern_execve+0x47a kernel`sys_execve+0x43 kernel`amd64_syscall+0x3bf kernel`0xffffffff80784947 26 kernel`newnfs_request+0x631 kernel`nfscl_request+0x75 kernel`nfsrpc_getattr+0xbe kernel`nfs_close+0x3e9 kernel`VOP_CLOSE_APV+0x74 kernel`kern_execve+0x15c5 kernel`sys_execve+0x43 kernel`amd64_syscall+0x3bf kernel`0xffffffff80784947 26 kernel`newnfs_request+0x631 kernel`nfscl_request+0x75 kernel`nfsrpc_getattr+0xbe kernel`nfs_getattr+0x280 kernel`VOP_GETATTR_APV+0x74 kernel`nfs_lookup+0x3cc kernel`VOP_LOOKUP_APV+0x74 kernel`lookup+0x69e kernel`namei+0x6df kernel`vn_open_cred+0x330 kernel`vn_open+0x1c kernel`kern_openat+0x207 kernel`kern_open+0x19 kernel`sys_open+0x18 kernel`amd64_syscall+0x3bf kernel`0xffffffff80784947 2512 kernel`newnfs_request+0x631 kernel`nfscl_request+0x75 kernel`nfsrpc_getattr+0xbe kernel`nfs_close+0x3e9 kernel`VOP_CLOSE_APV+0x74 kernel`vn_close+0xee kernel`vn_closefile+0xff kernel`_fdrop+0x3a kernel`closef+0x332 kernel`kern_close+0x183 kernel`sys_close+0xb kernel`amd64_syscall+0x3bf kernel`0xffffffff80784947 2530 I had a look at the source of nfs_close and could not find a call to = nfsrpc_getattr, and I am wondering why close would be calling getattr = anyway. If the file is closed what do we care about it's attributes.... Paul From owner-freebsd-fs@FreeBSD.ORG Sat Apr 13 15:41:32 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 82513A65 for ; Sat, 13 Apr 2013 15:41:32 +0000 (UTC) (envelope-from jdc@koitsu.org) Received: from qmta03.emeryville.ca.mail.comcast.net (qmta03.emeryville.ca.mail.comcast.net [IPv6:2001:558:fe2d:43:76:96:30:32]) by mx1.freebsd.org (Postfix) with ESMTP id 65AF8955 for ; Sat, 13 Apr 2013 15:41:32 +0000 (UTC) Received: from omta09.emeryville.ca.mail.comcast.net ([76.96.30.20]) by qmta03.emeryville.ca.mail.comcast.net with comcast id PRbn1l0010S2fkCA3ThXjg; Sat, 13 Apr 2013 15:41:31 +0000 Received: from koitsu.strangled.net ([67.180.84.87]) by omta09.emeryville.ca.mail.comcast.net with comcast id PThW1l00x1t3BNj8VThXzb; Sat, 13 Apr 2013 15:41:31 +0000 Received: by icarus.home.lan (Postfix, from userid 1000) id 94EC573A33; Sat, 13 Apr 2013 08:41:30 -0700 (PDT) Date: Sat, 13 Apr 2013 08:41:30 -0700 From: Jeremy Chadwick To: Quartz Subject: Re: A failed drive causes system to hang Message-ID: <20130413154130.GA877@icarus.home.lan> References: <51672164.1090908@o2.pl> <20130411212408.GA60159@icarus.home.lan> <5168821F.5020502@o2.pl> <20130412220350.GA82467@icarus.home.lan> <516917CA.5040607@sneakertech.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <516917CA.5040607@sneakertech.com> User-Agent: Mutt/1.5.21 (2010-09-15) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=comcast.net; s=q20121106; t=1365867691; bh=I6easHb3pWOU/hbxwiaj98GsAAYEUajP+g6aHpDkc9w=; h=Received:Received:Received:Date:From:To:Subject:Message-ID: MIME-Version:Content-Type; b=j7DBOxTt2pbX3o2INYXH/VIHla6c5elqcrW+UJ4W4TijXUBFJxaF/pUvZDItcSCMv TOO4kCR/qt200CFyUWh5qxp1vtIsrTw6NVrM1bi7KPAp1u6MbExnppv4tIKa14B+LX s4vpmesnlv4gTtw+QUk9Ju9ha4xEl+aWDkIbwUgPE7ryK+nt0JkkQrtS7GJ5FhFICT 3rdHhjrQArhqqZMP+LrA8yHaPdJ6RtuzQOlCWTUDRmRjTrywqEzhdSpk0yiZS2umGC ZuSqORPcvqIPhryuaXZ23z4VBoF1Xm+xn+c2zSirCKCB9c/umXNv9PL/RfzPLt9xAq kgWmGb7YaLdzw== Cc: freebsd-fs@freebsd.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 13 Apr 2013 15:41:32 -0000 On Sat, Apr 13, 2013 at 04:31:06AM -0400, Quartz wrote: > >If the ZFS layer > >is waiting on CAM, and CAM is waiting on your hardware, then those I/O > >requests are going to block indefinitely. > > >2. I agree that the problem is not likely in ZFS, but rather either with > >CAM, the AHCI implementation used, or hardware (either disk or storage > >controller). > > Question: > > How (or does) this relate to the hang that I'm seeing with my > system? It doesn't relate in any way, shape, or form. This is what happens when end-users start to try and "correlate" issues to one another's without actually taking the time to fully read the thread and follow along actively. This has now happened *twice* with this thread (once from user Lawrence K. Chen, and now another from radiomlodychbandytow@o2.pl). This sort of behavioural thing has happened with FreeBSD, particularly with regards to storage/filesystems/etc., for as long as I can remember. I am not going to get into a discussion on how to solve such social dilemmas because the procedure is to use send-pr and wait for someone in-the-know to respond asking for relevant information. The FreeBSD Handbook goes over how to file a PR and what to put in it. http://www.freebsd.org/send-pr.html http://www.freebsd.org/doc/en_US.ISO8859-1/articles/problem-reports/article.html > You mentioned cam issues when talking to me earlier, but > less decisively than your comment here. What's the difference? Your issue: "on my raidz2 pool, when I lose more than 2 disks, I/O to the pool stalls indefinitely, but I can still use the system barring ZFS-related things; I don't know how to get the system back into a usable state from this situation". That's based on these two statements: http://lists.freebsd.org/pipermail/freebsd-fs/2013-March/016822.html http://lists.freebsd.org/pipermail/freebsd-fs/2013-March/016847.html radiomlodychbandytow@o2.pl's issue: "I'm seeing ATA-level errors from one or more of my disks, can someone help?" Lawrence K. Chen's issue: "I had a crash/issue and then the system hung for a very long time at the mountroot phase". Given the information known at this time, ALL THREE of these issues are unrelated to one another. As I've said elsewhere: it is very important every single issue reported is handled individually/separately. I was given this advice from a FreeBSD kernel developer some years ago and it's excellent. It might seem logical to try and correlate such things, but a lot of the time this turns out to be wrong and is a great waste of everyone's time. So Just Don't Do It(tm). > >We're also > >going to need to see "zpool status" output, as well as "zpool get all" > >and "zfs get all". "pciconf -lvbc" would also be useful. > > You never asked for these when talking to me, but I can provide any > of it if you want to look at it. At this point in the conversation, WRT your issue, there's no indication that it would help, but you've already given dmesg output: http://lists.freebsd.org/pipermail/freebsd-fs/2013-March/016840.html Else, all you've provided so far is a general explanation. You have still not provided concise step-by-step information like I've asked. I've gone so far as to give you an example of what to provide: http://lists.freebsd.org/pipermail/freebsd-fs/2013-March/016814.html I will again point to the 2nd-to-last paragraph of my above referenced mail. Another example of troubleshooting and how to do it: here's effort I went through over the course of some months to track down a bug in CAM: http://lists.freebsd.org/pipermail/freebsd-fs/2013-January/016324.html READ: I'm not saying your issue is with CAM (it may be, but it may not be -- there isn't enough information right now to determine that). I'm giving you an example of the troubleshooting/debugging effort that has to go into things for issues of this nature. You can even see from my quoted material in that link that I spent many hours doing step-by-step QA only to find I messed up in the process and had to start over the following day. It happens. Once concise details are given and (highly preferable!) a step-by-step way to reproduce the issue 100% of the time (including all commands, all output seen, all physical actions taken, etc.), then the kernel folks tend to get involved. -- | Jeremy Chadwick jdc@koitsu.org | | UNIX Systems Administrator http://jdc.koitsu.org/ | | Mountain View, CA, US | | Making life hard for others since 1977. PGP 4BD6C0CB | From owner-freebsd-fs@FreeBSD.ORG Sat Apr 13 17:11:31 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 8380445C for ; Sat, 13 Apr 2013 17:11:31 +0000 (UTC) (envelope-from will@firepipe.net) Received: from mail-ia0-x236.google.com (mail-ia0-x236.google.com [IPv6:2607:f8b0:4001:c02::236]) by mx1.freebsd.org (Postfix) with ESMTP id 563A6C23 for ; Sat, 13 Apr 2013 17:11:31 +0000 (UTC) Received: by mail-ia0-f182.google.com with SMTP id u20so3232919iag.41 for ; Sat, 13 Apr 2013 10:11:31 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=mime-version:x-received:in-reply-to:references:date:message-id :subject:from:to:cc:content-type:x-gm-message-state; bh=zOYGt5sb3dEgXyfbgQb3tD7tMu1qxZ2GnLBNo/MzzZk=; b=Od9gee+KISAEEKkupt50uJhm6xGIiKr/wi3L4Pg50MfSqXlFHNWoiuNja8BfU31FYy 874tSxa/dgnQsqyQ+/uoBaasiLgoNpzwYwciVLUUXaj7BmVQEiKvv5kD8kJ+C+SkozJP NxDH7JzNeCVNlGXo5LsZWSBKu18+w/fPZ5QgAZ6/rD+ZUSPKm5HYq0iSI48ifRx0ktM6 0+HT/ubcMpQqkQw9avOH9FhKjHepF1Es5ySHCUeWwFhcD9AXCqrMMNoH/SBM8eg7b+qr 8k+ikXBaw2NpUxq9OTF6T0OS+mM0iWOIHRAk9ZeyWLI+AxgOrj4nFttoSjfSgrcx0DLW o6SQ== MIME-Version: 1.0 X-Received: by 10.42.155.66 with SMTP id t2mr6972683icw.10.1365873091004; Sat, 13 Apr 2013 10:11:31 -0700 (PDT) Received: by 10.231.211.133 with HTTP; Sat, 13 Apr 2013 10:11:30 -0700 (PDT) In-Reply-To: <516949C7.4030305@platinum.linux.pl> References: <5166EA43.7050700@platinum.linux.pl> <5167B1C5.8020402@FreeBSD.org> <51689A2C.4080402@platinum.linux.pl> <5169324A.3080309@FreeBSD.org> <516949C7.4030305@platinum.linux.pl> Date: Sat, 13 Apr 2013 11:11:30 -0600 Message-ID: Subject: Re: ZFS slow reads for unallocated blocks From: Will Andrews To: Adam Nowacki X-Gm-Message-State: ALoCoQkDBoCVcVKKzRRbj+txE6t7BH+V+4Hp8lTfFr0WG9Qp9rsX9cBl+DliPBXuDB/WUeKrBcbw Content-Type: text/plain; charset=UTF-8 X-Content-Filtered-By: Mailman/MimeDel 2.1.14 Cc: "freebsd-fs@freebsd.org" , Andriy Gapon X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 13 Apr 2013 17:11:31 -0000 Hi, I think the idea of using a pre-zeroed region as the 'source' is a good one, but probably it would be better to set a special flag on a hole dbuf than to require caller flags. That way, ZFS can lazily evaluate the hole dbuf (i.e. avoid zeroing db_data until it has to). However, that could be complicated by the fact that there are many potential users of hole dbufs that would want to write to the dbuf. This sort of optimization should be brought to the illumos zfs list. As it stands, your patch is also FreeBSD-specific, since 'zero_region' only exists in vm/vm_kern.c. Given the frequency of zero-copying, however, it's quite possible there are other versions of this region elsewhere. --Will. On Sat, Apr 13, 2013 at 6:04 AM, Adam Nowacki wrote: > Temporary dbufs are created for each missing (unallocated on disk) record, > including indirects if the hole is large enough. Those dbufs never find way > to ARC and are freed at the end of dmu_read_uio. > > A small read (from a hole) would in the best case bzero 128KiB > (recordsize, more if missing indirects) ... and I'm running modified ZFS > with record sizes up to 8MiB. > > # zfs create -o atime=off -o recordsize=8M -o compression=off -o > mountpoint=/home/testfs home/testfs > # truncate -s 8m /home/testfs/trunc8m > # dd if=/dev/zero of=/home/testfs/zero8m bs=8m count=1 > 1+0 records in > 1+0 records out > 8388608 bytes transferred in 0.010193 secs (822987745 bytes/sec) > > # time cat /home/testfs/trunc8m > /dev/null > 0.000u 6.111s 0:06.11 100.0% 15+2753k 0+0io 0pf+0w > > # time cat /home/testfs/zero8m > /dev/null > 0.000u 0.010s 0:00.01 100.0% 12+2168k 0+0io 0pf+0w > > 600x increase in system time and close to 1MB/s - insanity. > > The fix - a lot of the code to efficiently handle this was already there. > > dbuf_hold_impl has int fail_sparse argument to return ENOENT for holes. > Just had to get there and somehow back to dmu_read_uio where zeroing can > happen at byte granularity. > > ... didn't have time to actually test it yet. > > > On 2013-04-13 12:24, Andriy Gapon wrote: > >> on 13/04/2013 02:35 Adam Nowacki said the following: >> >>> http://tepeserwery.pl/nowak/**freebsd/zfs_sparse_** >>> optimization.patch.txt >>> >>> Does it look sane? >>> >> >> It's hard to tell from a quick look since they change is not small. >> What is your idea of the problem and the fix? >> >> On 2013-04-12 09:03, Andriy Gapon wrote: >>> >>>> >>>> ENOTIME to really investigate, but here is a basic profile result for >>>> those >>>> interested: >>>> kernel`bzero+0xa >>>> kernel`dmu_buf_hold_array_by_**dnode+0x1cf >>>> kernel`dmu_read_uio+0x66 >>>> kernel`zfs_freebsd_read+0x3c0 >>>> kernel`VOP_READ_APV+0x92 >>>> kernel`vn_read+0x1a3 >>>> kernel`vn_io_fault+0x23a >>>> kernel`dofileread+0x7b >>>> kernel`sys_read+0x9e >>>> kernel`amd64_syscall+0x238 >>>> kernel`0xffffffff80747e4b >>>> >>>> That's where > 99% of time is spent. >>>> >>>> >>> >> >> > ______________________________**_________________ > freebsd-fs@freebsd.org mailing list > http://lists.freebsd.org/**mailman/listinfo/freebsd-fs > To unsubscribe, send any mail to "freebsd-fs-unsubscribe@**freebsd.org > " > From owner-freebsd-fs@FreeBSD.ORG Sat Apr 13 17:34:02 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id BE1B6611 for ; Sat, 13 Apr 2013 17:34:02 +0000 (UTC) (envelope-from mxb@alumni.chalmers.se) Received: from mail-lb0-f182.google.com (mail-lb0-f182.google.com [209.85.217.182]) by mx1.freebsd.org (Postfix) with ESMTP id 47A3ED26 for ; Sat, 13 Apr 2013 17:34:01 +0000 (UTC) Received: by mail-lb0-f182.google.com with SMTP id z13so3430853lbh.41 for ; Sat, 13 Apr 2013 10:33:54 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=x-received:from:content-type:content-transfer-encoding:subject :message-id:date:to:mime-version:x-mailer:x-gm-message-state; bh=4Gbx1MZ5Y2SWb62l42HgRLej+WI2sh0PD5InflILqh8=; b=NyPKiVVOYdaFZ0lV4CrrDbVJ3tAjT306BR8d3MF4RMH7tkrZGovoupSuExFM8Tw4bJ VupcfnQDgc00Q+vgYTP6QkWJR7bG39aDm5FOe0/Ht82YTx/zbNnnnT+hSsJQ1v4K22MU SLAznpeffYDUvNsZf/Z8Drn8K+J2xAzKHG3qDBTlaUvhbuIdwW1jn5OVijltKKS3t1MH 4EUJP7l+tPV9PivK1tac8GjCCQHR1xKqJz6Dl2vdCjAuDI7fqA0ocvRVpxymSxZaVwSz WVMpkioFofHe+NDh7sbhFV8TREbq/yEPjxYRwc8Dxh4nuLHbJIhSDrFwLGNx2mPHEeL+ Fc6A== X-Received: by 10.152.116.52 with SMTP id jt20mr7525679lab.52.1365874434514; Sat, 13 Apr 2013 10:33:54 -0700 (PDT) Received: from grey.home.unixconn.com (h-74-23.a183.priv.bahnhof.se. [46.59.74.23]) by mx.google.com with ESMTPS id t17sm5129044lbd.11.2013.04.13.10.33.52 (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Sat, 13 Apr 2013 10:33:53 -0700 (PDT) From: mxb Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: quoted-printable Subject: ZFS: ZIL device export/import Message-Id: <5A2824CA-2A67-47FA-AB27-20C6EBD2C501@alumni.chalmers.se> Date: Sat, 13 Apr 2013 19:33:51 +0200 To: freebsd-fs@freebsd.org Mime-Version: 1.0 (Mac OS X Mail 6.3 \(1503\)) X-Mailer: Apple Mail (2.1503) X-Gm-Message-State: ALoCoQkEJapnEYgA3XBpf24Wr+7QnvmvZnujLqWkxmZLdJbbafEP0R6HvrUJLJnavVyVUfjfaaqw X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 13 Apr 2013 17:34:02 -0000 Hello list, I'm currently have 2x Head Units (HU) [FreeBSD 9.1] connected to the = same JBOD via SAS Expander. Each HU has separate ZIL and L2ARC devices. Hardware on both HU (inc. = SSD disks for ZIL/L2ARC) is identical. This is basically a HA-setup. The I do an 'zpool export tank' on the first HU and do 'zpool import = tank' on the second one, only L2ARC device appears usable. Import fails, complaining about ZIL device not present.=20 According to the man page zpool(8) ZIL device can be imported and = exported. "=85 Log devices can be added, replaced, attached, detached, imported = and exported as part of the larger pool. =85". Do I miss something here? This feature not implemented yet? Any way to work around this, except that moving ZIL into JBOD? //maxim From owner-freebsd-fs@FreeBSD.ORG Sat Apr 13 17:53:43 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 87859CEB for ; Sat, 13 Apr 2013 17:53:43 +0000 (UTC) (envelope-from nowakpl@platinum.linux.pl) Received: from platinum.linux.pl (platinum.edu.pl [81.161.192.4]) by mx1.freebsd.org (Postfix) with ESMTP id 4C55BE1A for ; Sat, 13 Apr 2013 17:53:43 +0000 (UTC) Received: by platinum.linux.pl (Postfix, from userid 87) id 1A85947E1A; Sat, 13 Apr 2013 19:53:40 +0200 (CEST) X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on platinum.linux.pl X-Spam-Level: X-Spam-Status: No, score=-1.3 required=3.0 tests=ALL_TRUSTED,AWL autolearn=disabled version=3.3.2 Received: from [10.255.1.2] (unknown [83.151.38.73]) by platinum.linux.pl (Postfix) with ESMTPA id C3D1F47E14 for ; Sat, 13 Apr 2013 19:53:40 +0200 (CEST) Message-ID: <51699B8E.7050003@platinum.linux.pl> Date: Sat, 13 Apr 2013 19:53:18 +0200 From: Adam Nowacki User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:17.0) Gecko/20130328 Thunderbird/17.0.5 MIME-Version: 1.0 To: freebsd-fs@freebsd.org Subject: Re: ZFS: ZIL device export/import References: <5A2824CA-2A67-47FA-AB27-20C6EBD2C501@alumni.chalmers.se> In-Reply-To: <5A2824CA-2A67-47FA-AB27-20C6EBD2C501@alumni.chalmers.se> Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 8bit X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 13 Apr 2013 17:53:43 -0000 On 2013-04-13 19:33, mxb wrote: > > Hello list, > > I'm currently have 2x Head Units (HU) [FreeBSD 9.1] connected to the same JBOD via SAS Expander. > Each HU has separate ZIL and L2ARC devices. Hardware on both HU (inc. SSD disks for ZIL/L2ARC) is identical. > > This is basically a HA-setup. > > The I do an 'zpool export tank' on the first HU and do 'zpool import tank' on the second one, only L2ARC device appears usable. > Import fails, complaining about ZIL device not present. > > According to the man page zpool(8) ZIL device can be imported and exported. > > " Log devices can be added, replaced, attached, detached, imported and > exported as part of the larger pool. ". > > Do I miss something here? > This feature not implemented yet? > Any way to work around this, except that moving ZIL into JBOD? From the same man page: -m Enables import with missing log devices. ... but that won't be HA since on unclean shutdown of one the other won't be able to replay the log and some recent writes (or worse) will be lost. From owner-freebsd-fs@FreeBSD.ORG Sat Apr 13 18:03:51 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id E1824F1D for ; Sat, 13 Apr 2013 18:03:51 +0000 (UTC) (envelope-from mxb@alumni.chalmers.se) Received: from mail-la0-x22b.google.com (mail-la0-x22b.google.com [IPv6:2a00:1450:4010:c03::22b]) by mx1.freebsd.org (Postfix) with ESMTP id 6ADBEE73 for ; Sat, 13 Apr 2013 18:03:51 +0000 (UTC) Received: by mail-la0-f43.google.com with SMTP id eg20so2580248lab.30 for ; Sat, 13 Apr 2013 11:03:50 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=x-received:content-type:mime-version:subject:from:in-reply-to:date :content-transfer-encoding:message-id:references:to:x-mailer :x-gm-message-state; bh=hz4NYMDGvCLaArJKHRH344N26BpD3s8JfWyrQL+OrgM=; b=caOBUfMQku3jirtGw15QLprqHgSHju1WQDxuQRErCHCUF3YNQEJURFTCcdi3dxM/TU 07l+4yH+ZlVwu5TcAG0KAdtuCmMWMcZ7E9pJ319tGZvB6s9LOL2isKwIIHzTGNJSsorU nmgPuVVsaljk2BuFDqdN+F1t1hilwsCI4kZCqCealZUdODbBIPOhOTQDo2RcV3LjEsSM uFRBNIQhRqnXxVw18pFOVlCEIj28Af+NFc0uKXx7Uhlf90d4w/SeCpbHZQBNRBG0MFjB Jg0cQ5y23jFcDDo+OY6z/FbaSv1lFToN6FeP92uzuCaGLgPY2MNzGaTS55SXqJEjGtRK VBag== X-Received: by 10.112.1.169 with SMTP id 9mr7649128lbn.130.1365876230069; Sat, 13 Apr 2013 11:03:50 -0700 (PDT) Received: from grey.home.unixconn.com (h-74-23.a183.priv.bahnhof.se. [46.59.74.23]) by mx.google.com with ESMTPS id m9sm5187055lbm.3.2013.04.13.11.03.48 (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Sat, 13 Apr 2013 11:03:48 -0700 (PDT) Content-Type: text/plain; charset=windows-1252 Mime-Version: 1.0 (Mac OS X Mail 6.3 \(1503\)) Subject: Re: ZFS: ZIL device export/import From: mxb In-Reply-To: <51699B8E.7050003@platinum.linux.pl> Date: Sat, 13 Apr 2013 20:03:47 +0200 Content-Transfer-Encoding: quoted-printable Message-Id: References: <5A2824CA-2A67-47FA-AB27-20C6EBD2C501@alumni.chalmers.se> <51699B8E.7050003@platinum.linux.pl> To: "freebsd-fs@freebsd.org" X-Mailer: Apple Mail (2.1503) X-Gm-Message-State: ALoCoQnVmghjBHDQ62nvkNRCwIRPKYg11dqvfQHnL3f+GGrvVV7mtK54Xh9mdIUYfhImFv/MOO71 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 13 Apr 2013 18:03:51 -0000 Yes, will this require some additional interactions. The big question is WHY export/import works fine for L2ARC device = (cache), but not for log (ZIL device) ? Well, I tried to partition and then label (via glabel) slices on both = ZIL and L2ARC, the created a new pool with labels. Result is the same - L2ARC gets attached upon import, but not ZIL. GUID reported by 'zdb' is the same for ZIL-partitions on both HU (as = well as for L2ARC). The size of cause is the same. //mxb On 13 apr 2013, at 19:53, Adam Nowacki = wrote: > On 2013-04-13 19:33, mxb wrote: >>=20 >> Hello list, >>=20 >> I'm currently have 2x Head Units (HU) [FreeBSD 9.1] connected to the = same JBOD via SAS Expander. >> Each HU has separate ZIL and L2ARC devices. Hardware on both HU (inc. = SSD disks for ZIL/L2ARC) is identical. >>=20 >> This is basically a HA-setup. >>=20 >> The I do an 'zpool export tank' on the first HU and do 'zpool import = tank' on the second one, only L2ARC device appears usable. >> Import fails, complaining about ZIL device not present. >>=20 >> According to the man page zpool(8) ZIL device can be imported and = exported. >>=20 >> "=85 Log devices can be added, replaced, attached, detached, imported = and >> exported as part of the larger pool. =85". >>=20 >> Do I miss something here? >> This feature not implemented yet? >> Any way to work around this, except that moving ZIL into JBOD? >=20 > =46rom the same man page: > -m Enables import with missing log devices. >=20 > ... but that won't be HA since on unclean shutdown of one the other = won't be able to replay the log and some recent writes (or worse) will = be lost. >=20 > _______________________________________________ > freebsd-fs@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-fs > To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org" From owner-freebsd-fs@FreeBSD.ORG Sat Apr 13 18:10:37 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id A7133271 for ; Sat, 13 Apr 2013 18:10:37 +0000 (UTC) (envelope-from ronald-freebsd8@klop.yi.org) Received: from smarthost1.greenhost.nl (smarthost1.greenhost.nl [195.190.28.78]) by mx1.freebsd.org (Postfix) with ESMTP id 424E7EAB for ; Sat, 13 Apr 2013 18:10:37 +0000 (UTC) Received: from smtp.greenhost.nl ([213.108.104.138]) by smarthost1.greenhost.nl with esmtps (TLS1.0:RSA_AES_256_CBC_SHA1:32) (Exim 4.69) (envelope-from ) id 1UR4uC-00055P-KK; Sat, 13 Apr 2013 20:10:29 +0200 Received: from dhcp-077-251-158-153.chello.nl ([77.251.158.153] helo=pinky) by smtp.greenhost.nl with esmtpsa (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.72) (envelope-from ) id 1UR4uC-0000VP-8A; Sat, 13 Apr 2013 20:10:28 +0200 Content-Type: text/plain; charset=utf-8; format=flowed; delsp=yes To: "freebsd-fs@freebsd.org" , mxb Subject: Re: ZFS: ZIL device export/import References: <5A2824CA-2A67-47FA-AB27-20C6EBD2C501@alumni.chalmers.se> <51699B8E.7050003@platinum.linux.pl> Date: Sat, 13 Apr 2013 20:10:29 +0200 MIME-Version: 1.0 Content-Transfer-Encoding: 8bit From: "Ronald Klop" Message-ID: In-Reply-To: User-Agent: Opera Mail/12.15 (Win32) X-Virus-Scanned: by clamav at smarthost1.samage.net X-Spam-Level: / X-Spam-Score: 0.8 X-Spam-Status: No, score=0.8 required=5.0 tests=BAYES_50 autolearn=disabled version=3.3.1 X-Scan-Signature: f0d5e446bfc5bbd6ce781899a390d841 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 13 Apr 2013 18:10:37 -0000 On Sat, 13 Apr 2013 20:03:47 +0200, mxb wrote: > > Yes, will this require some additional interactions. > > The big question is WHY export/import works fine for L2ARC device > (cache), but not for log (ZIL device) ? > > Well, I tried to partition and then label (via glabel) slices on both > ZIL and L2ARC, the created a new pool with labels. > Result is the same - L2ARC gets attached upon import, but not ZIL. > GUID reported by 'zdb' is the same for ZIL-partitions on both HU (as > well as for L2ARC). The size of cause is the same. The L2ARC is considered empty on startup/import. The ZIL might contain valuable data after a crash. So your setup is wrong. The ZIL is supposed to be one-on-one with the pool. You should move the ZILs to the JBOD. You can make a mirror of the ZIL devices to improve failsafe operation by redundancy. Ronald. > //mxb > > > On 13 apr 2013, at 19:53, Adam Nowacki wrote: > >> On 2013-04-13 19:33, mxb wrote: >>> >>> Hello list, >>> >>> I'm currently have 2x Head Units (HU) [FreeBSD 9.1] connected to the >>> same JBOD via SAS Expander. >>> Each HU has separate ZIL and L2ARC devices. Hardware on both HU (inc. >>> SSD disks for ZIL/L2ARC) is identical. >>> >>> This is basically a HA-setup. >>> >>> The I do an 'zpool export tank' on the first HU and do 'zpool import >>> tank' on the second one, only L2ARC device appears usable. >>> Import fails, complaining about ZIL device not present. >>> >>> According to the man page zpool(8) ZIL device can be imported and >>> exported. >>> >>> "… Log devices can be added, replaced, attached, detached, imported and >>> exported as part of the larger pool. …". >>> >>> Do I miss something here? Yes. >>> This feature not implemented yet? >>> Any way to work around this, except that moving ZIL into JBOD? >> >> From the same man page: >> -m Enables import with missing log devices. >> >> ... but that won't be HA since on unclean shutdown of one the other >> won't be able to replay the log and some recent writes (or worse) will >> be lost. >> >> _______________________________________________ >> freebsd-fs@freebsd.org mailing list >> http://lists.freebsd.org/mailman/listinfo/freebsd-fs >> To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org" > > _______________________________________________ > freebsd-fs@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-fs > To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org" From owner-freebsd-fs@FreeBSD.ORG Sat Apr 13 18:11:21 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id B609A2EE for ; Sat, 13 Apr 2013 18:11:21 +0000 (UTC) (envelope-from mxb@alumni.chalmers.se) Received: from mail-lb0-f173.google.com (mail-lb0-f173.google.com [209.85.217.173]) by mx1.freebsd.org (Postfix) with ESMTP id 3EAA6EB4 for ; Sat, 13 Apr 2013 18:11:20 +0000 (UTC) Received: by mail-lb0-f173.google.com with SMTP id w20so3538812lbh.32 for ; Sat, 13 Apr 2013 11:11:19 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=x-received:from:content-type:message-id:mime-version:subject:date :references:to:in-reply-to:x-mailer:x-gm-message-state; bh=Ea7K3EhLESh3X70WTQaQpFYH8AvDUTIGX2tMKct/e6c=; b=LqrdL9jIn41QCNczf2Fl70uv7N20vUgqIMtHsiaCYgVSkF0hVie2Mb4mBXedKhZP04 QPMjUGKNJ++jUkgi2DjChTf2Ja+pWrctEZOmSsImhK/ALe6Xtf2dzGt7e8jlbW/K0e8R yhYT3ccHZ+5Fruba3fiWTLQAQjRhftQE1R3qHKsrvkM5qL+8BkQY5ITit/0HRQA/rHTL wDbw+6dyc/kEkDbIfeNcprAxjbzto7XiMKC1/lWrt5s/a53sEr3ofRJEWLk5AanV+7Yl S8iC1CVoGHUeb3Vn2MRx6mXvSnL2yWmB7nYDw7+yC0sEgs7NzUmPAIXf8A9TJC0LFDwn EPuQ== X-Received: by 10.152.26.101 with SMTP id k5mr7557606lag.31.1365876679820; Sat, 13 Apr 2013 11:11:19 -0700 (PDT) Received: from grey.home.unixconn.com (h-74-23.a183.priv.bahnhof.se. [46.59.74.23]) by mx.google.com with ESMTPS id l20sm5179171lbv.9.2013.04.13.11.11.18 (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Sat, 13 Apr 2013 11:11:19 -0700 (PDT) From: mxb Message-Id: <82C9DE61-99AC-4C38-B415-18C8795B056C@alumni.chalmers.se> Mime-Version: 1.0 (Mac OS X Mail 6.3 \(1503\)) Subject: Re: ZFS: ZIL device export/import Date: Sat, 13 Apr 2013 20:11:17 +0200 References: <5A2824CA-2A67-47FA-AB27-20C6EBD2C501@alumni.chalmers.se> <51699B8E.7050003@platinum.linux.pl> To: "freebsd-fs@freebsd.org" In-Reply-To: <51699B8E.7050003@platinum.linux.pl> X-Mailer: Apple Mail (2.1503) X-Gm-Message-State: ALoCoQmvxGpGTxnD3E4Ms4QjQnsKDDJdYid3XHFk4ZQwi/dipaaoe7N6lhqWK+TpxPge/66BgMax Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: quoted-printable X-Content-Filtered-By: Mailman/MimeDel 2.1.14 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 13 Apr 2013 18:11:21 -0000 On 13 apr 2013, at 19:53, Adam Nowacki = wrote: > =46rom the same man page: > -m Enables import with missing log devices. It is not missing. spool just not recognizes it for some reason. Sure, the hw itself is not = the same - another server. From owner-freebsd-fs@FreeBSD.ORG Sat Apr 13 18:14:54 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id BCA62396 for ; Sat, 13 Apr 2013 18:14:54 +0000 (UTC) (envelope-from spork@bway.net) Received: from smtp2.bway.net (smtp2.bway.net [216.220.96.28]) by mx1.freebsd.org (Postfix) with ESMTP id 9B942ECC for ; Sat, 13 Apr 2013 18:14:54 +0000 (UTC) Received: from hotlap.sporklab.com (foon.sporktines.com [96.57.144.66]) (using TLSv1 with cipher AES128-SHA (128/128 bits)) (No client certificate requested) (Authenticated sender: spork@bway.net) by smtp2.bway.net (Postfix) with ESMTPSA id C32E29587F; Sat, 13 Apr 2013 14:14:43 -0400 (EDT) References: <51672164.1090908@o2.pl> <20130411212408.GA60159@icarus.home.lan> <5168821F.5020502@o2.pl> <51691524.4050009@sneakertech.com> In-Reply-To: <51691524.4050009@sneakertech.com> Mime-Version: 1.0 (Apple Message framework v1085) Content-Type: text/plain; charset=us-ascii Message-Id: <4617BC69-842C-422E-9616-3BCDC11C0048@bway.net> Content-Transfer-Encoding: quoted-printable From: Charles Sprickman Subject: Re: A failed drive causes system to hang Date: Sat, 13 Apr 2013 14:14:42 -0400 To: Quartz X-Mailer: Apple Mail (2.1085) Cc: freebsd-fs@freebsd.org, =?utf-8?Q?Radio_m=C5=82odych_bandyt=C3=B3w?= X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 13 Apr 2013 18:14:54 -0000 On Apr 13, 2013, at 4:19 AM, Quartz wrote: >=20 >> As to dmesg, tough luck. I have 2 photos on my phone and their >> transcripts are all I can give until the problem reappears >=20 > I think there's a communication gap here. >=20 > While a messages and logs from the time the incident happens are = ideal, Jeremy *also* just needs to see the generic info about your = hardware, which can be found in any dmesg taken at any time. More specifically, I think the OP did supply the full output of the = 'dmesg' *command*, but what I think is wanted is the contents of = /var/run/dmesg.boot. Charles >=20 > ______________________________________ > it has a certain smooth-brained appeal > _______________________________________________ > freebsd-fs@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-fs > To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org" From owner-freebsd-fs@FreeBSD.ORG Sat Apr 13 18:15:17 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 9D1BA40E for ; Sat, 13 Apr 2013 18:15:17 +0000 (UTC) (envelope-from nowakpl@platinum.linux.pl) Received: from platinum.linux.pl (platinum.edu.pl [81.161.192.4]) by mx1.freebsd.org (Postfix) with ESMTP id 62414ED4 for ; Sat, 13 Apr 2013 18:15:17 +0000 (UTC) Received: by platinum.linux.pl (Postfix, from userid 87) id B529047E1A; Sat, 13 Apr 2013 20:15:15 +0200 (CEST) X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on platinum.linux.pl X-Spam-Level: X-Spam-Status: No, score=-1.3 required=3.0 tests=ALL_TRUSTED,AWL autolearn=disabled version=3.3.2 Received: from [10.255.1.2] (unknown [83.151.38.73]) by platinum.linux.pl (Postfix) with ESMTPA id 8B4BB47E14 for ; Sat, 13 Apr 2013 20:15:15 +0200 (CEST) Message-ID: <5169A09D.1010403@platinum.linux.pl> Date: Sat, 13 Apr 2013 20:14:53 +0200 From: Adam Nowacki User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:17.0) Gecko/20130328 Thunderbird/17.0.5 MIME-Version: 1.0 To: freebsd-fs@freebsd.org Subject: Re: ZFS: ZIL device export/import References: <5A2824CA-2A67-47FA-AB27-20C6EBD2C501@alumni.chalmers.se> <51699B8E.7050003@platinum.linux.pl> <82C9DE61-99AC-4C38-B415-18C8795B056C@alumni.chalmers.se> In-Reply-To: <82C9DE61-99AC-4C38-B415-18C8795B056C@alumni.chalmers.se> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 13 Apr 2013 18:15:17 -0000 It is not missing but in inconsistent state with the pool. If import would succeed it is almost certain the entire pool would explode - because it would replay log from the past. On 2013-04-13 20:11, mxb wrote: > > On 13 apr 2013, at 19:53, Adam Nowacki wrote: > >> From the same man page: >> -m Enables import with missing log devices. > > > It is not missing. > spool just not recognizes it for some reason. Sure, the hw itself is not the same - another server. > > > _______________________________________________ > freebsd-fs@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-fs > To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org" > From owner-freebsd-fs@FreeBSD.ORG Sat Apr 13 18:25:44 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id A510AB17 for ; Sat, 13 Apr 2013 18:25:44 +0000 (UTC) (envelope-from mxb@alumni.chalmers.se) Received: from mail-la0-x22e.google.com (mail-la0-x22e.google.com [IPv6:2a00:1450:4010:c03::22e]) by mx1.freebsd.org (Postfix) with ESMTP id 2CBF4F47 for ; Sat, 13 Apr 2013 18:25:43 +0000 (UTC) Received: by mail-la0-f46.google.com with SMTP id ea20so3310596lab.5 for ; Sat, 13 Apr 2013 11:25:43 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=x-received:from:content-type:message-id:mime-version:subject:date :references:to:in-reply-to:x-mailer:x-gm-message-state; bh=Ntp6/UZvDFvl5L2U+3G5OSQqqACE4vip3zk5bwseAjg=; b=nzBikJBB3TX5gi3g2jQduzmhYTs4PvUcvvX1nF2IMP4XShMs7KGBgbqCoGFuSAJyak JVobq0rxNdrQj+MLuTsOAzY2HDzAIfo/tAD1X1wdiTNWm5zelmfIYNjMQ4N57ix3WMhX ivsYcnlRS1vR4LYQgc1iG/691rzuta3VONOvZfa2krF6OPxrYUznMO6M3axvRBv48OsS vqVssyO8Pl6O87SPeHinB7P/9NqmV2N7LUtekhn09Lw8GdCDCqaOdqKCfSxuO+ZFNsxh pb9KYAVJRJEodQDB9oFtfyOSQNLibKlZ/V8JFmFKOKb6WQX8I/2mfKBPk081PoxRsWHF 0QYg== X-Received: by 10.152.105.244 with SMTP id gp20mr7617439lab.34.1365877543031; Sat, 13 Apr 2013 11:25:43 -0700 (PDT) Received: from grey.home.unixconn.com (h-74-23.a183.priv.bahnhof.se. [46.59.74.23]) by mx.google.com with ESMTPS id z10sm5217529lbz.1.2013.04.13.11.25.41 (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Sat, 13 Apr 2013 11:25:42 -0700 (PDT) From: mxb Message-Id: <2DE8AD5E-B84C-4D88-A242-EA30EA4A68FD@alumni.chalmers.se> Mime-Version: 1.0 (Mac OS X Mail 6.3 \(1503\)) Subject: Re: ZFS: ZIL device export/import Date: Sat, 13 Apr 2013 20:25:40 +0200 References: <5A2824CA-2A67-47FA-AB27-20C6EBD2C501@alumni.chalmers.se> <51699B8E.7050003@platinum.linux.pl> To: "freebsd-fs@freebsd.org" In-Reply-To: X-Mailer: Apple Mail (2.1503) X-Gm-Message-State: ALoCoQkUsmsKuEz42luadxvF0WBAAyawVy3ZJU24TP3Yj9ADSKx7nQ2c/tAwSA8i7jemQddy2VAJ Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: quoted-printable X-Content-Filtered-By: Mailman/MimeDel 2.1.14 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 13 Apr 2013 18:25:44 -0000 On 13 apr 2013, at 20:10, "Ronald Klop" = wrote: > The L2ARC is considered empty on startup/import. The ZIL might contain = valuable data after a crash. So your setup is wrong. The ZIL is supposed = to be one-on-one with the pool. You should move the ZILs to the JBOD. = You can make a mirror of the ZIL devices to improve failsafe operation = by redundancy. I figured that out with mirror, thus my tests with slices - one slice on = local SSD (per HU) the second half of the mirror on JBOD-slice = (dedicated SSD there too). But this requires extra 'zpool online' and = 'zpool replace'.=20 As I understand there is no other way?? I'm forced to do those steps? ZIL dev. on JBOD is a bit odd - the idea with local (per HU) ZIL is to = postpone transfer of the data over SAS Expander. Or at least buffer and then move over SAS Exp. . //mxb=20= From owner-freebsd-fs@FreeBSD.ORG Sat Apr 13 19:24:33 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 71504325; Sat, 13 Apr 2013 19:24:33 +0000 (UTC) (envelope-from nowakpl@platinum.linux.pl) Received: from platinum.linux.pl (platinum.edu.pl [81.161.192.4]) by mx1.freebsd.org (Postfix) with ESMTP id 1913910D4; Sat, 13 Apr 2013 19:24:32 +0000 (UTC) Received: by platinum.linux.pl (Postfix, from userid 87) id 0CD8947E27; Sat, 13 Apr 2013 21:24:30 +0200 (CEST) X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on platinum.linux.pl X-Spam-Level: X-Spam-Status: No, score=-1.3 required=3.0 tests=ALL_TRUSTED,AWL autolearn=disabled version=3.3.2 Received: from [10.255.1.2] (unknown [83.151.38.73]) by platinum.linux.pl (Postfix) with ESMTPA id 5160347E21; Sat, 13 Apr 2013 21:24:30 +0200 (CEST) Message-ID: <5169B0D7.9090607@platinum.linux.pl> Date: Sat, 13 Apr 2013 21:24:07 +0200 From: Adam Nowacki User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:17.0) Gecko/20130328 Thunderbird/17.0.5 MIME-Version: 1.0 To: Will Andrews Subject: Re: ZFS slow reads for unallocated blocks References: <5166EA43.7050700@platinum.linux.pl> <5167B1C5.8020402@FreeBSD.org> <51689A2C.4080402@platinum.linux.pl> <5169324A.3080309@FreeBSD.org> <516949C7.4030305@platinum.linux.pl> In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit Cc: "freebsd-fs@freebsd.org" , zfs@lists.illumos.org, Andriy Gapon X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 13 Apr 2013 19:24:33 -0000 Including zfs@illumos on this. To recap: Reads from sparse files are slow with speed proportional to ratio of read size to filesystem recordsize ratio. There is no physical disk I/O. # zfs create -o atime=off -o recordsize=128k -o compression=off -o sync=disabled -o mountpoint=/home/testfs home/testfs # dd if=/dev/random of=/home/testfs/random10m bs=1024k count=10 # truncate -s 10m /home/testfs/trunc10m # dd if=/home/testfs/random10m of=/dev/null bs=512 10485760 bytes transferred in 0.078637 secs (133344041 bytes/sec) # dd if=/home/testfs/trunc10m of=/dev/null bs=512 10485760 bytes transferred in 1.011500 secs (10366544 bytes/sec) # zfs create -o atime=off -o recordsize=8M -o compression=off -o sync=disabled -o mountpoint=/home/testfs home/testfs # dd if=/home/testfs/random10m of=/dev/null bs=512 10485760 bytes transferred in 0.080430 secs (130371205 bytes/sec) # dd if=/home/testfs/trunc10m of=/dev/null bs=512 10485760 bytes transferred in 72.465486 secs (144700 bytes/sec) This is from FreeBSD 9.1 and possible solution at http://tepeserwery.pl/nowak/freebsd/zfs_sparse_optimization_v2.patch.txt - untested yet, system will be busy building packages for a few more days. On 2013-04-13 19:11, Will Andrews wrote: > Hi, > > I think the idea of using a pre-zeroed region as the 'source' is a good > one, but probably it would be better to set a special flag on a hole > dbuf than to require caller flags. That way, ZFS can lazily evaluate > the hole dbuf (i.e. avoid zeroing db_data until it has to). However, > that could be complicated by the fact that there are many potential > users of hole dbufs that would want to write to the dbuf. > > This sort of optimization should be brought to the illumos zfs list. As > it stands, your patch is also FreeBSD-specific, since 'zero_region' only > exists in vm/vm_kern.c. Given the frequency of zero-copying, however, > it's quite possible there are other versions of this region elsewhere. > > --Will. > > > On Sat, Apr 13, 2013 at 6:04 AM, Adam Nowacki > wrote: > > Temporary dbufs are created for each missing (unallocated on disk) > record, including indirects if the hole is large enough. Those dbufs > never find way to ARC and are freed at the end of dmu_read_uio. > > A small read (from a hole) would in the best case bzero 128KiB > (recordsize, more if missing indirects) ... and I'm running modified > ZFS with record sizes up to 8MiB. > > # zfs create -o atime=off -o recordsize=8M -o compression=off -o > mountpoint=/home/testfs home/testfs > # truncate -s 8m /home/testfs/trunc8m > # dd if=/dev/zero of=/home/testfs/zero8m bs=8m count=1 > 1+0 records in > 1+0 records out > 8388608 bytes transferred in 0.010193 secs (822987745 bytes/sec) > > # time cat /home/testfs/trunc8m > /dev/null > 0.000u 6.111s 0:06.11 100.0% 15+2753k 0+0io 0pf+0w > > # time cat /home/testfs/zero8m > /dev/null > 0.000u 0.010s 0:00.01 100.0% 12+2168k 0+0io 0pf+0w > > 600x increase in system time and close to 1MB/s - insanity. > > The fix - a lot of the code to efficiently handle this was already > there. > > dbuf_hold_impl has int fail_sparse argument to return ENOENT for > holes. Just had to get there and somehow back to dmu_read_uio where > zeroing can happen at byte granularity. > > ... didn't have time to actually test it yet. > > > On 2013-04-13 12:24, Andriy Gapon wrote: > > on 13/04/2013 02:35 Adam Nowacki said the following: > > http://tepeserwery.pl/nowak/__freebsd/zfs_sparse___optimization.patch.txt > > > Does it look sane? > > > It's hard to tell from a quick look since they change is not small. > What is your idea of the problem and the fix? > > On 2013-04-12 09:03, Andriy Gapon wrote: > > > ENOTIME to really investigate, but here is a basic > profile result for those > interested: > kernel`bzero+0xa > kernel`dmu_buf_hold_array_by___dnode+0x1cf > kernel`dmu_read_uio+0x66 > kernel`zfs_freebsd_read+0x3c0 > kernel`VOP_READ_APV+0x92 > kernel`vn_read+0x1a3 > kernel`vn_io_fault+0x23a > kernel`dofileread+0x7b > kernel`sys_read+0x9e > kernel`amd64_syscall+0x238 > kernel`0xffffffff80747e4b > > That's where > 99% of time is spent. > > > > > > _________________________________________________ > freebsd-fs@freebsd.org mailing list > http://lists.freebsd.org/__mailman/listinfo/freebsd-fs > > To unsubscribe, send any mail to > "freebsd-fs-unsubscribe@__freebsd.org > " > > From owner-freebsd-fs@FreeBSD.ORG Sat Apr 13 20:51:12 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id E56E516E for ; Sat, 13 Apr 2013 20:51:12 +0000 (UTC) (envelope-from ronald-freebsd8@klop.yi.org) Received: from smarthost1.greenhost.nl (smarthost1.greenhost.nl [195.190.28.78]) by mx1.freebsd.org (Postfix) with ESMTP id A8E11134C for ; Sat, 13 Apr 2013 20:51:12 +0000 (UTC) Received: from smtp.greenhost.nl ([213.108.104.138]) by smarthost1.greenhost.nl with esmtps (TLS1.0:RSA_AES_256_CBC_SHA1:32) (Exim 4.69) (envelope-from ) id 1UR7Ph-0003Gg-RE for freebsd-fs@freebsd.org; Sat, 13 Apr 2013 22:51:10 +0200 Received: from dhcp-077-251-158-153.chello.nl ([77.251.158.153] helo=pinky) by smtp.greenhost.nl with esmtpsa (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.72) (envelope-from ) id 1UR7Pg-0004Mn-Vw for freebsd-fs@freebsd.org; Sat, 13 Apr 2013 22:51:09 +0200 Content-Type: text/plain; charset=us-ascii; format=flowed; delsp=yes To: freebsd-fs@freebsd.org Subject: Re: ZFS: ZIL device export/import References: <5A2824CA-2A67-47FA-AB27-20C6EBD2C501@alumni.chalmers.se> <51699B8E.7050003@platinum.linux.pl> <2DE8AD5E-B84C-4D88-A242-EA30EA4A68FD@alumni.chalmers.se> Date: Sat, 13 Apr 2013 22:51:10 +0200 MIME-Version: 1.0 Content-Transfer-Encoding: 8bit From: "Ronald Klop" Message-ID: In-Reply-To: <2DE8AD5E-B84C-4D88-A242-EA30EA4A68FD@alumni.chalmers.se> User-Agent: Opera Mail/12.15 (Win32) X-Virus-Scanned: by clamav at smarthost1.samage.net X-Spam-Level: / X-Spam-Score: 0.8 X-Spam-Status: No, score=0.8 required=5.0 tests=BAYES_50 autolearn=disabled version=3.3.1 X-Scan-Signature: 0ccaee305be983877c9e38c09cbf8ec4 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 13 Apr 2013 20:51:13 -0000 On Sat, 13 Apr 2013 20:25:40 +0200, mxb wrote: > > On 13 apr 2013, at 20:10, "Ronald Klop" > wrote: > >> The L2ARC is considered empty on startup/import. The ZIL might contain >> valuable data after a crash. So your setup is wrong. The ZIL is >> supposed to be one-on-one with the pool. You should move the ZILs to >> the JBOD. You can make a mirror of the ZIL devices to improve failsafe >> operation by redundancy. > > > I figured that out with mirror, thus my tests with slices - one slice on > local SSD (per HU) the second half of the mirror on JBOD-slice > (dedicated SSD there too). But this requires extra 'zpool online' and > 'zpool replace'. > > As I understand there is no other way?? > I'm forced to do those steps? > > ZIL dev. on JBOD is a bit odd - the idea with local (per HU) ZIL is to > postpone transfer of the data over SAS Expander. > Or at least buffer and then move over SAS Exp. . > > //mxb I thought the idea of ZIL is a fast buffer before the write to slow disk. Are you really sure the SAS expander is the bottleneck in the system instead of the disks? Ronald. From owner-freebsd-fs@FreeBSD.ORG Sat Apr 13 20:59:59 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 7ECEC236 for ; Sat, 13 Apr 2013 20:59:59 +0000 (UTC) (envelope-from quartz@sneakertech.com) Received: from relay00.pair.com (relay00.pair.com [209.68.5.9]) by mx1.freebsd.org (Postfix) with SMTP id 21513137A for ; Sat, 13 Apr 2013 20:59:58 +0000 (UTC) Received: (qmail 92824 invoked by uid 0); 13 Apr 2013 20:59:52 -0000 Received: from 173.48.104.62 (HELO ?10.2.2.1?) (173.48.104.62) by relay00.pair.com with SMTP; 13 Apr 2013 20:59:52 -0000 X-pair-Authenticated: 173.48.104.62 Message-ID: <5169C747.8030806@sneakertech.com> Date: Sat, 13 Apr 2013 16:59:51 -0400 From: Quartz User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.6; rv:10.0.2) Gecko/20120216 Thunderbird/10.0.2 MIME-Version: 1.0 To: Jeremy Chadwick Subject: Re: A failed drive causes system to hang References: <51672164.1090908@o2.pl> <20130411212408.GA60159@icarus.home.lan> <5168821F.5020502@o2.pl> <20130412220350.GA82467@icarus.home.lan> <516917CA.5040607@sneakertech.com> <20130413154130.GA877@icarus.home.lan> In-Reply-To: <20130413154130.GA877@icarus.home.lan> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: freebsd-fs@freebsd.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 13 Apr 2013 20:59:59 -0000 > This is what happens when end-users start to try and "correlate" issues > to one another's without actually taking the time to fully read the > thread and follow along actively. He was experiencing a system hang, which appeared to be related to zfs and/or cam. I'm experiencing a system hang, which appears to be related to zfs and/or cam. I am in fact following along with this thread. > Your issue: "on my raidz2 pool, when I lose more than 2 disks, I/O to > the pool stalls indefinitely, Close, but not quite- Yes, io to the pool stalls, but io in general also stalls. It appears the problem possibly doesn't start until there's io traffic to the pool though. >but I can still use the system barring > ZFS-related things; No. I've responded to this misconception on your part more than once- I *CANNOT* use the system in any reliable way, random commands fail. I've had it hang trying cd from one dir on the boot volume to another dir on the boot volume. The only thing I can *reliably* do is log in. Past that point all bets are off. >I don't know how to get the system back into a > usable state from this situation" "...short of having to hard reset", yes. > Else, all you've provided so far is a general explanation. You have > still not provided concise step-by-step information like I've asked. *WHAT* info? You have YET TO TELL ME WHAT THE CRAP YOU ACTUALLY NEED from me. I've said many times I'm perfectly willing to give you logs or run tests, but I'm not about to post a tarball of my entire drive and output of every possible command I could ever run. For all the harping you do about "not enough info" you're just as bad yourself. > I've gone so far as to give you an example of what to provide: > > http://lists.freebsd.org/pipermail/freebsd-fs/2013-March/016814.html The only thing there you ask for is a dmesg, which I subsequently provided. Nowhere in that thread do you ask me to give you *anything* else, besides your generic mantra of "more info". And yes, I did read it again just now three times over to make sure. The closest you come is: "This is why hard data/logs/etc. are necessary, and why every single step of the way needs to be provided, including physical tasks performed." ... but you still never told me WHICH logs or WHAT data you need. I've already given you the steps I took re: removing drives, steps which *you yourself* confirmed to express the problem. > I will again point to the 2nd-to-last paragraph of my above referenced > mail. The "2nd-to-last paragraph" is: "So in summary: there seem to be multiple issues shown above, but I can confirm that failmode=continue **does** pass EIO to *running* processes that are doing I/O. Subsequent I/O, however, is questionable at this time." Unless you're typing in a language other than english, that isn't asking me jack shit. > Once concise details are given and (highly preferable!) a step-by-step > way to reproduce the issue 100% of the time *YOU'VE ALREADY REPRODUCED THIS ON YOUR OWN MACHINE.* Seriously, wtf? ______________________________________ it has a certain smooth-brained appeal From owner-freebsd-fs@FreeBSD.ORG Sat Apr 13 21:23:41 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 27EE4D8 for ; Sat, 13 Apr 2013 21:23:41 +0000 (UTC) (envelope-from quartz@sneakertech.com) Received: from relay02.pair.com (relay02.pair.com [209.68.5.16]) by mx1.freebsd.org (Postfix) with SMTP id BDEF514BA for ; Sat, 13 Apr 2013 21:23:40 +0000 (UTC) Received: (qmail 91609 invoked by uid 0); 13 Apr 2013 21:23:39 -0000 Received: from 173.48.104.62 (HELO ?10.2.2.1?) (173.48.104.62) by relay02.pair.com with SMTP; 13 Apr 2013 21:23:39 -0000 X-pair-Authenticated: 173.48.104.62 Message-ID: <5169CCDA.2020901@sneakertech.com> Date: Sat, 13 Apr 2013 17:23:38 -0400 From: Quartz User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.6; rv:10.0.2) Gecko/20120216 Thunderbird/10.0.2 MIME-Version: 1.0 To: Jeremy Chadwick Subject: Re: A failed drive causes system to hang References: <51672164.1090908@o2.pl> <20130411212408.GA60159@icarus.home.lan> <5168821F.5020502@o2.pl> <20130412220350.GA82467@icarus.home.lan> <516917CA.5040607@sneakertech.com> <20130413154130.GA877@icarus.home.lan> <5169C747.8030806@sneakertech.com> In-Reply-To: <5169C747.8030806@sneakertech.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: freebsd-fs@freebsd.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 13 Apr 2013 21:23:41 -0000 >> Your issue: "on my raidz2 pool, when I lose more than 2 disks, I/O to >> the pool stalls indefinitely, > > Close, but not quite- Yes, io to the pool stalls, but io in general also > stalls. It appears the problem possibly doesn't start until there's io > traffic to the pool though. > > >> but I can still use the system barring >> ZFS-related things; > > No. I've responded to this misconception on your part more than once- I > *CANNOT* use the system in any reliable way, random commands fail. I've > had it hang trying cd from one dir on the boot volume to another dir on > the boot volume. The only thing I can *reliably* do is log in. Past that > point all bets are off. So, in looking over my thread again from the start, I realize there's been an evolution of diagnosis that may not have been immediately obvious. When I wrote up the description in my initial email, *at that time* I thought that io confined to the boot drive was in the clear. However, shortly afterwards after doing more tests over the course of the thread, I discovered that this was NOT the case. ______________________________________ it has a certain smooth-brained appeal From owner-freebsd-fs@FreeBSD.ORG Sat Apr 13 21:36:33 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 306BE267 for ; Sat, 13 Apr 2013 21:36:33 +0000 (UTC) (envelope-from jdc@koitsu.org) Received: from qmta13.emeryville.ca.mail.comcast.net (qmta13.emeryville.ca.mail.comcast.net [IPv6:2001:558:fe2d:44:76:96:27:243]) by mx1.freebsd.org (Postfix) with ESMTP id 141641516 for ; Sat, 13 Apr 2013 21:36:33 +0000 (UTC) Received: from omta20.emeryville.ca.mail.comcast.net ([76.96.30.87]) by qmta13.emeryville.ca.mail.comcast.net with comcast id PYLl1l0021smiN4ADZcXJ9; Sat, 13 Apr 2013 21:36:31 +0000 Received: from koitsu.strangled.net ([67.180.84.87]) by omta20.emeryville.ca.mail.comcast.net with comcast id PZcW1l0081t3BNj8gZcWBD; Sat, 13 Apr 2013 21:36:30 +0000 Received: by icarus.home.lan (Postfix, from userid 1000) id 1B73073A33; Sat, 13 Apr 2013 14:36:30 -0700 (PDT) Date: Sat, 13 Apr 2013 14:36:30 -0700 From: Jeremy Chadwick To: Quartz Subject: Re: A failed drive causes system to hang Message-ID: <20130413213630.GA6018@icarus.home.lan> References: <51672164.1090908@o2.pl> <20130411212408.GA60159@icarus.home.lan> <5168821F.5020502@o2.pl> <20130412220350.GA82467@icarus.home.lan> <516917CA.5040607@sneakertech.com> <20130413154130.GA877@icarus.home.lan> <5169C747.8030806@sneakertech.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <5169C747.8030806@sneakertech.com> User-Agent: Mutt/1.5.21 (2010-09-15) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=comcast.net; s=q20121106; t=1365888991; bh=uwM2kajUkPPiP9fYgMivZrhVOhM9LjfeAph+y/qGroA=; h=Received:Received:Received:Date:From:To:Subject:Message-ID: MIME-Version:Content-Type; b=FanHm8pzNsJalw7q8LwXm+Hu86SSANvQOhv6nt7LDrXoWbKArlUrlTgWh77gm6ZN1 1RQQRUlQPSjn11SlNwfll9SZ/FYlfqqpSgJPxqG/bhYymRNOdGoEl3B9OEYC0vojs3 BHjpOxcLN128mJ0whEHiOXr0LolrKpLoGdCopmakLz0femCCYNzgFd5ld6G7Q3yBXr m749Wur6J885nNqF7qRZMfQ/d9t5moT8wi2P7oOYvCLcnY2WzL7A05JEKow3QmAqUg LLBVYT0fREAtYqi8V1eLdITPzURiqDh4rd+/AbzACyN88Lqj+tpQKTfmwmmHV5b5SL EJo+1UWhR11ig== Cc: freebsd-fs@freebsd.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 13 Apr 2013 21:36:33 -0000 On Sat, Apr 13, 2013 at 04:59:51PM -0400, Quartz wrote: > > >This is what happens when end-users start to try and "correlate" issues > >to one another's without actually taking the time to fully read the > >thread and follow along actively. > > He was experiencing a system hang, which appeared to be related to > zfs and/or cam. I'm experiencing a system hang, which appears to be > related to zfs and/or cam. I am in fact following along with this > thread. The correlation was incorrect, however, which is my point. Treat every incident uniquely. > >Your issue: "on my raidz2 pool, when I lose more than 2 disks, I/O to > >the pool stalls indefinitely, > > Close, but not quite- Yes, io to the pool stalls, but io in general > also stalls. It appears the problem possibly doesn't start until > there's io traffic to the pool though. All I was able to reproduce was that I/O ***to the pool*** (once its broken) stalls. I'm still waiting on you to go through the same method/model I did here, providing all the data: http://lists.freebsd.org/pipermail/freebsd-fs/2013-March/016814.html > >but I can still use the system barring > >ZFS-related things; > > No. I've responded to this misconception on your part more than > once- I *CANNOT* use the system in any reliable way, random commands > fail. I've had it hang trying cd from one dir on the boot volume to > another dir on the boot volume. The only thing I can *reliably* do > is log in. Past that point all bets are off. Quoting you: http://lists.freebsd.org/pipermail/freebsd-fs/2013-March/016822.html "I can nose around the boot drive just fine, but anything involving i/o that so much as sneezes in the general direction of the pool hangs the machine. Once this happens I can log in via ssh, but that's pretty much it." This conflicts directly with your above statement. Then you proceed to provide ""the evidence that nothing works"", specifically: "zpool destroy hangs" -- this touches the pool "zpool replace hangs" -- this touches the pool "zpool history hangs" -- this touches the pool "shutdown -r now gets half way through then hangs" -- this touches the pool "reboot -q same as shutdown" -- this touches the pool (flushing of FS cache) If you're able to log in to the machine via SSH, it means that things like /etc/master.passwd can be read, and also that /var/log/utx* and similar files get updated (written to) successfully. So, to me, it indicates that only I/O to anything involving the ZFS pool is what causes indefinite stalling (of that application/command only). To make, this makes perfect sense. If you have other proof that indicates otherwise (such as non-ZFS filesystems start also stalling/causing problems), please provide those details. But as it stands, we don't even know what the "boot drive" consists of (filesystems, etc.) because you haven't provided any of that necessary information. Starting to see the problem? I sound like a broken record, it's because all the necessary information needed to diagnose this is stuff only you have access to. > >I don't know how to get the system back into a > >usable state from this situation" > > "...short of having to hard reset", yes. > > >Else, all you've provided so far is a general explanation. You have > >still not provided concise step-by-step information like I've asked. > > *WHAT* info? You have YET TO TELL ME WHAT THE CRAP YOU ACTUALLY NEED > from me. I've said many times I'm perfectly willing to give you logs > or run tests, but I'm not about to post a tarball of my entire drive > and output of every possible command I could ever run. I've given you 2 examples of what's generally needed. First example (yes, this URL again): http://lists.freebsd.org/pipermail/freebsd-fs/2013-March/016814.html Second example: http://lists.freebsd.org/pipermail/freebsd-fs/2013-January/016324.html Read what I've written in full in both of those posts please. Don't skim -- you'll see that I go "step by step" looking at certain things, noting what the kernel is showing on the console, and taking notes of what transpires each step of the way (including physical actions taken). Start with that, and if there's stuff omitted/missing then we can get that later. > For all the harping you do about "not enough info" you're just as > bad yourself. I see. > >I've gone so far as to give you an example of what to provide: > > > >http://lists.freebsd.org/pipermail/freebsd-fs/2013-March/016814.html > > The only thing there you ask for is a dmesg, which I subsequently > provided. Nowhere in that thread do you ask me to give you > *anything* else, besides your generic mantra of "more info". And > yes, I did read it again just now three times over to make sure. The > closest you come is: > > "This is why hard data/logs/etc. are necessary, and why > every single step of the way needs to be provided, including physical > tasks performed." > > ... but you still never told me WHICH logs or WHAT data you need. > I've already given you the steps I took re: removing drives, steps > which *you yourself* confirmed to express the problem. All I was able to confirm was the following, with regards to a permanently damaged pool in a non-recoverable state (ex. 3 disks lost in a raidz2 pool), taken from here: http://lists.freebsd.org/pipermail/freebsd-fs/2013-March/016814.html The results: * Loss of the 3rd disk does not show up in "zpool status" -- it still continues to show "ONLINE" state, but with incremented WRITE counters. As such, "zpool status" does work. * failmode=wait causes all I/O to the pool to block/wait indefinitely. This includes running processes or new processes doing I/O. This is by design. * failmode=continue causes existing processes with pending write I/O to the pool to return EIO (I/O error) and thus might (depends on the program and its behaviour) exit/kick you back to a shell. New processes issuing read I/O from the pool will block/wait. I did not test what happens with new processes that issue new write I/O to the pool. * Doing something like "ls -l /filesystem_on_that_pool" works, which seems a little strange to me -- possibly this is due to the VFS or underlying caching layers involved. * Re-insertion of one of the yanked (or "failed") disks does not result in CAM reporting disk insertion. This happens even *after* the CAM-related fixes that were committed in r247115. Thus "zpool replace" returns "cannot open 'xxx': no such GEOM provider" since the disk appears missing from /dev, however "camcontrol devlist" still shows it attached (and probably why ZFS still shows it as "ONLINE"). What you've stated happens differs from the above, and this is why I keep asking for you to please go step-by-step in reproducing your issue, provide all output (including commands you issue), and all physical tasks performed, plus what the console shows each step of the way. I'm sorry but that's the only way. > >I will again point to the 2nd-to-last paragraph of my above referenced > >mail. > > The "2nd-to-last paragraph" is: > > "So in summary: there seem to be multiple issues shown above, but I can > confirm that failmode=continue **does** pass EIO to *running* processes > that are doing I/O. Subsequent I/O, however, is questionable at this > time." > > Unless you're typing in a language other than english, that isn't > asking me jack shit. The paragraph I was referring to: "I'll end this Email with (hopefully) an educational statement: I hope my analysis shows you why very thorough, detailed output/etc. needs to be provided when reporting a problem, and not just some "general" description. This is why hard data/logs/etc. are necessary, and why every single step of the way needs to be provided, including physical tasks performed." > >Once concise details are given and (highly preferable!) a step-by-step > >way to reproduce the issue 100% of the time > > *YOU'VE ALREADY REPRODUCED THIS ON YOUR OWN MACHINE.* > > Seriously, wtf? No I haven't. My attempt to reproduce the issue/analysis is above. Some of the things you have happening I cannot reproduce. So can you please go through the same procedure/methodology and do the same write-up I did, but with your system? -- | Jeremy Chadwick jdc@koitsu.org | | UNIX Systems Administrator http://jdc.koitsu.org/ | | Mountain View, CA, US | | Making life hard for others since 1977. PGP 4BD6C0CB | From owner-freebsd-fs@FreeBSD.ORG Sat Apr 13 23:02:08 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 321FA225 for ; Sat, 13 Apr 2013 23:02:08 +0000 (UTC) (envelope-from quartz@sneakertech.com) Received: from relay01.pair.com (relay01.pair.com [209.68.5.15]) by mx1.freebsd.org (Postfix) with SMTP id CA14A1714 for ; Sat, 13 Apr 2013 23:02:07 +0000 (UTC) Received: (qmail 1687 invoked by uid 0); 13 Apr 2013 23:02:06 -0000 Received: from 173.48.104.62 (HELO ?10.2.2.1?) (173.48.104.62) by relay01.pair.com with SMTP; 13 Apr 2013 23:02:06 -0000 X-pair-Authenticated: 173.48.104.62 Message-ID: <5169E3ED.1000900@sneakertech.com> Date: Sat, 13 Apr 2013 19:02:05 -0400 From: Quartz User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.6; rv:10.0.2) Gecko/20120216 Thunderbird/10.0.2 MIME-Version: 1.0 To: Jeremy Chadwick Subject: Re: A failed drive causes system to hang References: <51672164.1090908@o2.pl> <20130411212408.GA60159@icarus.home.lan> <5168821F.5020502@o2.pl> <20130412220350.GA82467@icarus.home.lan> <516917CA.5040607@sneakertech.com> <20130413154130.GA877@icarus.home.lan> <5169C747.8030806@sneakertech.com> <20130413213630.GA6018@icarus.home.lan> In-Reply-To: <20130413213630.GA6018@icarus.home.lan> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: freebsd-fs@freebsd.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 13 Apr 2013 23:02:08 -0000 > This conflicts directly with your above statement. Yeah, I realized that after looking things over again. See the email I sent a few minutes ago. I'm going to do some testing to get a bead on exactly what does and doesn't fail, and when. It's a slow process though. > If you have other proof that indicates otherwise (such as non-ZFS > filesystems start also stalling/causing problems), please provide those > details. The main counter example I have of this is that I know I had it hang when trying to cd to my home directory. I've also had it hang when running random other commands I wouldn't expect to be zfs related, but I don't remember exactly all the things I typed and how long after I popped the drives it happened, so I'm going to try and figure out what the pattern is. >But as it stands, we don't even know what the "boot drive" > consists of (filesystems, etc.) because you haven't provided any of that > necessary information. Yes I did, on Tuesday in fact, but I'll repeat it here again since you obviously missed it. It's a single ufs disk that houses the entirety of the system. Root, var, home, /etc, swap partition... all of it. It's not any form of raid or dual boot or anything special, just a stock single-disk default install with no custom config. > What you've stated happens differs from the above, Everything you wrote is the same thing I'm seeing, which is why I considered you to have confirmed the base problem. The only things that differ are that "zpool status" and other zfs related commands are not guaranteed to work (or at least not indefinitely), and "all io on the boot drive works for me guaranteed" (which you didn't list here but I'm assuming is implied). > The paragraph I was referring to: [pedantic] Ok, that's the last paragraph. The thing after it is a postscript, which isn't counted. [/pedantic] > All I was able to reproduce was that I/O ***to the pool*** (once its > broken) stalls. I'm still waiting on you to go through the same > method/model I did here, providing all the data: > you'll see that I go "step by step" looking at certain things, > and this is why I > keep asking for you to please go step-by-step in reproducing your issue, > provide all output (including commands you issue), and all physical > tasks performed, plus what the console shows each step of the way. > So can you > please go through the same procedure/methodology and do the same > write-up I did, but with your system? Look, I can't read your mind any more than you can read mine. I can do "run command xyz -abc and tell me what FOO is set to", or "yank each drive in order by /dev ID, wait one second and run zpool status". I can't do "find all the important stuffs", and I'm not going to automatically assume that just because someone ran some test or series of commands that they silently expect me to do the same. If you wanted me to run through those exact steps in that exact order and give you the output, it would've been nice if you'd actually *said* that at some point. - Although you seem to want to disagree with me, you did confirm the main issue in that you're also seeing a zfs related hang. The point of contention is my observation that zfs commands and non-zfs-related io may also fail after some undefined period of time. It's not clear to me how long you had your test system up and if you waited long enough to see the same issues. Either way, I need to do more testing to try and figure out if there's a pattern here. ______________________________________ it has a certain smooth-brained appeal