From owner-freebsd-fs@FreeBSD.ORG Mon Dec 29 06:38:54 2008 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id F276A1065673 for ; Mon, 29 Dec 2008 06:38:53 +0000 (UTC) (envelope-from morganw@chemikals.org) Received: from warped.bluecherry.net (unknown [IPv6:2001:440:eeee:fffb::2]) by mx1.freebsd.org (Postfix) with ESMTP id 3334D8FC08 for ; Mon, 29 Dec 2008 06:38:53 +0000 (UTC) (envelope-from morganw@chemikals.org) Received: from volatile.chemikals.org (unknown [74.193.182.107]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by warped.bluecherry.net (Postfix) with ESMTPSA id F0989A34A0E7 for ; Mon, 29 Dec 2008 00:38:51 -0600 (CST) Received: from localhost (morganw@localhost [127.0.0.1]) by volatile.chemikals.org (8.14.3/8.14.3) with ESMTP id mBT6cndx005172 for ; Mon, 29 Dec 2008 00:38:49 -0600 (CST) (envelope-from morganw@chemikals.org) Date: Mon, 29 Dec 2008 00:38:49 -0600 (CST) From: Wes Morgan To: freebsd-fs@freebsd.org In-Reply-To: Message-ID: References: <9461581F-F354-486D-961D-3FD5B1EF007C@rabson.org> User-Agent: Alpine 2.00 (BSF 1167 2008-08-23) MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Subject: Re: zpool devices "stuck" (was zpool resilver restarting) X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 29 Dec 2008 06:38:54 -0000 On Sat, 27 Dec 2008, Wes Morgan wrote: > On Fri, 26 Dec 2008, Wes Morgan wrote: > >> On Fri, 26 Dec 2008, Wes Morgan wrote: >> >>> I just did a zpool replace on a new drive, and now it's resilvering. >>> Only, when it gets about 20mb resilvered it restarts. I can see all the >>> drive activity simply halting for a period then resuming in gstat. I see >>> some bugs in the opensolaris tracker about this, but no resolutions. It >>> doesn't seem to be related to calling "zpool status" because I can watch >>> gstat and see it restarting... Anyone seen this before, and hopefully have >>> a workaround...? >>> >>> The pool lost a drive on Wednesday and was running with a device missing, >>> however due to the device numbering changing on the scsi bus, I had to >>> export/import the pool to get it to come up, the same for after replacing >>> it. >> >> Replying to myself with some more information. zpool history -l -i shows >> the scrub loop happening: >> >> 2008-12-26.21:39:46 [internal pool scrub done txg:6463875] complete=0 [user >> root on volatile] >> 2008-12-26.21:39:46 [internal pool scrub txg:6463875] func=1 mintxg=3 >> maxtxg=6463720 [user root on volatile] >> 2008-12-26.21:41:23 [internal pool scrub done txg:6463879] complete=0 [user >> root on volatile] >> 2008-12-26.21:41:23 [internal pool scrub txg:6463879] func=1 mintxg=3 >> maxtxg=6463720 [user root on volatile] >> 2008-12-26.21:43:00 [internal pool scrub done txg:6463883] complete=0 [user >> root on volatile] >> 2008-12-26.21:43:00 [internal pool scrub txg:6463883] func=1 mintxg=3 >> maxtxg=6463720 [user root on volatile] >> 2008-12-26.21:44:38 [internal pool scrub done txg:6463887] complete=0 [user >> root on volatile] >> 2008-12-26.21:44:38 [internal pool scrub txg:6463887] func=1 mintxg=3 >> maxtxg=6463720 [user root on volatile] > > > It seems that the resliver and drive replacement were "fighting" each other > somehow. Detaching the new drive allowed the resilver to complete, but now > I'm stuck with two nonexistent devices trying to replace each other, and I > can't replace a device that is being replaced: > > replacing UNAVAIL 0 36.4K 0 insufficient > replicas > 17628927049345412941 FAULTED 0 0 0 was /dev/da4 > 5474360425105728553 FAULTED 0 0 0 was /dev/da4 > > errors: No known data errors > > So, how the heck do I cancel that replacement and restart it using /dev/da4? Ok, dear sweet mercy, I think I've dug myself out of the huge hole. I found a bug in the opensolaris tracker that is basically the same as my issue: http://bugs.opensolaris.org/view_bug.do?bug_id=6782540 So, I spent most of the weekend trying to figure out how to repair the damage. I ended up re-creating the actual zfs disk label for the 547xxx device and dumping that onto the drive. After some trouble with checksums, the system came back to life a few hours ago and I thought I was out of the woods when the resilver started up. However, I was not... I had simply got myself back into the resilver loop that I could not stop. Back to the drawing board... Using gvirstor, created a 500gb volume (with only 100gb available to back it), dumped the label of the 176xxxx device onto it, export/import and then the resilver starts back up. Checking gstat showed that the true device was not being written to at all, so I realized that it was going to try to resliver the 176 device first before doing the replacement. Not good... After some more floundering, I discovered that I could "zpool detach" the virstor volume, leaving me with only real devices in the pool. Except now it did not want to do a complete and true resilver, only resilvering a tiny bit of data, about 20mb or something. My wild guess is that it might have something to do with tgx id's and how the resilver tries to only do the data that is "new". Since there is no way (that I know of) to force a resilver with zpool, I simply started scrubbing the array. This would probably have worked, but it was going to take far too long, and was simply throwing up millions of checksum errors on the new drive. So I cancelled the scrub and figured I could just offline the drive and replace it with itself... Nope, no dice, it was reported as "busy". However, after mucking around with the label some more, I was able to finally get the drive to replace itself and start resilvering. Hopefully it will finish successfully. I'm still not sure what went wrong. Part of what happened seems to be related to scsi devices not being wired down like atapi devices, so successive reboots replaced "offline" devices with "faulted", and the pool kept trying to write to them, just generating more errors. Do the folks on the opensolaris zfs-discuss take reports from FreeBSD users, or do they just toss it back at you? I did actually boot an opensolaris live cd at one point, but it couldn't match the vdevs with devices well enough to import the pool. I don't think it would have handled it properly anyway, given the bug I found in their database. Hope no one ever has to deal with this themselves! Whew... From owner-freebsd-fs@FreeBSD.ORG Mon Dec 29 11:02:01 2008 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 5AB0A106564A for ; Mon, 29 Dec 2008 11:02:01 +0000 (UTC) (envelope-from gary.jennejohn@freenet.de) Received: from mout5.freenet.de (mout5.freenet.de [IPv6:2001:748:100:40::2:7]) by mx1.freebsd.org (Postfix) with ESMTP id DC4348FC1B for ; Mon, 29 Dec 2008 11:02:00 +0000 (UTC) (envelope-from gary.jennejohn@freenet.de) Received: from [195.4.92.20] (helo=10.mx.freenet.de) by mout5.freenet.de with esmtpa (ID gary.jennejohn@freenet.de) (port 25) (Exim 4.69 #73) id 1LHFsk-0007qz-Ph; Mon, 29 Dec 2008 12:01:58 +0100 Received: from tc267.t.pppool.de ([89.55.194.103]:40876 helo=ernst.jennejohn.org) by 10.mx.freenet.de with esmtpa (ID gary.jennejohn@freenet.de) (port 25) (Exim 4.69 #73) id 1LHFsk-0002Jy-IY; Mon, 29 Dec 2008 12:01:58 +0100 Date: Mon, 29 Dec 2008 12:01:55 +0100 From: Gary Jennejohn To: Wes Morgan Message-ID: <20081229120155.224a34b6@ernst.jennejohn.org> In-Reply-To: References: <9461581F-F354-486D-961D-3FD5B1EF007C@rabson.org> X-Mailer: Claws Mail 3.5.0 (GTK+ 2.12.11; amd64-portbld-freebsd8.0) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Cc: freebsd-fs@freebsd.org Subject: Re: zpool devices "stuck" (was zpool resilver restarting) X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: gary.jennejohn@freenet.de List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 29 Dec 2008 11:02:01 -0000 On Mon, 29 Dec 2008 00:38:49 -0600 (CST) Wes Morgan wrote: > I'm still not sure what went wrong. Part of what happened seems to be > related to scsi devices not being wired down like atapi devices, so > successive reboots replaced "offline" devices with "faulted", and the pool > kept trying to write to them, just generating more errors. > This is probably irrelevant now, but it is possible to wire down SCSI devices in /boot/device.hints. I had this in there when I was still using SCSI: hint.scbus.0.at="ahc0" hint.scbus.0.bus="0" hint.da.0.at="scbus0" hint.da.0.target="8" hint.da.0.unit="0" hint.da.1.at="scbus0" hint.da.1.target="10" hint.da.1.unit="0" hint.da.2.at="scbus0" hint.da.2.target="12" hint.da.2.unit="0" hint.da.3.at="scbus0" hint.da.3.target="14" hint.da.3.unit="0" hint.da.4.at="scbus0" hint.da.4.target="1" hint.da.4.unit="0" --- Gary Jennejohn From owner-freebsd-fs@FreeBSD.ORG Mon Dec 29 11:06:54 2008 Return-Path: Delivered-To: freebsd-fs@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 8D16A1065673 for ; Mon, 29 Dec 2008 11:06:54 +0000 (UTC) (envelope-from owner-bugmaster@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:4f8:fff6::28]) by mx1.freebsd.org (Postfix) with ESMTP id 72FE88FC23 for ; Mon, 29 Dec 2008 11:06:54 +0000 (UTC) (envelope-from owner-bugmaster@FreeBSD.org) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.3/8.14.3) with ESMTP id mBTB6sPd024431 for ; Mon, 29 Dec 2008 11:06:54 GMT (envelope-from owner-bugmaster@FreeBSD.org) Received: (from gnats@localhost) by freefall.freebsd.org (8.14.3/8.14.3/Submit) id mBTB6rib024427 for freebsd-fs@FreeBSD.org; Mon, 29 Dec 2008 11:06:53 GMT (envelope-from owner-bugmaster@FreeBSD.org) Date: Mon, 29 Dec 2008 11:06:53 GMT Message-Id: <200812291106.mBTB6rib024427@freefall.freebsd.org> X-Authentication-Warning: freefall.freebsd.org: gnats set sender to owner-bugmaster@FreeBSD.org using -f From: FreeBSD bugmaster To: freebsd-fs@FreeBSD.org Cc: Subject: Current problem reports assigned to freebsd-fs@FreeBSD.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 29 Dec 2008 11:06:54 -0000 Note: to view an individual PR, use: http://www.freebsd.org/cgi/query-pr.cgi?pr=(number). The following is a listing of current problems submitted by FreeBSD users. These represent problem reports covering all versions including experimental development code and obsolete releases. S Tracker Resp. Description -------------------------------------------------------------------------------- o kern/129760 fs [nfs] after 'umount -f' of a stale NFS share FreeBSD l o kern/129231 fs [ufs] [patch] New UFS mount (norandom) option - mostly o kern/129174 fs [nfs][zfs][panic] NFS v3 Panic when under high load ex o kern/129152 fs [panic] non-userfriendly panic when trying to mount(8) o kern/129084 fs [udf] [panic] udf panic: getblk: size(67584) > MAXBSIZ f kern/128829 fs smbd(8) causes periodic panic on 7-RELEASE o kern/128633 fs [zfs] [lor] lock order reversal in zfs o kern/128514 fs [zfs] [mpt] problems with ZFS and LSILogic SAS/SATA Ad o kern/128173 fs [ext2fs] ls gives "Input/output error" on mounted ext3 o kern/127420 fs [gjournal] [panic] Journal overflow on gmirrored gjour o kern/127213 fs [tmpfs] sendfile on tmpfs data corruption o kern/127029 fs [panic] mount(8): trying to mount a write protected zi o kern/126287 fs [ufs] [panic] Kernel panics while mounting an UFS file o kern/125536 fs [ext2fs] ext 2 mounts cleanly but fails on commands li o kern/125149 fs [nfs][panic] changing into .zfs dir from nfs client ca o kern/124621 fs [ext3] [patch] Cannot mount ext2fs partition o kern/122888 fs [zfs] zfs hang w/ prefetch on, zil off while running t o bin/122172 fs [fs]: amd(8) automount daemon dies on 6.3-STABLE i386, o bin/121072 fs [smbfs] mount_smbfs(8) cannot normally convert the cha o bin/118249 fs mv(1): moving a directory changes its mtime o kern/116170 fs [panic] Kernel panic when mounting /tmp o kern/114955 fs [cd9660] [patch] [request] support for mask,dirmask,ui o kern/114847 fs [ntfs] [patch] [request] dirmask support for NTFS ala o kern/114676 fs [ufs] snapshot creation panics: snapacct_ufs2: bad blo o bin/114468 fs [patch] [request] add -d option to umount(8) to detach o bin/113838 fs [patch] [request] mount(8): add support for relative p o bin/113049 fs [patch] [request] make quot(8) use getopt(3) and show o kern/112658 fs [smbfs] [patch] smbfs and caching problems (resolves b o kern/93942 fs [vfs] [patch] panic: ufs_dirbad: bad dir (patch from D 29 problems total. From owner-freebsd-fs@FreeBSD.ORG Tue Dec 30 01:39:04 2008 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 123041065676 for ; Tue, 30 Dec 2008 01:39:04 +0000 (UTC) (envelope-from bryanalves@gmail.com) Received: from rv-out-0506.google.com (rv-out-0506.google.com [209.85.198.230]) by mx1.freebsd.org (Postfix) with ESMTP id DBCF98FC14 for ; Tue, 30 Dec 2008 01:39:03 +0000 (UTC) (envelope-from bryanalves@gmail.com) Received: by rv-out-0506.google.com with SMTP id b25so6227973rvf.43 for ; Mon, 29 Dec 2008 17:39:03 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:message-id:date:from:to :subject:mime-version:content-type; bh=fwdTIsa7mhCkgYr34mt2LX4KSn2Sw/D/LxK144rF0Rc=; b=SbDt0wIy68xtAtvJ8nGTWuxB4gmDme3gpJUos9l1KBu0mI2br14PPmdubD1YniiFI+ +QiTXRQcGjH1VzoTd7SpHkMXjwCL1WZLQJqHPpcBwJTrU8pepG0vftsi8t+4tsbcL6pt gGYJNvXxoiSvCVTQAFlwRIqH+mM5oZkPznwZA= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=message-id:date:from:to:subject:mime-version:content-type; b=yAkepm9c/rQ4JU4zQYXdKhDeI/Sr+qjiJ9OAKtlBlLSpbqW7GIR37hAwBODRJheYn/ qEhcLGPGWypczgudLFL9xSkkESa4wt1lj/T1ZkNAAaQ8oThmp98rVj/0rMCOsUbqwyqt Bp8tVkJAyxx9+Ptvh2WQa63AES7jzYkJ4PcXQ= Received: by 10.114.74.18 with SMTP id w18mr9396698waa.40.1230601143535; Mon, 29 Dec 2008 17:39:03 -0800 (PST) Received: by 10.114.103.20 with HTTP; Mon, 29 Dec 2008 17:39:03 -0800 (PST) Message-ID: <92f477740812291739o7c0b840bsd1cce4375577c41f@mail.gmail.com> Date: Mon, 29 Dec 2008 20:39:03 -0500 From: "Bryan Alves" To: freebsd-fs@freebsd.org MIME-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Content-Disposition: inline X-Content-Filtered-By: Mailman/MimeDel 2.1.5 Subject: NFS locking problems with 7.0-RELEASE X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 30 Dec 2008 01:39:04 -0000 I'm running a FreeBSD Server (7.0-RELEASE, latest patchlevel, problem has existed on previous patchlevels). Running an NFS server, with statd and lockd. Client is an Ubuntu 8.10 machine. Of note is that the FreeBSD server (in a home environment) is also running PF and doing the packet filtering for the house. When I export my home directory and mount it on my linux client, I run into all sorts of problems with file locking. The biggest problem is the inability to run firefox. When stracing an execution of firefox, execution hangs when opening it's .parentlock file for F_GETLK. I also notice messages in /var/log/messages on the client on occasion: Dec 29 20:08:01 balves-ubuntu-desktop kernel: [ 5430.560020] lockd: server 192.168.10.1 not responding, still trying Dec 29 20:08:28 balves-ubuntu-desktop kernel: [ 5457.560725] lockd: server 192.168.10.1 OK 192.168.10.1 is the internal address for the FreeBSD server. Nothing related to NFS appears in /var/log/messages on the FreeBSD server. I've made sure to turn off scrubbing for PF on internal interfaces, because of it's problems with NFS. Of note is that things that don't need locks (for example, video playback with some players, music playback, etc), works fine on the nfs mount. I have a device in my living room (a popcorn hour), that connects to the FreeBSD server and streams via NFS without issue. The only problems I've come across occur with file locks. Restarting various services (rpc, statd, lockd, nfsd) on the server doesn't help, neither does remounting. Rebooting doesn't help either. The only thing that makes the mount usable is using samba instead of nfs. This is unfortunate because samba is much slower on my network (> 20 MB/s drop in throughput using samba instead of NFS). Here is my pf.conf for those of you who want to verify that i've turned off scrubbing correctly: ===BEGIN pf.conf=== ext_if = "em1" int_if = "em0" localnet = $int_if:network torrent_ports = "57100:57199" web_ports = "81" vpn_ports = "1723" gateway = "192.168.10.1" httpd_jail = "192.168.10.200" samba_jail = "192.168.10.201" slimserver_jail = "192.168.10.202" torrent_jail = "192.168.10.203" set skip on { lo0 } set loginterface $ext_if #scrub in all scrub in on $ext_if altq on $ext_if bandwidth 4500Kb hfsc queue { q_high, q_med, q_low } queue q_high bandwidth 25% priority 6 qlimit 250 hfsc queue q_med bandwidth 45% priority 4 qlimit 250 hfsc (default) queue q_low bandwidth 30% priority 3 qlimit 250 hfsc nat on $ext_if from $localnet to any -> ($ext_if) #Port Forwards rdr on $ext_if proto tcp from any to any port ssh -> $gateway port ssh rdr on $ext_if proto tcp from any to any port $web_ports -> $httpd_jail port $web_ports rdr on $ext_if proto tcp from any to any port $torrent_ports -> $torrent_jail port $torrent_ports #Nat Reflection rdr on $int_if proto tcp from $localnet to $ext_if port ssh -> $gateway rdr on $int_if proto tcp from $localnet to $ext_if port $web_ports -> $httpd_jail no nat on $int_if proto tcp from $int_if to $localnet nat on $int_if proto tcp from $localnet to $gateway port ssh -> $int_if nat on $int_if proto tcp from $localnet to $httpd_jail port $web_ports -> $int_if antispoof for $ext_if block all #In on ext_if pass in on $ext_if proto tcp from any to any port $web_ports keep state queue (q_high) pass in on $ext_if proto { tcp, udp } from any to $torrent_jail keep state queue (q_low) pass in on $ext_if proto tcp from any to port ssh modulate state queue (q_high) pass in on $ext_if proto gre from any to any keep state queue (q_high) pass in on $ext_if proto tcp from any to any port $vpn_ports keep state queue (q_high) #Out on ext_if pass out on $ext_if proto tcp all modulate state queue (q_med) pass out on $ext_if proto { udp, icmp } all keep state queue (q_med) pass out on $ext_if proto gre all keep state queue (q_high) pass out on $ext_if proto tcp from $torrent_jail to any keep state queue (q_low) #Allow all LAN traffic pass in on $int_if from $localnet to any keep state pass out on $int_if from any to $localnet keep state ===END pf.conf=== I realize that the linux NFS client implementation isn't spectacular, but the same ubuntu setup works when connected to a netapp, which leads me to believe that the problem is with the freebsd nfs server implementation. If anyone can suggest some additional troubleshooting steps to provide some more information, or propose some suggested solutions, it would be appreciated. --Bryan From owner-freebsd-fs@FreeBSD.ORG Tue Dec 30 19:36:03 2008 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id EA5D5106564A for ; Tue, 30 Dec 2008 19:36:03 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) Received: from gigi.cs.uoguelph.ca (gigi.cs.uoguelph.ca [131.104.94.210]) by mx1.freebsd.org (Postfix) with ESMTP id 8536B8FC1E for ; Tue, 30 Dec 2008 19:36:03 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) Received: from muncher.cs.uoguelph.ca (muncher.cs.uoguelph.ca [131.104.91.102]) by gigi.cs.uoguelph.ca (8.13.1/8.13.1) with ESMTP id mBUIwO5l017317; Tue, 30 Dec 2008 13:58:24 -0500 Received: from localhost (rmacklem@localhost) by muncher.cs.uoguelph.ca (8.11.7p3+Sun/8.11.6) with ESMTP id mBUJ0c518468; Tue, 30 Dec 2008 14:00:38 -0500 (EST) X-Authentication-Warning: muncher.cs.uoguelph.ca: rmacklem owned process doing -bs Date: Tue, 30 Dec 2008 14:00:38 -0500 (EST) From: Rick Macklem X-X-Sender: rmacklem@muncher.cs.uoguelph.ca To: Bryan Alves In-Reply-To: <92f477740812291739o7c0b840bsd1cce4375577c41f@mail.gmail.com> Message-ID: References: <92f477740812291739o7c0b840bsd1cce4375577c41f@mail.gmail.com> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed X-Scanned-By: MIMEDefang 2.63 on 131.104.94.210 Cc: freebsd-fs@freebsd.org Subject: Re: NFS locking problems with 7.0-RELEASE X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 30 Dec 2008 19:36:04 -0000 On Mon, 29 Dec 2008, Bryan Alves wrote: [stuff snipped] > When I export my home directory and mount it on my linux client, I run into > all sorts of problems with file locking. The biggest problem is the > inability to run firefox. When stracing an execution of firefox, execution > hangs when opening it's .parentlock file for F_GETLK. I also notice > messages in /var/log/messages on the client on occasion: > I can't help w.r.t. getting the NKM to work (I've always thought that the NLM protocol was a crock and avoided using it). But, here's a couple of things you could try to avoid using the NLM. - Do the Linux mount with the "nolock" option. (If Ubuntu has a "locallock" option, that would be even better, but I'm not sure what options recent Linux's have for nfs mounts.) - Download my server patches (ftp.cis.uoguelph.ca/pub/nfsv4/FreeBSD7) and switch to using nfsv4, which has integral locking in the protocol. Have a good holiday, rick From owner-freebsd-fs@FreeBSD.ORG Tue Dec 30 20:46:06 2008 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 34301106566C for ; Tue, 30 Dec 2008 20:46:06 +0000 (UTC) (envelope-from bryanalves@gmail.com) Received: from wa-out-1112.google.com (wa-out-1112.google.com [209.85.146.177]) by mx1.freebsd.org (Postfix) with ESMTP id 035648FC16 for ; Tue, 30 Dec 2008 20:46:05 +0000 (UTC) (envelope-from bryanalves@gmail.com) Received: by wa-out-1112.google.com with SMTP id m34so3035808wag.27 for ; Tue, 30 Dec 2008 12:46:05 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:message-id:date:from:to :subject:in-reply-to:mime-version:content-type:references; bh=jtDhu8DTeQuZSwCDhSHwxZY5uwBHbq2CG7Q+xqMAmX0=; b=sxMF+gcJpN/79WI6c8zB83VdbBmWNy5lWkyVtCdQtvmROgQ0CagpwIFWhxc8bQaA5X qJvz83K23D0VCz07IWvktdllB7YHtbaBXekPNDzVATx3Nf0BfHTpY6s6vL3cULJEVf2z 5RsDQ1ihpcEDpX7n2rw4m7VktlQpxT9Q36OoI= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=message-id:date:from:to:subject:in-reply-to:mime-version :content-type:references; b=kktyRwDj9bLrPx9bJkYao425bBY41vOzvydcXeRXHSi+9zpuyhqTBzFqwwF6JGNkw5 CqjJL3JwI2ERmIWHskI7QFo3Jep23SCKa+7CIMU/MorZ3ZM97rVdrl2IqZJGHBX9W1OO 39L4/HNQUGHVsVRlSziFcTDhYK8xgzE1Uw4L8= Received: by 10.114.103.4 with SMTP id a4mr10002065wac.91.1230669965303; Tue, 30 Dec 2008 12:46:05 -0800 (PST) Received: by 10.114.103.20 with HTTP; Tue, 30 Dec 2008 12:46:05 -0800 (PST) Message-ID: <92f477740812301246k7ed77511oc969c22a3b5aad4d@mail.gmail.com> Date: Tue, 30 Dec 2008 15:46:05 -0500 From: "Bryan Alves" To: freebsd-fs@freebsd.org In-Reply-To: MIME-Version: 1.0 References: <92f477740812291739o7c0b840bsd1cce4375577c41f@mail.gmail.com> Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Content-Disposition: inline X-Content-Filtered-By: Mailman/MimeDel 2.1.5 Subject: Re: NFS locking problems with 7.0-RELEASE X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 30 Dec 2008 20:46:06 -0000 On Tue, Dec 30, 2008 at 2:00 PM, Rick Macklem wrote: > > > On Mon, 29 Dec 2008, Bryan Alves wrote: > > [stuff snipped] > >> When I export my home directory and mount it on my linux client, I run >> into >> all sorts of problems with file locking. The biggest problem is the >> inability to run firefox. When stracing an execution of firefox, >> execution >> hangs when opening it's .parentlock file for F_GETLK. I also notice >> messages in /var/log/messages on the client on occasion: >> >> I can't help w.r.t. getting the NKM to work (I've always thought that the > NLM protocol was a crock and avoided using it). But, here's a couple of > things you could try to avoid using the NLM. > > - Do the Linux mount with the "nolock" option. (If Ubuntu has a > "locallock" option, that would be even better, but I'm not sure what > options recent Linux's have for nfs mounts.) > > - Download my server patches (ftp.cis.uoguelph.ca/pub/nfsv4/FreeBSD7) and > switch to using nfsv4, which has integral locking in the protocol. > > Have a good holiday, rick > > Is there another location where I can get the nfs4 patches? That FTP seems to be down. Also, outside the scope of this list, but since the discussion is opened, I might as well ask: If this NFS is the only remote mount that involves writing (it's opened read-only in other locations), and it's read/write locally, is it safe to use local locking? --Bryan From owner-freebsd-fs@FreeBSD.ORG Tue Dec 30 21:13:20 2008 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id A63C5106564A for ; Tue, 30 Dec 2008 21:13:20 +0000 (UTC) (envelope-from grafan@gmail.com) Received: from mail-bw0-f19.google.com (mail-bw0-f19.google.com [209.85.218.19]) by mx1.freebsd.org (Postfix) with ESMTP id E9DFA8FC0C for ; Tue, 30 Dec 2008 21:13:19 +0000 (UTC) (envelope-from grafan@gmail.com) Received: by bwz12 with SMTP id 12so14617546bwz.19 for ; Tue, 30 Dec 2008 13:13:18 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:message-id:date:from:to :subject:cc:in-reply-to:mime-version:content-type :content-transfer-encoding:content-disposition:references; bh=9q/y8gn+QXqZxUl5JaNrWfyCcUx8Rt8LiPGEoDMhJQ8=; b=dD4TG1z/F5F42xq73tmHrZqhsWY8E3zHVJry5wByHvH040je+XaI8/FdFqePlKeXjY xgDUWCrYuIutzt3wSjAJQR+5t+KJ2B7g4GhKw+nAT5s86XlKC++S3bOASwUMDgSop4ES 05aNWVoqcvG0KyMpI+OraefwdmXZrGEd+C18A= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=message-id:date:from:to:subject:cc:in-reply-to:mime-version :content-type:content-transfer-encoding:content-disposition :references; b=F2dU3U4OI43aKstMQQAfGn/tI6Zj5tNxrcSqXYiIzSF4MlfMTBNGUE9gunVnlpteee xt0FJi/mOUtfoALPvrNvnqqrBeSh88GnMjYOGFOd15HIVdrx9bGeb2DWdCVVUPinEhbv eeN3JECIwvsryQcZWnOmis+LK8sPol3v8wV3k= Received: by 10.223.112.201 with SMTP id x9mr11050146fap.69.1230670062264; Tue, 30 Dec 2008 12:47:42 -0800 (PST) Received: by 10.223.104.2 with HTTP; Tue, 30 Dec 2008 12:47:42 -0800 (PST) Message-ID: <6eb82e0812301247uaf5eb45v529765e29220fd80@mail.gmail.com> Date: Wed, 31 Dec 2008 04:47:42 +0800 From: "Rong-en Fan" To: "Bryan Alves" In-Reply-To: <92f477740812291739o7c0b840bsd1cce4375577c41f@mail.gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit Content-Disposition: inline References: <92f477740812291739o7c0b840bsd1cce4375577c41f@mail.gmail.com> Cc: freebsd-fs@freebsd.org Subject: Re: NFS locking problems with 7.0-RELEASE X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 30 Dec 2008 21:13:20 -0000 On Tue, Dec 30, 2008 at 9:39 AM, Bryan Alves wrote: > I'm running a FreeBSD Server (7.0-RELEASE, latest patchlevel, problem has > existed on previous patchlevels). Running an NFS server, with statd and > lockd. Client is an Ubuntu 8.10 machine. Of note is that the FreeBSD > server (in a home environment) is also running PF and doing the packet > filtering for the house. > > When I export my home directory and mount it on my linux client, I run into > all sorts of problems with file locking. The biggest problem is the > inability to run firefox. When stracing an execution of firefox, execution > hangs when opening it's .parentlock file for F_GETLK. I also notice > messages in /var/log/messages on the client on occasion: > [...] > > I realize that the linux NFS client implementation isn't spectacular, but > the same ubuntu setup works when connected to a netapp, which leads me to > believe that the problem is with the freebsd nfs server implementation. > > If anyone can suggest some additional troubleshooting steps to provide some > more information, or propose some suggested solutions, it would be > appreciated. You may want to upgrade to latest RELENG_7 and use the rewrote lockd in kernel space (w/ NFS_LOCKD in your kernel configuration). Regards, Rong-En Fan From owner-freebsd-fs@FreeBSD.ORG Wed Dec 31 05:09:06 2008 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 1B4E81065676 for ; Wed, 31 Dec 2008 05:09:06 +0000 (UTC) (envelope-from bryanalves@gmail.com) Received: from wa-out-1112.google.com (wa-out-1112.google.com [209.85.146.176]) by mx1.freebsd.org (Postfix) with ESMTP id E07408FC17 for ; Wed, 31 Dec 2008 05:09:01 +0000 (UTC) (envelope-from bryanalves@gmail.com) Received: by wa-out-1112.google.com with SMTP id m34so3119404wag.27 for ; Tue, 30 Dec 2008 21:09:01 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:message-id:date:from:to :subject:in-reply-to:mime-version:content-type:references; bh=v2WI/GPd9oZ/e41GJlBBjDQhdjRKfQ5D4Uk7fI8OXaU=; b=HE4WecmKNWvy8G6iVFIjgXDWwdDuGhQJGzC3RfcHCdGiLL2tNAR3bS3QIFYxqnPe8G asRxiwtx1ZAiA8B005PyumYzu8BHYdR22LUp0wkKW7PvM5XMoBVhGFNBzEy53Q8btv3c L/GSCURUKsv/SsURI5xapX1ZDRUsrhOQAUKbY= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=message-id:date:from:to:subject:in-reply-to:mime-version :content-type:references; b=asumBUxbjPP0EdWYsktKHgUe/YH8MjPBwTdRmdCEOhhtsvaLEIRBhKZyqU+XO+95Xp 4UJwG88lo0mY2H8b+v7535QboShZLcjXVDt/qZbwUjGiOVu5lfAEt4T8fqU5vwl3yO9m pUjxax/VU/qkIuH7fh4dkvC+/5ITj6ATEqX9s= Received: by 10.114.169.20 with SMTP id r20mr10259614wae.110.1230700140933; Tue, 30 Dec 2008 21:09:00 -0800 (PST) Received: by 10.114.155.13 with HTTP; Tue, 30 Dec 2008 21:09:00 -0800 (PST) Message-ID: <92f477740812302109n78d9f303y5c49b8ca6ab082c5@mail.gmail.com> Date: Wed, 31 Dec 2008 00:09:00 -0500 From: "Bryan Alves" To: freebsd-fs@freebsd.org In-Reply-To: <6eb82e0812301247uaf5eb45v529765e29220fd80@mail.gmail.com> MIME-Version: 1.0 References: <92f477740812291739o7c0b840bsd1cce4375577c41f@mail.gmail.com> <6eb82e0812301247uaf5eb45v529765e29220fd80@mail.gmail.com> Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Content-Disposition: inline X-Content-Filtered-By: Mailman/MimeDel 2.1.5 Subject: Re: NFS locking problems with 7.0-RELEASE X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 31 Dec 2008 05:09:06 -0000 On Tue, Dec 30, 2008 at 3:47 PM, Rong-en Fan wrote: > On Tue, Dec 30, 2008 at 9:39 AM, Bryan Alves wrote: > > I'm running a FreeBSD Server (7.0-RELEASE, latest patchlevel, problem has > > existed on previous patchlevels). Running an NFS server, with statd and > > lockd. Client is an Ubuntu 8.10 machine. Of note is that the FreeBSD > > server (in a home environment) is also running PF and doing the packet > > filtering for the house. > > > > When I export my home directory and mount it on my linux client, I run > into > > all sorts of problems with file locking. The biggest problem is the > > inability to run firefox. When stracing an execution of firefox, > execution > > hangs when opening it's .parentlock file for F_GETLK. I also notice > > messages in /var/log/messages on the client on occasion: > > > [...] > > > > I realize that the linux NFS client implementation isn't spectacular, but > > the same ubuntu setup works when connected to a netapp, which leads me to > > believe that the problem is with the freebsd nfs server implementation. > > > > If anyone can suggest some additional troubleshooting steps to provide > some > > more information, or propose some suggested solutions, it would be > > appreciated. > > You may want to upgrade to latest RELENG_7 and use > the rewrote lockd in kernel space (w/ NFS_LOCKD in your > kernel configuration). > > Regards, > Rong-En Fan Where can I find more information about in kernel NFS_LOCKD? It doesn't seem to exist on google at all, and I'm hesistant to upgrade from RELEASE without doing due dilligence in terms of research. Is rpc.lockd_enable still required in rc.conf when using this? Do I need to do anything else besides update to RELENG_7 and installworld/installkernel with this new option? From owner-freebsd-fs@FreeBSD.ORG Wed Dec 31 10:26:37 2008 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 06508106564A for ; Wed, 31 Dec 2008 10:26:37 +0000 (UTC) (envelope-from rwatson@FreeBSD.org) Received: from cyrus.watson.org (cyrus.watson.org [65.122.17.42]) by mx1.freebsd.org (Postfix) with ESMTP id D0C858FC0C for ; Wed, 31 Dec 2008 10:26:36 +0000 (UTC) (envelope-from rwatson@FreeBSD.org) Received: from fledge.watson.org (fledge.watson.org [65.122.17.41]) by cyrus.watson.org (Postfix) with ESMTP id 443B146B03; Wed, 31 Dec 2008 05:26:36 -0500 (EST) Date: Wed, 31 Dec 2008 10:26:36 +0000 (GMT) From: Robert Watson X-X-Sender: robert@fledge.watson.org To: Bryan Alves In-Reply-To: <92f477740812302109n78d9f303y5c49b8ca6ab082c5@mail.gmail.com> Message-ID: References: <92f477740812291739o7c0b840bsd1cce4375577c41f@mail.gmail.com> <6eb82e0812301247uaf5eb45v529765e29220fd80@mail.gmail.com> <92f477740812302109n78d9f303y5c49b8ca6ab082c5@mail.gmail.com> User-Agent: Alpine 2.00 (BSF 1167 2008-08-23) MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: freebsd-fs@freebsd.org Subject: Re: NFS locking problems with 7.0-RELEASE X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 31 Dec 2008 10:26:37 -0000 On Wed, 31 Dec 2008, Bryan Alves wrote: > Where can I find more information about in kernel NFS_LOCKD? It doesn't > seem to exist on google at all, and I'm hesistant to upgrade from RELEASE > without doing due dilligence in terms of research. I'm not sure there's a specific web page/etc on it, but you can find the initial patch announcement here: http://lists.freebsd.org/pipermail/freebsd-current/2008-March/084446.html I believe the "-k" described in the original post is no longer required. > Is rpc.lockd_enable still required in rc.conf when using this? Hmm. I believe so. > Do I need to do anything else besides update to RELENG_7 and > installworld/installkernel with this new option? You can do the normal upgrade -- build world, kernel, install kernel, reboot, mergemaster -p, installworld, full mergemaster, reboot. Or you can wait another week and install FreeBSD 7.1-RELEASE, if you want to be running on a release rather than doing incremental updates along the branch. Robert N M Watson Computer Laboratory University of Cambridge From owner-freebsd-fs@FreeBSD.ORG Wed Dec 31 19:13:53 2008 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id E57B7106564A for ; Wed, 31 Dec 2008 19:13:53 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) Received: from aeryn.cs.uoguelph.ca (aeryn.cs.uoguelph.ca [131.104.20.160]) by mx1.freebsd.org (Postfix) with ESMTP id A47128FC1B for ; Wed, 31 Dec 2008 19:13:53 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) Received: from muncher.cs.uoguelph.ca (muncher.cs.uoguelph.ca [131.104.91.102]) by aeryn.cs.uoguelph.ca (8.13.1/8.13.1) with ESMTP id mBVJDqDh002301; Wed, 31 Dec 2008 14:13:52 -0500 Received: from localhost (rmacklem@localhost) by muncher.cs.uoguelph.ca (8.11.7p3+Sun/8.11.6) with ESMTP id mBVJGAs20550; Wed, 31 Dec 2008 14:16:10 -0500 (EST) X-Authentication-Warning: muncher.cs.uoguelph.ca: rmacklem owned process doing -bs Date: Wed, 31 Dec 2008 14:16:10 -0500 (EST) From: Rick Macklem X-X-Sender: rmacklem@muncher.cs.uoguelph.ca To: Bryan Alves In-Reply-To: <92f477740812301246k7ed77511oc969c22a3b5aad4d@mail.gmail.com> Message-ID: References: <92f477740812291739o7c0b840bsd1cce4375577c41f@mail.gmail.com> <92f477740812301246k7ed77511oc969c22a3b5aad4d@mail.gmail.com> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed X-Scanned-By: MIMEDefang 2.63 on 131.104.20.161 Cc: freebsd-fs@freebsd.org Subject: Re: NFS locking problems with 7.0-RELEASE X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 31 Dec 2008 19:13:54 -0000 On Tue, 30 Dec 2008, Bryan Alves wrote: >> - Download my server patches (ftp.cis.uoguelph.ca/pub/nfsv4/FreeBSD7) and >> switch to using nfsv4, which has integral locking in the protocol. >> >> Have a good holiday, rick >> >> > Is there another location where I can get the nfs4 patches? That FTP seems > to be down. > Seems to be working here. Just "ftp ftp.cis.uoguelph.ca", login "anonymous", then "cd pub/nfsv4/FreeBSD7". (Is it that you can't find the machine? It's IP# is 131.104.48.112.) > Also, outside the scope of this list, but since the discussion is opened, I > might as well ask: > > If this NFS is the only remote mount that involves writing (it's opened > read-only in other locations), and it's read/write locally, is it safe to > use local locking? > Yes, I believe so. Even if there are multiple clients rw mounting a file system, local locking should be fine unless there are multiple clients writing the same file in the file system. (With a single writer and multiple readers, an application might run into coherency problems if that application was written to use byte range locking to maintain coherency (ie. most recently written data visible to the readers), but that seems unlikely to matter for most applications/environments. (And I'm not sure if the NLM is wired into NFS is such a way as to maintain full coherency for the locked byte ranges anyhow, since normally NFS does not maintain full coherency?) Have a happy new years, rick