From owner-freebsd-fs@freebsd.org Sun Jun 28 15:37:59 2015 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id E5A9398FE53 for ; Sun, 28 Jun 2015 15:37:59 +0000 (UTC) (envelope-from johan.ponin.pro@gmail.com) Received: from mail-la0-x22d.google.com (mail-la0-x22d.google.com [IPv6:2a00:1450:4010:c03::22d]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 72B29151B for ; Sun, 28 Jun 2015 15:37:59 +0000 (UTC) (envelope-from johan.ponin.pro@gmail.com) Received: by lacny3 with SMTP id ny3so100322920lac.3 for ; Sun, 28 Jun 2015 08:37:57 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:date:message-id:subject:from:to:content-type; bh=Yun7BDK33aVVxpAGyo19b5VOu3W8oTphK5Rlbjp9Qj4=; b=VVOC+qGtXxxOWxJOS69qo5SWqzgK3XkLfxVAaqQRGZLqQGFVeSiv2O/9IfM6nq54Nf MvMsn6oNgojm3psmM01v2JPFJGHvjO8hyento9+Xgsc3V7SK1tyE+GdTaFi1ZBRYb+fC OXszXwKpWZQ8iRJvmE4K8jfR2KRrKs6E1uSPNp5j0q8oXYfiiLjeVz5gGTv8D4dToRMF 9mSVIUtzf5c7/rh+xibtPibq2mgU9/wY4ekEK5VEXUdf09abW+dyNlA7MiAA6/PJseQ2 pG04hLU0qqNxbtEfNMB7G2EZ22cFzpZCl1KRn7Bh1/sz2bY3CMYol9GfQ/CKlUY+rSJB mcbA== MIME-Version: 1.0 X-Received: by 10.112.186.35 with SMTP id fh3mr10146229lbc.82.1435505877291; Sun, 28 Jun 2015 08:37:57 -0700 (PDT) Received: by 10.152.29.35 with HTTP; Sun, 28 Jun 2015 08:37:57 -0700 (PDT) Date: Sun, 28 Jun 2015 17:37:57 +0200 Message-ID: Subject: Lost pool after nas4free configuration asynchrony. [x-post] From: Johan PONIN To: freebsd-fs@freebsd.org Content-Type: text/plain; charset=UTF-8 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 28 Jun 2015 15:38:00 -0000 pre-disclaimer, I sent this onto zfs-discuss before realizing it was a zfsonlinux ML, so I'm copying the mail here. Hi, Disclaimer, this is my first NAS, I barely know FreeBSD, have a computer engineer degree though and archlinux as a daily driver. Consider this a mostly PEBCAK story. I hope you'll read through nonetheless. Setup - NAS4Free from a 4GB usbkey on HP N54L. - 3x1TB WD Red as a vdev - 1 pool, 1 dataset, raidz (single parity AFAICR). The usbkey failed, nas4free config was lost. I `dd` it onto a new 8GB usbkey, boot and got to the console to reset factory config to gain access to the webgui (I remember doing that once causing no issue with the zfs pools and datasets) The webgui showed no more pool/dataset, but running /etc/rc.d/zfs start mounted everything fine. zfs scrub reported no errors. Data was there, a few `less /mnt/../file` showing good data. Trying to recreate the pool config in the gui to avoid mounting by hand, I got some errors, but the GUI told me it detected a pool config, offered to synchronize but I was afraid not knowing in what direction it would do it. I removed the pool I created. Pool and dataset still accessible. I reboot, try a second time the "same" thing. But now the new pool is validated, the old pool isn't detected, and I got an empty dataset. I tried some commands (shouldn't have, idiocy). That said, I didn't add data to this new pool/dataset, and so I'm hoping the actual zfs object graph is intact. Of course I don't know how deep 'zpool ...' commands change things, that might be very naive of me. What options are there ? Could I copy the 3 1TB blocks onto another drive and attempt some form of uberblock scan ? Thanks in advance for your time. ps: what are other knowledgeable people ? freebsd-fs ML ? I have read George Wilson's name but it seems impolite. From owner-freebsd-fs@freebsd.org Sun Jun 28 16:16:36 2015 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id F270198F373 for ; Sun, 28 Jun 2015 16:16:36 +0000 (UTC) (envelope-from karli.sjoberg@slu.se) Received: from exch2-4.slu.se (exch2-4.slu.se [77.235.224.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-SHA384 (256/256 bits)) (Client CN "webmail.slu.se", Issuer "TERENA SSL CA" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 8992B1A01 for ; Sun, 28 Jun 2015 16:16:35 +0000 (UTC) (envelope-from karli.sjoberg@slu.se) Received: from exch2-4.slu.se (77.235.224.124) by exch2-4.slu.se (77.235.224.124) with Microsoft SMTP Server (TLS) id 15.0.1076.9; Sun, 28 Jun 2015 18:15:57 +0200 Received: from exch2-4.slu.se ([fe80::3117:818f:aa48:9d9b]) by exch2-4.slu.se ([fe80::3117:818f:aa48:9d9b%22]) with mapi id 15.00.1076.000; Sun, 28 Jun 2015 18:15:57 +0200 From: =?utf-8?B?S2FybGkgU2rDtmJlcmc=?= To: Johan PONIN CC: "freebsd-fs@freebsd.org" Subject: Re: Lost pool after nas4free configuration asynchrony. [x-post] Thread-Topic: Lost pool after nas4free configuration asynchrony. [x-post] Thread-Index: AQHQsb22JvPyVVbeq0qlfyZDTkAe5Q== Date: Sun, 28 Jun 2015 16:15:56 +0000 Message-ID: <76f0a25b-3307-4de1-8cba-4b869203888a@email.android.com> Accept-Language: sv-SE, en-US Content-Language: sv-SE X-MS-Has-Attach: X-MS-TNEF-Correlator: x-ms-exchange-transport-fromentityheader: Hosted MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: base64 X-Content-Filtered-By: Mailman/MimeDel 2.1.20 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 28 Jun 2015 16:16:37 -0000 DQpEZW4gMjgganVuIDIwMTUgNTozNyBlbSBza3JldiBKb2hhbiBQT05JTiA8am9oYW4ucG9uaW4u cHJvQGdtYWlsLmNvbT46DQo+DQo+IHByZS1kaXNjbGFpbWVyLCBJIHNlbnQgdGhpcyBvbnRvIHpm cy1kaXNjdXNzIGJlZm9yZSByZWFsaXppbmcgaXQgd2FzIGENCj4gemZzb25saW51eCBNTCwgc28g SSdtIGNvcHlpbmcgdGhlIG1haWwgaGVyZS4NCj4NCj4gSGksDQo+DQo+IERpc2NsYWltZXIsIHRo aXMgaXMgbXkgZmlyc3QgTkFTLCBJIGJhcmVseSBrbm93IEZyZWVCU0QsIGhhdmUgYQ0KPiBjb21w dXRlciBlbmdpbmVlciBkZWdyZWUgdGhvdWdoIGFuZCBhcmNobGludXggYXMgYSBkYWlseSBkcml2 ZXIuDQo+IENvbnNpZGVyIHRoaXMgYSBtb3N0bHkgUEVCQ0FLIHN0b3J5LiBJIGhvcGUgeW91J2xs IHJlYWQgdGhyb3VnaA0KPiBub25ldGhlbGVzcy4NCj4NCj4gU2V0dXANCj4gIC0gTkFTNEZyZWUg ZnJvbSBhIDRHQiB1c2JrZXkgb24gSFAgTjU0TC4NCj4gIC0gM3gxVEIgV0QgUmVkIGFzIGEgdmRl dg0KPiAgLSAxIHBvb2wsIDEgZGF0YXNldCwgcmFpZHogKHNpbmdsZSBwYXJpdHkgQUZBSUNSKS4N Cj4NCj4gVGhlIHVzYmtleSBmYWlsZWQsIG5hczRmcmVlIGNvbmZpZyB3YXMgbG9zdC4gSSBgZGRg IGl0IG9udG8gYSBuZXcgOEdCDQo+IHVzYmtleSwgYm9vdCBhbmQgZ290IHRvIHRoZSBjb25zb2xl IHRvIHJlc2V0IGZhY3RvcnkgY29uZmlnIHRvIGdhaW4NCj4gYWNjZXNzIHRvIHRoZSB3ZWJndWkg KEkgcmVtZW1iZXIgZG9pbmcgdGhhdCBvbmNlIGNhdXNpbmcgbm8gaXNzdWUgd2l0aA0KPiB0aGUg emZzIHBvb2xzIGFuZCBkYXRhc2V0cykNCj4NCj4gVGhlIHdlYmd1aSBzaG93ZWQgbm8gbW9yZSBw b29sL2RhdGFzZXQsIGJ1dCBydW5uaW5nIC9ldGMvcmMuZC96ZnMNCj4gc3RhcnQgbW91bnRlZCBl dmVyeXRoaW5nIGZpbmUuIHpmcyBzY3J1YiByZXBvcnRlZCBubyBlcnJvcnMuIERhdGEgd2FzDQo+ IHRoZXJlLCBhIGZldyBgbGVzcyAvbW50Ly4uL2ZpbGVgIHNob3dpbmcgZ29vZCBkYXRhLg0KPg0K PiBUcnlpbmcgdG8gcmVjcmVhdGUgdGhlIHBvb2wgY29uZmlnIGluIHRoZSBndWkgdG8gYXZvaWQg bW91bnRpbmcgYnkNCj4gaGFuZCwgSSBnb3Qgc29tZSBlcnJvcnMsIGJ1dCB0aGUgR1VJIHRvbGQg bWUgaXQgZGV0ZWN0ZWQgYSBwb29sDQo+IGNvbmZpZywgb2ZmZXJlZCB0byBzeW5jaHJvbml6ZSBi dXQgSSB3YXMgYWZyYWlkIG5vdCBrbm93aW5nIGluIHdoYXQNCj4gZGlyZWN0aW9uIGl0IHdvdWxk IGRvIGl0LiBJIHJlbW92ZWQgdGhlIHBvb2wgSSBjcmVhdGVkLiBQb29sIGFuZA0KPiBkYXRhc2V0 IHN0aWxsIGFjY2Vzc2libGUuIEkgcmVib290LCB0cnkgYSBzZWNvbmQgdGltZSB0aGUgInNhbWUi DQo+IHRoaW5nLiBCdXQgbm93IHRoZSBuZXcgcG9vbCBpcyB2YWxpZGF0ZWQsIHRoZSBvbGQgcG9v bCBpc24ndCBkZXRlY3RlZCwNCj4gYW5kIEkgZ290IGFuIGVtcHR5IGRhdGFzZXQuDQo+DQo+IEkg dHJpZWQgc29tZSBjb21tYW5kcyAoc2hvdWxkbid0IGhhdmUsIGlkaW9jeSkuIFRoYXQgc2FpZCwg SSBkaWRuJ3QNCj4gYWRkIGRhdGEgdG8gdGhpcyBuZXcgcG9vbC9kYXRhc2V0LCBhbmQgc28gSSdt IGhvcGluZyB0aGUgYWN0dWFsIHpmcw0KPiBvYmplY3QgZ3JhcGggaXMgaW50YWN0LiBPZiBjb3Vy c2UgSSBkb24ndCBrbm93IGhvdyBkZWVwICd6cG9vbCAuLi4nDQo+IGNvbW1hbmRzIGNoYW5nZSB0 aGluZ3MsIHRoYXQgbWlnaHQgYmUgdmVyeSBuYWl2ZSBvZiBtZS4NCj4NCj4gV2hhdCBvcHRpb25z IGFyZSB0aGVyZSA/IENvdWxkIEkgY29weSB0aGUgMyAxVEIgYmxvY2tzIG9udG8gYW5vdGhlcg0K PiBkcml2ZSBhbmQgYXR0ZW1wdCBzb21lIGZvcm0gb2YgdWJlcmJsb2NrIHNjYW4gPw0KPg0KPiBU aGFua3MgaW4gYWR2YW5jZSBmb3IgeW91ciB0aW1lLg0KPg0KPiBwczogd2hhdCBhcmUgb3RoZXIg a25vd2xlZGdlYWJsZSBwZW9wbGUgPyBmcmVlYnNkLWZzIE1MID8gSSBoYXZlIHJlYWQNCj4gR2Vv cmdlIFdpbHNvbidzIG5hbWUgYnV0IGl0IHNlZW1zIGltcG9saXRlLg0KPiBfX19fX19fX19fX19f X19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fXw0KPiBmcmVlYnNkLWZzQGZyZWVic2Qu b3JnIG1haWxpbmcgbGlzdA0KPiBodHRwOi8vbGlzdHMuZnJlZWJzZC5vcmcvbWFpbG1hbi9saXN0 aW5mby9mcmVlYnNkLWZzDQo+IFRvIHVuc3Vic2NyaWJlLCBzZW5kIGFueSBtYWlsIHRvICJmcmVl YnNkLWZzLXVuc3Vic2NyaWJlQGZyZWVic2Qub3JnIg0KDQpIb3cgYWJvdXQgeW91IGRyb3AgdG8g YSBzaGVsbCBhbmQgc2hvdzoNCg0KIyB6cG9vbCBzdGF0dXMNCmFuZA0KIyB6cG9vbCBpbXBvcnQN Cg0KPw0KDQovSw0K From owner-freebsd-fs@freebsd.org Sun Jun 28 21:00:25 2015 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 8D0F598FA25 for ; Sun, 28 Jun 2015 21:00:25 +0000 (UTC) (envelope-from bugzilla-noreply@FreeBSD.org) Received: from kenobi.freebsd.org (kenobi.freebsd.org [IPv6:2001:1900:2254:206a::16:76]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 6899A1EDE for ; Sun, 28 Jun 2015 21:00:25 +0000 (UTC) (envelope-from bugzilla-noreply@FreeBSD.org) Received: from bugs.freebsd.org ([127.0.1.118]) by kenobi.freebsd.org (8.14.9/8.14.9) with ESMTP id t5SL0P01013799 for ; Sun, 28 Jun 2015 21:00:25 GMT (envelope-from bugzilla-noreply@FreeBSD.org) Message-Id: <201506282100.t5SL0P01013799@kenobi.freebsd.org> From: bugzilla-noreply@FreeBSD.org To: freebsd-fs@FreeBSD.org Subject: Problem reports for freebsd-fs@FreeBSD.org that need special attention X-Bugzilla-URL: https://bugs.freebsd.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 Date: Sun, 28 Jun 2015 21:00:25 +0000 Content-Type: text/plain; charset="UTF-8" X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 28 Jun 2015 21:00:25 -0000 To view an individual PR, use: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=(Bug Id). The following is a listing of current problems submitted by FreeBSD users, which need special attention. These represent problem reports covering all versions including experimental development code and obsolete releases. Status | Bug Id | Description ------------+-----------+--------------------------------------------------- Open | 136470 | [nfs] Cannot mount / in read-only, over NFS Open | 139651 | [nfs] mount(8): read-only remount of NFS volume d Open | 144447 | [zfs] sharenfs fsunshare() & fsshare_main() non f 3 problems total for which you should take action. From owner-freebsd-fs@freebsd.org Mon Jun 29 17:40:05 2015 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id BE686990A89 for ; Mon, 29 Jun 2015 17:40:05 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org (kenobi.freebsd.org [IPv6:2001:1900:2254:206a::16:76]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id AB2CF1E4B for ; Mon, 29 Jun 2015 17:40:05 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from bugs.freebsd.org ([127.0.1.118]) by kenobi.freebsd.org (8.14.9/8.14.9) with ESMTP id t5THe5MH058284 for ; Mon, 29 Jun 2015 17:40:05 GMT (envelope-from bugzilla-noreply@freebsd.org) From: bugzilla-noreply@freebsd.org To: freebsd-fs@FreeBSD.org Subject: [Bug 168158] [zfs] incorrect parsing of sharenfs options in zfs (fsshare.c) Date: Mon, 29 Jun 2015 17:40:05 +0000 X-Bugzilla-Reason: AssignedTo X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: Base System X-Bugzilla-Component: kern X-Bugzilla-Version: unspecified X-Bugzilla-Keywords: X-Bugzilla-Severity: Affects Only Me X-Bugzilla-Who: eborisch+FreeBSD@gmail.com X-Bugzilla-Status: In Progress X-Bugzilla-Priority: Normal X-Bugzilla-Assigned-To: freebsd-fs@FreeBSD.org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: cc attachments.created Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 7bit X-Bugzilla-URL: https://bugs.freebsd.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 29 Jun 2015 17:40:05 -0000 https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=168158 eborisch+FreeBSD@gmail.com changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |eborisch+FreeBSD@gmail.com --- Comment #3 from eborisch+FreeBSD@gmail.com --- Created attachment 158168 --> https://bugs.freebsd.org/bugzilla/attachment.cgi?id=158168&action=edit Patch to support hyphenated host names in sharenfs zfs property. Simple patch that only ignores leading hyphens in options. With this change: '-ro,a-server -alldirs' -> '-ro a-server -alldirs' instead of (current behavior): '-ro,a-server -alldirs' -> '-ro a server -alldirs' Changed by removing '-' from strsep() call, and handling (skipping) hyphen if first character immediately after strsep() call. All test cases listed immediately prior to the code still work; I'm not aware of any supported forms that are broken by this change. -- You are receiving this mail because: You are the assignee for the bug. From owner-freebsd-fs@freebsd.org Wed Jul 1 01:16:55 2015 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id C021A9900AB for ; Wed, 1 Jul 2015 01:16:55 +0000 (UTC) (envelope-from petehodur@gmail.com) Received: from mail-lb0-x22d.google.com (mail-lb0-x22d.google.com [IPv6:2a00:1450:4010:c04::22d]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 43EC61E4C for ; Wed, 1 Jul 2015 01:16:55 +0000 (UTC) (envelope-from petehodur@gmail.com) Received: by lbcpe5 with SMTP id pe5so3797311lbc.2 for ; Tue, 30 Jun 2015 18:16:53 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:from:date:message-id:subject:to:content-type; bh=zefAk02ZGdz6aoBAhvW0VzkQOzHu9LPX2RLG31F8upU=; b=y5C54u1rX600HuDK4AtsysS/JQUPwbWJyp5DVCDeIWrDLNA0MF+kfE3X3I4SlHSyoa FTdxQrLkIov5ravqd6hMYL40DPlQpRLakQ7mC4U/3WlL2yGhsI29mN2D5IqAlVgtlJlD rmDdwMhOe8jHnWCKb8V5f+70g6Gc1ID1BVdxaBeRAZFmnwKZ0pJgkgFU6AkT114FDv1e 5RWzeZwBtmX7ToREc4QWJrvr8a9FyiBB9fBIxSs7QZSdqc8cvnhDsnmCroaXarGkOA7c zMiuwdqiiEigJH3CtBMbV+L2H2MoalJYCf2z8tFmZh289J5hJDDAIxhYGoEJBYji8oYJ fCvw== X-Received: by 10.112.204.199 with SMTP id la7mr22520771lbc.114.1435713413872; Tue, 30 Jun 2015 18:16:53 -0700 (PDT) MIME-Version: 1.0 Received: by 10.152.30.39 with HTTP; Tue, 30 Jun 2015 18:16:34 -0700 (PDT) From: Peter Hodur Date: Wed, 1 Jul 2015 03:16:34 +0200 Message-ID: Subject: NFSv4 & ZFS question To: freebsd-fs@freebsd.org Content-Type: text/plain; charset=UTF-8 X-Content-Filtered-By: Mailman/MimeDel 2.1.20 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 01 Jul 2015 01:16:55 -0000 Hello, just starting and migrating storage server from NFSv3 to NFSv4. I have read everything including man nfsv4 and found this: .............. The nfsd(8) allows a limited subset of operations to be performed on non- exported subtrees of the local file system, so that traversal of the tree to the exported subtrees is possible. As such, the ``'' can be in a non-exported file system. The exception is ZFS, which checks exports and, as such, all ZFS file systems below the ``'' must be exported. However, the entire tree that is rooted at that point must be in local file systems that are of types that can be NFS exported. Since the NFSv4 file system is rooted at ``'', setting this to anything other than ``/'' will result in clients being required to use different mount paths for NFSv4 than for NFS Version 2 or 3. Unlike NFS Version 2 and 3, Version 4 allows a client mount to span across multiple server file systems, although not all clients are capable of doing this. .................. How can I understand: "The exception is ZFS, which checks exports and, as such, all ZFS file systems below the `'' must be exported." Why? So when I add "V4: /" in /etc/exports then I must export ALL (including zroot) filesystems? Is this bug or it is true and is better to make /exports a below this directory move all "shared filesystems"? thanks Peter From owner-freebsd-fs@freebsd.org Wed Jul 1 16:23:56 2015 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 5DB1D9922FC for ; Wed, 1 Jul 2015 16:23:56 +0000 (UTC) (envelope-from email.ahmedkamal@googlemail.com) Received: from mail-wg0-x22b.google.com (mail-wg0-x22b.google.com [IPv6:2a00:1450:400c:c00::22b]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 12AE01EB8 for ; Wed, 1 Jul 2015 16:23:56 +0000 (UTC) (envelope-from email.ahmedkamal@googlemail.com) Received: by wguu7 with SMTP id u7so41368298wgu.3 for ; Wed, 01 Jul 2015 09:23:54 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=googlemail.com; s=20120113; h=mime-version:from:date:message-id:subject:to:content-type; bh=htW/F40WOO254sm2YStfrod3IeotbjgX3YYwHXMYybM=; b=m3b75DNxwrtrtH0h5/YAaKmc55skAMg3WTmfTirhXV0FcTL3xgNYt69+0cD14Iu/tX F3GYuYw4BKozS4fb6PJ7V3PWAmJxl4NCM9B/OMAkSL+rPnHBAvDfPLm28gGJ8g42RrBJ T6exOYErOyx2vuiKF3VXhix9NG2IRAU0ZnmTFUNrBNE86Wk0QEA/Mb3G1KYtoPY/SLu7 uxV2exog5nnmZXNBFSCnnajSZUyAqWTFUFy8xaWBSQ+zKxjoOno57luv80fisFynCsBP bzpKUBNQ91G6ztvQVNGDtfQ//6DDSrpdG1ob+C/CNZKIGcoyMzzMKnHsbwm29raAWo2+ Q0xw== X-Received: by 10.194.6.229 with SMTP id e5mr3196200wja.158.1435767834455; Wed, 01 Jul 2015 09:23:54 -0700 (PDT) MIME-Version: 1.0 Received: by 10.28.6.143 with HTTP; Wed, 1 Jul 2015 09:23:35 -0700 (PDT) From: Ahmed Kamal Date: Wed, 1 Jul 2015 18:23:35 +0200 Message-ID: Subject: Linux NFSv4 clients are getting (bad sequence-id error!) To: freebsd-fs@freebsd.org Content-Type: text/plain; charset=UTF-8 X-Content-Filtered-By: Mailman/MimeDel 2.1.20 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 01 Jul 2015 16:23:56 -0000 Hi all, I'm a refugee from linux land. I just set up my first freebsd 10.1 zfs box, sharing /home over nfs. Since every home directory is its own zfs dataset, I chose to use nfsv4 to enable recursively sharing/mounting any directory under /home (I understand nfs4 is a must in this scenario!) I'm able to mount form linux (rhel5 latest kernel) successfully. Users are working fine. However every now and then a user screams that his session is frozen. Usually the processes are stuck in nfs_wait or rpc_* state. I tried using a much newer linux kernel (3.2 however it still faced the same problem). The errors in Linux log files are mostly: Jul 1 17:41:47 mammoth kernel: NFS: v4 server nas returned a *bad sequence-id error*! Jul 1 17:52:32 mammoth kernel: nfs4_reclaim_locks: unhandled error -11. Zeroing state Jul 1 17:52:32 mammoth kernel: nfs4_reclaim_open_state: Lock reclaim failed! My search led me to (https://access.redhat.com/solutions/1328073) a detailed analysis of the issue, which you can read over here https://dl.dropboxusercontent.com/u/51939288/nfs4-bad-seq.pdf .. NetApp confirmed this was a bug for them (I'm wondering if this is still in FreeBSD?!) PS: Right before sending this, I saw dmesg on the freebsd box advising increasing vfs.nfsd.tcphighwater .. So I up'ed that to 64000. I also up'ed the number of nfs server threads (-t) from 10 to 60 (we're roughly 40 linux machines) Any advice is most appreciated! Thanks From owner-freebsd-fs@freebsd.org Wed Jul 1 21:58:32 2015 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 9051F99258C for ; Wed, 1 Jul 2015 21:58:32 +0000 (UTC) (envelope-from wwwescue@dime189.dizinc.com) Received: from leaf104.mfilter.dimenoc.com (banana.mfilter.dimenoc.com [72.29.89.6]) by mx1.freebsd.org (Postfix) with ESMTP id 5052918C4 for ; Wed, 1 Jul 2015 21:58:31 +0000 (UTC) (envelope-from wwwescue@dime189.dizinc.com) Received: from localhost (localhost [127.0.0.1]) by leaf104.mfilter.dimenoc.com (Postfix) with ESMTP id CB798CBEE81 for ; Wed, 1 Jul 2015 17:58:31 -0400 (EDT) X-Virus-Scanned: amavisd-new at leaf.mfilter.dimenoc.com Received: from leaf104.mfilter.dimenoc.com ([127.0.0.1]) by localhost (leaf104.mfilter.dimenoc.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id VDbrJ2we6yv6 for ; Wed, 1 Jul 2015 17:58:31 -0400 (EDT) Received: from dime189.dizinc.com (dime189.dizinc.com [72.29.68.31]) by leaf104.mfilter.dimenoc.com (Postfix) with ESMTP for ; Wed, 1 Jul 2015 17:58:31 -0400 (EDT) Received: from wwwescue by dime189.dizinc.com with local (Exim 4.85) (envelope-from ) id 1ZAQ1V-0007yp-RR for freebsd-fs@freebsd.org; Wed, 01 Jul 2015 17:58:29 -0400 To: freebsd-fs@freebsd.org Subject: Payment for driving on toll road, invoice #0000578758 X-PHP-Script: escueladeportivamilan.com/post.php for 84.200.12.141 Date: Wed, 1 Jul 2015 17:58:29 -0400 From: "E-ZPass Support" Reply-To: "E-ZPass Support" Message-ID: X-Priority: 3 MIME-Version: 1.0 Sender: X-AntiAbuse: This header was added to track abuse, please include it with any abuse report X-AntiAbuse: Primary Hostname - dime189.dizinc.com X-AntiAbuse: Original Domain - freebsd.org X-AntiAbuse: Originator/Caller UID/GID - [589 32007] / [47 12] X-AntiAbuse: Sender Address Domain - dime189.dizinc.com X-Get-Message-Sender-Via: dime189.dizinc.com: authenticated_id: wwwescue/primary_hostname/system user X-Source: /usr/bin/php53 X-Source-Args: /usr/bin/php53 /home/wwwescue/public_html/post.php X-Source-Dir: escueladeportivamilan.com:/public_html Content-Type: text/plain; charset=us-ascii X-Content-Filtered-By: Mailman/MimeDel 2.1.20 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 01 Jul 2015 21:58:32 -0000 Notice to Appear, You have a debt to pay for using a toll road. You are kindly asked to service your debt in the shortest time possible. The invoice is attached to this email. Yours faithfully, Jared Burch, E-ZPass Manager. From owner-freebsd-fs@freebsd.org Wed Jul 1 23:20:08 2015 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 5B8129920D7 for ; Wed, 1 Jul 2015 23:20:08 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) Received: from esa-annu.net.uoguelph.ca (esa-annu.mail.uoguelph.ca [131.104.91.36]) by mx1.freebsd.org (Postfix) with ESMTP id 0A0AE18D5 for ; Wed, 1 Jul 2015 23:20:07 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: A2DhBABGdZRV/61jaINbg2VfBoMZuiGBZAqFLkoCggYSAQEBAQEBAYEKhCIBAQECAQEBAQEgBCcgCwULAgEIDgoCAg0ZAgInAQkmAgQIBwQBGgIEiAYIDbYBlxEBAQEBAQEEAQEBAQEBAQEagSGKKYQ0AQEFFzQHgmiBQwWMFod6hF2ENoQIRIZdj2cCJoQWIjEBBoEFOoECAQEB X-IronPort-AV: E=Sophos;i="5.15,389,1432612800"; d="scan'208";a="223364281" Received: from nipigon.cs.uoguelph.ca (HELO zcs1.mail.uoguelph.ca) ([131.104.99.173]) by esa-annu.net.uoguelph.ca with ESMTP; 01 Jul 2015 19:19:46 -0400 Received: from localhost (localhost [127.0.0.1]) by zcs1.mail.uoguelph.ca (Postfix) with ESMTP id 3D08115F533; Wed, 1 Jul 2015 19:19:46 -0400 (EDT) Received: from zcs1.mail.uoguelph.ca ([127.0.0.1]) by localhost (zcs1.mail.uoguelph.ca [127.0.0.1]) (amavisd-new, port 10032) with ESMTP id YxGeaC-6_u76; Wed, 1 Jul 2015 19:19:45 -0400 (EDT) Received: from localhost (localhost [127.0.0.1]) by zcs1.mail.uoguelph.ca (Postfix) with ESMTP id 6ED9815F54D; Wed, 1 Jul 2015 19:19:45 -0400 (EDT) X-Virus-Scanned: amavisd-new at zcs1.mail.uoguelph.ca Received: from zcs1.mail.uoguelph.ca ([127.0.0.1]) by localhost (zcs1.mail.uoguelph.ca [127.0.0.1]) (amavisd-new, port 10026) with ESMTP id bScfgRANA9oB; Wed, 1 Jul 2015 19:19:45 -0400 (EDT) Received: from zcs1.mail.uoguelph.ca (zcs1.mail.uoguelph.ca [172.17.95.18]) by zcs1.mail.uoguelph.ca (Postfix) with ESMTP id 5363315F533; Wed, 1 Jul 2015 19:19:45 -0400 (EDT) Date: Wed, 1 Jul 2015 19:19:45 -0400 (EDT) From: Rick Macklem To: Ahmed Kamal Cc: freebsd-fs@freebsd.org Message-ID: <2124485979.2769788.1435792785282.JavaMail.zimbra@uoguelph.ca> In-Reply-To: References: Subject: Re: Linux NFSv4 clients are getting (bad sequence-id error!) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Originating-IP: [172.17.95.12] X-Mailer: Zimbra 8.0.9_GA_6191 (ZimbraWebClient - FF34 (Win)/8.0.9_GA_6191) Thread-Topic: Linux NFSv4 clients are getting (bad sequence-id error!) Thread-Index: p02pgg/QbRC4TZXd26n3g9GKfmu8sQ== X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 01 Jul 2015 23:20:08 -0000 Ahmed Kamal wrote: > Hi all, > > I'm a refugee from linux land. I just set up my first freebsd 10.1 zfs box, > sharing /home over nfs. Since every home directory is its own zfs dataset, > I chose to use nfsv4 to enable recursively sharing/mounting any directory > under /home (I understand nfs4 is a must in this scenario!) > > I'm able to mount form linux (rhel5 latest kernel) successfully. Users are > working fine. However every now and then a user screams that his session is > frozen. Usually the processes are stuck in nfs_wait or rpc_* state. I tried > using a much newer linux kernel (3.2 however it still faced the same > problem). The errors in Linux log files are mostly: > Jul 1 17:41:47 mammoth kernel: NFS: v4 server nas returned a *bad > sequence-id error*! > Jul 1 17:52:32 mammoth kernel: nfs4_reclaim_locks: unhandled error -11. > Zeroing state > Jul 1 17:52:32 mammoth kernel: nfs4_reclaim_open_state: Lock reclaim > failed! > > My search led me to (https://access.redhat.com/solutions/1328073) a > detailed analysis of the issue, which you can read over here > https://dl.dropboxusercontent.com/u/51939288/nfs4-bad-seq.pdf .. NetApp > confirmed this was a bug for them (I'm wondering if this is still in > FreeBSD?!) > Well, the Netapp NFS server code is proprietary to them and has no commonality with the FreeBSD code, so it seems unlikely that they will have the same bug. > PS: Right before sending this, I saw dmesg on the freebsd box advising > increasing vfs.nfsd.tcphighwater .. So I up'ed that to 64000. I also up'ed > the number of nfs server threads (-t) from 10 to 60 (we're roughly 40 linux > machines) > This indicates that the server's DRC has gotten constipated and this could cause issues for NFSv4.0. Things to try: - the above increase of vfs.nfsd.tcphighwater might do the trick. --> You can also try decreasing vfs.nfsd.tcpcachetimeo. If you can't find values of these that avoid the constipation, you can disable it by setting vfs.nfsd.cachetcp to 0. Alternately, go to the Linux client and see if the mount is using minorversion 0 or 1. (I think "nfsstat -m" on the client will do that.) Then use the minorversion= option to force it to use the other minorversion (ie. if its 0, force it to 1 or vice versa) Since NFSv4.1 doesn't use the DRC, I'd guess these are NFSv4.0 mounts and using NFSv4.1 would avoid any DRC related issues. Good luck with it, rick > Any advice is most appreciated! > > Thanks > _______________________________________________ > freebsd-fs@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-fs > To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org" > From owner-freebsd-fs@freebsd.org Wed Jul 1 23:32:34 2015 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 272AB99243C for ; Wed, 1 Jul 2015 23:32:34 +0000 (UTC) (envelope-from allan@physics.umn.edu) Received: from mail.physics.umn.edu (smtp.spa.umn.edu [128.101.220.4]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 0B19E21B3 for ; Wed, 1 Jul 2015 23:32:33 +0000 (UTC) (envelope-from allan@physics.umn.edu) Received: from peevish.spa.umn.edu ([128.101.220.230]) by mail.physics.umn.edu with esmtpsa (TLSv1:CAMELLIA256-SHA:256) (Exim 4.77 (FreeBSD)) (envelope-from ) id 1ZAQv8-000GlU-PA for freebsd-fs@freebsd.org; Wed, 01 Jul 2015 17:55:58 -0500 Message-ID: <55946FFE.8070402@physics.umn.edu> Date: Wed, 01 Jul 2015 17:55:58 -0500 From: Graham Allan Organization: Physics, University of Minnesota User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.7.0 MIME-Version: 1.0 To: freebsd-fs@freebsd.org Subject: Strange NFS problem implicating nfsuserd? Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 01 Jul 2015 23:32:34 -0000 I spent a few days digging into a strange NFSv4 problem at our site, which I think I may have finally resolved but don't really understand why. We have a bunch of large-ish NFS servers running FreeBSD 9.3 exporting ZFS filesystems to mostly "RHEL-clone" linux clients. Over the last few weeks I started getting reports that peoples' jobs would fail erratically with i/o errors, and it became apparent that they pointed in general to all our FreeBSD NFS servers rather than just one. Ultimately I could trivially reproduce the problem running "find . -type f -exec cat {} > /dev/null \;" on one of the NFS-mounted filesystems. Linux clients would eventually error with "Input/output error" FreeBSD clients would eventually error with "Permission denied" on files or directories which should be readable. Reverting to earlier patch releases didn't make any difference, though it seemed like the problem started roughly when I updated p8->p13. Finally I seem to have pinpointed it to one change made in rc.conf for nfsuserd, which I committed at around the right date: nfsuserd_flags="-usermax 500 -usertimeout 600 16" became: nfsuserd_flags="-domain xxx.yyy.zzz -usermax 500 -usertimeout 600 16" probably because I saw a user mapping failure somewhere previously, and decided to make the domain explicit. Undoing this change appears to eliminate the problem - but this makes no sense to me. Starting nfsuserd with either set of options (adding -verbose) prints the same output: Starting nfsuserd. nfsuserd: domain=xxx.yyy.zzz usermax=500 usertimeout=36000 So the domain chosen by default is the same as the one explicitly specified (as I would expect). I've reproduced this across 4-5 different servers and a similar number of different client systems. I'm wondering if any plausible explanation suggests itself? Graham -- From owner-freebsd-fs@freebsd.org Wed Jul 1 23:36:19 2015 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 17645992496 for ; Wed, 1 Jul 2015 23:36:19 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) Received: from esa-jnhn.mail.uoguelph.ca (esa-jnhn.mail.uoguelph.ca [131.104.91.44]) by mx1.freebsd.org (Postfix) with ESMTP id A96D622E2 for ; Wed, 1 Jul 2015 23:36:18 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: A2DhBADBeJRV/61jaINbg2VfBoMZuiGBZAqFLkoCggYSAQEBAQEBAYEKhCIBAQECAQEBAQEgKyALBQsCAQgOCgICDRkCAicBCSYCBAgHBAEaAgSIBggNtg6XEQEBAQEBAQQBAQEBAQEBG4EhiimENAEBBRc0B4JogUMFjBaHeoRdhDaECESGXY9nAiaEFiIxAQaBBTqBAgEBAQ X-IronPort-AV: E=Sophos;i="5.15,389,1432612800"; d="scan'208";a="221500174" Received: from nipigon.cs.uoguelph.ca (HELO zcs1.mail.uoguelph.ca) ([131.104.99.173]) by esa-jnhn.mail.uoguelph.ca with ESMTP; 01 Jul 2015 19:36:17 -0400 Received: from localhost (localhost [127.0.0.1]) by zcs1.mail.uoguelph.ca (Postfix) with ESMTP id 82C9F15F533; Wed, 1 Jul 2015 19:36:17 -0400 (EDT) Received: from zcs1.mail.uoguelph.ca ([127.0.0.1]) by localhost (zcs1.mail.uoguelph.ca [127.0.0.1]) (amavisd-new, port 10032) with ESMTP id r-ScHg8V0W5k; Wed, 1 Jul 2015 19:36:16 -0400 (EDT) Received: from localhost (localhost [127.0.0.1]) by zcs1.mail.uoguelph.ca (Postfix) with ESMTP id D7F3515F54D; Wed, 1 Jul 2015 19:36:16 -0400 (EDT) X-Virus-Scanned: amavisd-new at zcs1.mail.uoguelph.ca Received: from zcs1.mail.uoguelph.ca ([127.0.0.1]) by localhost (zcs1.mail.uoguelph.ca [127.0.0.1]) (amavisd-new, port 10026) with ESMTP id 1GjLwTQbD3mE; Wed, 1 Jul 2015 19:36:16 -0400 (EDT) Received: from zcs1.mail.uoguelph.ca (zcs1.mail.uoguelph.ca [172.17.95.18]) by zcs1.mail.uoguelph.ca (Postfix) with ESMTP id BD3CE15F533; Wed, 1 Jul 2015 19:36:16 -0400 (EDT) Date: Wed, 1 Jul 2015 19:36:16 -0400 (EDT) From: Rick Macklem To: Ahmed Kamal Cc: freebsd-fs@freebsd.org Message-ID: <684628776.2772174.1435793776748.JavaMail.zimbra@uoguelph.ca> In-Reply-To: References: Subject: Re: Linux NFSv4 clients are getting (bad sequence-id error!) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Originating-IP: [172.17.95.10] X-Mailer: Zimbra 8.0.9_GA_6191 (ZimbraWebClient - FF34 (Win)/8.0.9_GA_6191) Thread-Topic: Linux NFSv4 clients are getting (bad sequence-id error!) Thread-Index: Iq69mADIXHh0Bf5jWZ4+5Tg8skxe/Q== X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 01 Jul 2015 23:36:19 -0000 Ahmed Kamal wrote: > Hi all, > > I'm a refugee from linux land. I just set up my first freebsd 10.1 zfs box, > sharing /home over nfs. Since every home directory is its own zfs dataset, > I chose to use nfsv4 to enable recursively sharing/mounting any directory > under /home (I understand nfs4 is a must in this scenario!) > > I'm able to mount form linux (rhel5 latest kernel) successfully. Users are > working fine. However every now and then a user screams that his session is > frozen. Usually the processes are stuck in nfs_wait or rpc_* state. I tried > using a much newer linux kernel (3.2 however it still faced the same > problem). The errors in Linux log files are mostly: > Jul 1 17:41:47 mammoth kernel: NFS: v4 server nas returned a *bad > sequence-id error*! > Jul 1 17:52:32 mammoth kernel: nfs4_reclaim_locks: unhandled error -11. > Zeroing state > Jul 1 17:52:32 mammoth kernel: nfs4_reclaim_open_state: Lock reclaim > failed! > Btw, a client should only do "reclaim" operations after the server has replied with NFS4ERR_STALE_CLIENTID or NFS4ERR_STALE_STATEID. I am pretty certain that the FreeBSD NFSv4 server only generates these replies after it has rebooted, so assuming the server didn't reboot, I have no idea why the client would attempt these and am not surprised they failed. I'm guessing that the DRC constipation somehow caused the Linux client to go into recovery mode? rick > My search led me to (https://access.redhat.com/solutions/1328073) a > detailed analysis of the issue, which you can read over here > https://dl.dropboxusercontent.com/u/51939288/nfs4-bad-seq.pdf .. NetApp > confirmed this was a bug for them (I'm wondering if this is still in > FreeBSD?!) > > PS: Right before sending this, I saw dmesg on the freebsd box advising > increasing vfs.nfsd.tcphighwater .. So I up'ed that to 64000. I also up'ed > the number of nfs server threads (-t) from 10 to 60 (we're roughly 40 linux > machines) > > Any advice is most appreciated! > > Thanks > _______________________________________________ > freebsd-fs@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-fs > To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org" > From owner-freebsd-fs@freebsd.org Wed Jul 1 23:44:56 2015 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id D53FF992639 for ; Wed, 1 Jul 2015 23:44:56 +0000 (UTC) (envelope-from email.ahmedkamal@googlemail.com) Received: from mail-wi0-x229.google.com (mail-wi0-x229.google.com [IPv6:2a00:1450:400c:c05::229]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 6434B276C for ; Wed, 1 Jul 2015 23:44:56 +0000 (UTC) (envelope-from email.ahmedkamal@googlemail.com) Received: by widjy10 with SMTP id jy10so70650509wid.1 for ; Wed, 01 Jul 2015 16:44:54 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=googlemail.com; s=20120113; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc:content-type; bh=FBKvBo/OQV72Y4PKhLud00weTd0h/WJ+Pq2VeqXdum8=; b=IoTTh/rrC1/4+5XkefXS1RJgIWTvN9NvjvrRsmVWShgMDitFlFpjUNcxI3nx2gPGgY cS2dv8Q51OKBGnMGj0+qLgLQWbdSkMMaRP1dWAlLUglAElH9+2Prth3P3j25+R29M7aV BVCzbhf2Dk8jCkLqwmG2224uxrFaZJ1+os2tlpXe9BGL9nvI/hC801DEx70Ggl4pC7BO 7dAqJ7j43TGaHK/NNy2h9YBBtrw/cP+NzcCOf4DXM1BANI+iIG5Veexcyby5o7sQMjAl BkYNuwGlganBhcNlynH4lYk6uodLN4/kaueaweETuBHxdBmx9q/9e4JCMjx9AS5E8a3w eGLA== X-Received: by 10.194.6.229 with SMTP id e5mr6207641wja.158.1435794294836; Wed, 01 Jul 2015 16:44:54 -0700 (PDT) MIME-Version: 1.0 Received: by 10.28.6.143 with HTTP; Wed, 1 Jul 2015 16:44:35 -0700 (PDT) In-Reply-To: <684628776.2772174.1435793776748.JavaMail.zimbra@uoguelph.ca> References: <684628776.2772174.1435793776748.JavaMail.zimbra@uoguelph.ca> From: Ahmed Kamal Date: Thu, 2 Jul 2015 01:44:35 +0200 Message-ID: Subject: Re: Linux NFSv4 clients are getting (bad sequence-id error!) To: Rick Macklem Cc: freebsd-fs@freebsd.org Content-Type: text/plain; charset=UTF-8 X-Content-Filtered-By: Mailman/MimeDel 2.1.20 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 01 Jul 2015 23:44:57 -0000 Thanks a lot Rick .. Actually I did reboot the nas box after setting the below in /etc/sysctl.conf (just to clean things up) vfs.nfsd.tcphighwater=61000 vfs.nfsd.tcpcachetimeout=300 Since clients are rhel5 only, they don't seem to support v4.1, only v4.0 .. So this one is not an option for now. The good news, is that after raising tcphighwater, these messages (nfsd server cache flooded, try increasing vfs.nfsd.tcphighwater) have completely stopped appearing in freebsd dmesg The not so great news is, after updating sysctl and rebooting the nas box, I still saw a few (NFS: v4 server nas returned a bad sequence-id error!) lines in logs. Users have already left, so I don't know how bad is it .. Could you share more info on what this error means? RedHat seems to think the client can skip-by-1 and choose larger IDs and that would be totally fine ? Also how serious is this error, would it cause NFS session stall like that ? On Thu, Jul 2, 2015 at 1:36 AM, Rick Macklem wrote: > Ahmed Kamal wrote: > > Hi all, > > > > I'm a refugee from linux land. I just set up my first freebsd 10.1 zfs > box, > > sharing /home over nfs. Since every home directory is its own zfs > dataset, > > I chose to use nfsv4 to enable recursively sharing/mounting any directory > > under /home (I understand nfs4 is a must in this scenario!) > > > > I'm able to mount form linux (rhel5 latest kernel) successfully. Users > are > > working fine. However every now and then a user screams that his session > is > > frozen. Usually the processes are stuck in nfs_wait or rpc_* state. I > tried > > using a much newer linux kernel (3.2 however it still faced the same > > problem). The errors in Linux log files are mostly: > > Jul 1 17:41:47 mammoth kernel: NFS: v4 server nas returned a *bad > > sequence-id error*! > > Jul 1 17:52:32 mammoth kernel: nfs4_reclaim_locks: unhandled error -11. > > Zeroing state > > Jul 1 17:52:32 mammoth kernel: nfs4_reclaim_open_state: Lock reclaim > > failed! > > > Btw, a client should only do "reclaim" operations after the server has > replied with NFS4ERR_STALE_CLIENTID or NFS4ERR_STALE_STATEID. I am pretty > certain that the FreeBSD NFSv4 server only generates these replies after > it has rebooted, so assuming the server didn't reboot, I have no idea why > the client would attempt these and am not surprised they failed. > > I'm guessing that the DRC constipation somehow caused the Linux client > to go into recovery mode? > > rick > > > My search led me to (https://access.redhat.com/solutions/1328073) a > > detailed analysis of the issue, which you can read over here > > https://dl.dropboxusercontent.com/u/51939288/nfs4-bad-seq.pdf .. NetApp > > confirmed this was a bug for them (I'm wondering if this is still in > > FreeBSD?!) > > > > PS: Right before sending this, I saw dmesg on the freebsd box advising > > increasing vfs.nfsd.tcphighwater .. So I up'ed that to 64000. I also > up'ed > > the number of nfs server threads (-t) from 10 to 60 (we're roughly 40 > linux > > machines) > > > > Any advice is most appreciated! > > > > Thanks > > _______________________________________________ > > freebsd-fs@freebsd.org mailing list > > http://lists.freebsd.org/mailman/listinfo/freebsd-fs > > To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org" > > > From owner-freebsd-fs@freebsd.org Wed Jul 1 23:49:04 2015 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 08812992748 for ; Wed, 1 Jul 2015 23:49:04 +0000 (UTC) (envelope-from delphij@delphij.net) Received: from anubis.delphij.net (anubis.delphij.net [IPv6:2001:470:1:117::25]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "anubis.delphij.net", Issuer "StartCom Class 1 Primary Intermediate Server CA" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id E668B2827 for ; Wed, 1 Jul 2015 23:49:03 +0000 (UTC) (envelope-from delphij@delphij.net) Received: from zeta.ixsystems.com (unknown [12.229.62.2]) (using TLSv1 with cipher ECDHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by anubis.delphij.net (Postfix) with ESMTPSA id 8DBC61A4E9; Wed, 1 Jul 2015 16:49:03 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=delphij.net; s=anubis; t=1435794543; x=1435808943; bh=mU51GQiBCcWPpLcttJHV5yoKiv5edOluXoZToisCqRw=; h=Date:From:Reply-To:To:CC:Subject:References:In-Reply-To; b=oI+J7TrxMrfgmEoxZzvLJ/m73E9nrTyc8KBceM1YyulGqnXbRjncEITvByMiQ59h6 fiWBGW+6JYYqhSIhd1qUTeXya9HEQB9qIwewjcJiqRGsIHkLTI+PlCdfCwXFF3xtjR PbbxBx48+9WlM71hQmS+KErIhXRO1vv0jjsyoSYI= Message-ID: <55947C6E.5060409@delphij.net> Date: Wed, 01 Jul 2015 16:49:02 -0700 From: Xin Li Reply-To: d@delphij.net Organization: The FreeBSD Project MIME-Version: 1.0 To: Ahmed Kamal , Rick Macklem CC: freebsd-fs@freebsd.org Subject: Re: Linux NFSv4 clients are getting (bad sequence-id error!) References: <684628776.2772174.1435793776748.JavaMail.zimbra@uoguelph.ca> In-Reply-To: Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 7bit X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 01 Jul 2015 23:49:04 -0000 -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA512 On 07/01/15 16:44, Ahmed Kamal via freebsd-fs wrote: > The not so great news is, after updating sysctl and rebooting the > nas box, I still saw a few (NFS: v4 server nas returned a bad > sequence-id error!) lines in logs. Users have already left, so I > don't know how bad is it .. > > Could you share more info on what this error means? RedHat seems to > think the client can skip-by-1 and choose larger IDs and that would > be totally fine ? Also how serious is this error, would it cause > NFS session stall like that ? I wonder if this would help, which loosen the check: Index: sys/fs/nfsserver/nfs_nfsdstate.c =================================================================== - --- sys/fs/nfsserver/nfs_nfsdstate.c (revision 285016) +++ sys/fs/nfsserver/nfs_nfsdstate.c (working copy) @@ -3805,7 +3805,8 @@ nfsrv_checkseqid(struct nfsrv_descript *nd, u_int3 printf("refcnt=%d\n", stp->ls_op->rc_refcnt); panic("nfsrvstate op refcnt"); } - - if ((stp->ls_seq + 1) == seqid) { + if ((stp->ls_seq + 1) == seqid || + (stp->ls_seq + 2) == seqid) { if (stp->ls_op) nfsrvd_derefcache(stp->ls_op); stp->ls_op = op; Personally I don't quite buy the skip-by-1 is Okay argument but it seems that the RFC text can be interpreted that way. Cheers, > On Thu, Jul 2, 2015 at 1:36 AM, Rick Macklem > wrote: > >> Ahmed Kamal wrote: >>> Hi all, >>> >>> I'm a refugee from linux land. I just set up my first freebsd >>> 10.1 zfs >> box, >>> sharing /home over nfs. Since every home directory is its own >>> zfs >> dataset, >>> I chose to use nfsv4 to enable recursively sharing/mounting any >>> directory under /home (I understand nfs4 is a must in this >>> scenario!) >>> >>> I'm able to mount form linux (rhel5 latest kernel) >>> successfully. Users >> are >>> working fine. However every now and then a user screams that >>> his session >> is >>> frozen. Usually the processes are stuck in nfs_wait or rpc_* >>> state. I >> tried >>> using a much newer linux kernel (3.2 however it still faced the >>> same problem). The errors in Linux log files are mostly: Jul 1 >>> 17:41:47 mammoth kernel: NFS: v4 server nas returned a *bad >>> sequence-id error*! Jul 1 17:52:32 mammoth kernel: >>> nfs4_reclaim_locks: unhandled error -11. Zeroing state Jul 1 >>> 17:52:32 mammoth kernel: nfs4_reclaim_open_state: Lock reclaim >>> failed! >>> >> Btw, a client should only do "reclaim" operations after the >> server has replied with NFS4ERR_STALE_CLIENTID or >> NFS4ERR_STALE_STATEID. I am pretty certain that the FreeBSD NFSv4 >> server only generates these replies after it has rebooted, so >> assuming the server didn't reboot, I have no idea why the client >> would attempt these and am not surprised they failed. >> >> I'm guessing that the DRC constipation somehow caused the Linux >> client to go into recovery mode? >> >> rick >> >>> My search led me to >>> (https://access.redhat.com/solutions/1328073) a detailed >>> analysis of the issue, which you can read over here >>> https://dl.dropboxusercontent.com/u/51939288/nfs4-bad-seq.pdf >>> .. NetApp confirmed this was a bug for them (I'm wondering if >>> this is still in FreeBSD?!) >>> >>> PS: Right before sending this, I saw dmesg on the freebsd box >>> advising increasing vfs.nfsd.tcphighwater .. So I up'ed that to >>> 64000. I also >> up'ed >>> the number of nfs server threads (-t) from 10 to 60 (we're >>> roughly 40 >> linux >>> machines) >>> >>> Any advice is most appreciated! >>> >>> Thanks _______________________________________________ >>> freebsd-fs@freebsd.org mailing list >>> http://lists.freebsd.org/mailman/listinfo/freebsd-fs To >>> unsubscribe, send any mail to >>> "freebsd-fs-unsubscribe@freebsd.org" >>> >> > _______________________________________________ > freebsd-fs@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-fs To > unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org" > - -- Xin LI https://www.delphij.net/ FreeBSD - The Power to Serve! Live free or die -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.1.5 (FreeBSD) iQIcBAEBCgAGBQJVlHxuAAoJEJW2GBstM+nskvsP/ire8QyTfL6mF1njMNZwI/k5 AQ+BwWs5r8LzcRN/4v7/gelbS+lXnYVbVHMl8q6j+HzUzQ3yId4ZGlJWpJtHDNnj +gV8kmFt/og1QTrQRbN81i4GEr914SlKWmo7LsxrWmEhAiKsN0sYsjELD/mH5BZX 1wRe3vTvyrMwm+6u1krqT8ZrxRANBFBmNqiFb8sag7B3oJQZsGhAyUSsJvUhb00o ozwC2NT5y8Jv0QcZdC/wGeYc8FmRNQTAjE22WkzbsUey/e7FxL7vflCGgngYCIxE zbZNW65xThZO8fti5MxiepJ27VPa5ocX0CQihBFYp5haG6fzWBGalV/ggAOwYL44 nz1caLhdKIj9JSd8QwLdTArq8+6H8Sx4jp4iGzQnppNo8PtG/AlHlw9uDKaUF4iw H+tMb6qMu2FQJ9X+phtplzvjZxCbBbwY205GeTm5eElOkYzIyYvqIvZasos02ze0 v3SQXtpIHjrnndXMVNRJOkhYquGxVFxUm5IJ7o+0wrgVJp1V3cBKd4vs0o84Mgu5 EPGKCyt8x/B6ujCxkunODpNOb+sFyq6aqsDLAO6JSih5HfQntpxoZTjm8p4KjsG6 nPqXQXmi2NoOd6WPOunp7w/y+fKA4YdLAhPC7rbXQwpLL81UqNH141BrtscN0ovi pyRlJ4r3Zs75qUwVSkzL =3/OG -----END PGP SIGNATURE----- From owner-freebsd-fs@freebsd.org Thu Jul 2 00:10:34 2015 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 30871992BCD for ; Thu, 2 Jul 2015 00:10:34 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) Received: from esa-annu.net.uoguelph.ca (esa-annu.mail.uoguelph.ca [131.104.91.36]) by mx1.freebsd.org (Postfix) with ESMTP id D5A832E6B for ; Thu, 2 Jul 2015 00:10:33 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: A2DBBAA1gZRV/61jaINbDoNYXwaDGbwHCoUuSgKCCBABAQEBAQEBgQqEIgEBAQMBAQEBIAQnIAsFCwIBCBgCAg0HEgICJwEJJgIECAcEARwEiAYIDbYIlmIBAQEBBgEBAQEBAQEbgSGKKYQ0AQEcNAcYglCBQwWHA4UUgSOGVoRdhDaECESSaoNbAiZjgSkcgRRaIjEHgQU6gQIBAQE X-IronPort-AV: E=Sophos;i="5.15,389,1432612800"; d="scan'208";a="223368918" Received: from nipigon.cs.uoguelph.ca (HELO zcs1.mail.uoguelph.ca) ([131.104.99.173]) by esa-annu.net.uoguelph.ca with ESMTP; 01 Jul 2015 20:10:32 -0400 Received: from localhost (localhost [127.0.0.1]) by zcs1.mail.uoguelph.ca (Postfix) with ESMTP id 40A7F15F533; Wed, 1 Jul 2015 20:10:32 -0400 (EDT) Received: from zcs1.mail.uoguelph.ca ([127.0.0.1]) by localhost (zcs1.mail.uoguelph.ca [127.0.0.1]) (amavisd-new, port 10032) with ESMTP id I98G6cuz1POf; Wed, 1 Jul 2015 20:10:31 -0400 (EDT) Received: from localhost (localhost [127.0.0.1]) by zcs1.mail.uoguelph.ca (Postfix) with ESMTP id 96F3915F54D; Wed, 1 Jul 2015 20:10:31 -0400 (EDT) X-Virus-Scanned: amavisd-new at zcs1.mail.uoguelph.ca Received: from zcs1.mail.uoguelph.ca ([127.0.0.1]) by localhost (zcs1.mail.uoguelph.ca [127.0.0.1]) (amavisd-new, port 10026) with ESMTP id 3MarbI7qxJ9f; Wed, 1 Jul 2015 20:10:31 -0400 (EDT) Received: from zcs1.mail.uoguelph.ca (zcs1.mail.uoguelph.ca [172.17.95.18]) by zcs1.mail.uoguelph.ca (Postfix) with ESMTP id 7BDBD15F533; Wed, 1 Jul 2015 20:10:31 -0400 (EDT) Date: Wed, 1 Jul 2015 20:10:31 -0400 (EDT) From: Rick Macklem To: Graham Allan Cc: freebsd-fs@freebsd.org Message-ID: <972685551.2776991.1435795831472.JavaMail.zimbra@uoguelph.ca> In-Reply-To: <55946FFE.8070402@physics.umn.edu> References: <55946FFE.8070402@physics.umn.edu> Subject: Re: Strange NFS problem implicating nfsuserd? MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Originating-IP: [172.17.95.10] X-Mailer: Zimbra 8.0.9_GA_6191 (ZimbraWebClient - FF34 (Win)/8.0.9_GA_6191) Thread-Topic: Strange NFS problem implicating nfsuserd? Thread-Index: aL17OfM4OuGNE9XoCP/AZ+LPWwhKnw== X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 02 Jul 2015 00:10:34 -0000 Graham Allan wrote: > I spent a few days digging into a strange NFSv4 problem at our site, > which I think I may have finally resolved but don't really understand why. > > We have a bunch of large-ish NFS servers running FreeBSD 9.3 exporting > ZFS filesystems to mostly "RHEL-clone" linux clients. Over the last few > weeks I started getting reports that peoples' jobs would fail > erratically with i/o errors, and it became apparent that they pointed in > general to all our FreeBSD NFS servers rather than just one. > > Ultimately I could trivially reproduce the problem running > "find . -type f -exec cat {} > /dev/null \;" > on one of the NFS-mounted filesystems. > > Linux clients would eventually error with "Input/output error" > FreeBSD clients would eventually error with "Permission denied" on files > or directories which should be readable. > > Reverting to earlier patch releases didn't make any difference, though > it seemed like the problem started roughly when I updated p8->p13. > > Finally I seem to have pinpointed it to one change made in rc.conf for > nfsuserd, which I committed at around the right date: > > nfsuserd_flags="-usermax 500 -usertimeout 600 16" > > became: > > nfsuserd_flags="-domain xxx.yyy.zzz -usermax 500 -usertimeout 600 16" > > probably because I saw a user mapping failure somewhere previously, and > decided to make the domain explicit. > > Undoing this change appears to eliminate the problem - but this makes no > sense to me. Starting nfsuserd with either set of options (adding > -verbose) prints the same output: > > Starting nfsuserd. > nfsuserd: domain=xxx.yyy.zzz usermax=500 usertimeout=36000 > > So the domain chosen by default is the same as the one explicitly > specified (as I would expect). > > I've reproduced this across 4-5 different servers and a similar number > of different client systems. I'm wondering if any plausible explanation > suggests itself? > As far as I know, the domain is only set when the nfsuserd is started and it just uses the domain part of the machine's host name if not explicitly defined by "-domain". Maybe there is some bug in nfsuserd.c that gets tickled by the option, although I just looked and the argument parsing looks ok. If your xxx.yyy.zzz is identical, then I can't see how this would affect anything. What will cause intermittent mapping problems is having more than one username that maps to the same uid. (One of them will be cached at random.) (There was a common case of both "root" and "toor" in the password database for uid == 0.) rick > Graham > -- > _______________________________________________ > freebsd-fs@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-fs > To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org" > From owner-freebsd-fs@freebsd.org Thu Jul 2 00:25:07 2015 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 60F4B992E47 for ; Thu, 2 Jul 2015 00:25:07 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) Received: from esa-jnhn.mail.uoguelph.ca (esa-jnhn.mail.uoguelph.ca [131.104.91.44]) by mx1.freebsd.org (Postfix) with ESMTP id 1D1A714B1 for ; Thu, 2 Jul 2015 00:25:06 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: A2B3AwCHg5RV/61jaINbg2ZfBoMZuhoJgWQKhS5KAoIEFAEBAQEBAQGBCoQiAQEBAgEBAQEBIAQnIAsFCwIBCBgCAg0ZAgInAQkmAhMEARkBAgSIBggNtgSWYgEBAQEBBQEBAQEBHYEhiimENAEBAQQXATMHgmiBQwWMFod6hF2CWYFdhAhEhl2PaAImggsdgW4iMQEGgQU6gQIBAQE X-IronPort-AV: E=Sophos;i="5.15,389,1432612800"; d="scan'208";a="221505884" Received: from nipigon.cs.uoguelph.ca (HELO zcs1.mail.uoguelph.ca) ([131.104.99.173]) by esa-jnhn.mail.uoguelph.ca with ESMTP; 01 Jul 2015 20:25:05 -0400 Received: from localhost (localhost [127.0.0.1]) by zcs1.mail.uoguelph.ca (Postfix) with ESMTP id 07ADD15F533; Wed, 1 Jul 2015 20:25:05 -0400 (EDT) Received: from zcs1.mail.uoguelph.ca ([127.0.0.1]) by localhost (zcs1.mail.uoguelph.ca [127.0.0.1]) (amavisd-new, port 10032) with ESMTP id u0UrCrgan1Rq; Wed, 1 Jul 2015 20:25:03 -0400 (EDT) Received: from localhost (localhost [127.0.0.1]) by zcs1.mail.uoguelph.ca (Postfix) with ESMTP id BE82D15F54D; Wed, 1 Jul 2015 20:25:03 -0400 (EDT) X-Virus-Scanned: amavisd-new at zcs1.mail.uoguelph.ca Received: from zcs1.mail.uoguelph.ca ([127.0.0.1]) by localhost (zcs1.mail.uoguelph.ca [127.0.0.1]) (amavisd-new, port 10026) with ESMTP id tvLNX-DejrJI; Wed, 1 Jul 2015 20:25:03 -0400 (EDT) Received: from zcs1.mail.uoguelph.ca (zcs1.mail.uoguelph.ca [172.17.95.18]) by zcs1.mail.uoguelph.ca (Postfix) with ESMTP id 9F67D15F533; Wed, 1 Jul 2015 20:25:03 -0400 (EDT) Date: Wed, 1 Jul 2015 20:25:03 -0400 (EDT) From: Rick Macklem To: d@delphij.net Cc: Ahmed Kamal , freebsd-fs@freebsd.org Message-ID: <284567732.2779201.1435796703628.JavaMail.zimbra@uoguelph.ca> In-Reply-To: <55947C6E.5060409@delphij.net> References: <684628776.2772174.1435793776748.JavaMail.zimbra@uoguelph.ca> <55947C6E.5060409@delphij.net> Subject: Re: Linux NFSv4 clients are getting (bad sequence-id error!) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Originating-IP: [172.17.95.11] X-Mailer: Zimbra 8.0.9_GA_6191 (ZimbraWebClient - FF34 (Win)/8.0.9_GA_6191) Thread-Topic: Linux NFSv4 clients are getting (bad sequence-id error!) Thread-Index: rKwzfFHSu/+ZVz9uDz2AhKphYCLTVQ== X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 02 Jul 2015 00:25:07 -0000 Xin Li wrote: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA512 > > On 07/01/15 16:44, Ahmed Kamal via freebsd-fs wrote: > > The not so great news is, after updating sysctl and rebooting the > > nas box, I still saw a few (NFS: v4 server nas returned a bad > > sequence-id error!) lines in logs. Users have already left, so I > > don't know how bad is it .. > > > > Could you share more info on what this error means? RedHat seems to > > think the client can skip-by-1 and choose larger IDs and that would > > be totally fine ? Also how serious is this error, would it cause > > NFS session stall like that ? > Ok, I see the "skip-by-1" comment in the RedHat stuff. I have no idea where they got that from, because I certainly don't read it that way in the RFC. I'll post on nfsv4@ietf.org to see what the story is on this. (Very recently, a new RFC was published that replaces RFC-3530. Maybe it allows this skip-by-1. It shouldn't, because the whole idea is to order lock operations. It may be that the OPEN case has been changed. The Linux client author is always trying to come up with tricks that allow OPENs to be done concurrently.) This patch does look correct if this "skip-by-1" is supposed to be allowed for all operations. It will be interesting to see if this patch resolves the problem. Thanks for spotting this. I would never have guessed that this "skip-by-1" would be done. rick > I wonder if this would help, which loosen the check: > > Index: sys/fs/nfsserver/nfs_nfsdstate.c > =================================================================== > - --- sys/fs/nfsserver/nfs_nfsdstate.c (revision 285016) > +++ sys/fs/nfsserver/nfs_nfsdstate.c (working copy) > @@ -3805,7 +3805,8 @@ nfsrv_checkseqid(struct nfsrv_descript *nd, u_int3 > printf("refcnt=%d\n", stp->ls_op->rc_refcnt); > panic("nfsrvstate op refcnt"); > } > - - if ((stp->ls_seq + 1) == seqid) { > + if ((stp->ls_seq + 1) == seqid || > + (stp->ls_seq + 2) == seqid) { > if (stp->ls_op) > nfsrvd_derefcache(stp->ls_op); > stp->ls_op = op; > > > Personally I don't quite buy the skip-by-1 is Okay argument but it > seems that the RFC text can be interpreted that way. > > Cheers, > > > > On Thu, Jul 2, 2015 at 1:36 AM, Rick Macklem > > wrote: > > > >> Ahmed Kamal wrote: > >>> Hi all, > >>> > >>> I'm a refugee from linux land. I just set up my first freebsd > >>> 10.1 zfs > >> box, > >>> sharing /home over nfs. Since every home directory is its own > >>> zfs > >> dataset, > >>> I chose to use nfsv4 to enable recursively sharing/mounting any > >>> directory under /home (I understand nfs4 is a must in this > >>> scenario!) > >>> > >>> I'm able to mount form linux (rhel5 latest kernel) > >>> successfully. Users > >> are > >>> working fine. However every now and then a user screams that > >>> his session > >> is > >>> frozen. Usually the processes are stuck in nfs_wait or rpc_* > >>> state. I > >> tried > >>> using a much newer linux kernel (3.2 however it still faced the > >>> same problem). The errors in Linux log files are mostly: Jul 1 > >>> 17:41:47 mammoth kernel: NFS: v4 server nas returned a *bad > >>> sequence-id error*! Jul 1 17:52:32 mammoth kernel: > >>> nfs4_reclaim_locks: unhandled error -11. Zeroing state Jul 1 > >>> 17:52:32 mammoth kernel: nfs4_reclaim_open_state: Lock reclaim > >>> failed! > >>> > >> Btw, a client should only do "reclaim" operations after the > >> server has replied with NFS4ERR_STALE_CLIENTID or > >> NFS4ERR_STALE_STATEID. I am pretty certain that the FreeBSD NFSv4 > >> server only generates these replies after it has rebooted, so > >> assuming the server didn't reboot, I have no idea why the client > >> would attempt these and am not surprised they failed. > >> > >> I'm guessing that the DRC constipation somehow caused the Linux > >> client to go into recovery mode? > >> > >> rick > >> > >>> My search led me to > >>> (https://access.redhat.com/solutions/1328073) a detailed > >>> analysis of the issue, which you can read over here > >>> https://dl.dropboxusercontent.com/u/51939288/nfs4-bad-seq.pdf > >>> .. NetApp confirmed this was a bug for them (I'm wondering if > >>> this is still in FreeBSD?!) > >>> > >>> PS: Right before sending this, I saw dmesg on the freebsd box > >>> advising increasing vfs.nfsd.tcphighwater .. So I up'ed that to > >>> 64000. I also > >> up'ed > >>> the number of nfs server threads (-t) from 10 to 60 (we're > >>> roughly 40 > >> linux > >>> machines) > >>> > >>> Any advice is most appreciated! > >>> > >>> Thanks _______________________________________________ > >>> freebsd-fs@freebsd.org mailing list > >>> http://lists.freebsd.org/mailman/listinfo/freebsd-fs To > >>> unsubscribe, send any mail to > >>> "freebsd-fs-unsubscribe@freebsd.org" > >>> > >> > > _______________________________________________ > > freebsd-fs@freebsd.org mailing list > > http://lists.freebsd.org/mailman/listinfo/freebsd-fs To > > unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org" > > > > > - -- > Xin LI https://www.delphij.net/ > FreeBSD - The Power to Serve! Live free or die > -----BEGIN PGP SIGNATURE----- > Version: GnuPG v2.1.5 (FreeBSD) > > iQIcBAEBCgAGBQJVlHxuAAoJEJW2GBstM+nskvsP/ire8QyTfL6mF1njMNZwI/k5 > AQ+BwWs5r8LzcRN/4v7/gelbS+lXnYVbVHMl8q6j+HzUzQ3yId4ZGlJWpJtHDNnj > +gV8kmFt/og1QTrQRbN81i4GEr914SlKWmo7LsxrWmEhAiKsN0sYsjELD/mH5BZX > 1wRe3vTvyrMwm+6u1krqT8ZrxRANBFBmNqiFb8sag7B3oJQZsGhAyUSsJvUhb00o > ozwC2NT5y8Jv0QcZdC/wGeYc8FmRNQTAjE22WkzbsUey/e7FxL7vflCGgngYCIxE > zbZNW65xThZO8fti5MxiepJ27VPa5ocX0CQihBFYp5haG6fzWBGalV/ggAOwYL44 > nz1caLhdKIj9JSd8QwLdTArq8+6H8Sx4jp4iGzQnppNo8PtG/AlHlw9uDKaUF4iw > H+tMb6qMu2FQJ9X+phtplzvjZxCbBbwY205GeTm5eElOkYzIyYvqIvZasos02ze0 > v3SQXtpIHjrnndXMVNRJOkhYquGxVFxUm5IJ7o+0wrgVJp1V3cBKd4vs0o84Mgu5 > EPGKCyt8x/B6ujCxkunODpNOb+sFyq6aqsDLAO6JSih5HfQntpxoZTjm8p4KjsG6 > nPqXQXmi2NoOd6WPOunp7w/y+fKA4YdLAhPC7rbXQwpLL81UqNH141BrtscN0ovi > pyRlJ4r3Zs75qUwVSkzL > =3/OG > -----END PGP SIGNATURE----- > From owner-freebsd-fs@freebsd.org Thu Jul 2 00:43:04 2015 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id ADAC19920D9 for ; Thu, 2 Jul 2015 00:43:04 +0000 (UTC) (envelope-from allan@physics.umn.edu) Received: from mail.physics.umn.edu (smtp.spa.umn.edu [128.101.220.4]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 903971CAF for ; Thu, 2 Jul 2015 00:43:04 +0000 (UTC) (envelope-from allan@physics.umn.edu) Received: from spa-sysadm-01.spa.umn.edu ([134.84.199.8]) by mail.physics.umn.edu with esmtpsa (TLSv1:AES128-SHA:128) (Exim 4.77 (FreeBSD)) (envelope-from ) id 1ZASak-000Juj-Sp; Wed, 01 Jul 2015 19:43:02 -0500 Message-ID: <55948916.4080405@physics.umn.edu> Date: Wed, 01 Jul 2015 19:43:02 -0500 From: Graham Allan User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:31.0) Gecko/20100101 Thunderbird/31.5.0 MIME-Version: 1.0 To: Rick Macklem CC: freebsd-fs@freebsd.org Subject: Re: Strange NFS problem implicating nfsuserd? References: <55946FFE.8070402@physics.umn.edu> <972685551.2776991.1435795831472.JavaMail.zimbra@uoguelph.ca> In-Reply-To: <972685551.2776991.1435795831472.JavaMail.zimbra@uoguelph.ca> Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 02 Jul 2015 00:43:04 -0000 On 7/1/2015 7:10 PM, Rick Macklem wrote: >> >> I've reproduced this across 4-5 different servers and a similar number >> of different client systems. I'm wondering if any plausible explanation >> suggests itself? >> > > As far as I know, the domain is only set when > the nfsuserd is started and it just uses the domain part of the machine's > host name if not explicitly defined by "-domain". Maybe there is some bug > in nfsuserd.c that gets tickled by the option, although I just looked and > the argument parsing looks ok. > > If your xxx.yyy.zzz is identical, then I can't see how this would affect > anything. > > What will cause intermittent mapping problems is having more than one > username that maps to the same uid. (One of them will be cached at random.) > (There was a common case of both "root" and "toor" in the password database > for uid == 0.) Yes, on the face of it this report appears crazy to me too :-) If I hadn't tried a dozen other things including reverting FreeBSD patch level, linux kernel/package versions, tweaking/checking ldap lookup settings (nslcd etc), before simply removing the "domain=" argument to nfsuserd, I wouldn't believe it possible. I also took a quick look through nfsuserd.c and couldn't see anything to explain it. I want to think something else must be going on, but adding or removing that parameter appears to toggle the problem on and off deterministically. I was always able to get a failure within 10-60 minutes or so, so having the nfsuserd cache timeout at 600 minutes seems like it should eliminate any intermittent id lookup issues. I guess I could try... (1) rpcdebug on the linux client, though I'm not sure which flags to enable to log idmapping issues. (2) watch nfsuserd with truss and look for different behaviors. (3) capture NFS traffic, examine with wireshark Graham From owner-freebsd-fs@freebsd.org Thu Jul 2 01:09:46 2015 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id EE25C9923C1 for ; Thu, 2 Jul 2015 01:09:46 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) Received: from esa-jnhn.mail.uoguelph.ca (esa-jnhn.mail.uoguelph.ca [131.104.91.44]) by mx1.freebsd.org (Postfix) with ESMTP id A01651239 for ; Thu, 2 Jul 2015 01:09:46 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: A2B3AwA/jpRV/61jaINbg2ZfBoMZuhoJgWQKhS5KAoICFAEBAQEBAQGBCoQiAQEBAgEBAQEBIAQnIAsFCwIBCBgCAg0ZAgInAQkmAhMEARkBAgSIBggNtgeWXgEBAQEBBQEBAQEBAQEbgSGKKYQ0AQEBBBcBMweCaIFDBYcIhQ6BJIZWhF2ENoQIRINQgw2MDYNbAiaCCx2BbiIxAQaBBTqBAgEBAQ X-IronPort-AV: E=Sophos;i="5.15,389,1432612800"; d="scan'208";a="221510258" Received: from nipigon.cs.uoguelph.ca (HELO zcs1.mail.uoguelph.ca) ([131.104.99.173]) by esa-jnhn.mail.uoguelph.ca with ESMTP; 01 Jul 2015 21:09:45 -0400 Received: from localhost (localhost [127.0.0.1]) by zcs1.mail.uoguelph.ca (Postfix) with ESMTP id 0C63A15F533; Wed, 1 Jul 2015 21:09:45 -0400 (EDT) Received: from zcs1.mail.uoguelph.ca ([127.0.0.1]) by localhost (zcs1.mail.uoguelph.ca [127.0.0.1]) (amavisd-new, port 10032) with ESMTP id 9mQrzFqhObDP; Wed, 1 Jul 2015 21:09:43 -0400 (EDT) Received: from localhost (localhost [127.0.0.1]) by zcs1.mail.uoguelph.ca (Postfix) with ESMTP id E5CC615F54D; Wed, 1 Jul 2015 21:09:43 -0400 (EDT) X-Virus-Scanned: amavisd-new at zcs1.mail.uoguelph.ca Received: from zcs1.mail.uoguelph.ca ([127.0.0.1]) by localhost (zcs1.mail.uoguelph.ca [127.0.0.1]) (amavisd-new, port 10026) with ESMTP id URnLzD9XECl1; Wed, 1 Jul 2015 21:09:43 -0400 (EDT) Received: from zcs1.mail.uoguelph.ca (zcs1.mail.uoguelph.ca [172.17.95.18]) by zcs1.mail.uoguelph.ca (Postfix) with ESMTP id C891915F533; Wed, 1 Jul 2015 21:09:43 -0400 (EDT) Date: Wed, 1 Jul 2015 21:09:43 -0400 (EDT) From: Rick Macklem To: d@delphij.net Cc: Ahmed Kamal , freebsd-fs@freebsd.org Message-ID: <1491630362.2785531.1435799383802.JavaMail.zimbra@uoguelph.ca> In-Reply-To: <55947C6E.5060409@delphij.net> References: <684628776.2772174.1435793776748.JavaMail.zimbra@uoguelph.ca> <55947C6E.5060409@delphij.net> Subject: Re: Linux NFSv4 clients are getting (bad sequence-id error!) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Originating-IP: [172.17.95.12] X-Mailer: Zimbra 8.0.9_GA_6191 (ZimbraWebClient - FF34 (Win)/8.0.9_GA_6191) Thread-Topic: Linux NFSv4 clients are getting (bad sequence-id error!) Thread-Index: 551uJ4KbtV+co3xU0l+6BLywRhu40w== X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 02 Jul 2015 01:09:47 -0000 Xin Li wrote: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA512 > > On 07/01/15 16:44, Ahmed Kamal via freebsd-fs wrote: > > The not so great news is, after updating sysctl and rebooting the > > nas box, I still saw a few (NFS: v4 server nas returned a bad > > sequence-id error!) lines in logs. Users have already left, so I > > don't know how bad is it .. > > > > Could you share more info on what this error means? RedHat seems to > > think the client can skip-by-1 and choose larger IDs and that would > > be totally fine ? Also how serious is this error, would it cause > > NFS session stall like that ? > I just looked at RFC-7530 (which replaced RFC-3530 for NFSv4.0) and here is the snippet w.r.t. this: When a request is received, its sequence number (r) is compared to that of the last one received (L). Only if it has the correct next sequence, normally L + 1, is the request processed beyond the point of seqid checking. Given a properly functioning client, the response to (r) must have been received before the last request (L) was sent. If a duplicate of last request (r == L) is received, the stored response is returned. If the sequence value received is any other value, it is rejected with the return of error NFS4ERR_BAD_SEQID. Sequence history is reinitialized whenever the SETCLIENTID/ SETCLIENTID_CONFIRM sequence changes the client verifier. It seems clear that "L + 2" is not supposed to be allowed. (The wording in RFC-3530 meant that L + 2 was the case of past the correct one, but it was poorly worded I'd say.) However, the updated RFC has changed one case. For RFC-3530, the correct value that follows UINT32_MAX is 0. However, for RFC-7530, the code is supposed to skip over 0 and use 1. (I think that means that the server must accept both 0 and 1 as the correct value following UINT32_MAX.) It seems that is why the above states "normally L + 1". I am going to post to nfsv4@ietf.org to see what they say. Please let me know if Xin Li's patch resolves your problem, even though I don't believe it is correct except for the UINT32_MAX case. Good luck with it, rick > I wonder if this would help, which loosen the check: > > Index: sys/fs/nfsserver/nfs_nfsdstate.c > =================================================================== > - --- sys/fs/nfsserver/nfs_nfsdstate.c (revision 285016) > +++ sys/fs/nfsserver/nfs_nfsdstate.c (working copy) > @@ -3805,7 +3805,8 @@ nfsrv_checkseqid(struct nfsrv_descript *nd, u_int3 > printf("refcnt=%d\n", stp->ls_op->rc_refcnt); > panic("nfsrvstate op refcnt"); > } > - - if ((stp->ls_seq + 1) == seqid) { > + if ((stp->ls_seq + 1) == seqid || > + (stp->ls_seq + 2) == seqid) { > if (stp->ls_op) > nfsrvd_derefcache(stp->ls_op); > stp->ls_op = op; > > > Personally I don't quite buy the skip-by-1 is Okay argument but it > seems that the RFC text can be interpreted that way. > > Cheers, > > > > On Thu, Jul 2, 2015 at 1:36 AM, Rick Macklem > > wrote: > > > >> Ahmed Kamal wrote: > >>> Hi all, > >>> > >>> I'm a refugee from linux land. I just set up my first freebsd > >>> 10.1 zfs > >> box, > >>> sharing /home over nfs. Since every home directory is its own > >>> zfs > >> dataset, > >>> I chose to use nfsv4 to enable recursively sharing/mounting any > >>> directory under /home (I understand nfs4 is a must in this > >>> scenario!) > >>> > >>> I'm able to mount form linux (rhel5 latest kernel) > >>> successfully. Users > >> are > >>> working fine. However every now and then a user screams that > >>> his session > >> is > >>> frozen. Usually the processes are stuck in nfs_wait or rpc_* > >>> state. I > >> tried > >>> using a much newer linux kernel (3.2 however it still faced the > >>> same problem). The errors in Linux log files are mostly: Jul 1 > >>> 17:41:47 mammoth kernel: NFS: v4 server nas returned a *bad > >>> sequence-id error*! Jul 1 17:52:32 mammoth kernel: > >>> nfs4_reclaim_locks: unhandled error -11. Zeroing state Jul 1 > >>> 17:52:32 mammoth kernel: nfs4_reclaim_open_state: Lock reclaim > >>> failed! > >>> > >> Btw, a client should only do "reclaim" operations after the > >> server has replied with NFS4ERR_STALE_CLIENTID or > >> NFS4ERR_STALE_STATEID. I am pretty certain that the FreeBSD NFSv4 > >> server only generates these replies after it has rebooted, so > >> assuming the server didn't reboot, I have no idea why the client > >> would attempt these and am not surprised they failed. > >> > >> I'm guessing that the DRC constipation somehow caused the Linux > >> client to go into recovery mode? > >> > >> rick > >> > >>> My search led me to > >>> (https://access.redhat.com/solutions/1328073) a detailed > >>> analysis of the issue, which you can read over here > >>> https://dl.dropboxusercontent.com/u/51939288/nfs4-bad-seq.pdf > >>> .. NetApp confirmed this was a bug for them (I'm wondering if > >>> this is still in FreeBSD?!) > >>> > >>> PS: Right before sending this, I saw dmesg on the freebsd box > >>> advising increasing vfs.nfsd.tcphighwater .. So I up'ed that to > >>> 64000. I also > >> up'ed > >>> the number of nfs server threads (-t) from 10 to 60 (we're > >>> roughly 40 > >> linux > >>> machines) > >>> > >>> Any advice is most appreciated! > >>> > >>> Thanks _______________________________________________ > >>> freebsd-fs@freebsd.org mailing list > >>> http://lists.freebsd.org/mailman/listinfo/freebsd-fs To > >>> unsubscribe, send any mail to > >>> "freebsd-fs-unsubscribe@freebsd.org" > >>> > >> > > _______________________________________________ > > freebsd-fs@freebsd.org mailing list > > http://lists.freebsd.org/mailman/listinfo/freebsd-fs To > > unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org" > > > > > - -- > Xin LI https://www.delphij.net/ > FreeBSD - The Power to Serve! Live free or die > -----BEGIN PGP SIGNATURE----- > Version: GnuPG v2.1.5 (FreeBSD) > > iQIcBAEBCgAGBQJVlHxuAAoJEJW2GBstM+nskvsP/ire8QyTfL6mF1njMNZwI/k5 > AQ+BwWs5r8LzcRN/4v7/gelbS+lXnYVbVHMl8q6j+HzUzQ3yId4ZGlJWpJtHDNnj > +gV8kmFt/og1QTrQRbN81i4GEr914SlKWmo7LsxrWmEhAiKsN0sYsjELD/mH5BZX > 1wRe3vTvyrMwm+6u1krqT8ZrxRANBFBmNqiFb8sag7B3oJQZsGhAyUSsJvUhb00o > ozwC2NT5y8Jv0QcZdC/wGeYc8FmRNQTAjE22WkzbsUey/e7FxL7vflCGgngYCIxE > zbZNW65xThZO8fti5MxiepJ27VPa5ocX0CQihBFYp5haG6fzWBGalV/ggAOwYL44 > nz1caLhdKIj9JSd8QwLdTArq8+6H8Sx4jp4iGzQnppNo8PtG/AlHlw9uDKaUF4iw > H+tMb6qMu2FQJ9X+phtplzvjZxCbBbwY205GeTm5eElOkYzIyYvqIvZasos02ze0 > v3SQXtpIHjrnndXMVNRJOkhYquGxVFxUm5IJ7o+0wrgVJp1V3cBKd4vs0o84Mgu5 > EPGKCyt8x/B6ujCxkunODpNOb+sFyq6aqsDLAO6JSih5HfQntpxoZTjm8p4KjsG6 > nPqXQXmi2NoOd6WPOunp7w/y+fKA4YdLAhPC7rbXQwpLL81UqNH141BrtscN0ovi > pyRlJ4r3Zs75qUwVSkzL > =3/OG > -----END PGP SIGNATURE----- > From owner-freebsd-fs@freebsd.org Thu Jul 2 01:14:05 2015 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id E12619924DE for ; Thu, 2 Jul 2015 01:14:04 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) Received: from esa-jnhn.mail.uoguelph.ca (esa-jnhn.mail.uoguelph.ca [131.104.91.44]) by mx1.freebsd.org (Postfix) with ESMTP id 93BDB1669 for ; Thu, 2 Jul 2015 01:14:04 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: A2ByAwA6j5RV/61jaINbDoQ9gxm6GgmHZgKCAhQBAQEBAQEBgQqEIgEBAQMBI1YFCwIBCBgCAg0HEgICVwIEiDoItheWXgEBAQEGAQEBAQEBHIEhiimEUjQHGIJQgUMFjBeHeY0bky6DWwImY4EpHIEUWiKBd4ECAQEB X-IronPort-AV: E=Sophos;i="5.15,389,1432612800"; d="scan'208";a="221510675" Received: from nipigon.cs.uoguelph.ca (HELO zcs1.mail.uoguelph.ca) ([131.104.99.173]) by esa-jnhn.mail.uoguelph.ca with ESMTP; 01 Jul 2015 21:14:03 -0400 Received: from localhost (localhost [127.0.0.1]) by zcs1.mail.uoguelph.ca (Postfix) with ESMTP id 7406415F533; Wed, 1 Jul 2015 21:14:03 -0400 (EDT) Received: from zcs1.mail.uoguelph.ca ([127.0.0.1]) by localhost (zcs1.mail.uoguelph.ca [127.0.0.1]) (amavisd-new, port 10032) with ESMTP id PE6b5B0OBMva; Wed, 1 Jul 2015 21:14:02 -0400 (EDT) Received: from localhost (localhost [127.0.0.1]) by zcs1.mail.uoguelph.ca (Postfix) with ESMTP id DD8B215F54D; Wed, 1 Jul 2015 21:14:02 -0400 (EDT) X-Virus-Scanned: amavisd-new at zcs1.mail.uoguelph.ca Received: from zcs1.mail.uoguelph.ca ([127.0.0.1]) by localhost (zcs1.mail.uoguelph.ca [127.0.0.1]) (amavisd-new, port 10026) with ESMTP id ZT-L3EUcG4Km; Wed, 1 Jul 2015 21:14:02 -0400 (EDT) Received: from zcs1.mail.uoguelph.ca (zcs1.mail.uoguelph.ca [172.17.95.18]) by zcs1.mail.uoguelph.ca (Postfix) with ESMTP id BB57315F533; Wed, 1 Jul 2015 21:14:02 -0400 (EDT) Date: Wed, 1 Jul 2015 21:14:02 -0400 (EDT) From: Rick Macklem To: Graham Allan Cc: freebsd-fs@freebsd.org Message-ID: <1203156989.2786078.1435799642755.JavaMail.zimbra@uoguelph.ca> In-Reply-To: <55948916.4080405@physics.umn.edu> References: <55946FFE.8070402@physics.umn.edu> <972685551.2776991.1435795831472.JavaMail.zimbra@uoguelph.ca> <55948916.4080405@physics.umn.edu> Subject: Re: Strange NFS problem implicating nfsuserd? MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Originating-IP: [172.17.95.10] X-Mailer: Zimbra 8.0.9_GA_6191 (ZimbraWebClient - FF34 (Win)/8.0.9_GA_6191) Thread-Topic: Strange NFS problem implicating nfsuserd? Thread-Index: pO8+2GSPF7vYELtZlb+jdlxc7cDdOA== X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 02 Jul 2015 01:14:05 -0000 Graham Allan wrote: > On 7/1/2015 7:10 PM, Rick Macklem wrote: > >> > >> I've reproduced this across 4-5 different servers and a similar number > >> of different client systems. I'm wondering if any plausible explanation > >> suggests itself? > >> > > > > As far as I know, the domain is only set when > > the nfsuserd is started and it just uses the domain part of the machine's > > host name if not explicitly defined by "-domain". Maybe there is some bug > > in nfsuserd.c that gets tickled by the option, although I just looked and > > the argument parsing looks ok. > > > > If your xxx.yyy.zzz is identical, then I can't see how this would affect > > anything. > > > > What will cause intermittent mapping problems is having more than one > > username that maps to the same uid. (One of them will be cached at random.) > > (There was a common case of both "root" and "toor" in the password database > > for uid == 0.) > > Yes, on the face of it this report appears crazy to me too :-) > > If I hadn't tried a dozen other things including reverting FreeBSD patch > level, linux kernel/package versions, tweaking/checking ldap lookup > settings (nslcd etc), before simply removing the "domain=" argument to > nfsuserd, I wouldn't believe it possible. I also took a quick look > through nfsuserd.c and couldn't see anything to explain it. I want to > think something else must be going on, but adding or removing that > parameter appears to toggle the problem on and off deterministically. > > I was always able to get a failure within 10-60 minutes or so, so having > the nfsuserd cache timeout at 600 minutes seems like it should eliminate > any intermittent id lookup issues. > I'll take another look at nfsuserd.c. Maybe it does something stupid like getting the length of the argument wrong (trailing blank or null or something like that, that doesn't show up when it is printed out). All I can think of is a subtle bug in nfsuserd.c when the argument is specified. > I guess I could try... > (1) rpcdebug on the linux client, though I'm not sure which flags to > enable to log idmapping issues. > (2) watch nfsuserd with truss and look for different behaviors. > (3) capture NFS traffic, examine with wireshark > I'd try #3 if I were you and see if the owner and owner_group names look right. I'll post if I find anything in nfsuserd.c, rick > Graham > From owner-freebsd-fs@freebsd.org Thu Jul 2 03:29:32 2015 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 79C909926C9 for ; Thu, 2 Jul 2015 03:29:32 +0000 (UTC) (envelope-from julian@freebsd.org) Received: from vps1.elischer.org (vps1.elischer.org [204.109.63.16]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "vps1.elischer.org", Issuer "CA Cert Signing Authority" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id 521432ABA for ; Thu, 2 Jul 2015 03:29:32 +0000 (UTC) (envelope-from julian@freebsd.org) Received: from Julian-MBP3.local (ppp121-45-238-82.lns20.per1.internode.on.net [121.45.238.82]) (authenticated bits=0) by vps1.elischer.org (8.14.9/8.14.9) with ESMTP id t623TJLr002737 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES128-SHA bits=128 verify=NO); Wed, 1 Jul 2015 20:29:22 -0700 (PDT) (envelope-from julian@freebsd.org) Message-ID: <5594B008.10202@freebsd.org> Date: Thu, 02 Jul 2015 11:29:12 +0800 From: Julian Elischer User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.10; rv:31.0) Gecko/20100101 Thunderbird/31.7.0 MIME-Version: 1.0 To: Rick Macklem , d@delphij.net CC: freebsd-fs@freebsd.org Subject: Re: Linux NFSv4 clients are getting (bad sequence-id error!) References: <684628776.2772174.1435793776748.JavaMail.zimbra@uoguelph.ca> <55947C6E.5060409@delphij.net> <1491630362.2785531.1435799383802.JavaMail.zimbra@uoguelph.ca> In-Reply-To: <1491630362.2785531.1435799383802.JavaMail.zimbra@uoguelph.ca> Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 7bit X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 02 Jul 2015 03:29:32 -0000 On 7/2/15 9:09 AM, Rick Macklem wrote: > I am going to post to nfsv4@ietf.org to see what they say. Please > let me know if Xin Li's patch resolves your problem, even though I > don't believe it is correct except for the UINT32_MAX case. Good > luck with it, rick and please keep us all in the loop as to what they say! the general N+2 bit sounds like bullshit to me.. its always N+1 in a number field that has a bit of slack at wrap time (probably due to some ambiguity in the original spec). From owner-freebsd-fs@freebsd.org Thu Jul 2 10:02:40 2015 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 54AE499207D for ; Thu, 2 Jul 2015 10:02:40 +0000 (UTC) (envelope-from felipemonteiro.carvalho@gmail.com) Received: from mail-pa0-x232.google.com (mail-pa0-x232.google.com [IPv6:2607:f8b0:400e:c03::232]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 2542421DD; Thu, 2 Jul 2015 10:02:40 +0000 (UTC) (envelope-from felipemonteiro.carvalho@gmail.com) Received: by paceq1 with SMTP id eq1so38199740pac.3; Thu, 02 Jul 2015 03:02:39 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=uuzSutuE6OIWJ+nQSYZMDoDIg8aCpLZZ402odZw4JXI=; b=QhpRDQ316wwzIAc66Rz9aY8OHatf459MRiOsQeo0DMEm/mVPp3+IRMNKEmW1trgOsH eKFSecMrpjcpelE0Nn5aITQeW9ui2eZUh/4wnUePhFVmkNb/xCeMRH1iutZCKuURyRVZ J7Wbt9HqnBEUfRGBO9jwqb2oZx0t/ZpSes9v4CXDyuTSKn6EOy6OaXnE7GIzpdUfEyI/ hmtmIqTACQLqb3aGgW+4BUrhfBImrZHVegj8ZCF7+CKBWDH3IrJ66v5VXhvwlu2hHgZq tUUUL3ah+lU+RTnxR30LbsiCIIUHnp2zOvrPlsjU/O/eye4OEvgtgrVOhMETUMjeyt3Q fOqg== MIME-Version: 1.0 X-Received: by 10.67.30.102 with SMTP id kd6mr64326714pad.132.1435831359605; Thu, 02 Jul 2015 03:02:39 -0700 (PDT) Received: by 10.66.147.4 with HTTP; Thu, 2 Jul 2015 03:02:39 -0700 (PDT) In-Reply-To: <557C282D.8060809@freebsd.org> References: <557B0255.8060809@freebsd.org> <01184F08-1C6B-4282-9203-1BF98F07A05A@gmail.com> <557C282D.8060809@freebsd.org> Date: Thu, 2 Jul 2015 12:02:39 +0200 Message-ID: Subject: Re: Uberblock location From: Felipe Monteiro de Carvalho To: Julian Elischer Cc: "freebsd-fs@freebsd.org" Content-Type: text/plain; charset=UTF-8 X-Content-Filtered-By: Mailman/MimeDel 2.1.20 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 02 Jul 2015 10:02:40 -0000 Hello, Ok, thanks a lot =) I am now working based on the FreeBSD source code, and I made a lot of progress at adapting it to my needs. My use is very different, so its not a simple "use as is", so I'm trying to understand what goes on in every step to adapt it, and I have arrived at a point where I got lost. Ok, so far I have the following working: Part 1> Read the disk label: based on function vdev_probe, reads from the disk the type vdev_phys_t This issues the following data: [0] DATA_TYPE_UINT64 name="version" val=5000 [1] DATA_TYPE_STRING name="name" val="zfs_felipe_2" [2] DATA_TYPE_UINT64 name="state" val=1 [3] DATA_TYPE_UINT64 name="txg" val=164 [4] DATA_TYPE_UINT64 name="pool_guid" val=8752563210577670337 [5] DATA_TYPE_UINT64 name="errata" val=0 [6] DATA_TYPE_UINT64 name="hostid" val=8323329 [7] DATA_TYPE_STRING name="hostname" val="felipe-VirtualBox" [8] DATA_TYPE_UINT64 name="top_guid" val=2451523895343473151 [9] DATA_TYPE_UINT64 name="guid" val=2451523895343473151 [10] DATA_TYPE_UINT64 name="vdev_children" val=1 [11] DATA_TYPE_NVLIST name="vdev_tree" >=[0] data_type_string 'name'="type" val="disk" >=[1] data_type_uint64 'name'="id" val=0 >=[2] data_type_uint64 'name'="guid" val=2,45152389534347E+018 >=[3] data_type_string 'name'="path" val="/dev/loop7" >=[4] data_type_uint64 'name'="whole_disk" val=0 >=[5] data_type_uint64 'name'="metaslab_array" val=34 >=[6] data_type_uint64 'name'="metaslab_shift" val=21 >=[7] data_type_uint64 'name'="ashift" val=9 >=[8] data_type_uint64 'name'="asize" val=257425408 >=[9] data_type_uint64 'name'="is_log" val=0 >=[10] data_type_uint64 'name'="create_txg" val=4 [12] DATA_TYPE_NVLIST name="features_for_read" >=[0] data_type_boolean 'name'="com.delphix:hole_birth" >=[1] data_type_boolean 'name'="com.delphix:embedded_data" This part is OK, although I was upset that in a pool with 2 disks, there is no info in each disk about the other one, except that vdev_children=2 =( It would have been great if there was the path to the second disk, but anyway, this is no huge problem. Part 2> Read the uberblocks and figure out which is the most current one. I select the uberblock with the largest ub_timestamp as the best one, and then I try to read the data pointed to by the uberblock.ub_rootbp of type blkptr_t The best uberblock has the following ub_rootbp: [0] VolumePos=$28000 DiskPos=$28000 IsVolumePosValid=1 IsSBValid=1 OriginNr=4 SB.ub_version=5000 SB.ub_timestamp=555CA4DF SB.ub_software_version=5000 BlockPtr=DVA[0]= DVA[1]= DVA[2]= LEVEL=0 TYPE=B LSIZE=800 PSIZE=200 COMP=F BIRTH=A0 PHYS_BIRTH=A0 read using the functions such as BP_GET_COMPRESS, etc. And this part is where I am stuck, because: 1> The offset 0x325D400 counting from disk image start is filled with CC CC CC CC .... it doesn't look like at all that it is something valid. Maybe the offset should not be read from disk image start, but instead counting from somewhere else? 2> comp=F wow, so even basic blocks use compression? Or my reading of the data somehow wrong? It will be a lot of work to get decompression working. any ideas? Attached is the "best" uberblock, with block pointer data with blue background. thanks, -- Felipe Monteiro de Carvalho From owner-freebsd-fs@freebsd.org Thu Jul 2 11:59:35 2015 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id E204F9927A8 for ; Thu, 2 Jul 2015 11:59:35 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) Received: from esa-annu.net.uoguelph.ca (esa-annu.mail.uoguelph.ca [131.104.91.36]) by mx1.freebsd.org (Postfix) with ESMTP id 6853C293B; Thu, 2 Jul 2015 11:59:34 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: A2BdAwBpJ5VV/61jaINbhEUGvS0Jh2YCggMUAQEBAQEBAYEKhCMBAQQjVhACAQgYAgINGQICVwIEE4gvtWaWYgEBAQEBBQEBAQEBARyBIYophFI0B4JogUMFlBKHOoVmhBSPHoNbAiaEFiIxgUaBAgEBAQ X-IronPort-AV: E=Sophos;i="5.15,392,1432612800"; d="scan'208";a="223431747" Received: from nipigon.cs.uoguelph.ca (HELO zcs1.mail.uoguelph.ca) ([131.104.99.173]) by esa-annu.net.uoguelph.ca with ESMTP; 02 Jul 2015 07:59:22 -0400 Received: from localhost (localhost [127.0.0.1]) by zcs1.mail.uoguelph.ca (Postfix) with ESMTP id 3ABC215F533; Thu, 2 Jul 2015 07:59:22 -0400 (EDT) Received: from zcs1.mail.uoguelph.ca ([127.0.0.1]) by localhost (zcs1.mail.uoguelph.ca [127.0.0.1]) (amavisd-new, port 10032) with ESMTP id q2wlW5nL1-DM; Thu, 2 Jul 2015 07:59:20 -0400 (EDT) Received: from localhost (localhost [127.0.0.1]) by zcs1.mail.uoguelph.ca (Postfix) with ESMTP id C756A15F54D; Thu, 2 Jul 2015 07:59:20 -0400 (EDT) X-Virus-Scanned: amavisd-new at zcs1.mail.uoguelph.ca Received: from zcs1.mail.uoguelph.ca ([127.0.0.1]) by localhost (zcs1.mail.uoguelph.ca [127.0.0.1]) (amavisd-new, port 10026) with ESMTP id cwXRXiEGw5T7; Thu, 2 Jul 2015 07:59:20 -0400 (EDT) Received: from zcs1.mail.uoguelph.ca (zcs1.mail.uoguelph.ca [172.17.95.18]) by zcs1.mail.uoguelph.ca (Postfix) with ESMTP id AA92C15F533; Thu, 2 Jul 2015 07:59:20 -0400 (EDT) Date: Thu, 2 Jul 2015 07:59:20 -0400 (EDT) From: Rick Macklem To: Julian Elischer Cc: d@delphij.net, freebsd-fs@freebsd.org Message-ID: <1022558302.2863702.1435838360534.JavaMail.zimbra@uoguelph.ca> In-Reply-To: <5594B008.10202@freebsd.org> References: <684628776.2772174.1435793776748.JavaMail.zimbra@uoguelph.ca> <55947C6E.5060409@delphij.net> <1491630362.2785531.1435799383802.JavaMail.zimbra@uoguelph.ca> <5594B008.10202@freebsd.org> Subject: Re: Linux NFSv4 clients are getting (bad sequence-id error!) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Originating-IP: [172.17.95.10] X-Mailer: Zimbra 8.0.9_GA_6191 (ZimbraWebClient - FF34 (Win)/8.0.9_GA_6191) Thread-Topic: Linux NFSv4 clients are getting (bad sequence-id error!) Thread-Index: vseI9Trq0D3X255JHmasf5P3utEJKw== X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 02 Jul 2015 11:59:36 -0000 Julian Elischer wrote: > On 7/2/15 9:09 AM, Rick Macklem wrote: > > I am going to post to nfsv4@ietf.org to see what they say. Please > > let me know if Xin Li's patch resolves your problem, even though I > > don't believe it is correct except for the UINT32_MAX case. Good > > luck with it, rick > and please keep us all in the loop as to what they say! > > the general N+2 bit sounds like bullshit to me.. its always N+1 in a > number field that has a > bit of slack at wrap time (probably due to some ambiguity in the > original spec). > Actually, since N is the lock op already done, N + 1 is the next lock operation in order. Since lock ops need to be strictly ordered, allowing N + 2 (which means N + 2 would be done before N + 1) makes no sense. I think the author of the RFC meant that N + 2 or greater fails, but it was poorly worded. I will pass along whatever I get from nfsv4@ietf.org. (There is an archive of it somewhere, but I can't remember where.;-) rick From owner-freebsd-fs@freebsd.org Thu Jul 2 12:07:54 2015 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id F06E8992EC4 for ; Thu, 2 Jul 2015 12:07:53 +0000 (UTC) (envelope-from email.ahmedkamal@googlemail.com) Received: from mail-wi0-x231.google.com (mail-wi0-x231.google.com [IPv6:2a00:1450:400c:c05::231]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 8BAEF2F57; Thu, 2 Jul 2015 12:07:53 +0000 (UTC) (envelope-from email.ahmedkamal@googlemail.com) Received: by widjy10 with SMTP id jy10so83009052wid.1; Thu, 02 Jul 2015 05:07:51 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=googlemail.com; s=20120113; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc:content-type; bh=/tjINW9iZb4o50XtA/IPDF6WLU4OJpqcvmq9pP+vZxc=; b=BSYkFB5V0SD+ysENrFCedWZXS/hiGXLy2kG9qlxgH4B70ohcIlEbXe+aPaG0zEvp33 j6NslVHaxczXKczraUinOsdDBjYbjV1gc/nggBrQM40Wu2BR/EmfJuIMlMllwmAASeIr NChSL+UJKeDgAX0iw7neLaUYS9SttRDlRj3vDRQSV7hm/L2rXSNAs1ckJRfbruitlqKr EO76MPppUWbYLIOHqaWFBkqEd4DVpBxDsDOhqTHakK54TB3nEGJEFTdG16BJZh+7ykb9 peh+m6pV5wKVsG/73TeuNehXoFDRUo4doWz5aNjxY0P7QCEpVafeEp2CS4BvBp3It4tz W2NA== X-Received: by 10.194.6.229 with SMTP id e5mr11123118wja.158.1435838871794; Thu, 02 Jul 2015 05:07:51 -0700 (PDT) MIME-Version: 1.0 Received: by 10.28.6.143 with HTTP; Thu, 2 Jul 2015 05:07:32 -0700 (PDT) In-Reply-To: <1022558302.2863702.1435838360534.JavaMail.zimbra@uoguelph.ca> References: <684628776.2772174.1435793776748.JavaMail.zimbra@uoguelph.ca> <55947C6E.5060409@delphij.net> <1491630362.2785531.1435799383802.JavaMail.zimbra@uoguelph.ca> <5594B008.10202@freebsd.org> <1022558302.2863702.1435838360534.JavaMail.zimbra@uoguelph.ca> From: Ahmed Kamal Date: Thu, 2 Jul 2015 14:07:32 +0200 Message-ID: Subject: Re: Linux NFSv4 clients are getting (bad sequence-id error!) To: Rick Macklem Cc: Julian Elischer , freebsd-fs@freebsd.org, d@delphij.net Content-Type: text/plain; charset=UTF-8 X-Content-Filtered-By: Mailman/MimeDel 2.1.20 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 02 Jul 2015 12:07:54 -0000 Appreciating the fruitful discussion! Can someone please explain to me, what would happen in the current situation (linux client doing this skip-by-1 thing, and freebsd not doing it) ? What is the effect of that? What do users see? Any chances of data loss? Also, I find it strange that netapp have acknowledged this is a bug on their side, which has been fixed since then! I also find it strange that I'm the first to hit this :) Is no one running nfs4 yet! On Thu, Jul 2, 2015 at 1:59 PM, Rick Macklem wrote: > Julian Elischer wrote: > > On 7/2/15 9:09 AM, Rick Macklem wrote: > > > I am going to post to nfsv4@ietf.org to see what they say. Please > > > let me know if Xin Li's patch resolves your problem, even though I > > > don't believe it is correct except for the UINT32_MAX case. Good > > > luck with it, rick > > and please keep us all in the loop as to what they say! > > > > the general N+2 bit sounds like bullshit to me.. its always N+1 in a > > number field that has a > > bit of slack at wrap time (probably due to some ambiguity in the > > original spec). > > > Actually, since N is the lock op already done, N + 1 is the next lock > operation in order. Since lock ops need to be strictly ordered, allowing > N + 2 (which means N + 2 would be done before N + 1) makes no sense. > > I think the author of the RFC meant that N + 2 or greater fails, but it > was poorly worded. > > I will pass along whatever I get from nfsv4@ietf.org. (There is an archive > of it somewhere, but I can't remember where.;-) > > rick > _______________________________________________ > freebsd-fs@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-fs > To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org" > From owner-freebsd-fs@freebsd.org Thu Jul 2 20:37:47 2015 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 04A469934E2 for ; Thu, 2 Jul 2015 20:37:47 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org (kenobi.freebsd.org [IPv6:2001:1900:2254:206a::16:76]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id E5E9115B8 for ; Thu, 2 Jul 2015 20:37:46 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from bugs.freebsd.org ([127.0.1.118]) by kenobi.freebsd.org (8.14.9/8.14.9) with ESMTP id t62KbkuI004668 for ; Thu, 2 Jul 2015 20:37:46 GMT (envelope-from bugzilla-noreply@freebsd.org) From: bugzilla-noreply@freebsd.org To: freebsd-fs@FreeBSD.org Subject: [Bug 165950] [ffs] SU+J and fsck problem Date: Thu, 02 Jul 2015 20:37:47 +0000 X-Bugzilla-Reason: AssignedTo X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: Base System X-Bugzilla-Component: kern X-Bugzilla-Version: 9.0-RELEASE X-Bugzilla-Keywords: X-Bugzilla-Severity: Affects Only Me X-Bugzilla-Who: jb.1234abcd@gmail.com X-Bugzilla-Status: Closed X-Bugzilla-Priority: Normal X-Bugzilla-Assigned-To: freebsd-fs@FreeBSD.org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: resolution bug_status Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 7bit X-Bugzilla-URL: https://bugs.freebsd.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 02 Jul 2015 20:37:47 -0000 https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=165950 jb.1234abcd@gmail.com changed: What |Removed |Added ---------------------------------------------------------------------------- Resolution|--- |Unable to Reproduce Status|In Progress |Closed -- You are receiving this mail because: You are the assignee for the bug. From owner-freebsd-fs@freebsd.org Thu Jul 2 21:41:20 2015 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 3317E992559 for ; Thu, 2 Jul 2015 21:41:20 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) Received: from esa-annu.net.uoguelph.ca (esa-annu.mail.uoguelph.ca [131.104.91.36]) by mx1.freebsd.org (Postfix) with ESMTP id C5FC4112E; Thu, 2 Jul 2015 21:41:19 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: A2B7AwCzr5VV/61jaINbg2ZfBoMZuhgJgWQKhXgCggkUAQEBAQEBAYEKhCMBAQECAgEBASAERwsMBAIBCA4KAgINGQICJzACBBOILw22WpYgAQEBAQEFAQEBAQEBARcEgSGKKYQ7FwEzB4JogUMFhwiGMoZYhGGEYoNdFIcPj2oCJoQWIjEBgUaBBAEBAQ X-IronPort-AV: E=Sophos;i="5.15,395,1432612800"; d="scan'208";a="223544926" Received: from nipigon.cs.uoguelph.ca (HELO zcs1.mail.uoguelph.ca) ([131.104.99.173]) by esa-annu.net.uoguelph.ca with ESMTP; 02 Jul 2015 17:41:18 -0400 Received: from localhost (localhost [127.0.0.1]) by zcs1.mail.uoguelph.ca (Postfix) with ESMTP id 9EDC615F533; Thu, 2 Jul 2015 17:41:18 -0400 (EDT) Received: from zcs1.mail.uoguelph.ca ([127.0.0.1]) by localhost (zcs1.mail.uoguelph.ca [127.0.0.1]) (amavisd-new, port 10032) with ESMTP id 1XfrOzacyfLA; Thu, 2 Jul 2015 17:41:15 -0400 (EDT) Received: from localhost (localhost [127.0.0.1]) by zcs1.mail.uoguelph.ca (Postfix) with ESMTP id 23C7C15F54D; Thu, 2 Jul 2015 17:41:15 -0400 (EDT) X-Virus-Scanned: amavisd-new at zcs1.mail.uoguelph.ca Received: from zcs1.mail.uoguelph.ca ([127.0.0.1]) by localhost (zcs1.mail.uoguelph.ca [127.0.0.1]) (amavisd-new, port 10026) with ESMTP id KcyImgs8proe; Thu, 2 Jul 2015 17:41:15 -0400 (EDT) Received: from zcs1.mail.uoguelph.ca (zcs1.mail.uoguelph.ca [172.17.95.18]) by zcs1.mail.uoguelph.ca (Postfix) with ESMTP id 06CC215F533; Thu, 2 Jul 2015 17:41:15 -0400 (EDT) Date: Thu, 2 Jul 2015 17:41:15 -0400 (EDT) From: Rick Macklem To: Ahmed Kamal Cc: Julian Elischer , Freebsd fs , d@delphij.net Message-ID: <1919954909.3441620.1435873275001.JavaMail.zimbra@uoguelph.ca> In-Reply-To: References: <1427974645.2786896.1435800121257.JavaMail.zimbra@uoguelph.ca> Subject: Re: [nfsv4] Is "skip by 1" allowed for the NFSv4.0 seqid? MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Originating-IP: [172.17.95.12] X-Mailer: Zimbra 8.0.9_GA_6191 (ZimbraWebClient - FF34 (Win)/8.0.9_GA_6191) Thread-Topic: Is "skip by 1" allowed for the NFSv4.0 seqid? Thread-Index: QLEgmE4pXFcQ2frzSVqAv4V1Er5qqg== X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 02 Jul 2015 21:41:20 -0000 Here is the response on nfsv4@ietf.org. I think the Redhat client is broken from reading this. It will be interesting to see if anyone working on the Linux client responds. rick ----- Original Message ----- > > Is "skip by 1" allowed for NFv4.0 seqid? >=20 > No it isn't. As you indicated RFC7530, which obsoletes RFC3530,, indicate= s > you are' > only allowed to increment by one, with the exception of the wraparound ca= se. >=20 > > I'll admit the wording in RFC-3530 isn't ideal, >=20 > I think the text "(r =3D=3D L+2)" is intended to be read as "(e.g. r =3D= =3D L+2)" but > the existing text > is unclear. >=20 > > but I've never heard of this interpretation before . >=20 > Me either. >=20 > > Is an NFSv4.0 server supposed to accept both 0 and 1 as the > correct next sequence # after UINT32_MAX? >=20 > I would say it isn't.=C2=A0 RFC7530 says "If th e sequence value received= is any > other > value, it is rejected with the return of error NFS4ERR_BAD_SEQID" In thi= s > case > the correct value is 1. On the other hand, this isn't a "MUST". If ther= e > are clients out > out there that don't do seqid wraparound correctly, it is seems like a > reasonable > accommodation for a server to accept the incorrect value zero in this cas= e., >=20 >=20 >=20 >=20 >=20 >=20 > On Wed, Jul 1, 2015 at 9:22 PM, Rick Macklem < rmacklem@uoguelph.ca > wro= te: >=20 >=20 > Hi, >=20 > If you look here (on page #6) it seems to indicate that > incrementing the seqid by 2 is allowed by the RFC. > (I'll admit the wording in RFC-3530 isn't ideal, but I've > never heard of this interpretation before.) > https://dl.dropboxusercontent.com/u/51939288/nfs4-bad-seq.pdf >=20 > RFC-7530 seems clear that it should only be incremented by 1. > However, I do notice that a wraparound of the seqid is supposed > to skip 0 and go to 1. I don't see any mention of this in RFC-3530. > --> Is an NFSv4.0 server supposed to accept both 0 and 1 as the > correct next sequence # after UINT32_MAX? >=20 > Thanks for any help clarifying this, rick >=20 > _______________________________________________ > nfsv4 mailing list > nfsv4@ietf.org > https://www.ietf.org/mailman/listinfo/nfsv4 >=20 >=20 > _______________________________________________ > nfsv4 mailing list > nfsv4@ietf.org > https://www.ietf.org/mailman/listinfo/nfsv4 >=20 From owner-freebsd-fs@freebsd.org Thu Jul 2 21:53:16 2015 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id BBB86992994 for ; Thu, 2 Jul 2015 21:53:16 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) Received: from esa-jnhn.mail.uoguelph.ca (esa-jnhn.mail.uoguelph.ca [131.104.91.44]) by mx1.freebsd.org (Postfix) with ESMTP id 5AFDB1CA1; Thu, 2 Jul 2015 21:53:16 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: A2DGBACxsZVV/61jaINbg2ZfBoMZuiGBZAqFLkoCggsSAQEBAQEBAYEKhCMBAQEDAQEBASArIAsFCwIBCA4KAgINGQICJwEJJgIECAcEARwEiAYIDbZdlh8BAQEBBgEBAQEBARyBIYophDQBAQUXNAeCaIFDBZQShGGCWYFdhAlEg1CPHoNbAiaEFiIxB4EGOoEEAQEB X-IronPort-AV: E=Sophos;i="5.15,395,1432612800"; d="scan'208";a="221687711" Received: from nipigon.cs.uoguelph.ca (HELO zcs1.mail.uoguelph.ca) ([131.104.99.173]) by esa-jnhn.mail.uoguelph.ca with ESMTP; 02 Jul 2015 17:53:16 -0400 Received: from localhost (localhost [127.0.0.1]) by zcs1.mail.uoguelph.ca (Postfix) with ESMTP id D9DF715F533; Thu, 2 Jul 2015 17:53:14 -0400 (EDT) Received: from zcs1.mail.uoguelph.ca ([127.0.0.1]) by localhost (zcs1.mail.uoguelph.ca [127.0.0.1]) (amavisd-new, port 10032) with ESMTP id VBzJAe1invUe; Thu, 2 Jul 2015 17:53:14 -0400 (EDT) Received: from localhost (localhost [127.0.0.1]) by zcs1.mail.uoguelph.ca (Postfix) with ESMTP id 1713615F54D; Thu, 2 Jul 2015 17:53:14 -0400 (EDT) X-Virus-Scanned: amavisd-new at zcs1.mail.uoguelph.ca Received: from zcs1.mail.uoguelph.ca ([127.0.0.1]) by localhost (zcs1.mail.uoguelph.ca [127.0.0.1]) (amavisd-new, port 10026) with ESMTP id N7IJZE_IIbcS; Thu, 2 Jul 2015 17:53:14 -0400 (EDT) Received: from zcs1.mail.uoguelph.ca (zcs1.mail.uoguelph.ca [172.17.95.18]) by zcs1.mail.uoguelph.ca (Postfix) with ESMTP id EE22715F533; Thu, 2 Jul 2015 17:53:13 -0400 (EDT) Date: Thu, 2 Jul 2015 17:53:13 -0400 (EDT) From: Rick Macklem To: Ahmed Kamal Cc: Julian Elischer , freebsd-fs@freebsd.org, d@delphij.net Message-ID: <791936587.3443190.1435873993955.JavaMail.zimbra@uoguelph.ca> In-Reply-To: References: <684628776.2772174.1435793776748.JavaMail.zimbra@uoguelph.ca> <55947C6E.5060409@delphij.net> <1491630362.2785531.1435799383802.JavaMail.zimbra@uoguelph.ca> <5594B008.10202@freebsd.org> <1022558302.2863702.1435838360534.JavaMail.zimbra@uoguelph.ca> Subject: Re: Linux NFSv4 clients are getting (bad sequence-id error!) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Originating-IP: [172.17.95.11] X-Mailer: Zimbra 8.0.9_GA_6191 (ZimbraWebClient - FF34 (Win)/8.0.9_GA_6191) Thread-Topic: Linux NFSv4 clients are getting (bad sequence-id error!) Thread-Index: ASr502p894cBKPvi9Cv5OEpe/TaJXA== X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 02 Jul 2015 21:53:16 -0000 Ahmed Kamal wrote: > Appreciating the fruitful discussion! Can someone please explain to me, > what would happen in the current situation (linux client doing this > skip-by-1 thing, and freebsd not doing it) ? What is the effect of that? Well, as you've seen, the Linux client doesn't function correctly against the FreeBSD server (and probably others that don't support this "skip-by-1" case). > What do users see? Any chances of data loss? Hmm. Mostly it will cause Opens to fail, but I can't guess what the Linux client behaviour is after receiving NFS4ERR_BAD_SEQID. You're the guy observing it. > > Also, I find it strange that netapp have acknowledged this is a bug on > their side, which has been fixed since then! Yea, I think Netapp screwed up. For some reason their server allowed this, then was fixed to not allow it and then someone decided that was broken and reversed it. > I also find it strange that I'm the first to hit this :) Is no one running > nfs4 yet! > Well, it seems to be slowly catching on. I suspect that the Linux client mounting a Netapp is the most common use of it. Since it appears that they flip flopped w.r.t. who's bug this is, it has probably persisted. It may turn out that the Linux client has been fixed or it may turn out that most servers allowed this "skip-by-1" even though David Noveck (one of the main authors of the protocol) seems to agree with me that it should not be allowed. It is possible that others have bumped into this, but it wasn't isolated (I wouldn't have guessed it, so it was good you pointed to the RedHat discussion) and they worked around it by reverting to NFSv3 or similar. The protocol is rather complex in this area and changed completely for NFSv4.1, so many have also probably moved onto NFSv4.1 where this won't be an issue. (NFSv4.1 uses sessions to provide exactly once RPC semantics and doesn't use these seqid fields.) This is all just mho, rick > On Thu, Jul 2, 2015 at 1:59 PM, Rick Macklem wrote: > > > Julian Elischer wrote: > > > On 7/2/15 9:09 AM, Rick Macklem wrote: > > > > I am going to post to nfsv4@ietf.org to see what they say. Please > > > > let me know if Xin Li's patch resolves your problem, even though I > > > > don't believe it is correct except for the UINT32_MAX case. Good > > > > luck with it, rick > > > and please keep us all in the loop as to what they say! > > > > > > the general N+2 bit sounds like bullshit to me.. its always N+1 in a > > > number field that has a > > > bit of slack at wrap time (probably due to some ambiguity in the > > > original spec). > > > > > Actually, since N is the lock op already done, N + 1 is the next lock > > operation in order. Since lock ops need to be strictly ordered, allowing > > N + 2 (which means N + 2 would be done before N + 1) makes no sense. > > > > I think the author of the RFC meant that N + 2 or greater fails, but it > > was poorly worded. > > > > I will pass along whatever I get from nfsv4@ietf.org. (There is an archive > > of it somewhere, but I can't remember where.;-) > > > > rick > > _______________________________________________ > > freebsd-fs@freebsd.org mailing list > > http://lists.freebsd.org/mailman/listinfo/freebsd-fs > > To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org" > > > From owner-freebsd-fs@freebsd.org Thu Jul 2 22:59:31 2015 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 4DEA29935BE for ; Thu, 2 Jul 2015 22:59:31 +0000 (UTC) (envelope-from alex.burlyga.ietf@gmail.com) Received: from mail-yk0-x22e.google.com (mail-yk0-x22e.google.com [IPv6:2607:f8b0:4002:c07::22e]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 0468A2A0E for ; Thu, 2 Jul 2015 22:59:31 +0000 (UTC) (envelope-from alex.burlyga.ietf@gmail.com) Received: by ykfy125 with SMTP id y125so82028727ykf.1 for ; Thu, 02 Jul 2015 15:59:30 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=ceWq26hfN1eCoRPjUSv5kx6IgnmaXjJ5N8bBWmN9P44=; b=XEx+UfgqSvuKpxgEAhIjS8UsnKnEv0B0/c/pvMqKXPw4wDcx+ohBIKhj9TRG6hzdwV E6QrNJbuZOlpRYkyLZmACCVqVGaIyE/eTWWBaYnh40uaUhftdX3jTzKoW1YYZWpi6YWI ir9p+b7cQhidrhx5Qh93tBIgvXVkKRX5NorjQdQG4wMdE2cuhwnxBpjIeHnYBOQRaDJs 4jkF/Y2qnvZX44yZByScPRxA6rwLRVr26Rkal2zDjTtu1XrA71FsrpkUjn3xTotahL0d pF/+ENA8rS/2noYwXpjW15mOHybAzl9kjzoACUGpZ9t4tGqDc96Q6G3EMKB+C+61i/1v kMCA== MIME-Version: 1.0 X-Received: by 10.129.85.69 with SMTP id j66mr42333260ywb.40.1435877970013; Thu, 02 Jul 2015 15:59:30 -0700 (PDT) Received: by 10.13.244.65 with HTTP; Thu, 2 Jul 2015 15:59:29 -0700 (PDT) In-Reply-To: References: <1969046464.61534041.1434897034960.JavaMail.root@uoguelph.ca> Date: Thu, 2 Jul 2015 15:59:29 -0700 Message-ID: Subject: Re: [nfs][client] - Question about handling of the NFS3_EEXIST error in SYMLINK rpc From: "alex.burlyga.ietf alex.burlyga.ietf" To: Rick Macklem Cc: freebsd-fs Content-Type: text/plain; charset=UTF-8 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 02 Jul 2015 22:59:31 -0000 Rick, I was able to test the patch. Works as expected. With sysctl set to 1 I can reproduce original issue, setting sysctl to 0 fixes the issue. I'll leave running tests over night, but everything looks great. Alex On Mon, Jun 22, 2015 at 2:02 PM, alex.burlyga.ietf alex.burlyga.ietf wrote: > Rick, > > Thank you for a quick turn around, see answers inline: > > On Sun, Jun 21, 2015 at 7:30 AM, Rick Macklem wrote: >> Alex Burlyga wrote: >>> Hi, >>> >>> NFS client code in nfsrpc_symlink() masks server returned NFS3_EEXIST >>> error >>> code >>> by returning 0 to the upper layers. I'm assuming this was an attempt >>> to >>> work around >>> some server's broken replay cache out there, however, it breaks a >>> more >>> common >>> case where server is returning EEXIST for legitimate reason and >>> application >>> is expecting this error code and equipped to deal with it. >>> >>> To fix it I see three ways of doing this: >>> * Remove offending code >>> * Make it optional, sysctl? >>> * On NFS3_EEXIST send READLINK rpc to make sure symlink content is >>> right >>> >>> Which of the ways will maximize the chances of getting this fix >>> upstream? >>> >> I've attached a patch for testing/review that does essentially #2. >> It has no effect on trivial tests, since the syscall does a Lookup >> before trying to create the symlink and fails with EEXIST. >> Do you have a case where competing clients are trying to create >> the symlink or something like that, which runs into this? > > That's exactly failing test case we are running into. >> >> Please test the attached patch, since I don't know how to do that, rick > Great! I'll test it. I was leaning towards option 3 for SYMLINK and > option 2 for MKDIR. > This will work. Thanks for taking your time to generate the patch! > >> >>> One more point, old client circa FreeBSD 7.0 does not exhibit this >>> problem. >>> >>> Alex >>> _______________________________________________ >>> freebsd-fs@freebsd.org mailing list >>> http://lists.freebsd.org/mailman/listinfo/freebsd-fs >>> To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org" >>> From owner-freebsd-fs@freebsd.org Thu Jul 2 23:15:41 2015 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 1D02A993924 for ; Thu, 2 Jul 2015 23:15:41 +0000 (UTC) (envelope-from email.ahmedkamal@googlemail.com) Received: from mail-wi0-x234.google.com (mail-wi0-x234.google.com [IPv6:2a00:1450:400c:c05::234]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id B1B7D1219; Thu, 2 Jul 2015 23:15:40 +0000 (UTC) (envelope-from email.ahmedkamal@googlemail.com) Received: by wiwl6 with SMTP id l6so210834880wiw.0; Thu, 02 Jul 2015 16:15:38 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=googlemail.com; s=20120113; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc:content-type; bh=Up9HFwc6FqCrF8j+7EeB6EkhlllyalxCpj6T/ng2YP8=; b=WlkaDabmto+JNnHaWwuuCBSRQVVBo5jYpwQpjIsuFmCjnoVYz94bq2ZcfNk5blSr0y /obPVDSYiZAkmkk012qDH4Ms/S9S66eNd62DbLuHzCzrSDcA4969FrqBz54ajhyqY8Nd 9fkgXaTQ+BGSSKSuARdqk/DHg6ZXyymXa2jbRsZbrZwZyxTsd1r0PZ60/wXmrm8lsaDd mNsTv1/Ck/IZVmBVXHIElrMTOde/rs6YeuNp0gn/V8YE8ro3Bo6znL5LaIW5Myci+9Uq N6ZjA8TR8lHLaEOuhBju7umlVsMBRs6FI0ieVCGnY/hE4KkgonWB9JwvfigV+uC5yKaW +vpw== X-Received: by 10.180.7.199 with SMTP id l7mr21553672wia.28.1435878938349; Thu, 02 Jul 2015 16:15:38 -0700 (PDT) MIME-Version: 1.0 Received: by 10.28.6.143 with HTTP; Thu, 2 Jul 2015 16:15:18 -0700 (PDT) In-Reply-To: <791936587.3443190.1435873993955.JavaMail.zimbra@uoguelph.ca> References: <684628776.2772174.1435793776748.JavaMail.zimbra@uoguelph.ca> <55947C6E.5060409@delphij.net> <1491630362.2785531.1435799383802.JavaMail.zimbra@uoguelph.ca> <5594B008.10202@freebsd.org> <1022558302.2863702.1435838360534.JavaMail.zimbra@uoguelph.ca> <791936587.3443190.1435873993955.JavaMail.zimbra@uoguelph.ca> From: Ahmed Kamal Date: Fri, 3 Jul 2015 01:15:18 +0200 Message-ID: Subject: Re: Linux NFSv4 clients are getting (bad sequence-id error!) To: Rick Macklem Cc: Julian Elischer , freebsd-fs@freebsd.org, d@delphij.net Content-Type: text/plain; charset=UTF-8 X-Content-Filtered-By: Mailman/MimeDel 2.1.20 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 02 Jul 2015 23:15:41 -0000 Thanks all .. I understand now we're doing the "right thing" .. Although if mounting keeps wedging, I will have to solve it somehow! Either using Xin's patch .. or Upgrading RHEL to 6.x and using NFS4.1. Regarding Xin's patch, is it possible to build the patched nfsd code, as a kernel module ? I'm looking to minimize my delta to upstream. Also would adopting Xin's patch and hiding it behind a kern.nfs.allow_linux_broken_client be an option (I'm probably not the last person on earth to hit this) ? Thanks a lot for all the help! On Thu, Jul 2, 2015 at 11:53 PM, Rick Macklem wrote: > Ahmed Kamal wrote: > > Appreciating the fruitful discussion! Can someone please explain to me, > > what would happen in the current situation (linux client doing this > > skip-by-1 thing, and freebsd not doing it) ? What is the effect of that? > Well, as you've seen, the Linux client doesn't function correctly against > the FreeBSD server (and probably others that don't support this "skip-by-1" > case). > > > What do users see? Any chances of data loss? > Hmm. Mostly it will cause Opens to fail, but I can't guess what the Linux > client behaviour is after receiving NFS4ERR_BAD_SEQID. You're the guy > observing > it. > > > > > Also, I find it strange that netapp have acknowledged this is a bug on > > their side, which has been fixed since then! > Yea, I think Netapp screwed up. For some reason their server allowed this, > then was fixed to not allow it and then someone decided that was broken and > reversed it. > > > I also find it strange that I'm the first to hit this :) Is no one > running > > nfs4 yet! > > > Well, it seems to be slowly catching on. I suspect that the Linux client > mounting a Netapp is the most common use of it. Since it appears that they > flip flopped w.r.t. who's bug this is, it has probably persisted. > > It may turn out that the Linux client has been fixed or it may turn out > that most servers allowed this "skip-by-1" even though David Noveck (one > of the main authors of the protocol) seems to agree with me that it should > not be allowed. > > It is possible that others have bumped into this, but it wasn't isolated > (I wouldn't have guessed it, so it was good you pointed to the RedHat > discussion) > and they worked around it by reverting to NFSv3 or similar. > The protocol is rather complex in this area and changed completely for > NFSv4.1, > so many have also probably moved onto NFSv4.1 where this won't be an issue. > (NFSv4.1 uses sessions to provide exactly once RPC semantics and doesn't > use > these seqid fields.) > > This is all just mho, rick > > > On Thu, Jul 2, 2015 at 1:59 PM, Rick Macklem > wrote: > > > > > Julian Elischer wrote: > > > > On 7/2/15 9:09 AM, Rick Macklem wrote: > > > > > I am going to post to nfsv4@ietf.org to see what they say. Please > > > > > let me know if Xin Li's patch resolves your problem, even though I > > > > > don't believe it is correct except for the UINT32_MAX case. Good > > > > > luck with it, rick > > > > and please keep us all in the loop as to what they say! > > > > > > > > the general N+2 bit sounds like bullshit to me.. its always N+1 in a > > > > number field that has a > > > > bit of slack at wrap time (probably due to some ambiguity in the > > > > original spec). > > > > > > > Actually, since N is the lock op already done, N + 1 is the next lock > > > operation in order. Since lock ops need to be strictly ordered, > allowing > > > N + 2 (which means N + 2 would be done before N + 1) makes no sense. > > > > > > I think the author of the RFC meant that N + 2 or greater fails, but it > > > was poorly worded. > > > > > > I will pass along whatever I get from nfsv4@ietf.org. (There is an > archive > > > of it somewhere, but I can't remember where.;-) > > > > > > rick > > > _______________________________________________ > > > freebsd-fs@freebsd.org mailing list > > > http://lists.freebsd.org/mailman/listinfo/freebsd-fs > > > To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org" > > > > > > From owner-freebsd-fs@freebsd.org Thu Jul 2 23:21:21 2015 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 87AD9993A5D for ; Thu, 2 Jul 2015 23:21:21 +0000 (UTC) (envelope-from email.ahmedkamal@googlemail.com) Received: from mail-wg0-x229.google.com (mail-wg0-x229.google.com [IPv6:2a00:1450:400c:c00::229]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 15D9C1B0B; Thu, 2 Jul 2015 23:21:21 +0000 (UTC) (envelope-from email.ahmedkamal@googlemail.com) Received: by wguu7 with SMTP id u7so74978454wgu.3; Thu, 02 Jul 2015 16:21:19 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=googlemail.com; s=20120113; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc:content-type; bh=y2hFDBC3puKaXnVcqWs64+MbWLyNNV5ObTnVxpvIq14=; b=RhvV0MQF3f7wVD3MuRZB7dUlPFRfqyHaCu8//6Tvo/yjyU82WBpYkpc98IdkcRJeGk yJ4FcznLtr75ql7IoiH0yFLRpwSVMaO0JSOs0GcYuM9e5gOx8HldScxPw37Ps2jnCG6k epWvsfh9FRvfHGxVuwaqTyFdGnI0CwiQvaZ4aGraZ9l4957eWwrzUHwGMgfXJjEpKRdx nyBf6NHttlIGYPai9A9MvnIcPNvVZ5ZrAv0Ot9DmiQgPaw+clwTc7RxUcCi71zHLthkR g7BRI46aec267entF+05XZlTPBzud9O/Hr44eo6A5bDCxJ2Zqt4+v39aUmFIXoCADmPB KFjQ== X-Received: by 10.180.79.133 with SMTP id j5mr59233508wix.38.1435879279526; Thu, 02 Jul 2015 16:21:19 -0700 (PDT) MIME-Version: 1.0 Received: by 10.28.6.143 with HTTP; Thu, 2 Jul 2015 16:21:00 -0700 (PDT) In-Reply-To: References: <684628776.2772174.1435793776748.JavaMail.zimbra@uoguelph.ca> <55947C6E.5060409@delphij.net> <1491630362.2785531.1435799383802.JavaMail.zimbra@uoguelph.ca> <5594B008.10202@freebsd.org> <1022558302.2863702.1435838360534.JavaMail.zimbra@uoguelph.ca> <791936587.3443190.1435873993955.JavaMail.zimbra@uoguelph.ca> From: Ahmed Kamal Date: Fri, 3 Jul 2015 01:21:00 +0200 Message-ID: Subject: Re: Linux NFSv4 clients are getting (bad sequence-id error!) To: Rick Macklem Cc: Julian Elischer , freebsd-fs@freebsd.org, Xin LI Content-Type: text/plain; charset=UTF-8 X-Content-Filtered-By: Mailman/MimeDel 2.1.20 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 02 Jul 2015 23:21:21 -0000 PS: Today (after adjusting tcp.highwater) I didn't get any screaming reports from users about hung vnc sessions. So maybe just maybe, linux clients are able to somehow recover from this bad sequence messages. I could still see the bad sequence error message in logs though Why isn't the highwater tunable set to something better by default ? I mean this server is certainly not under a high or unusual load (it's only 40 PCs mounting from it) On Fri, Jul 3, 2015 at 1:15 AM, Ahmed Kamal wrote: > Thanks all .. I understand now we're doing the "right thing" .. Although > if mounting keeps wedging, I will have to solve it somehow! Either using > Xin's patch .. or Upgrading RHEL to 6.x and using NFS4.1. > > Regarding Xin's patch, is it possible to build the patched nfsd code, as a > kernel module ? I'm looking to minimize my delta to upstream. > > Also would adopting Xin's patch and hiding it behind a > kern.nfs.allow_linux_broken_client be an option (I'm probably not the last > person on earth to hit this) ? > > Thanks a lot for all the help! > > On Thu, Jul 2, 2015 at 11:53 PM, Rick Macklem > wrote: > >> Ahmed Kamal wrote: >> > Appreciating the fruitful discussion! Can someone please explain to me, >> > what would happen in the current situation (linux client doing this >> > skip-by-1 thing, and freebsd not doing it) ? What is the effect of that? >> Well, as you've seen, the Linux client doesn't function correctly against >> the FreeBSD server (and probably others that don't support this >> "skip-by-1" >> case). >> >> > What do users see? Any chances of data loss? >> Hmm. Mostly it will cause Opens to fail, but I can't guess what the Linux >> client behaviour is after receiving NFS4ERR_BAD_SEQID. You're the guy >> observing >> it. >> >> > >> > Also, I find it strange that netapp have acknowledged this is a bug on >> > their side, which has been fixed since then! >> Yea, I think Netapp screwed up. For some reason their server allowed this, >> then was fixed to not allow it and then someone decided that was broken >> and >> reversed it. >> >> > I also find it strange that I'm the first to hit this :) Is no one >> running >> > nfs4 yet! >> > >> Well, it seems to be slowly catching on. I suspect that the Linux client >> mounting a Netapp is the most common use of it. Since it appears that they >> flip flopped w.r.t. who's bug this is, it has probably persisted. >> >> It may turn out that the Linux client has been fixed or it may turn out >> that most servers allowed this "skip-by-1" even though David Noveck (one >> of the main authors of the protocol) seems to agree with me that it should >> not be allowed. >> >> It is possible that others have bumped into this, but it wasn't isolated >> (I wouldn't have guessed it, so it was good you pointed to the RedHat >> discussion) >> and they worked around it by reverting to NFSv3 or similar. >> The protocol is rather complex in this area and changed completely for >> NFSv4.1, >> so many have also probably moved onto NFSv4.1 where this won't be an >> issue. >> (NFSv4.1 uses sessions to provide exactly once RPC semantics and doesn't >> use >> these seqid fields.) >> >> This is all just mho, rick >> >> > On Thu, Jul 2, 2015 at 1:59 PM, Rick Macklem >> wrote: >> > >> > > Julian Elischer wrote: >> > > > On 7/2/15 9:09 AM, Rick Macklem wrote: >> > > > > I am going to post to nfsv4@ietf.org to see what they say. Please >> > > > > let me know if Xin Li's patch resolves your problem, even though I >> > > > > don't believe it is correct except for the UINT32_MAX case. Good >> > > > > luck with it, rick >> > > > and please keep us all in the loop as to what they say! >> > > > >> > > > the general N+2 bit sounds like bullshit to me.. its always N+1 in a >> > > > number field that has a >> > > > bit of slack at wrap time (probably due to some ambiguity in the >> > > > original spec). >> > > > >> > > Actually, since N is the lock op already done, N + 1 is the next lock >> > > operation in order. Since lock ops need to be strictly ordered, >> allowing >> > > N + 2 (which means N + 2 would be done before N + 1) makes no sense. >> > > >> > > I think the author of the RFC meant that N + 2 or greater fails, but >> it >> > > was poorly worded. >> > > >> > > I will pass along whatever I get from nfsv4@ietf.org. (There is an >> archive >> > > of it somewhere, but I can't remember where.;-) >> > > >> > > rick >> > > _______________________________________________ >> > > freebsd-fs@freebsd.org mailing list >> > > http://lists.freebsd.org/mailman/listinfo/freebsd-fs >> > > To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org" >> > > >> > >> > > From owner-freebsd-fs@freebsd.org Thu Jul 2 23:24:18 2015 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 4C584993B64 for ; Thu, 2 Jul 2015 23:24:18 +0000 (UTC) (envelope-from email.ahmedkamal@googlemail.com) Received: from mail-wg0-x22c.google.com (mail-wg0-x22c.google.com [IPv6:2a00:1450:400c:c00::22c]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id DD7CA1C34 for ; Thu, 2 Jul 2015 23:24:17 +0000 (UTC) (envelope-from email.ahmedkamal@googlemail.com) Received: by wgqq4 with SMTP id q4so75017008wgq.1 for ; Thu, 02 Jul 2015 16:24:16 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=googlemail.com; s=20120113; h=mime-version:from:date:message-id:subject:to:content-type; bh=O5iazDqgNMivlvJuzXWgWgO/xs5dhu4sLEgZH2sTfXk=; b=SayPNOZN6Iz3ApsZYXELwy6joA79LfHjz3oixO7D8Uf87xkNdt6Bxmjpz6XxwSlrg4 rcfcivHEa8qPWP+vuSWX7VvP5vpaKhdZG/h3c5dzKVKwy+gjsGzddtPbBj1I8KKdiuYj luVhCGm/0D0BU/X5U4qsqED5qgGw3Q1myc+kum5ZOzib37UvzIMmempB+V7p8P7MpizM hXQ03J16oc2/INCIbA+NWvk1cZk+dMEGuEm+IEBlb+QgxFBl/iBJoivhR04CAYfHCNnX u5gINatHFG+VfCINL0NAROzS23q4bB3SUwES+hrY2db2tiLJOlHEJDX/Nu919EIxJQzm TXGw== X-Received: by 10.194.6.229 with SMTP id e5mr16123508wja.158.1435879456372; Thu, 02 Jul 2015 16:24:16 -0700 (PDT) MIME-Version: 1.0 Received: by 10.28.6.143 with HTTP; Thu, 2 Jul 2015 16:23:56 -0700 (PDT) From: Ahmed Kamal Date: Fri, 3 Jul 2015 01:23:56 +0200 Message-ID: Subject: NFS4 exports are blocked when NIS master server is unreachable To: freebsd-fs@freebsd.org Content-Type: text/plain; charset=UTF-8 X-Content-Filtered-By: Mailman/MimeDel 2.1.20 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 02 Jul 2015 23:24:18 -0000 I opened this ticket [ https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=201155] a week ago, but haven't heard since. I thought to discuss it here. Here's a paste of the bug for convenience. Thanks all! My Setup is: * A FreeBSD 10.1 with ZFS exporting NFS4 to a Linux server * That Linux server is a NIS master for the FreeBSD box The issue is: * Upon power failure .. Both servers starting fresh * FreeBSD boots faster (grin) * FreeBSD doesn't find NIS master and bombards logs with hundreds of thousands of messages!! like Jun 27 13:47:53 nas /usr/sbin/ypbind[827]: NIS server [192.168.10.254] for domain "SIVISION" not responding Jun 27 13:48:24 nas last message repeated 51704 times Jun 27 13:49:26 nas last message repeated 104381 times * In this state, FreeBSD does not allow any other machine to NFS4 mount its shares. This includes the NIS master Linux box * Linux boot hangs waiting for the NFS share to allow it to mount it * We are in a deadlock. Linux waiting for the NFS share to mount, while FreeBSD is waiting for the NIS server on Linux to start From owner-freebsd-fs@freebsd.org Fri Jul 3 00:51:59 2015 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 09167993A67 for ; Fri, 3 Jul 2015 00:51:59 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) Received: from esa-annu.net.uoguelph.ca (esa-annu.mail.uoguelph.ca [131.104.91.36]) by mx1.freebsd.org (Postfix) with ESMTP id A6E302D79; Fri, 3 Jul 2015 00:51:58 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: A2DGBABK3JVV/61jaINVBoNmXwaDGbojgWQKhS5KAoIKEwEBAQEBAQGBCoQjAQEBAwEBAQEgKyALEAIBCA4KAgINGQICJwEJJgIECAcEARwEh3kDCggNtxGQMA2FYAEBAQEGAQEBAQEdgSGKKYJNgVYQAgEFCAEONAeCaIFDBYwXh3uEYYJZgV2ECUSDUI8eg1sCJoQWIjEHf0GBBAEBAQ X-IronPort-AV: E=Sophos;i="5.15,396,1432612800"; d="scan'208";a="223561723" Received: from nipigon.cs.uoguelph.ca (HELO zcs1.mail.uoguelph.ca) ([131.104.99.173]) by esa-annu.net.uoguelph.ca with ESMTP; 02 Jul 2015 20:51:43 -0400 Received: from localhost (localhost [127.0.0.1]) by zcs1.mail.uoguelph.ca (Postfix) with ESMTP id 6744115F533; Thu, 2 Jul 2015 20:51:43 -0400 (EDT) Received: from zcs1.mail.uoguelph.ca ([127.0.0.1]) by localhost (zcs1.mail.uoguelph.ca [127.0.0.1]) (amavisd-new, port 10032) with ESMTP id yFGAulvIrRXU; Thu, 2 Jul 2015 20:51:42 -0400 (EDT) Received: from localhost (localhost [127.0.0.1]) by zcs1.mail.uoguelph.ca (Postfix) with ESMTP id 52DC115F54D; Thu, 2 Jul 2015 20:51:42 -0400 (EDT) X-Virus-Scanned: amavisd-new at zcs1.mail.uoguelph.ca Received: from zcs1.mail.uoguelph.ca ([127.0.0.1]) by localhost (zcs1.mail.uoguelph.ca [127.0.0.1]) (amavisd-new, port 10026) with ESMTP id GOQkAL2s6YM4; Thu, 2 Jul 2015 20:51:42 -0400 (EDT) Received: from zcs1.mail.uoguelph.ca (zcs1.mail.uoguelph.ca [172.17.95.18]) by zcs1.mail.uoguelph.ca (Postfix) with ESMTP id 32A6115F533; Thu, 2 Jul 2015 20:51:42 -0400 (EDT) Date: Thu, 2 Jul 2015 20:51:42 -0400 (EDT) From: Rick Macklem To: Ahmed Kamal Cc: Julian Elischer , freebsd-fs@freebsd.org, Xin LI Message-ID: <2010996878.3611963.1435884702063.JavaMail.zimbra@uoguelph.ca> In-Reply-To: References: <1491630362.2785531.1435799383802.JavaMail.zimbra@uoguelph.ca> <5594B008.10202@freebsd.org> <1022558302.2863702.1435838360534.JavaMail.zimbra@uoguelph.ca> <791936587.3443190.1435873993955.JavaMail.zimbra@uoguelph.ca> Subject: Re: Linux NFSv4 clients are getting (bad sequence-id error!) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Originating-IP: [172.17.95.12] X-Mailer: Zimbra 8.0.9_GA_6191 (ZimbraWebClient - FF34 (Win)/8.0.9_GA_6191) Thread-Topic: Linux NFSv4 clients are getting (bad sequence-id error!) Thread-Index: qJtbvx6IOu1CAIPoeaFQqFZzEklYJg== X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 03 Jul 2015 00:51:59 -0000 Ahmed Kamal wrote: > PS: Today (after adjusting tcp.highwater) I didn't get any screaming > reports from users about hung vnc sessions. So maybe just maybe, linux > clients are able to somehow recover from this bad sequence messages. I > could still see the bad sequence error message in logs though > > Why isn't the highwater tunable set to something better by default ? I mean > this server is certainly not under a high or unusual load (it's only 40 PCs > mounting from it) > > On Fri, Jul 3, 2015 at 1:15 AM, Ahmed Kamal > wrote: > > > Thanks all .. I understand now we're doing the "right thing" .. Although > > if mounting keeps wedging, I will have to solve it somehow! Either using > > Xin's patch .. or Upgrading RHEL to 6.x and using NFS4.1. > > > > Regarding Xin's patch, is it possible to build the patched nfsd code, as a > > kernel module ? I'm looking to minimize my delta to upstream. > > Yes, you can build the nfsd as a module. If your kernel config does not include "options NFSD" the module will get loaded/used. It is also possible to replace the module without rebooting, but you need to kill of the nfsd daemon then kldunload nfsd.ko and replace nfsd.ko with the new one. (In /boot/.) > > Also would adopting Xin's patch and hiding it behind a > > kern.nfs.allow_linux_broken_client be an option (I'm probably not the last > > person on earth to hit this) ? > > If it fixes your problem, I think this is reasonable. I'm also hoping that someone that works on the Linux client reports if/when this was changed. rick > > Thanks a lot for all the help! > > > > On Thu, Jul 2, 2015 at 11:53 PM, Rick Macklem > > wrote: > > > >> Ahmed Kamal wrote: > >> > Appreciating the fruitful discussion! Can someone please explain to me, > >> > what would happen in the current situation (linux client doing this > >> > skip-by-1 thing, and freebsd not doing it) ? What is the effect of that? > >> Well, as you've seen, the Linux client doesn't function correctly against > >> the FreeBSD server (and probably others that don't support this > >> "skip-by-1" > >> case). > >> > >> > What do users see? Any chances of data loss? > >> Hmm. Mostly it will cause Opens to fail, but I can't guess what the Linux > >> client behaviour is after receiving NFS4ERR_BAD_SEQID. You're the guy > >> observing > >> it. > >> > >> > > >> > Also, I find it strange that netapp have acknowledged this is a bug on > >> > their side, which has been fixed since then! > >> Yea, I think Netapp screwed up. For some reason their server allowed this, > >> then was fixed to not allow it and then someone decided that was broken > >> and > >> reversed it. > >> > >> > I also find it strange that I'm the first to hit this :) Is no one > >> running > >> > nfs4 yet! > >> > > >> Well, it seems to be slowly catching on. I suspect that the Linux client > >> mounting a Netapp is the most common use of it. Since it appears that they > >> flip flopped w.r.t. who's bug this is, it has probably persisted. > >> > >> It may turn out that the Linux client has been fixed or it may turn out > >> that most servers allowed this "skip-by-1" even though David Noveck (one > >> of the main authors of the protocol) seems to agree with me that it should > >> not be allowed. > >> > >> It is possible that others have bumped into this, but it wasn't isolated > >> (I wouldn't have guessed it, so it was good you pointed to the RedHat > >> discussion) > >> and they worked around it by reverting to NFSv3 or similar. > >> The protocol is rather complex in this area and changed completely for > >> NFSv4.1, > >> so many have also probably moved onto NFSv4.1 where this won't be an > >> issue. > >> (NFSv4.1 uses sessions to provide exactly once RPC semantics and doesn't > >> use > >> these seqid fields.) > >> > >> This is all just mho, rick > >> > >> > On Thu, Jul 2, 2015 at 1:59 PM, Rick Macklem > >> wrote: > >> > > >> > > Julian Elischer wrote: > >> > > > On 7/2/15 9:09 AM, Rick Macklem wrote: > >> > > > > I am going to post to nfsv4@ietf.org to see what they say. Please > >> > > > > let me know if Xin Li's patch resolves your problem, even though I > >> > > > > don't believe it is correct except for the UINT32_MAX case. Good > >> > > > > luck with it, rick > >> > > > and please keep us all in the loop as to what they say! > >> > > > > >> > > > the general N+2 bit sounds like bullshit to me.. its always N+1 in a > >> > > > number field that has a > >> > > > bit of slack at wrap time (probably due to some ambiguity in the > >> > > > original spec). > >> > > > > >> > > Actually, since N is the lock op already done, N + 1 is the next lock > >> > > operation in order. Since lock ops need to be strictly ordered, > >> allowing > >> > > N + 2 (which means N + 2 would be done before N + 1) makes no sense. > >> > > > >> > > I think the author of the RFC meant that N + 2 or greater fails, but > >> it > >> > > was poorly worded. > >> > > > >> > > I will pass along whatever I get from nfsv4@ietf.org. (There is an > >> archive > >> > > of it somewhere, but I can't remember where.;-) > >> > > > >> > > rick > >> > > _______________________________________________ > >> > > freebsd-fs@freebsd.org mailing list > >> > > http://lists.freebsd.org/mailman/listinfo/freebsd-fs > >> > > To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org" > >> > > > >> > > >> > > > > > From owner-freebsd-fs@freebsd.org Fri Jul 3 01:22:36 2015 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id BD1FC993F15 for ; Fri, 3 Jul 2015 01:22:36 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) Received: from esa-annu.net.uoguelph.ca (esa-annu.mail.uoguelph.ca [131.104.91.36]) by mx1.freebsd.org (Postfix) with ESMTP id 659591B67 for ; Fri, 3 Jul 2015 01:22:35 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: A2B9AwBY45VV/61jaINbg2ZfBoMZuhoJgWQKhS5KAoIJFAEBAQEBAQGBCoQjAQEBAwEBAQEgKyALBQsCAQgYAgINGQICIQYBCSYCBAgHBAEcBId5AwoIDbcDkDINhWABAQEBAQUBAQEBAQEBG4EhiimCTYFnAQEFFzQHgmiBQwWUEoRhhDZrgx5Eg1CINYMsgz2DWwImY4FagVkiMQEGgQY6gQQBAQE X-IronPort-AV: E=Sophos;i="5.15,396,1432612800"; d="scan'208";a="223564660" Received: from nipigon.cs.uoguelph.ca (HELO zcs1.mail.uoguelph.ca) ([131.104.99.173]) by esa-annu.net.uoguelph.ca with ESMTP; 02 Jul 2015 21:22:34 -0400 Received: from localhost (localhost [127.0.0.1]) by zcs1.mail.uoguelph.ca (Postfix) with ESMTP id E491915F533; Thu, 2 Jul 2015 21:22:34 -0400 (EDT) Received: from zcs1.mail.uoguelph.ca ([127.0.0.1]) by localhost (zcs1.mail.uoguelph.ca [127.0.0.1]) (amavisd-new, port 10032) with ESMTP id YK7_2h4WZww8; Thu, 2 Jul 2015 21:22:34 -0400 (EDT) Received: from localhost (localhost [127.0.0.1]) by zcs1.mail.uoguelph.ca (Postfix) with ESMTP id 70F6415F54D; Thu, 2 Jul 2015 21:22:34 -0400 (EDT) X-Virus-Scanned: amavisd-new at zcs1.mail.uoguelph.ca Received: from zcs1.mail.uoguelph.ca ([127.0.0.1]) by localhost (zcs1.mail.uoguelph.ca [127.0.0.1]) (amavisd-new, port 10026) with ESMTP id vqaZ4-RiwksL; Thu, 2 Jul 2015 21:22:34 -0400 (EDT) Received: from zcs1.mail.uoguelph.ca (zcs1.mail.uoguelph.ca [172.17.95.18]) by zcs1.mail.uoguelph.ca (Postfix) with ESMTP id 5606815F533; Thu, 2 Jul 2015 21:22:34 -0400 (EDT) Date: Thu, 2 Jul 2015 21:22:34 -0400 (EDT) From: Rick Macklem To: "alex.burlyga.ietf alex.burlyga.ietf" Cc: freebsd-fs Message-ID: <797076669.3619739.1435886554308.JavaMail.zimbra@uoguelph.ca> In-Reply-To: References: <1969046464.61534041.1434897034960.JavaMail.root@uoguelph.ca> Subject: Re: [nfs][client] - Question about handling of the NFS3_EEXIST error in SYMLINK rpc MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Originating-IP: [172.17.95.10] X-Mailer: Zimbra 8.0.9_GA_6191 (ZimbraWebClient - FF34 (Win)/8.0.9_GA_6191) Thread-Topic: - Question about handling of the NFS3_EEXIST error in SYMLINK rpc Thread-Index: QLE8Kc3+KYQd0YcnUBj0OOjjkjQlOQ== X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 03 Jul 2015 01:22:36 -0000 Alex Burlyga wrote: > Rick, > > I was able to test the patch. Works as expected. With sysctl set to 1 > I can reproduce original issue, setting sysctl to 0 fixes the issue. > I'll leave running tests over night, but everything looks great. > The patch has been committed to head. I don't know if it will be in 10.2. That will be up to re@. rick > Alex > > On Mon, Jun 22, 2015 at 2:02 PM, alex.burlyga.ietf alex.burlyga.ietf > wrote: > > Rick, > > > > Thank you for a quick turn around, see answers inline: > > > > On Sun, Jun 21, 2015 at 7:30 AM, Rick Macklem wrote: > >> Alex Burlyga wrote: > >>> Hi, > >>> > >>> NFS client code in nfsrpc_symlink() masks server returned NFS3_EEXIST > >>> error > >>> code > >>> by returning 0 to the upper layers. I'm assuming this was an attempt > >>> to > >>> work around > >>> some server's broken replay cache out there, however, it breaks a > >>> more > >>> common > >>> case where server is returning EEXIST for legitimate reason and > >>> application > >>> is expecting this error code and equipped to deal with it. > >>> > >>> To fix it I see three ways of doing this: > >>> * Remove offending code > >>> * Make it optional, sysctl? > >>> * On NFS3_EEXIST send READLINK rpc to make sure symlink content is > >>> right > >>> > >>> Which of the ways will maximize the chances of getting this fix > >>> upstream? > >>> > >> I've attached a patch for testing/review that does essentially #2. > >> It has no effect on trivial tests, since the syscall does a Lookup > >> before trying to create the symlink and fails with EEXIST. > >> Do you have a case where competing clients are trying to create > >> the symlink or something like that, which runs into this? > > > > That's exactly failing test case we are running into. > >> > >> Please test the attached patch, since I don't know how to do that, rick > > Great! I'll test it. I was leaning towards option 3 for SYMLINK and > > option 2 for MKDIR. > > This will work. Thanks for taking your time to generate the patch! > > > >> > >>> One more point, old client circa FreeBSD 7.0 does not exhibit this > >>> problem. > >>> > >>> Alex > >>> _______________________________________________ > >>> freebsd-fs@freebsd.org mailing list > >>> http://lists.freebsd.org/mailman/listinfo/freebsd-fs > >>> To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org" > >>> > From owner-freebsd-fs@freebsd.org Fri Jul 3 03:22:22 2015 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 2E6249944B9 for ; Fri, 3 Jul 2015 03:22:22 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org (kenobi.freebsd.org [IPv6:2001:1900:2254:206a::16:76]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 130971543 for ; Fri, 3 Jul 2015 03:22:22 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from bugs.freebsd.org ([127.0.1.118]) by kenobi.freebsd.org (8.14.9/8.14.9) with ESMTP id t633MLnN013846 for ; Fri, 3 Jul 2015 03:22:21 GMT (envelope-from bugzilla-noreply@freebsd.org) From: bugzilla-noreply@freebsd.org To: freebsd-fs@FreeBSD.org Subject: [Bug 201141] [zfs] System Crashes on access to certain files on a dataset. ZFS Data set Refers to impossibly large size. Date: Fri, 03 Jul 2015 03:22:22 +0000 X-Bugzilla-Reason: AssignedTo X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: Base System X-Bugzilla-Component: kern X-Bugzilla-Version: 10.1-RELEASE X-Bugzilla-Keywords: X-Bugzilla-Severity: Affects Only Me X-Bugzilla-Who: linimon@FreeBSD.org X-Bugzilla-Status: New X-Bugzilla-Priority: --- X-Bugzilla-Assigned-To: freebsd-fs@FreeBSD.org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: short_desc assigned_to Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 7bit X-Bugzilla-URL: https://bugs.freebsd.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 03 Jul 2015 03:22:22 -0000 https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=201141 Mark Linimon changed: What |Removed |Added ---------------------------------------------------------------------------- Summary|System Crashes on access to |[zfs] System Crashes on |certain files on a dataset. |access to certain files on | ZFS Data set Refers to |a dataset. ZFS Data set |impossibly large size. |Refers to impossibly large | |size. Assignee|freebsd-bugs@FreeBSD.org |freebsd-fs@FreeBSD.org -- You are receiving this mail because: You are the assignee for the bug. From owner-freebsd-fs@freebsd.org Fri Jul 3 03:23:42 2015 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 23E63994532 for ; Fri, 3 Jul 2015 03:23:42 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org (kenobi.freebsd.org [IPv6:2001:1900:2254:206a::16:76]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 119AA1896 for ; Fri, 3 Jul 2015 03:23:42 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from bugs.freebsd.org ([127.0.1.118]) by kenobi.freebsd.org (8.14.9/8.14.9) with ESMTP id t633NfOI015212 for ; Fri, 3 Jul 2015 03:23:41 GMT (envelope-from bugzilla-noreply@freebsd.org) From: bugzilla-noreply@freebsd.org To: freebsd-fs@FreeBSD.org Subject: [Bug 201155] NFS4 exports are blocked when NIS master server is unreachable Date: Fri, 03 Jul 2015 03:23:42 +0000 X-Bugzilla-Reason: AssignedTo X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: Base System X-Bugzilla-Component: kern X-Bugzilla-Version: 10.1-STABLE X-Bugzilla-Keywords: X-Bugzilla-Severity: Affects Only Me X-Bugzilla-Who: linimon@FreeBSD.org X-Bugzilla-Status: New X-Bugzilla-Priority: --- X-Bugzilla-Assigned-To: freebsd-fs@FreeBSD.org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: assigned_to Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 7bit X-Bugzilla-URL: https://bugs.freebsd.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 03 Jul 2015 03:23:42 -0000 https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=201155 Mark Linimon changed: What |Removed |Added ---------------------------------------------------------------------------- Assignee|freebsd-bugs@FreeBSD.org |freebsd-fs@FreeBSD.org -- You are receiving this mail because: You are the assignee for the bug. From owner-freebsd-fs@freebsd.org Fri Jul 3 03:33:56 2015 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id A7349994733 for ; Fri, 3 Jul 2015 03:33:56 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org (kenobi.freebsd.org [IPv6:2001:1900:2254:206a::16:76]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 949891F68 for ; Fri, 3 Jul 2015 03:33:56 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from bugs.freebsd.org ([127.0.1.118]) by kenobi.freebsd.org (8.14.9/8.14.9) with ESMTP id t633XuNg030551 for ; Fri, 3 Jul 2015 03:33:56 GMT (envelope-from bugzilla-noreply@freebsd.org) From: bugzilla-noreply@freebsd.org To: freebsd-fs@FreeBSD.org Subject: [Bug 201073] [nfsclient] RPCSEC_GSS principal includes inappropriate directory path Date: Fri, 03 Jul 2015 03:33:56 +0000 X-Bugzilla-Reason: AssignedTo X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: Base System X-Bugzilla-Component: kern X-Bugzilla-Version: 10.1-RELEASE X-Bugzilla-Keywords: patch X-Bugzilla-Severity: Affects Some People X-Bugzilla-Who: linimon@FreeBSD.org X-Bugzilla-Status: New X-Bugzilla-Priority: --- X-Bugzilla-Assigned-To: freebsd-fs@FreeBSD.org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: assigned_to keywords Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 7bit X-Bugzilla-URL: https://bugs.freebsd.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 03 Jul 2015 03:33:56 -0000 https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=201073 Mark Linimon changed: What |Removed |Added ---------------------------------------------------------------------------- Assignee|freebsd-bugs@FreeBSD.org |freebsd-fs@FreeBSD.org Keywords| |patch -- You are receiving this mail because: You are the assignee for the bug. From owner-freebsd-fs@freebsd.org Fri Jul 3 03:36:14 2015 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id DE501994833 for ; Fri, 3 Jul 2015 03:36:14 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org (kenobi.freebsd.org [IPv6:2001:1900:2254:206a::16:76]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id CB6B52270 for ; Fri, 3 Jul 2015 03:36:14 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from bugs.freebsd.org ([127.0.1.118]) by kenobi.freebsd.org (8.14.9/8.14.9) with ESMTP id t633aEPN032665 for ; Fri, 3 Jul 2015 03:36:14 GMT (envelope-from bugzilla-noreply@freebsd.org) From: bugzilla-noreply@freebsd.org To: freebsd-fs@FreeBSD.org Subject: [Bug 201118] A crash when I/O-ing a UFS USB drive and appearing as going through ZFS (which is on the HD /) Date: Fri, 03 Jul 2015 03:36:15 +0000 X-Bugzilla-Reason: AssignedTo X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: Base System X-Bugzilla-Component: kern X-Bugzilla-Version: 10.1-STABLE X-Bugzilla-Keywords: X-Bugzilla-Severity: Affects Only Me X-Bugzilla-Who: linimon@FreeBSD.org X-Bugzilla-Status: New X-Bugzilla-Priority: --- X-Bugzilla-Assigned-To: freebsd-fs@FreeBSD.org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: assigned_to Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 7bit X-Bugzilla-URL: https://bugs.freebsd.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 03 Jul 2015 03:36:15 -0000 https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=201118 Mark Linimon changed: What |Removed |Added ---------------------------------------------------------------------------- Assignee|freebsd-bugs@FreeBSD.org |freebsd-fs@FreeBSD.org -- You are receiving this mail because: You are the assignee for the bug. From owner-freebsd-fs@freebsd.org Fri Jul 3 10:19:47 2015 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id B066A9932E1 for ; Fri, 3 Jul 2015 10:19:47 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org (kenobi.freebsd.org [IPv6:2001:1900:2254:206a::16:76]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 9DE201A08 for ; Fri, 3 Jul 2015 10:19:47 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from bugs.freebsd.org ([127.0.1.118]) by kenobi.freebsd.org (8.14.9/8.14.9) with ESMTP id t63AJlkE082924 for ; Fri, 3 Jul 2015 10:19:47 GMT (envelope-from bugzilla-noreply@freebsd.org) From: bugzilla-noreply@freebsd.org To: freebsd-fs@FreeBSD.org Subject: [Bug 199775] ZFS hangs while removing large file Date: Fri, 03 Jul 2015 10:19:46 +0000 X-Bugzilla-Reason: AssignedTo X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: Base System X-Bugzilla-Component: kern X-Bugzilla-Version: 10.1-STABLE X-Bugzilla-Keywords: X-Bugzilla-Severity: Affects Some People X-Bugzilla-Who: avg@FreeBSD.org X-Bugzilla-Status: In Progress X-Bugzilla-Priority: --- X-Bugzilla-Assigned-To: avg@FreeBSD.org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: assigned_to bug_status Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 7bit X-Bugzilla-URL: https://bugs.freebsd.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 03 Jul 2015 10:19:47 -0000 https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=199775 Andriy Gapon changed: What |Removed |Added ---------------------------------------------------------------------------- Assignee|freebsd-fs@FreeBSD.org |avg@FreeBSD.org Status|New |In Progress --- Comment #6 from Andriy Gapon --- Change in base r284593 should help with this problem. -- You are receiving this mail because: You are the assignee for the bug. From owner-freebsd-fs@freebsd.org Fri Jul 3 16:10:02 2015 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 810B6993FED for ; Fri, 3 Jul 2015 16:10:02 +0000 (UTC) (envelope-from julian@freebsd.org) Received: from vps1.elischer.org (vps1.elischer.org [204.109.63.16]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "vps1.elischer.org", Issuer "CA Cert Signing Authority" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id 5DAA82926 for ; Fri, 3 Jul 2015 16:10:01 +0000 (UTC) (envelope-from julian@freebsd.org) Received: from Julian-MBP3.local (ppp121-45-224-92.lns20.per1.internode.on.net [121.45.224.92]) (authenticated bits=0) by vps1.elischer.org (8.14.9/8.14.9) with ESMTP id t63G9vKO009917 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES128-SHA bits=128 verify=NO); Fri, 3 Jul 2015 09:10:00 -0700 (PDT) (envelope-from julian@freebsd.org) Message-ID: <5596B3CF.50703@freebsd.org> Date: Sat, 04 Jul 2015 00:09:51 +0800 From: Julian Elischer User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.10; rv:31.0) Gecko/20100101 Thunderbird/31.7.0 MIME-Version: 1.0 To: Felipe Monteiro de Carvalho CC: "freebsd-fs@freebsd.org" Subject: Re: Uberblock location References: <557B0255.8060809@freebsd.org> <01184F08-1C6B-4282-9203-1BF98F07A05A@gmail.com> <557C282D.8060809@freebsd.org> In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 03 Jul 2015 16:10:02 -0000 On 7/2/15 6:02 PM, Felipe Monteiro de Carvalho wrote: > Hello, > > Ok, thanks a lot =) I am now working based on the FreeBSD source code, which? the kernel code or the bootblock code? I believe the bootblock version /usr/src/sys/boot/zfs from woudl be easier to start with. > and I made a lot of progress at adapting it to my needs. My use is > very different, so its not a simple "use as is", so I'm trying to > understand what goes on in every step to adapt it, and I have arrived > at a point where I got lost. > > Ok, so far I have the following working: > > Part 1> Read the disk label: based on function vdev_probe, reads from > the disk the type vdev_phys_t > > This issues the following data: > > [0] DATA_TYPE_UINT64 name="version" val=5000 > [1] DATA_TYPE_STRING name="name" val="zfs_felipe_2" > [2] DATA_TYPE_UINT64 name="state" val=1 > [3] DATA_TYPE_UINT64 name="txg" val=164 > [4] DATA_TYPE_UINT64 name="pool_guid" val=8752563210577670337 > [5] DATA_TYPE_UINT64 name="errata" val=0 > [6] DATA_TYPE_UINT64 name="hostid" val=8323329 > [7] DATA_TYPE_STRING name="hostname" val="felipe-VirtualBox" > [8] DATA_TYPE_UINT64 name="top_guid" val=2451523895343473151 > [9] DATA_TYPE_UINT64 name="guid" val=2451523895343473151 > [10] DATA_TYPE_UINT64 name="vdev_children" val=1 > [11] DATA_TYPE_NVLIST name="vdev_tree" >> =[0] data_type_string 'name'="type" val="disk" >> =[1] data_type_uint64 'name'="id" val=0 >> =[2] data_type_uint64 'name'="guid" val=2,45152389534347E+018 >> =[3] data_type_string 'name'="path" val="/dev/loop7" >> =[4] data_type_uint64 'name'="whole_disk" val=0 >> =[5] data_type_uint64 'name'="metaslab_array" val=34 >> =[6] data_type_uint64 'name'="metaslab_shift" val=21 >> =[7] data_type_uint64 'name'="ashift" val=9 >> =[8] data_type_uint64 'name'="asize" val=257425408 >> =[9] data_type_uint64 'name'="is_log" val=0 >> =[10] data_type_uint64 'name'="create_txg" val=4 > [12] DATA_TYPE_NVLIST name="features_for_read" >> =[0] data_type_boolean 'name'="com.delphix:hole_birth" >> =[1] data_type_boolean 'name'="com.delphix:embedded_data" > This part is OK, although I was upset that in a pool with 2 disks, > there is no info in each disk about the other one, except that > vdev_children=2 =( It would have been great if there was the path to > the second disk, but anyway, this is no huge problem. define "path".. in a way that is OS independent and meaningful when running in a bios environment (bootblocks). > > Part 2> Read the uberblocks and figure out which is the most current one. > > I select the uberblock with the largest ub_timestamp as the best one, > and then I try to read the data pointed to by the uberblock.ub_rootbp > of type blkptr_t > > The best uberblock has the following ub_rootbp: > > [0] VolumePos=$28000 DiskPos=$28000 IsVolumePosValid=1 IsSBValid=1 > OriginNr=4 SB.ub_version=5000 SB.ub_timestamp=555CA4DF > SB.ub_software_version=5000 > BlockPtr=DVA[0]= > DVA[1]= > DVA[2]= LEVEL=0 TYPE=B LSIZE=800 > PSIZE=200 COMP=F BIRTH=A0 PHYS_BIRTH=A0 > > read using the functions such as BP_GET_COMPRESS, etc. > > And this part is where I am stuck, because: > > 1> The offset 0x325D400 counting from disk image start is filled with > CC CC CC CC .... it doesn't look like at all that it is something > valid. > > Maybe the offset should not be read from disk image start, but instead > counting from somewhere else? i should be from the base of the partition containing the filesystem but I feel you are probably already doing this of you probably wouldn't have got this far. > > 2> comp=F wow, so even basic blocks use compression? Or my reading of > the data somehow wrong? It will be a lot of work to get decompression > working. > > any ideas? > > Attached is the "best" uberblock, with block pointer data with blue background. you have gone as far as I can hep you.. I'll have to leave it to others. > > thanks, From owner-freebsd-fs@freebsd.org Fri Jul 3 20:48:38 2015 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 45C0D9934FB for ; Fri, 3 Jul 2015 20:48:38 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org (kenobi.freebsd.org [IPv6:2001:1900:2254:206a::16:76]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 32C271100 for ; Fri, 3 Jul 2015 20:48:38 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from bugs.freebsd.org ([127.0.1.118]) by kenobi.freebsd.org (8.14.9/8.14.9) with ESMTP id t63Kmc6L009171 for ; Fri, 3 Jul 2015 20:48:38 GMT (envelope-from bugzilla-noreply@freebsd.org) From: bugzilla-noreply@freebsd.org To: freebsd-fs@FreeBSD.org Subject: [Bug 201073] [nfsclient] RPCSEC_GSS principal includes inappropriate directory path Date: Fri, 03 Jul 2015 20:48:38 +0000 X-Bugzilla-Reason: AssignedTo X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: Base System X-Bugzilla-Component: kern X-Bugzilla-Version: 10.1-RELEASE X-Bugzilla-Keywords: patch X-Bugzilla-Severity: Affects Some People X-Bugzilla-Who: rmacklem@FreeBSD.org X-Bugzilla-Status: New X-Bugzilla-Priority: --- X-Bugzilla-Assigned-To: rmacklem@FreeBSD.org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: assigned_to cc Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 7bit X-Bugzilla-URL: https://bugs.freebsd.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 03 Jul 2015 20:48:38 -0000 https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=201073 Rick Macklem changed: What |Removed |Added ---------------------------------------------------------------------------- Assignee|freebsd-fs@FreeBSD.org |rmacklem@FreeBSD.org CC| |rmacklem@FreeBSD.org --- Comment #1 from Rick Macklem --- I'll take this one. Btw, normally mount_nfs creates a "principal" argument even if the caller hasn't specified one. (See line#590-600 of mount_nfs.c.) The only way I can see that this would fail is if getaddrinfo() fails to acquire a canonical hostname for the server. --> As such, I suspect this bug only affects sites with unusual hostname configurations, such as not using DNS or ??? -- You are receiving this mail because: You are the assignee for the bug. From owner-freebsd-fs@freebsd.org Sat Jul 4 01:01:15 2015 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 0E7679927F0 for ; Sat, 4 Jul 2015 01:01:15 +0000 (UTC) (envelope-from alex.burlyga.ietf@gmail.com) Received: from mail-yk0-x22f.google.com (mail-yk0-x22f.google.com [IPv6:2607:f8b0:4002:c07::22f]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id BFF8C197C for ; Sat, 4 Jul 2015 01:01:14 +0000 (UTC) (envelope-from alex.burlyga.ietf@gmail.com) Received: by ykdr198 with SMTP id r198so107146542ykd.3 for ; Fri, 03 Jul 2015 18:01:13 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=B/k5ZhOBcDBpgSKxYKPVM547rCpCbhi2ON92T5wmFVU=; b=M07nkpuaRNBszo0srOogvxc0+icseXWzNXjHLlOeaY8ICSOcJAACHwX+gqfmcqVDuO Wb0ZdICuoZ0oYjhcnS9SF2kxhLA5gBaMnuM1WB2A7mz5WZm8CnlnYYprpUzNy6sped3d JRolFqBNVWTojA1PkhR2yjIPWCXZoe5TSyXDOhulhI6yehicV/Zzlw9Dx8sd2QrhNqm4 v4p+kRbXGyyTOqjFsmL70IQmiHkHKwvDv0v+YHQbABjtSl4KYY8hBOKSzOkRIFCXU9AR 2WxJjJupggoejHhSUXWlNVdUeQ70trwxV2uEbv01iy/RExGwR/pUS22FD5CVuwybVpe0 Dt1g== MIME-Version: 1.0 X-Received: by 10.129.138.2 with SMTP id a2mr45144139ywg.149.1435971673404; Fri, 03 Jul 2015 18:01:13 -0700 (PDT) Received: by 10.13.244.65 with HTTP; Fri, 3 Jul 2015 18:01:13 -0700 (PDT) In-Reply-To: <797076669.3619739.1435886554308.JavaMail.zimbra@uoguelph.ca> References: <1969046464.61534041.1434897034960.JavaMail.root@uoguelph.ca> <797076669.3619739.1435886554308.JavaMail.zimbra@uoguelph.ca> Date: Fri, 3 Jul 2015 18:01:13 -0700 Message-ID: Subject: Re: [nfs][client] - Question about handling of the NFS3_EEXIST error in SYMLINK rpc From: "alex.burlyga.ietf alex.burlyga.ietf" To: Rick Macklem Cc: freebsd-fs Content-Type: text/plain; charset=UTF-8 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 04 Jul 2015 01:01:15 -0000 Great, hopefully this will get picked up for 10.2 so we can pick it up from upstream. Alex On Thu, Jul 2, 2015 at 6:22 PM, Rick Macklem wrote: > Alex Burlyga wrote: >> Rick, >> >> I was able to test the patch. Works as expected. With sysctl set to 1 >> I can reproduce original issue, setting sysctl to 0 fixes the issue. >> I'll leave running tests over night, but everything looks great. >> > The patch has been committed to head. I don't know if it will be in 10.2. > That will be up to re@. > > rick > >> Alex >> >> On Mon, Jun 22, 2015 at 2:02 PM, alex.burlyga.ietf alex.burlyga.ietf >> wrote: >> > Rick, >> > >> > Thank you for a quick turn around, see answers inline: >> > >> > On Sun, Jun 21, 2015 at 7:30 AM, Rick Macklem wrote: >> >> Alex Burlyga wrote: >> >>> Hi, >> >>> >> >>> NFS client code in nfsrpc_symlink() masks server returned NFS3_EEXIST >> >>> error >> >>> code >> >>> by returning 0 to the upper layers. I'm assuming this was an attempt >> >>> to >> >>> work around >> >>> some server's broken replay cache out there, however, it breaks a >> >>> more >> >>> common >> >>> case where server is returning EEXIST for legitimate reason and >> >>> application >> >>> is expecting this error code and equipped to deal with it. >> >>> >> >>> To fix it I see three ways of doing this: >> >>> * Remove offending code >> >>> * Make it optional, sysctl? >> >>> * On NFS3_EEXIST send READLINK rpc to make sure symlink content is >> >>> right >> >>> >> >>> Which of the ways will maximize the chances of getting this fix >> >>> upstream? >> >>> >> >> I've attached a patch for testing/review that does essentially #2. >> >> It has no effect on trivial tests, since the syscall does a Lookup >> >> before trying to create the symlink and fails with EEXIST. >> >> Do you have a case where competing clients are trying to create >> >> the symlink or something like that, which runs into this? >> > >> > That's exactly failing test case we are running into. >> >> >> >> Please test the attached patch, since I don't know how to do that, rick >> > Great! I'll test it. I was leaning towards option 3 for SYMLINK and >> > option 2 for MKDIR. >> > This will work. Thanks for taking your time to generate the patch! >> > >> >> >> >>> One more point, old client circa FreeBSD 7.0 does not exhibit this >> >>> problem. >> >>> >> >>> Alex >> >>> _______________________________________________ >> >>> freebsd-fs@freebsd.org mailing list >> >>> http://lists.freebsd.org/mailman/listinfo/freebsd-fs >> >>> To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org" >> >>> >> From owner-freebsd-fs@freebsd.org Sat Jul 4 19:13:17 2015 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id D35AB9BF9 for ; Sat, 4 Jul 2015 19:13:17 +0000 (UTC) (envelope-from gobble.wa@gmail.com) Received: from mail-qk0-x235.google.com (mail-qk0-x235.google.com [IPv6:2607:f8b0:400d:c09::235]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 94F1912B2 for ; Sat, 4 Jul 2015 19:13:17 +0000 (UTC) (envelope-from gobble.wa@gmail.com) Received: by qkhu186 with SMTP id u186so92362291qkh.0 for ; Sat, 04 Jul 2015 12:13:16 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:date:message-id:subject:from:to:content-type; bh=K1Q1turyTBT+cvu5xuOYrnZUIzRaNPnN6Py5tlPKvNk=; b=EeIUFzbDXc1OWMxzK86aPGG5VfPqrxoJqcxEsmnUMh295MPPPus1OfEkux3M/6ziZC TpKwjricJWxY5GS6zW0IoPj0tJ4v+38PwX7RyKJTToDOshws8Zyuk+Km2yitat05ArOb bQMK+u1GvUgVinhC9B2auBcKMUlaHd+R+1K7uAWEGxio2mnvb9+X8wi3J7H1Q++mTm+u yq2jkILr30cj/7ZcPG7MsPkBwKR8/M+xlgTZWXyY3ijGwoH0xtie8a5qWGViI6MjlxaA muHNaAlpYiKk3vrn2J4M9RiEWxZLq9ESpiq1DueA/ZXdpl6r9RrTfBipgxirtegQQN3a R98w== MIME-Version: 1.0 X-Received: by 10.140.108.201 with SMTP id j67mr61254950qgf.83.1436037196299; Sat, 04 Jul 2015 12:13:16 -0700 (PDT) Received: by 10.96.88.165 with HTTP; Sat, 4 Jul 2015 12:13:16 -0700 (PDT) Date: Sat, 4 Jul 2015 12:13:16 -0700 Message-ID: Subject: UDF v2.5 in head - for testing please From: Waitman Gobble To: freebsd-fs@freebsd.org Content-Type: text/plain; charset=UTF-8 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 04 Jul 2015 19:13:17 -0000 Hi, here's a patch against r285141 https://gist.github.com/waitman/07849a747d0d633a7009 It provides UDF v2.5 support to 'mount', This driver could be useful in setting up a 'cold storage' system, or to read a Blu-ray disc created for backup purposes. UDF2.5 driver notes - FreeBSD-11.0-Current patch This driver was originally ported over from NetBSD some years ago, as a GSOC project, and updated by William Devries for FreeBSD 10.0 Current about 2 years ago (i think work was on 10 then). I updated the source to build on FreeBSD-11.0-Current. It's a cut down and simplified version of the NetBSD driver, and is useful for mounting and reading UDF v2.5 Blu-ray discs. The current NetBSD driver provides write support, however it seems to me not necessary (see write examples below). My goal wasn't to provide a way to 'play' encrypted Blu-ray discs, however the driver has been tested successfully on non-encrypted commercial discs. (I understand production runs less than 300 units are typically not encrypted, at least that's my experience. I've been told by powers that be that if a person produces over 300 units then they _must_ be encrypted due to patent licensing. Perhaps that's true.) You can mount an encrypted disc and look at the file system, however I believe you must make use of aacs library and keys, to actually play an encrypted disc. The software creates the 'mount_udf2' program, located in /sbin, and two kernel modules, udf2.ko and udf2_icon.ko, located in /boot/kernel. There are adjustments made to /sbin/mount to add udf2 switch. Also support added to scsi. 1) Create an iso file of the "Music" directory # mkisofs -R -J -joliet-long -udf -iso-level 3 -o Music.iso Music the -joliet-long switch allows for filenames which do not conform to the ancient 8.3 standard. 2) Burn Blu-ray disc (set speed to device/disc caps) install ports/sysutils/dvd+rw-tools # growisofs -speed=4 -dvd-compat -Z /dev/cd0=music.iso Note: burning blu-ray disc on drive connected over SATA, writing 10GB iso takes about 30 minutes. There appears to be some issue burning on drive connected over USB 3.0, it is very slow and reportedly will take 1 to 7 days to complete. An issue that perhaps needs to be investigated, however it's not really relevant to the UDF2 driver. Perhaps it's missing firmware, or a faulty drive I'm using. The -dvd-compat switch tells growisofs to close the session on the disc. 3) Mount disc # mkdir /br # mount_udf2 /dev/cd0 /br (or) # mount -t udf2 /dev/cd0 /br 4) Play video disc mounted on /br using mplayer (mplayer needs libbluray support compiled in) # mplayer br:////br -zoom -fs Note: I'm mounting to /br because my recollection is that aacs library seems to depend on that mount point, if trying to read/play encrypted blu-ray disc. This may not be accurate, I haven't tested for some months. The way to play an encrypted blu-ray disc is to use aacs library and have access to the corresponding key files. I only mention playing encrypted discs because I presume that will be something a user wishes to accomplish, and be disappointed right off the bat, however a few points - it seems not a trivial task to access the necessary encryption keys, and it's probably IMHO not worth the trouble to get it working. It's more productive to play your movies in a $50 player, in my opinion. I'm presuming it is trivial to make a backup of an encrypted blu-ray disc, either by a sector dump or just simply mirroring the filesystem, however I have not tried. It seems the video files on the disc are encrypted, not the filesystem. However, I think having UDFv2.5 read support is useful for file backups, and developing cold storage systems using FreeBSD. As a 'sanity' test, I used a BD disc created on FreeBSD, opened on an MS Windows machine, seems like it works. https://www.dropbox.com/s/5hoib3tzs819zb9/sshot-drive-bd.png?dl=0 https://www.dropbox.com/s/o7jngsqgg6dgb3k/sshot-drive-bd-play.png?dl=0 error reports/Comments/suggestions/improvements are appreciated. Thanks, -- Waitman Gobble Los Altos California USA 650-999-0406