From owner-freebsd-fs@freebsd.org Sat Dec 12 03:41:06 2020 Return-Path: Delivered-To: freebsd-fs@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id EE71C47ECBF for ; Sat, 12 Dec 2020 03:41:06 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) Received: from CAN01-QB1-obe.outbound.protection.outlook.com (mail-qb1can01on0625.outbound.protection.outlook.com [IPv6:2a01:111:f400:fe5c::625]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "mail.protection.outlook.com", Issuer "GlobalSign Organization Validation CA - SHA256 - G3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4CtD2p1GkVz3rJn for ; Sat, 12 Dec 2020 03:41:05 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=kS/Zwbx1rEZ/krxJyfLHP2VvQ/feEZQ8N9VMdBV5545s7p2QvwwjmR3n5lCoPFcg5vpELkAwQ4VOO+5Djw15sFDM9ER8PzUYSLoLYswgvEcItjefGzsz82EPvCFWxQn7f0fm7zxH+MNAh7W49kUvXw50nFM2LSa9HvalbuHXiXZP/zlzLcB3/+SSZoTtV9XuQ/xQXoJledvMBW9jHeAP8fzMik1mgEkMOzKJvhaTv6LoaWTovOUiRRGXHpe07E+f7bbgRXV/A8GqjcBu9+vlF4r+JD2jJpRrSKc33CcBcePWakcRJ99NvrgBOmAC8m7Ya/t76ITmUTtz/ZtHW+3HiA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=+Tkv9cZR2DiHV5HbVXdeCnWZ0r1r3zlSnawmbiNh7Nw=; b=VwW0J4C1Prg85H1H4ZO/i74l0GbMlTHgyVbjP937H2cW1izcQ2vl4fXUDi6BTRryvl0RnosHOY+r5QsRzTqEsKKRW6NBxwfEf8ctIU25f6wjMosvPXNk5RiJmK0s2qGpOotnTsE+JKRHiEw5T5jcRrjESP6w/Ab8lM/gzyd1cN/nfxjOLP39sY2DZLUVZzHbUQuzOMziU9LhqpSmdmzg7+kvW2r9iyJuo8OIEHbd6kXCvl+1SGxqdHqM9p137oKBseM0P2Gvh/s5jczc8CQlpdmWLeK9CPGjF/AruYessQmuzQkwwyW5xSCzBf1xsrIknCbGCKtdAMHqYOBUdppMjQ== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=uoguelph.ca; dmarc=pass action=none header.from=uoguelph.ca; dkim=pass header.d=uoguelph.ca; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=uoguelph.ca; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=+Tkv9cZR2DiHV5HbVXdeCnWZ0r1r3zlSnawmbiNh7Nw=; b=A8OaZz7eNTU8wSykXaoeNZbcw6TiV4mzazV9StjbERumlmh42q+NTyVM3N8uP3kNtld6T21XfhY7DuDRyfubSyBiuMXtZSKphRO6lr+s2+3UFpVDuzwjsyAJnssZUue283BdsBUbHMrKT3LFLF5BBg8ccmElFdM8Lc3aRUlseusDR2bZHrOC8E8xUy9bROfuWd/fiqwGJshq5Nb44JWZo41aBu5zgg3qLVHThoqTVaVp6zIymFWvlQ+a8HAuBG7xVR9F/ej1d14bmYZ7IbCbTc7uUZbL4OAdJCXNbfeR8ihP531dXJMdyK0YZcy9YK4yXDT1cx5PdXYvk7dXNpnLPg== Received: from YQXPR0101MB0968.CANPRD01.PROD.OUTLOOK.COM (2603:10b6:c00:19::29) by YQXPR01MB3431.CANPRD01.PROD.OUTLOOK.COM (2603:10b6:c00:50::17) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3654.14; Sat, 12 Dec 2020 03:41:04 +0000 Received: from YQXPR0101MB0968.CANPRD01.PROD.OUTLOOK.COM ([fe80::7d6b:aa68:78f4:5d94]) by YQXPR0101MB0968.CANPRD01.PROD.OUTLOOK.COM ([fe80::7d6b:aa68:78f4:5d94%7]) with mapi id 15.20.3654.019; Sat, 12 Dec 2020 03:40:55 +0000 From: Rick Macklem To: J David CC: "freebsd-fs@freebsd.org" Subject: Re: Major issues with nfsv4 Thread-Topic: Major issues with nfsv4 Thread-Index: AQHWzw/HDat+dHoH9kKG5K3Xpd53kqnxDteQgAFi0QCAABTa84AALLCAgAAVvck= Date: Sat, 12 Dec 2020 03:40:55 +0000 Message-ID: References: , In-Reply-To: Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-ms-publictraffictype: Email x-ms-office365-filtering-correlation-id: c4187e36-ae86-4107-5e1b-08d89e4fbb13 x-ms-traffictypediagnostic: YQXPR01MB3431: x-microsoft-antispam-prvs: x-ms-oob-tlc-oobclassifiers: OLM:10000; x-ms-exchange-senderadcheck: 1 x-microsoft-antispam: BCL:0; x-microsoft-antispam-message-info: 2Uy033T/7C3u55C/ApEXqJZz6kIXNePmvscKBTL3y2stwMt2JElXYxtedlkkV4tPdRwUrhZAOONeYMvI6GONJrmZO76cvXQRngE5begnJ1YluiLoUeQy3qiID7Jz7JWo3Nc/4aUxmzNzErUwiGCuC/uCmIRueL03Ej1rBZLsxryZqRT/oAe9kPlx8UtVRrRJc8z9GB5bJJrNcx2HcH64JNBR/tI7q/P1DfQA4TjZWtpG+xcEcShMclPdPz8469kjk4Suj/zpVFUotycLliH3Tre3qwqiPZc1IQNvKt8U0F5N7hzCCRL7HEdHWKc8mQHn3N1Ag8MXoHCZ3+P5uK7xYA== x-forefront-antispam-report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:YQXPR0101MB0968.CANPRD01.PROD.OUTLOOK.COM; PTR:; CAT:NONE; SFS:(346002)(136003)(366004)(376002)(9686003)(33656002)(5660300002)(83380400001)(55016002)(6506007)(86362001)(71200400001)(8676002)(66556008)(186003)(8936002)(7696005)(786003)(2906002)(66476007)(64756008)(66446008)(66946007)(66574015)(52536014)(91956017)(76116006)(6916009)(4326008)(508600001); DIR:OUT; SFP:1101; x-ms-exchange-antispam-messagedata: =?iso-8859-1?Q?MMVVko2xCbeCaIKZrbNYrUdlwQBzXDNzAH9sqmnbRytg1VgjLY6XCZWCp3?= =?iso-8859-1?Q?G84dtTXrC+4fvsiwdo5BmfyYmbBTXZ18doiMzJEVc6yG8xlHuuvDjL5lhf?= =?iso-8859-1?Q?BvkW4rM24NlftXOx5q33HAwAz/UiTSwC4MCaxCyiaIFjI5mgLhewH+5axL?= =?iso-8859-1?Q?fR8OtktiQo7Q9ozsR/FR8kUeAcHswGv5YiK+WmgudCrBFqY1qivEXJWxCr?= =?iso-8859-1?Q?1fHTViYVgw7VZWg/VfrQms6usPLZnAxZvdXf+0Arx7BEfc8b2Uyu+Ru8pe?= =?iso-8859-1?Q?ynYsURIs5WTKOC+RU+03wTUmMjHUDxj8sAvns7c/ES3miz5VlwwjCANkFY?= =?iso-8859-1?Q?pUNhVoLtUIv0HSC37J/3V1gvABmPuRvRPBoaVXcevZqurM5IoELr/pk7lO?= =?iso-8859-1?Q?HyGE2vtsANYMOW1UdNvuL2PA8SL5MX4eEYA5qgjLsz1X1Qe9jYQdKvplFs?= =?iso-8859-1?Q?6TkhLqyIFhf7xcSbDod7cxQSDC872X0tHIwnKFz5SScVK/8kRlTq0eFFRs?= =?iso-8859-1?Q?BoX4TZ54KMQJsCLI4o34dTddmm6k7Ibl73H1h6kwtP+Hz6gEzm2QWJMw9n?= =?iso-8859-1?Q?tZcdpLDFO8F3SPe9H975Kdyre8nxVaspeUlrodZ9nKL52lgIJlU7mrZ0nW?= =?iso-8859-1?Q?fckyf+UDhXIdDAvsQ2rmr9mYQ83CftyhUE7TEK8BkECsGHFdpacPYvWVpw?= =?iso-8859-1?Q?/eGjr756byEvoRQ0z4WAKwPBArlDtolsgnm0FZAtfRorhSyGoToKROqwrd?= =?iso-8859-1?Q?KSTSiNMkWD0XDC7QkF0PClIt+CqwU/nuvdIS1/Hk3Pa8r2KFM4rk/C0ipy?= =?iso-8859-1?Q?34r9HQJBfwYYbuW6oT1nsRd7k/Foi1S0KG6zSkpUkphdW623PP7V/ATMLs?= =?iso-8859-1?Q?lMDgq0JP2700AP6s9Hn85NO3+wUrEToTnVKF5eWk+Z3g5UnyoHqbKa2hgi?= =?iso-8859-1?Q?QJUcph3ze8+K/AzyXfL+KbTE3ggwl/QJUPU0FfIWN5/dAjhmBQbWYONcbV?= =?iso-8859-1?Q?962vsd7BR+NCLNX34LOju/oZbfUNlwqDE1AfLihzpiUqGSfb+JamzgJmKd?= =?iso-8859-1?Q?uWSTyyeqkScXC+swSdDhBXM=3D?= x-ms-exchange-transport-forked: True Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-OriginatorOrg: uoguelph.ca X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-AuthSource: YQXPR0101MB0968.CANPRD01.PROD.OUTLOOK.COM X-MS-Exchange-CrossTenant-Network-Message-Id: c4187e36-ae86-4107-5e1b-08d89e4fbb13 X-MS-Exchange-CrossTenant-originalarrivaltime: 12 Dec 2020 03:40:55.0199 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: be62a12b-2cad-49a1-a5fa-85f4f3156a7d X-MS-Exchange-CrossTenant-mailboxtype: HOSTED X-MS-Exchange-CrossTenant-userprincipalname: bQvNHpxiDw+VDsK8hvlqSDdD/ahBZcU5gJXXMVpjtEAndTQer8lfRhE6slzaKAFbLJ8OfU3FtSTAUnhEGVL+1Q== X-MS-Exchange-Transport-CrossTenantHeadersStamped: YQXPR01MB3431 X-Rspamd-Queue-Id: 4CtD2p1GkVz3rJn X-Spamd-Bar: ---- Authentication-Results: mx1.freebsd.org; dkim=pass header.d=uoguelph.ca header.s=selector1 header.b=A8OaZz7e; arc=pass (microsoft.com:s=arcselector9901:i=1); dmarc=pass (policy=none) header.from=uoguelph.ca; spf=pass (mx1.freebsd.org: domain of rmacklem@uoguelph.ca designates 2a01:111:f400:fe5c::625 as permitted sender) smtp.mailfrom=rmacklem@uoguelph.ca X-Spamd-Result: default: False [-4.00 / 15.00]; TO_DN_EQ_ADDR_SOME(0.00)[]; TO_DN_SOME(0.00)[]; R_SPF_ALLOW(-0.20)[+ip6:2a01:111:f400::/48]; RCVD_COUNT_THREE(0.00)[3]; DKIM_TRACE(0.00)[uoguelph.ca:+]; RCPT_COUNT_TWO(0.00)[2]; DMARC_POLICY_ALLOW(-0.50)[uoguelph.ca,none]; NEURAL_HAM_SHORT(-1.00)[-1.000]; FREEMAIL_TO(0.00)[gmail.com]; FROM_EQ_ENVFROM(0.00)[]; RCVD_TLS_LAST(0.00)[]; RBL_DBL_DONT_QUERY_IPS(0.00)[2a01:111:f400:fe5c::625:from]; ARC_ALLOW(-1.00)[microsoft.com:s=arcselector9901:i=1]; MIME_TRACE(0.00)[0:+]; ASN(0.00)[asn:8075, ipnet:2a01:111:f000::/36, country:US]; NEURAL_HAM_MEDIUM(-1.00)[-1.000]; R_DKIM_ALLOW(-0.20)[uoguelph.ca:s=selector1]; FREEFALL_USER(0.00)[rmacklem]; FROM_HAS_DN(0.00)[]; TAGGED_RCPT(0.00)[]; MIME_GOOD(-0.10)[text/plain]; SPAMHAUS_ZRD(0.00)[2a01:111:f400:fe5c::625:from:127.0.2.255]; DWL_DNSWL_LOW(-1.00)[uoguelph.ca:dkim]; TO_MATCH_ENVRCPT_SOME(0.00)[]; NEURAL_SPAM_LONG(1.00)[1.000]; MAILMAN_DEST(0.00)[freebsd-fs] X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 12 Dec 2020 03:41:07 -0000 J David wrote:=0A= >On Fri, Dec 11, 2020 at 6:28 PM Rick Macklem wrote:= =0A= >> I am afraid I know nothing about nullfs and jails. I suspect it will be= =0A= >> something related to when file descriptors in the NFS client mount=0A= >> get closed.=0A= >=0A= >What does NFSv4 do differently than NFSv3 that might upset a low-level=0A= >consumer like nullfs?=0A= The opens for one. When a file is opened it finds its way to VOP_OPEN().=0A= --> For NFSv3 all it does is some client side cache consistency checks.=0A= --> For NFSv4, it must acquire or update a NFSv4 Open, which is a form=0A= of lock that is acquired/updated by an Open operation in an RPC.=0A= Then the client stores this locking info in a structure in a linked = list=0A= off of the mount point.=0A= Once all file descriptors for the vnode are closed, then, and only= =0A= then can a Close operation be done against the server and the linked= =0A= list data structure be free'd.=0A= --> Does having nullfs between the file descriptors and the NFS vnod= es=0A= for the same file affect when the v_usecount decrements to 0 = on=0A= the NFS vnode?=0A= I don't know. but if it delays it, then these linked list str= uctures=0A= will not be free'd as soon and might accumulate.=0A= --> The more structures the longer the linked list and the mo= re=0A= overhead/cpu will be used prcessing them.=0A= The fact that processes are spending a long time in exit() might=0A= be a hint that there are a large # of these NFSv4 Opens to deal with= =0A= when files are being closed implicitly during exit.=0A= =0A= As I mentioned, "nfsstat -c -E" will tell you how many Opens there= =0A= are under the "OpenOwners ..." line.=0A= =0A= >> Well, NFSv3 is not going away any time soon, so if you don't need=0A= >> any of the additional features it offers...=0A= >=0A= >If we did not want the additional features, we definitely would not be=0A= >attempting this.=0A= >=0A= >> a user would have to run their own custom hacked=0A= >> userland NFS client. Although doable, I have never heard of it being don= e.=0A= >=0A= >Alex beat me to libnfs.=0A= And you have users that would want to maliciously access the NFS server=0A= running jobs on this environment? (Other than reverting to NFSv3, allowing= =0A= clients to use non-reserved port#s is probably your other choice, from what= =0A= I can see. Fixing whatever the interaction between nullfs and the NFSv4 mou= nt=0A= is probably won't be fixed quickly, if ever.)=0A= =0A= >What about this as a stopgap measure?=0A= >=0A= >> How explosive would adding SO_REUSEADDR to the NFS client be? It's=0A= >> not a full solution, but it would handle the TIME_WAIT side of the=0A= >> issue.=0A= >=0A= >The kernel NFS networking code is confusing to me. I can't even=0A= >figure out where/how NFSv4 binds a client socket to know if it's=0A= >possible. (Pretty sure the code in sys/nfs/krpc_subr.c is not it.)=0A= It's done in the kernel RPC code, found in the sys/rpc directory.=0A= Mostly in clnt_rc.c and clnt_vc.c.=0A= If there is a timeout for an RPC (slow server, network problem,...),=0A= the code in clnt_rc.c will create a new TCP connection. The old=0A= connection could easily still be around.=0A= As such, I do not believe that SO_REUSEADDR or SO_REUSEPORT=0A= is feasible.=0A= =0A= rick=0A= =0A= Thanks!=0A=