Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 2 Jul 2021 02:40:49 +0000
From:      Rick Macklem <rmacklem@uoguelph.ca>
To:        Peter Eriksson <pen@lysator.liu.se>
Cc:        freebsd-net <freebsd-net@freebsd.org>
Subject:   Re: RFC: NFS trunking (multiple TCP connections for a mount
Message-ID:  <YQXPR0101MB09680E95ACA0D07F817688AEDD1F9@YQXPR0101MB0968.CANPRD01.PROD.OUTLOOK.COM>
In-Reply-To: <YQXPR0101MB0968C4F4865ADA058CCEEA17DD009@YQXPR0101MB0968.CANPRD01.PROD.OUTLOOK.COM>
References:  <YQXPR0101MB0968DC173855A82AAF45F08FDD039@YQXPR0101MB0968.CANPRD01.PROD.OUTLOOK.COM>,<362300CE-30DA-4552-A3E4-0F3DFE385B2A@lysator.liu.se>,<YQXPR0101MB0968C4F4865ADA058CCEEA17DD009@YQXPR0101MB0968.CANPRD01.PROD.OUTLOOK.COM>

next in thread | previous in thread | raw e-mail | index | archive | help
Rick Macklem wrote:=0A=
>In case anyone is interested in testing and/or reviewing the patch,=0A=
>it is at https://reviews.freebsd.org/D30970.=0A=
>=0A=
>Only lightly tested at this point.=0A=
>=0A=
>The NFS mount option is "nconnect=3D<N>", where 2<=3D N <=3D 16,=0A=
>same as Linux. (I haven't done a man page patch yet.)=0A=
I have updated the patch so that the original TCP connection is=0A=
used for RPCs that consist of small messages (therefore not needing=0A=
much network bandwidth) and the RPCs (Read/Readdir/Write) that=0A=
use larger messages are sent on the N-1 additional TCP connections=0A=
in a round robin fashion.=0A=
=0A=
The message below was posted a couple of days ago on linux-nfs@vger.kernel.=
org.=0A=
It might be unfair to put it here, out of context, but I think it at least=
=0A=
suggests that separating the larger RPC messages from the small ones=0A=
(mostly Lookup/Getattr/Access metadata related RPCs) may be useful=0A=
under certain circumstances.=0A=
> The original issue described was how a high read/write process on the=0A=
> client could slow another process trying to do heavy metadata=0A=
> operations (like walking the filesystem). Using a different mount to=0A=
> the same multi-homed server seems to help a lot (probably because of=0A=
> the independent slot table).=0A=
--> For this implementation, there is no separate session/slot table.=0A=
      (Note that each I/O RPC only uses one table slot.)=0A=
=0A=
I did not make this small vs large RPCs on a separate TCP connection=0A=
a separate option, since I believe there are already too many mount options=
.=0A=
If others feel it should be a separate mount option, please speak up.=0A=
=0A=
The phabricator patch has been updated. Please test/review/comment.=0A=
=0A=
Thanks, rick=0A=
=0A=
Thanks everyone, for your input, rick=0A=
=0A=
________________________________________=0A=
From: Peter Eriksson <pen@lysator.liu.se>=0A=
Sent: Tuesday, June 29, 2021 5:11 AM=0A=
To: Rick Macklem=0A=
Cc: freebsd-net=0A=
Subject: Re: RFC: NFS trunking (multiple TCP connections for a mount=0A=
=0A=
CAUTION: This email originated from outside of the University of Guelph. Do=
 not click links or open attachments unless you recognize the sender and kn=
ow the content is safe. If in doubt, forward suspicious emails to IThelp@uo=
guelph.ca=0A=
=0A=
=0A=
> I don't understand how multiple TCP connections to the same=0A=
> server IP address will distribute the load across multiple network=0A=
> interfaces?=0A=
> I thought that lagg would have handled this?=0A=
=0A=
=0A=
A lagg typically keeps all data in a TCP stream on a specific lagg member (=
depending on how the lagg is set up, unless you select the =93roundrobin=94=
 option in freebsd -  don=92t do that unless you like out-of-order packets=
=85)=0A=
=0A=
Network equipment with laggs typically hash the IP streams over the lagg me=
mbers based on MAC addresses (source&target), IP addresses (source&target) =
and port numbers.=0A=
=0A=
(We have been diagnosing a fun problem locally where we see packet losses/p=
erformance drops over our internal backbone network for certain combination=
s of odd/even IP addresses/port numbers when things pass certain SPB =93rou=
ters=94 (which typically hash the streams over many =93channels=94 between =
routers)=85 Fun fun. :-)=0A=
=0A=
I think the multiple NFS TCP streams could make for some nice performance i=
mprovements in certain cases. And it would be a more generalisation of havi=
ng multiple streams between two hosts - one-or-many over IPv4 and one-or-ma=
ny over IPv6 at the same time. Windows SMB has a similar feature.=0A=
=0A=
Just avoid the Linux NFS mounting deadlock issue with =93down=94 servers pl=
ease  :-)=0A=
=0A=
- Peter=0A=
=0A=



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?YQXPR0101MB09680E95ACA0D07F817688AEDD1F9>