Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 5 Jun 2001 15:20:28 +0300
From:      Ruslan Ermilov <ru@FreeBSD.org>
To:        Jesper Skriver <jesper@skriver.dk>, Jonathan Lemon <jlemon@FreeBSD.org>
Cc:        freebsd-net@FreeBSD.org
Subject:   Re: control TCP send/recieve window size based on port numbers ?  and a bug(?) in sendpipe/recvpipe handling ...
Message-ID:  <20010605152028.A12215@sunbay.com>
In-Reply-To: <20010527000854.B98021@skriver.dk>; from jesper@skriver.dk on Sun, May 27, 2001 at 12:08:54AM %2B0200
References:  <20010526213442.A95985@skriver.dk> <20010527000854.B98021@skriver.dk>

next in thread | previous in thread | raw e-mail | index | archive | help

--HcAYCG3uE/tztfnV
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline

On Sun, May 27, 2001 at 12:08:54AM +0200, Jesper Skriver wrote:
> On Sat, May 26, 2001 at 09:34:42PM +0200, Jesper Skriver wrote:
> > Hi,
> > 
> > I'm currently looking at ways to tune a ftp server, and when
> > tuning net.inet.tcp.sendspace/net.inet.tcp.recvspace and
> > NMBCLUSTERS, I came to think that in a ftp server role, half the
> > TCP sessions will be control sessions, which doesn't transfer much
> > data, so there is no reason to reserve the same number of buffers
> > for sendspace/recvspace for these, compared to the data sessions.
> > 
> > I was thinking of adding 3 new sysctl's 
> > 
> > net.inet.tcp.override_sendspace
> > net.inet.tcp.override_recvspace
> > net.inet.tcp.override_ports
> > 
> > The latter controls which (if any) src/dst ports, trigger the
> > session to get the overridden send and recv-space applied.
> > 
> > Does this make any sense ?
> 
> As Mike Silbersack has educated me, the sendspace and recvspace is
> only the upper limit pr. session, and it's not static allocated,
> so this i not a problem, and thus this patch doesn't give us
> anything.
> 
> So the only thing remaining is the bug where the sendpipe/recvpipe
> doesn't have any effect.
> 
It does, but only if the pipesize from the rtentry is greater than
the mss.  IOW, buffer sizes never fall below MSS.  I wonder if this
was intentional though.  The code for rmx_recvpipe suggests it was.

: 	/*
: 	 * If there's a pipesize, change the socket buffer
: 	 * to that size.  Make the socket buffers an integral
: 	 * number of mss units; if the mss is larger than
: 	 * the socket buffer, decrease the mss.
: 	 */
: #ifdef RTV_SPIPE
: 	if ((bufsize = rt->rt_rmx.rmx_sendpipe) == 0)
: #endif
: 		bufsize = so->so_snd.sb_hiwat;
: 	if (bufsize < mss)
: 		mss = bufsize;
: 	else {
: 		bufsize = roundup(bufsize, mss);
: 		if (bufsize > sb_max)
: 			bufsize = sb_max;
: 		(void)sbreserve(&so->so_snd, bufsize, so, NULL);
: 	}
: 	tp->t_maxseg = mss;
: 
: #ifdef RTV_RPIPE
: 	if ((bufsize = rt->rt_rmx.rmx_recvpipe) == 0)
: #endif
: 		bufsize = so->so_rcv.sb_hiwat;
: 	if (bufsize > mss) {
: 		bufsize = roundup(bufsize, mss);
: 		if (bufsize > sb_max)
: 			bufsize = sb_max;
: 		(void)sbreserve(&so->so_rcv, bufsize, so, NULL);
: 	}

Also, there is the related PR kern/11966 which complains about
this code overriding user-set buffer sizes.  The problem could
be demonstrated with the loopback connection (through lo0), for
which lortrequest() always sets send and receive pipes to
3 * LOMTU = 49152:

: # sysctl net.inet.tcp.recvspace net.inet.tcp.rfc1323
: net.inet.tcp.recvspace: 65535
: net.inet.tcp.rfc1323: 1
: 
: # route -n get 127.1
:    route to: 127.0.0.1
: destination: 127.0.0.1
:   interface: lo0
:       flags: <UP,HOST,DONE,LOCAL>
:  recvpipe  sendpipe  ssthresh  rtt,msec    rttvar  hopcount      mtu     expire
:    49152     49152         0         0         0         0     16384         0 
: 
: # ./tcp
: rcv. buffer size before connect(): 65535 bytes
: rcv. buffer size after connect(): 57344 bytes

where:
	mss = rounddown(mtu - 40, MCLBYTES) =
	    rounddown(16384 - 40, 2048) = 14336

	rcvbuf = roundup(recvpipe, mss) =
	    roundup(49152, 14336) = 57344


In the rfc1323=1 case, this is even worse.  The user initially sets
the large receive buffer, this then gets announced via the window
scale option, and this code then resets the receive buffer to the
lower size.

The attached patch fixes this by only changing the buffer size to
the greater value.  The impact of this patch should be low, as
(by default) only routes through the loopback interface have
these routing metrics set.  Please review.


Cheers,
-- 
Ruslan Ermilov		Oracle Developer/DBA,
ru@sunbay.com		Sunbay Software AG,
ru@FreeBSD.org		FreeBSD committer,
+380.652.512.251	Simferopol, Ukraine

http://www.FreeBSD.org	The Power To Serve
http://www.oracle.com	Enabling The Information Age

--HcAYCG3uE/tztfnV
Content-Type: text/plain; charset=us-ascii
Content-Disposition: attachment; filename=p

Index: tcp_input.c
===================================================================
RCS file: /home/ncvs/src/sys/netinet/tcp_input.c,v
retrieving revision 1.107.2.8
diff -u -p -r1.107.2.8 tcp_input.c
--- tcp_input.c	2001/04/18 17:55:23	1.107.2.8
+++ tcp_input.c	2001/06/05 11:55:03
@@ -2786,7 +2786,8 @@ tcp_mss(tp, offer)
 		bufsize = roundup(bufsize, mss);
 		if (bufsize > sb_max)
 			bufsize = sb_max;
-		(void)sbreserve(&so->so_snd, bufsize, so, NULL);
+		if (bufsize > so->so_snd.sb_hiwat)
+			(void)sbreserve(&so->so_snd, bufsize, so, NULL);
 	}
 	tp->t_maxseg = mss;
 
@@ -2798,7 +2799,8 @@ tcp_mss(tp, offer)
 		bufsize = roundup(bufsize, mss);
 		if (bufsize > sb_max)
 			bufsize = sb_max;
-		(void)sbreserve(&so->so_rcv, bufsize, so, NULL);
+		if (bufsize > so->so_rcv.sb_hiwat)
+			(void)sbreserve(&so->so_rcv, bufsize, so, NULL);
 	}
 
 	/*

--HcAYCG3uE/tztfnV--

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-net" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20010605152028.A12215>