Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 3 Dec 2001 02:07:27 -0500 (EST)
From:      Mike Silbersack <silby@silby.com>
To:        Matthew Dillon <dillon@apollo.backplane.com>
Cc:        Luigi Rizzo <rizzo@aciri.org>, Richard Sharpe <sharpe@ns.aus.com>, <freebsd-hackers@FreeBSD.ORG>
Subject:   Re: Patch #3 (TCP / Linux / Performance)
Message-ID:  <Pine.BSF.4.30.0112030205080.46337-100000@niwun.pair.com>
In-Reply-To: <200112030536.fB35ac395075@apollo.backplane.com>

next in thread | previous in thread | raw e-mail | index | archive | help

This part of the thread sounds really familiar.  I recall someone coming
up with a patch for this a few weeks ago, possibly committing it to
-current.  I'm too tired and it's too late, though; I'll look for it
tomorrow if Matt doesn't find the thread in the archives first.

Mike "Silby" Silbersack


On Sun, 2 Dec 2001, Matthew Dillon wrote:

>
> :curious, as the loopback's MTU is normally 16384.
> :Also, any idea on where does the 4096 limit (1460*2+1176) come from ?
> :
> :	cheers
> :	luigi
>
>     It comes from the size of an mbuf, which is 2K.  If you are trying to
>     send 4100 bytes of data what winds up happening is this:
>
> 	* construct 2048 byte mbuf and queue	(TF_MORETOCOME set)
> 		1460 byte packet gets pushed out
> 	* construct 2048 byte mbuf and queue	(TF_MORETOCOME set)
> 		1460 byte packet gets pushed out
> 		(1172 bytes left over in mbuf)
> 	    <<--- ack is received (semi synchronous)
> 		1172 bytes in transmit buffer are pushed out due to the ack
> 	* construct 4 byte mbuf and queue 	(TF_MORETOCOME clear)
> 		4 bytes is pushed out due to TCP_NOWAIT being set.
>
>     There are two localhost MTUs.  If you use 'localhost' the MTU is 16384.
>     If you use the IP address of an ethernet interface on the machine the
>     MTU winds up being 1500 even though it is effectively a localhost
>     connection.  An MTU of 1500 generates the 1460 byte push-outs.
>
>     However, even with an MTU of 16384 you still have the same problem when
>     sending, say, 16384+2052 bytes of data.  After it pushed out a 16384 byte
>     segment it winds up with 2048 bytes queued in the mbuf and a
>     received ack (again, semi synchronous because this is localhost) will
>     cause it to push out the 2048 bytes prematurely, before the last 4 bytes
>     can get queued.
>
>     What we need is a mechanism in the tcp_input() code to NOT call
>     tcp_output() when an ACK is received, under certain circumstances.
>     I was thinking of taking the TF_MORETOCOME flag and causing it to be
>     left set for the duration of the write (except for the last sub-write).
>     At the moment it is set and cleared for each sub-write and the ack wiggles
>     its way in while it happens to be clear.  In anycase, this would all
>     tcp_input() to skip calling tcp_output() prematurely.  But it isn't so
>     easy to implement since the TF_ flags are in the 'tp' structure, not
>     the 'so' socket structure, and higher levels do not have direct access
>     to the tcp-specific 'tp' structure.
>
> 					-Matt
> 					Matthew Dillon
> 					<dillon@backplane.com>
>
>
> To Unsubscribe: send mail to majordomo@FreeBSD.org
> with "unsubscribe freebsd-hackers" in the body of the message
>


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-hackers" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?Pine.BSF.4.30.0112030205080.46337-100000>