Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 2 Dec 2001 21:36:38 -0800 (PST)
From:      Matthew Dillon <dillon@apollo.backplane.com>
To:        Luigi Rizzo <rizzo@aciri.org>
Cc:        Richard Sharpe <sharpe@ns.aus.com>, freebsd-hackers@FreeBSD.ORG
Subject:   Re: Patch #3 (TCP / Linux / Performance)
Message-ID:  <200112030536.fB35ac395075@apollo.backplane.com>
References:  <20011128153817.T61580@monorchid.lemis.com> <15364.38174.938500.946169@caddis.yogotech.com> <20011128104629.A43642@walton.maths.tcd.ie> <5.1.0.14.1.20011130181236.00a80160@postamt1.charite.de> <200111302047.fAUKlT811090@apollo.backplane.com> <200111302130.fAULUU324648@apollo.backplane.com> <3C08CF9D.2030109@ns.aus.com> <200112012138.fB1LcG837063@apollo.backplane.com> <200112020810.fB28Arr77757@apollo.backplane.com> <20011202204702.A54149@iguana.aciri.org>

next in thread | previous in thread | raw e-mail | index | archive | help

:curious, as the loopback's MTU is normally 16384.
:Also, any idea on where does the 4096 limit (1460*2+1176) come from ?
:
:	cheers
:	luigi

    It comes from the size of an mbuf, which is 2K.  If you are trying to
    send 4100 bytes of data what winds up happening is this:

	* construct 2048 byte mbuf and queue	(TF_MORETOCOME set)
		1460 byte packet gets pushed out
	* construct 2048 byte mbuf and queue	(TF_MORETOCOME set)
		1460 byte packet gets pushed out
		(1172 bytes left over in mbuf)
	    <<--- ack is received (semi synchronous)
		1172 bytes in transmit buffer are pushed out due to the ack
	* construct 4 byte mbuf and queue 	(TF_MORETOCOME clear)
		4 bytes is pushed out due to TCP_NOWAIT being set.

    There are two localhost MTUs.  If you use 'localhost' the MTU is 16384.
    If you use the IP address of an ethernet interface on the machine the
    MTU winds up being 1500 even though it is effectively a localhost
    connection.  An MTU of 1500 generates the 1460 byte push-outs.

    However, even with an MTU of 16384 you still have the same problem when
    sending, say, 16384+2052 bytes of data.  After it pushed out a 16384 byte
    segment it winds up with 2048 bytes queued in the mbuf and a
    received ack (again, semi synchronous because this is localhost) will
    cause it to push out the 2048 bytes prematurely, before the last 4 bytes
    can get queued.

    What we need is a mechanism in the tcp_input() code to NOT call 
    tcp_output() when an ACK is received, under certain circumstances.
    I was thinking of taking the TF_MORETOCOME flag and causing it to be
    left set for the duration of the write (except for the last sub-write).
    At the moment it is set and cleared for each sub-write and the ack wiggles
    its way in while it happens to be clear.  In anycase, this would all
    tcp_input() to skip calling tcp_output() prematurely.  But it isn't so
    easy to implement since the TF_ flags are in the 'tp' structure, not
    the 'so' socket structure, and higher levels do not have direct access
    to the tcp-specific 'tp' structure.

					-Matt
					Matthew Dillon 
					<dillon@backplane.com>


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-hackers" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200112030536.fB35ac395075>