Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 8 Jan 2002 19:33:25 -0500 (EST)
From:      Robert Watson <rwatson@FreeBSD.org>
To:        qa@FreeBSD.org
Subject:   Re: Reduced reliability due to larger socket queue defaults for TCP
Message-ID:  <Pine.NEB.3.96L.1020108192957.32228A-100000@fledge.watson.org>
In-Reply-To: <Pine.NEB.3.96L.1020106174749.96223A-100000@fledge.watson.org>

next in thread | previous in thread | raw e-mail | index | archive | help
The temptation here is one of two things:

(1) Back out the change increasing the send-q socket buffer size as a
    default, and restore tuning(7) to recommend increasing the value, or

(2) To add the following text to the release notes:

	In 4.5-RELEASE, default socket buffer sizes are increased to
	maximize performance on high speed networks.  However, under
	some circumstances, this can dramatically increase the memory
	requirements of the network system, requiring a manual
	bumping of the kernel NMBCLUSTERS setting.  This can be
	set using kern.ipc.nmbclusters.

My temptation is to bump back (1) a bit, possibly to bump up the keepalive
rate, and stick in this note.  Reliability==good.

Robert N M Watson             FreeBSD Core Team, TrustedBSD Project
robert@fledge.watson.org      NAI Labs, Safeport Network Services

On Sun, 6 Jan 2002, Robert Watson wrote:

> 
> Recently ran into the following circumstance on a server with about a 15
> day uptime (and hence about a 15-day old version of -STABLE):
> 
> tcp4       0  33090  204.156.12.50.80       213.197.75.52.2378 FIN_WAIT_1
> tcp4       0  33304  204.156.12.50.80       198.54.202.4.24052 FIN_WAIT_1
> tcp4       0  32120  204.156.12.50.80       24.27.14.83.50129 FIN_WAIT_1
> tcp4       0  33089  204.156.12.50.80       213.197.75.52.2381 FIN_WAIT_1
> tcp4       0  33304  204.156.12.50.80       198.54.202.4.23509 FIN_WAIT_1
> tcp4       0  33304  204.156.12.50.80       212.182.63.102.28130 FIN_WAIT_1
> tcp4       0  33304  204.156.12.50.80       62.233.128.65.13712 FIN_WAIT_1
> tcp4       0  33580  204.156.12.50.80       212.182.13.23.3473 LAST_ACK
> tcp4       0  31856  204.156.12.50.80       198.54.202.4.20584 FIN_WAIT_1
> tcp4       0  31856  204.156.12.50.80       212.182.63.102.29962 LAST_ACK
> tcp4       0  33304  204.156.12.50.80       198.54.202.4.23960 FIN_WAIT_1
> tcp4       0  31482  204.156.12.50.80       213.197.75.52.2373 FIN_WAIT_1
> tcp4       0  32551  204.156.12.50.80       213.197.75.52.2374 FIN_WAIT_1
> 
> (on the order of hundreds of these), resulting in mbufs getting exhausted. 
> maxusers is set to 256, so nmbclusters is 4608, which was previously a
> reasonable default.  Presumably the problem I'm experiencing is that dud
> connections have doubled in capacity due to a larger send queue size. I've
> temporarily dropped the send queue max until I can reboot the machine to
> increase nmbclusters, but this failure mode does seem unfortunate. It's
> also worth considering adding a release note entry indicating that while
> this can improve performance, it can also reduce scalability.  I suppose
> this shouldn't have caught me by surprise, but it did, since that server
> had previously not had a problem... :-) 
> 
> I don't suppose the TCP spec allows us to drain send socket queues in
> FIN_WAIT_1 or LAST_ACK? :-)  Any other bright suggestions on ways we can
> make this change "safer"?
> 
> Robert N M Watson             FreeBSD Core Team, TrustedBSD Project
> robert@fledge.watson.org      NAI Labs, Safeport Network Services
> 
> 
> To Unsubscribe: send mail to majordomo@FreeBSD.org
> with "unsubscribe freebsd-qa" in the body of the message
> 


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-qa" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?Pine.NEB.3.96L.1020108192957.32228A-100000>