From owner-cvs-all@FreeBSD.ORG  Wed Sep  6 15:01:53 2006
Return-Path: <owner-cvs-all@FreeBSD.ORG>
X-Original-To: cvs-all@FreeBSD.org
Delivered-To: cvs-all@FreeBSD.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id A7ED716A71C;
	Wed,  6 Sep 2006 15:01:53 +0000 (UTC)
	(envelope-from glebius@FreeBSD.org)
Received: from cell.sick.ru (cell.sick.ru [217.72.144.68])
	by mx1.FreeBSD.org (Postfix) with ESMTP id EADAC43D8F;
	Wed,  6 Sep 2006 15:01:37 +0000 (GMT)
	(envelope-from glebius@FreeBSD.org)
Received: from cell.sick.ru (glebius@localhost [127.0.0.1])
	by cell.sick.ru (8.13.4/8.13.3) with ESMTP id k86F1UBR007491
	(version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO);
	Wed, 6 Sep 2006 19:01:30 +0400 (MSD)
	(envelope-from glebius@FreeBSD.org)
Received: (from glebius@localhost)
	by cell.sick.ru (8.13.4/8.13.1/Submit) id k86F1UQW007490;
	Wed, 6 Sep 2006 19:01:30 +0400 (MSD)
	(envelope-from glebius@FreeBSD.org)
X-Authentication-Warning: cell.sick.ru: glebius set sender to
	glebius@FreeBSD.org using -f
Date: Wed, 6 Sep 2006 19:01:30 +0400
From: Gleb Smirnoff <glebius@FreeBSD.org>
To: Mike Silbersack <silby@silby.com>
Message-ID: <20060906150129.GT40020@FreeBSD.org>
References: <200609061356.k86DuZ0w016069@repoman.freebsd.org>
	<20060906091204.B6691@odysseus.silby.com>
	<20060906143204.GQ40020@FreeBSD.org>
	<20060906093553.L6691@odysseus.silby.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=koi8-r
Content-Disposition: inline
In-Reply-To: <20060906093553.L6691@odysseus.silby.com>
User-Agent: Mutt/1.5.6i
Cc: cvs-src@FreeBSD.org, src-committers@FreeBSD.org, cvs-all@FreeBSD.org
Subject: Re: cvs commit: src/sys/netinet in_pcb.c tcp_subr.c tcp_timer.c
	tcp_var.h
X-BeenThere: cvs-all@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: CVS commit messages for the entire tree <cvs-all.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/cvs-all>,
	<mailto:cvs-all-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/cvs-all>
List-Post: <mailto:cvs-all@freebsd.org>
List-Help: <mailto:cvs-all-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/cvs-all>,
	<mailto:cvs-all-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 06 Sep 2006 15:01:53 -0000

  Mike,

On Wed, Sep 06, 2006 at 09:49:15AM -0500, Mike Silbersack wrote:
M> >Then we found the CPU hog in the in_pcblookup_local(). I've added
M> >counters and gathered stats via ktr(4). When a lag occured, the
M> >following data was gathered:
M> >
M> >112350 return 0x0, iterations 0, expired 0
M> >112349 return 0xc5154888, iterations 19998, expired 745
M> 
M> Ah, I think I see what's happening.  It's probably spinning because the 
M> heuristic isn't triggering on each entry, that doesn't surprise me.  What 
M> does surprise me is that it's expiring more than one entry - my original 
M> intent with that code was for it to free just one entry, which it would 
M> then use... meaning that I goofed up the implementation.
M> 
M> I had been thinking of rewriting that heuristic anyway, I'm sure that I 
M> can go back and find something far more efficient if you give me a few 
M> days.  (Or a week.)

I think we should free the oldmost tcptw entry in a case if we can't
find the local endpoint. We can tell definitely that we can't find one
only in in_pcbbind_setup() in the "do {} while (in_pcblookup_local)" cycle,
where EADDRNOTAVAIL is returned. We can't definitely tell this in
in_pcblookup_local() since we don't know whether tried port is the
last one.

The oldmost tcptw entry can be taken simply from the ordered list, like
tcp_timer_2msl_tw() does this.

However, I don't like the idea of "finding" the free port at all. This
makes connect()/bind() performance depending on number of busy endpoints.
Shouldn't we make an algorythm, where free endpoints are stored, and
we don't need to _find_ one, we could just _take_ one?

M> >1.78 hasn't yet been merged to RELENG_6, and we faced the problem on
M> >RELENG_6 boxes where the periodic merging cycle is present. So the
M> >problem is not in 1.78 of tcp_timer.c. We have a lot of tcptw entries
M> >because we have a very big connection rate, not because they are
M> >leaked or not purged.
M> 
M> Ok, just checking.
M> 
M> With this code removed, are you not seeing the web frontends delaying new 
M> connections when they can't find a free port to use?

No. We monitor the amount of entries in tcptw zone. It is the same
as before. So the periodic cycle purges tcptw states at the same
rate as in_pcblookup_local() was, except that it does this consuming
less CPU time.

-- 
Totus tuus, Glebius.
GLEBIUS-RIPN GLEB-RIPE