From owner-cvs-all@FreeBSD.ORG Wed Sep 6 15:01:53 2006 Return-Path: X-Original-To: cvs-all@FreeBSD.org Delivered-To: cvs-all@FreeBSD.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id A7ED716A71C; Wed, 6 Sep 2006 15:01:53 +0000 (UTC) (envelope-from glebius@FreeBSD.org) Received: from cell.sick.ru (cell.sick.ru [217.72.144.68]) by mx1.FreeBSD.org (Postfix) with ESMTP id EADAC43D8F; Wed, 6 Sep 2006 15:01:37 +0000 (GMT) (envelope-from glebius@FreeBSD.org) Received: from cell.sick.ru (glebius@localhost [127.0.0.1]) by cell.sick.ru (8.13.4/8.13.3) with ESMTP id k86F1UBR007491 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Wed, 6 Sep 2006 19:01:30 +0400 (MSD) (envelope-from glebius@FreeBSD.org) Received: (from glebius@localhost) by cell.sick.ru (8.13.4/8.13.1/Submit) id k86F1UQW007490; Wed, 6 Sep 2006 19:01:30 +0400 (MSD) (envelope-from glebius@FreeBSD.org) X-Authentication-Warning: cell.sick.ru: glebius set sender to glebius@FreeBSD.org using -f Date: Wed, 6 Sep 2006 19:01:30 +0400 From: Gleb Smirnoff To: Mike Silbersack Message-ID: <20060906150129.GT40020@FreeBSD.org> References: <200609061356.k86DuZ0w016069@repoman.freebsd.org> <20060906091204.B6691@odysseus.silby.com> <20060906143204.GQ40020@FreeBSD.org> <20060906093553.L6691@odysseus.silby.com> Mime-Version: 1.0 Content-Type: text/plain; charset=koi8-r Content-Disposition: inline In-Reply-To: <20060906093553.L6691@odysseus.silby.com> User-Agent: Mutt/1.5.6i Cc: cvs-src@FreeBSD.org, src-committers@FreeBSD.org, cvs-all@FreeBSD.org Subject: Re: cvs commit: src/sys/netinet in_pcb.c tcp_subr.c tcp_timer.c tcp_var.h X-BeenThere: cvs-all@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: CVS commit messages for the entire tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 06 Sep 2006 15:01:53 -0000 Mike, On Wed, Sep 06, 2006 at 09:49:15AM -0500, Mike Silbersack wrote: M> >Then we found the CPU hog in the in_pcblookup_local(). I've added M> >counters and gathered stats via ktr(4). When a lag occured, the M> >following data was gathered: M> > M> >112350 return 0x0, iterations 0, expired 0 M> >112349 return 0xc5154888, iterations 19998, expired 745 M> M> Ah, I think I see what's happening. It's probably spinning because the M> heuristic isn't triggering on each entry, that doesn't surprise me. What M> does surprise me is that it's expiring more than one entry - my original M> intent with that code was for it to free just one entry, which it would M> then use... meaning that I goofed up the implementation. M> M> I had been thinking of rewriting that heuristic anyway, I'm sure that I M> can go back and find something far more efficient if you give me a few M> days. (Or a week.) I think we should free the oldmost tcptw entry in a case if we can't find the local endpoint. We can tell definitely that we can't find one only in in_pcbbind_setup() in the "do {} while (in_pcblookup_local)" cycle, where EADDRNOTAVAIL is returned. We can't definitely tell this in in_pcblookup_local() since we don't know whether tried port is the last one. The oldmost tcptw entry can be taken simply from the ordered list, like tcp_timer_2msl_tw() does this. However, I don't like the idea of "finding" the free port at all. This makes connect()/bind() performance depending on number of busy endpoints. Shouldn't we make an algorythm, where free endpoints are stored, and we don't need to _find_ one, we could just _take_ one? M> >1.78 hasn't yet been merged to RELENG_6, and we faced the problem on M> >RELENG_6 boxes where the periodic merging cycle is present. So the M> >problem is not in 1.78 of tcp_timer.c. We have a lot of tcptw entries M> >because we have a very big connection rate, not because they are M> >leaked or not purged. M> M> Ok, just checking. M> M> With this code removed, are you not seeing the web frontends delaying new M> connections when they can't find a free port to use? No. We monitor the amount of entries in tcptw zone. It is the same as before. So the periodic cycle purges tcptw states at the same rate as in_pcblookup_local() was, except that it does this consuming less CPU time. -- Totus tuus, Glebius. GLEBIUS-RIPN GLEB-RIPE