From owner-freebsd-current  Fri Nov  1 17:10: 3 2002
Delivered-To: freebsd-current@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id 3E15537B401
	for <current@freebsd.org>; Fri,  1 Nov 2002 17:10:02 -0800 (PST)
Received: from gull.mail.pas.earthlink.net (gull.mail.pas.earthlink.net [207.217.120.84])
	by mx1.FreeBSD.org (Postfix) with ESMTP id D3C2643E6E
	for <current@freebsd.org>; Fri,  1 Nov 2002 17:10:01 -0800 (PST)
	(envelope-from tlambert2@mindspring.com)
Received: from pool0092.cvx21-bradley.dialup.earthlink.net ([209.179.192.92] helo=mindspring.com)
	by gull.mail.pas.earthlink.net with esmtp (Exim 3.33 #1)
	id 187mnV-0004uO-00; Fri, 01 Nov 2002 17:09:58 -0800
Message-ID: <3DC32598.A0D0909A@mindspring.com>
Date: Fri, 01 Nov 2002 17:08:40 -0800
From: Terry Lambert <tlambert2@mindspring.com>
X-Mailer: Mozilla 4.79 [en] (Win98; U)
X-Accept-Language: en
MIME-Version: 1.0
To: Bill Fenner <fenner@research.att.com>
Cc: mime@traveller.cz, current@FreeBSD.ORG
Subject: Re: crash with network load (in tcp syncache ?)
References: <200211012246.gA1Mki5n001478@stash.attlabs.att.com> <3DC31EB0.2B79F42E@mindspring.com> <200211020100.RAA10356@windsor.research.att.com>
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Sender: owner-freebsd-current@FreeBSD.ORG
Precedence: bulk
List-ID: <freebsd-current.FreeBSD.ORG>
List-Archive: <http://docs.freebsd.org/mail/> (Web Archive)
List-Help: <mailto:majordomo@FreeBSD.ORG?subject=help> (List Instructions)
List-Subscribe: <mailto:majordomo@FreeBSD.ORG?subject=subscribe%20freebsd-current>
List-Unsubscribe: <mailto:majordomo@FreeBSD.ORG?subject=unsubscribe%20freebsd-current>
X-Loop: FreeBSD.ORG

Bill Fenner wrote:
> >I think this can still crash (just like my patch); the problem is in
> >what happens when it fails to allocate memory.  Unless you set one of
> >the flags, it's still going to panic in the same place, I think, when
> >you run out of memory.
> 
> No.  The flags are only checked when so_head is not NULL.  sonewconn()
> was handing sofree() an inconsistent struct so (so_head was set without
> being on either queue), i.e. sonewconn() was creating an invalid data
> structure.

You're right... I missed that; I was thinking too hard on the other
situations (e.g. soabort()) that could trigger that code, and no
enough on the code itself.

> The call in sonewconn() used to be to sodealloc(), which didn't care
> about whether or not the data structure was self-consistent.  The code
> was refactored to do reference counting, but the fact that the socket
> was inconsistent at that point wasn't noticed until now.

Yeah; I looked at doing a ref() of the thing as a partial fix,
but the unref() did the sotryfree() anyway.


> The problem is not at all based on what happens in the allocation or
> protocol attach failure cases.  The SYN cache is not involved, this is
> a bug in sonewconn(), plain and simple.

I still think there is a potential failure case, but the amount of
code you'd have to read through to follow it is immense.  It has to
do with the conection completing at NETISR, instead of in a process
context, in the allocation failure case.  I ran into the same issue
when trying to run connections to completion up to the accept() at
interrupt, in the LRP case.  The SYN cache case is very similar, in
the case of a cookie that hits when there are no resources remaining.
He might be able to trigger it with his setup, by setting the cache
size way, way don, and thus relying on cookies, and then flooding it
with conection requests until he runs it out of resources.

-- Terry

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-current" in the body of the message