From owner-freebsd-stable@FreeBSD.ORG Fri Sep 1 09:46:05 2006 Return-Path: X-Original-To: freebsd-stable@freebsd.org Delivered-To: freebsd-stable@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id B600E16A4DA for ; Fri, 1 Sep 2006 09:46:05 +0000 (UTC) (envelope-from rwatson@FreeBSD.org) Received: from cyrus.watson.org (cyrus.watson.org [209.31.154.42]) by mx1.FreeBSD.org (Postfix) with ESMTP id D279343D45 for ; Fri, 1 Sep 2006 09:46:03 +0000 (GMT) (envelope-from rwatson@FreeBSD.org) Received: from fledge.watson.org (fledge.watson.org [209.31.154.41]) by cyrus.watson.org (Postfix) with ESMTP id 53BDC46C5F; Fri, 1 Sep 2006 05:46:03 -0400 (EDT) Date: Fri, 1 Sep 2006 10:46:03 +0100 (BST) From: Robert Watson X-X-Sender: robert@fledge.watson.org To: =?UTF-8?B?VsOhY2xhdiBIYWlzbWFu?= In-Reply-To: <44F67DC2.1060900@sh.cvut.cz> Message-ID: <20060901104141.J4921@fledge.watson.org> References: <44F67DC2.1060900@sh.cvut.cz> MIME-Version: 1.0 Content-Type: MULTIPART/MIXED; BOUNDARY="0-52531061-1157103963=:4921" Cc: freebsd-stable@freebsd.org Subject: Re: malloc(M_WAITOK) of "g_bio", forcing M_NOWAIT with non-sleepable locks held: X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 01 Sep 2006 09:46:05 -0000 This message is in MIME format. The first part should be readable text, while the remaining parts are likely unreadable without MIME-aware tools. --0-52531061-1157103963=:4921 Content-Type: TEXT/PLAIN; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: QUOTED-PRINTABLE On Thu, 31 Aug 2006, V=E1clav Haisman wrote: > I found this in logs of 6.1 box that I admin this morning. The machine ke= eps=20 > running after that. Indeed, there does appear to be a problem in the TCP socket option code wit= h=20 respect to performing copyin/copyout while holding the inpcb lock. This=20 problem is not present in the IP layer socket option code. However, the co= de=20 between HEAD and 6-STABLE here differs significantly, so fixing this will= =20 require different changes in the two branches. Could you file a problem=20 report on this, and forward me the PR receipt? I'm on travel in India=20 currently, with mixed connectivity, so it may be a little bit before I can = get=20 to fixing the problem. In principle, the risk here is a deadlock, but the fix is a little complica= ted=20 as if we release the lock there, the state of the TCP socket can change, so= =20 when the code picks up from the copyin/copyout, it needs to validate that t= he=20 operation is still valid on the socket (i.e., the connection hasn't been re= set=20 during the system call -- perhaps while the application is blocked waiting = on=20 disk i/o for a paged out page that contains the socket option). This is ve= ry=20 unlikely to trigger in practice, the warning there is quite conservative, b= ut=20 needs to be addressed properly. Thanks for the report, Robert N M Watson Computer Laboratory University of Cambridge --0-52531061-1157103963=:4921--