From owner-freebsd-stable@FreeBSD.ORG Fri Feb 24 00:13:32 2006 Return-Path: X-Original-To: stable@freebsd.org Delivered-To: freebsd-stable@FreeBSD.ORG Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 8E34516A420 for ; Fri, 24 Feb 2006 00:13:32 +0000 (GMT) (envelope-from gcr+freebsd-stable@tharned.org) Received: from mx11.sac.fedex.com (mx11.sac.fedex.com [199.81.193.118]) by mx1.FreeBSD.org (Postfix) with ESMTP id E0B9743D69 for ; Fri, 24 Feb 2006 00:13:31 +0000 (GMT) (envelope-from gcr+freebsd-stable@tharned.org) Received: from inet03.prod.fedex.com (inet03.prod.fedex.com [199.81.10.43]) by mx11.sac.fedex.com (8.12.9p2/8.12.9) with ESMTP id k1O0DV1c037080 for ; Thu, 23 Feb 2006 18:13:31 -0600 (CST) (envelope-from gcr+freebsd-stable@tharned.org) Received: from w10.sac.fedex.com (w10.sac.fedex.com [161.135.204.136]) by inet03.prod.fedex.com (8.12.11/8.12.11) with ESMTP id k1O0DUIM026022 for ; Thu, 23 Feb 2006 18:13:30 -0600 (CST) Received: from w10.sac.fedex.com (gcr@localhost [127.0.0.1]) by w10.sac.fedex.com (8.13.4/8.13.4) with ESMTP id k1O0DUX2089542; Thu, 23 Feb 2006 18:13:30 -0600 (CST) (envelope-from gcr+freebsd-stable@tharned.org) Received: from localhost (gcr@localhost) by w10.sac.fedex.com (8.13.4/8.13.4/Submit) with ESMTP id k1O0DTNP089175; Thu, 23 Feb 2006 18:13:30 -0600 (CST) (envelope-from gcr+freebsd-stable@tharned.org) Date: Thu, 23 Feb 2006 18:13:29 -0600 (CST) From: Greg Rivers Sender: gcr@fedex.com To: Kris Kennaway In-Reply-To: <20060223235055.GA93873@xor.obsecurity.org> Message-ID: <20060223175345.U12100@w10.sac.fedex.com> References: <200602231753.k1NHr8c1079056@manor.msen.com> <20060223163849.I12100@w10.sac.fedex.com> <20060223235055.GA93873@xor.obsecurity.org> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: stable@freebsd.org, "Michael R. Wayne" Subject: Re: Disk I/O system hang on 5.4-RELEASE-p8 i386 X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 24 Feb 2006 00:13:32 -0000 On Thu, 23 Feb 2006, Kris Kennaway wrote: >> I believe this issue has made it onto the show-stopper list for >> 6.1-RELEASE and is being actively worked on. > > It's on the todo list, but I don't think it's being worked on yet. > The main problem is that we need a way to reproduce it on command. > I'd forgotten that snapshots are involved, so maybe it's just a matter > of running lots of mksnap_ffs while I/O is in progress. > > kris > It happens with or without snapshots, but snapshots are a lot more likely to make it happen. In my case, approximately 1 in 3 snapshots will do it. Without snapshots, I get a deadlock about every ten days in a population of three hosts. Tor Egge and Don Lewis were kind enough to work with me off-list for a bit last December. They analyzed several of the core files I produced and I think they have a fair understanding of what the problems are. But I wouldn't presume to put words in their mouths; perhaps they'll give us an update. I see from the todo list that Tor may already be working on the deadlock for amd64. I'm at the disposal of anyone who's willing to look into this further. -- Greg