From owner-freebsd-current@FreeBSD.ORG Wed May 12 20:44:43 2010 Return-Path: Delivered-To: current@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id EFCBF106566C; Wed, 12 May 2010 20:44:43 +0000 (UTC) (envelope-from jroberson@jroberson.net) Received: from mail-ww0-f54.google.com (mail-ww0-f54.google.com [74.125.82.54]) by mx1.freebsd.org (Postfix) with ESMTP id 61D818FC1A; Wed, 12 May 2010 20:44:42 +0000 (UTC) Received: by wwd20 with SMTP id 20so445183wwd.13 for ; Wed, 12 May 2010 13:44:42 -0700 (PDT) Received: by 10.227.132.69 with SMTP id a5mr7396724wbt.119.1273697081243; Wed, 12 May 2010 13:44:41 -0700 (PDT) Received: from [10.0.1.198] (udp022762uds.hawaiiantel.net [72.234.79.107]) by mx.google.com with ESMTPS id e82sm434825wej.4.2010.05.12.13.44.36 (version=SSLv3 cipher=RC4-MD5); Wed, 12 May 2010 13:44:39 -0700 (PDT) Date: Wed, 12 May 2010 10:44:34 -1000 (HST) From: Jeff Roberson X-X-Sender: jroberson@desktop To: =?ISO-8859-15?Q?Ulrich_Sp=F6rlein?= In-Reply-To: <20100512141154.GF88504@acme.spoerlein.net> Message-ID: References: <20100508102005.GB1867@elmar.spoerlein.net> <20100510061057.GA93038@server.vk2pj.dyndns.org> <20100512141154.GF88504@acme.spoerlein.net> User-Agent: Alpine 2.00 (BSF 1167 2008-08-23) MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: Attilio Rao , current@freebsd.org, Peter Jeremy Subject: Re: LOR: ufs vs bufwait X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 12 May 2010 20:44:44 -0000 On Wed, 12 May 2010, Ulrich Sp?rlein wrote: > On Mon, 10.05.2010 at 22:53:32 +0200, Attilio Rao wrote: >> 2010/5/10 Peter Jeremy : >>> On 2010-May-08 12:20:05 +0200, Ulrich Sp?rlein wrote: >>>> This LOR also is not yet listed on the LOR page, so I guess it's rather >>>> new. I do use SUJ. >>>> >>>> lock order reversal: >>>> 1st 0xc48388d8 ufs (ufs) @ /usr/src/sys/kern/vfs_lookup.c:502 >>>> 2nd 0xec0fe304 bufwait (bufwait) @ /usr/src/sys/ufs/ffs/ffs_softdep.c:11363 >>>> 3rd 0xc49e56b8 ufs (ufs) @ /usr/src/sys/kern/vfs_subr.c:2091 >>> >>> I'm seeing exactly the same LOR (and subsequent deadlock) on a recent >>> -current without SUJ. >> >> I think this LOR was reported since a long time. >> The deadlock may be new and someway related to the vm_page_lock work >> (if not SUJ). > > I was not able to reproduce this with a kernel prior to SUJ, a kernel > just after SUJ went it shows this "deadlock" or infinite loop ... > > Now it might be that the SUJ kernel only increases the pressure so it > happens during a systems uptime. It does not seem directly related to > actually using SUJ on a volume, as I could reproduce it with SU only, > too. > > I will try to get a hang not involving GELI and also re-do my tests when > the volumes have neither SUJ nor SU enabled, which led to 10-20s "hangs" > of the system IIRC. It seems SU/SUJ then only prolongs these hangs ad > infinitum. I think Peter Holm also saw this once while we were testing SUJ and reproduced ~30 second hangs with stock sources. At this point we need to brainstorm ideas for adding debugging instrumentation and come up with the quickest possible repro. It would probably be good to add some KTR tracing and log that when it wedges. The core I looked at was hung in bufwait. Is there any cpu activity or io activity when things hang? You'll prboably have to keep iostat/vmstat in memory to find out so they don't try to fault in pages once things are hung. Thanks, Jeff > > I'll be back next week with new results here > > Uli > _______________________________________________ > freebsd-current@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-current > To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org" >